From 18ae790739cf6140ed0c37127cb11ea32cd6841f Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Tue, 10 Feb 2026 15:29:55 -0800 Subject: [PATCH 01/16] transcripts --- .../513-stories-from-python-history.txt | 2 +- .../513-stories-from-python-history.vtt | 2 +- .../514-python-language-summit-2025.txt | 4 +- .../514-python-language-summit-2025.vtt | 4 +- transcripts/518-django-birthday.txt | 2 +- transcripts/518-django-birthday.vtt | 2 +- transcripts/536-fly-inside-fastapi-cloud.txt | 2294 ++++++++++ transcripts/536-fly-inside-fastapi-cloud.vtt | 3842 +++++++++++++++++ 8 files changed, 6144 insertions(+), 8 deletions(-) create mode 100644 transcripts/536-fly-inside-fastapi-cloud.txt create mode 100644 transcripts/536-fly-inside-fastapi-cloud.vtt diff --git a/transcripts/513-stories-from-python-history.txt b/transcripts/513-stories-from-python-history.txt index 57b7d4f..8291d4c 100644 --- a/transcripts/513-stories-from-python-history.txt +++ b/transcripts/513-stories-from-python-history.txt @@ -10,7 +10,7 @@ 00:00:34 This is Talk Python To Me, episode 513, recorded June 6th, 2025. -00:00:53 to unite. We started in pyramid cruising. +00:00:53 to unite. We started in Pyramid cruising. 00:00:56 Welcome to Talk Python To Me, a weekly podcast on Python. diff --git a/transcripts/513-stories-from-python-history.vtt b/transcripts/513-stories-from-python-history.vtt index fdd8459..7ba9b3f 100644 --- a/transcripts/513-stories-from-python-history.vtt +++ b/transcripts/513-stories-from-python-history.vtt @@ -19,7 +19,7 @@ You'll hear how Import This came to be and how the first PyCon had around 30 att This is Talk Python To Me, episode 513, recorded June 6th, 2025. 00:00:53.520 --> 00:00:55.820 -to unite. We started in pyramid cruising. +to unite. We started in Pyramid cruising. 00:00:56.120 --> 00:00:58.920 Welcome to Talk Python To Me, a weekly podcast on Python. diff --git a/transcripts/514-python-language-summit-2025.txt b/transcripts/514-python-language-summit-2025.txt index fed2300..8ff0a85 100644 --- a/transcripts/514-python-language-summit-2025.txt +++ b/transcripts/514-python-language-summit-2025.txt @@ -14,9 +14,9 @@ 00:00:31 This is Talk Python To Me, episode 514, recorded June 17th, 2025. -00:00:51 to unite. We started in pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. +00:00:51 to unite. We started in Pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. -00:00:54 to unite. We started in pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. +00:00:54 to unite. We started in Pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. 00:00:57 This is your host, Michael Kennedy. Follow me on Mastodon where I'm @mkennedy and follow the diff --git a/transcripts/514-python-language-summit-2025.vtt b/transcripts/514-python-language-summit-2025.vtt index 02daee9..edd5e2b 100644 --- a/transcripts/514-python-language-summit-2025.vtt +++ b/transcripts/514-python-language-summit-2025.vtt @@ -25,10 +25,10 @@ giving their attention to. This is Talk Python To Me, episode 514, recorded June 17th, 2025. 00:00:51.460 --> 00:00:53.760 -to unite. We started in pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. +to unite. We started in Pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. 00:00:54.060 --> 00:00:56.860 -to unite. We started in pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. +to unite. We started in Pyramid cruising. Welcome to Talk Python To Me, a weekly podcast on Python. 00:00:57.460 --> 00:01:02.700 This is your host, Michael Kennedy. Follow me on Mastodon where I'm @mkennedy and follow the diff --git a/transcripts/518-django-birthday.txt b/transcripts/518-django-birthday.txt index 8e70256..2176567 100644 --- a/transcripts/518-django-birthday.txt +++ b/transcripts/518-django-birthday.txt @@ -12,7 +12,7 @@ 00:01:02 This is Talk Python To Me, episode 518, recorded August 18th, 2025. -00:01:19 It's time to unite. We started in pyramid cruising. Welcome to Talk Python To Me, a weekly podcast +00:01:19 It's time to unite. We started in Pyramid cruising. Welcome to Talk Python To Me, a weekly podcast 00:01:25 on Python. This is your host, Michael Kennedy. Follow me on Mastodon where I'm @mkennedy and follow the podcast using @talkpython, both accounts over at fosstodon.org and keep up with the show and listen to over nine years of episodes at talkpython.fm. If you want to be part of our live episodes, you can find the live streams over on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. This episode is brought to you entirely by Sentry. It's a bit of an episode takeover, if you will. Sentry has two excellent and exciting services to tell you about. Sear, your agentic AI debugging assistant, which takes all the data already gathered by Sentry to help discover the problems and even propose fixes as GitHub PRs. And the other is AI agent monitoring, which adds deep observability to your AI agents in your app. If you're adding AI and LLM features to your Python apps, you'll want to know about AI agent monitoring. I'll tell you more about both of these later in the episode. And remember, however you happen to sign up for Sentry, if you do, use our code TALKPYTHON, one word, all caps. diff --git a/transcripts/518-django-birthday.vtt b/transcripts/518-django-birthday.vtt index 06575a0..f5992f7 100644 --- a/transcripts/518-django-birthday.vtt +++ b/transcripts/518-django-birthday.vtt @@ -22,7 +22,7 @@ Finally, we look ahead at the next decade of speed, security, and sustainability This is Talk Python To Me, episode 518, recorded August 18th, 2025. 00:01:19.960 --> 00:01:25.480 -It's time to unite. We started in pyramid cruising. Welcome to Talk Python To Me, a weekly podcast +It's time to unite. We started in Pyramid cruising. Welcome to Talk Python To Me, a weekly podcast 00:01:25.860 --> 00:02:36.740 on Python. This is your host, Michael Kennedy. Follow me on Mastodon where I'm @mkennedy and follow the podcast using @talkpython, both accounts over at fosstodon.org and keep up with the show and listen to over nine years of episodes at talkpython.fm. If you want to be part of our live episodes, you can find the live streams over on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. This episode is brought to you entirely by Sentry. It's a bit of an episode takeover, if you will. Sentry has two excellent and exciting services to tell you about. Sear, your agentic AI debugging assistant, which takes all the data already gathered by Sentry to help discover the problems and even propose fixes as GitHub PRs. And the other is AI agent monitoring, which adds deep observability to your AI agents in your app. If you're adding AI and LLM features to your Python apps, you'll want to know about AI agent monitoring. I'll tell you more about both of these later in the episode. And remember, however you happen to sign up for Sentry, if you do, use our code TALKPYTHON, one word, all caps. diff --git a/transcripts/536-fly-inside-fastapi-cloud.txt b/transcripts/536-fly-inside-fastapi-cloud.txt new file mode 100644 index 0000000..6701191 --- /dev/null +++ b/transcripts/536-fly-inside-fastapi-cloud.txt @@ -0,0 +1,2294 @@ +00:00:00 You've built your FastAPI app. It's running great locally. Now you want to share it with the world. + +00:00:05 But then reality hits. Containers, load balancers, HTTPS certificates, cloud consoles with 200 + +00:00:12 options. What if deploying was just one command? That's exactly what Sebastian Ramirez and the + +00:00:18 FastAPI cloud team are building. On this episode, we sit down with Sebastian, Patrick Arminio, + +00:00:24 Savannah Ostrowski, and Jonathan Ewald to go inside FastAPI cloud, explore what it means + +00:00:30 to build a Pythonic cloud and dig into how this commercial venture is actually making FastAPI, + +00:00:35 the open source project, stronger than ever. This is Talk Python To Me, episode 536, recorded January + +00:00:42 13th, 2026. Talk Python To Me, yeah, we ready to roll. Upgrading the code, no fear of getting old. + +00:00:51 Async in the air, new frameworks in sight, geeky rap on deck. Quark crew, it's time to unite. We + +00:01:01 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:06 This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. + +00:01:12 Let's connect on social media. You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:17 The social links are all in your show notes. You can find over 10 years of past episodes at + +00:01:23 Talk Python.fm. And if you want to be part of the show, you can join our recording live streams. + +00:01:27 That's right, we live stream the raw uncut version of each episode on YouTube. + +00:01:32 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:36 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:41 This episode is brought to you by CommandBook, a native macOS app that I built that gives + +00:01:46 long-running terminal commands a permanent home. + +00:01:49 No more juggling six terminal tabs every morning. + +00:01:51 Carefully craft a command once, run it forever with auto-restart, URL detection, and a full CLI. + +00:01:56 Download it for free at talkpython.fm/command book app. + +00:02:00 And it's brought to you by the Talk Python in Production Book, + +00:02:03 an inside look at 10 years of the real-world DevOps behind the Talk Python sites and apps. + +00:02:08 Check it out at talkpython.fm/DevOps book. + +00:02:12 Before we dive in, I have something excellent to announce. + +00:02:14 A few episodes back, I told you about our new AI integrations, + +00:02:18 the Talk Python MCP servers, and our llms.txt file, so that AI tools can tap into our over 550 episodes and 7.5 million words of info around the Python community and history. + +00:02:32 Well, I'm building on that now. + +00:02:34 Talk Python now has an open source CLI tool. + +00:02:37 You can search episodes, transcripts, guests, and even our courses right from your terminal. + +00:02:42 No browser required. + +00:02:44 It's fast too, backed by our optimized APIs, taking about five milliseconds of response time, + +00:02:49 plus, you know, whatever the internet ping time is. + +00:02:51 It supports rich text for humans, JSON for programs, and markdown output for AIs. + +00:02:57 You just install it with uvtoolinstall talk-python-cli, and check it out on GitHub. + +00:03:04 The link is in the podcast player show notes. + +00:03:07 I also wrote up a blog post on the hows and whys of it. + +00:03:09 Check that out over on the Talk Python blog at, well, talkpython.fm/blog. + +00:03:14 The full link to the exact article is in the show notes. + +00:03:17 All right, let's jump in. + +00:03:20 Hello, everyone. + +00:03:21 Sebastian, Patrick, Savannah, and Jonathan. + +00:03:24 Awesome to have you all here. + +00:03:26 Excited to talk about FastAPI Cloud. + +00:03:28 Welcome. + +00:03:28 Yes. + +00:03:28 Thanks for having me. + +00:03:29 Thank you. + +00:03:30 We're also ahead, Mike. + +00:03:31 What a project. + +00:03:32 It's been going on for a while. + +00:03:34 I've heard stuff from Sebastian that maybe something was brewing and all these things, + +00:03:40 but not too long ago, you all announced it. + +00:03:43 And I heard that FastAPI, some people have been using it recently. + +00:03:47 Some of the surveys show that some people use it for websites. + +00:03:51 I'm not sure. + +00:03:51 Rumors. + +00:03:52 Yeah, yeah. + +00:03:53 Rumors. + +00:03:54 Rumors. + +00:03:55 Oh, my gosh. + +00:03:56 I mean, congratulations on that. + +00:03:58 But before we dive into FastAPI and FastAPI Cloud, let's just do a quick introduction. + +00:04:03 Who are you? + +00:04:04 We'll just go around the Brady Bunch squares of our live stream here and start with Sebastian. + +00:04:08 You've been on the show a few times. + +00:04:09 In fact, you've been on the show just recently for a really fun episode, Sebastian. + +00:04:14 Who are you? + +00:04:15 In real, that was super fun. + +00:04:17 So, hello, everyone. + +00:04:18 I'm Sebastian Ramirez, or Tiangulo. + +00:04:20 I created FastAPI. + +00:04:22 That is this Python framework for building web APIs and backend. + +00:04:26 In case you've been living in a hole and haven't done any Python for 10 years. + +00:04:30 You also are famous for really pointing out the ridiculousness of modern tech recruiting. + +00:04:36 You know what I'm talking about? + +00:04:38 Yeah, you know, like it's fun. + +00:04:40 This is probably the thing that I am known for. + +00:04:42 It's for writing a tweet saying, yeah, what was it? + +00:04:47 that I saw a job asking for five years of experience in FastAPI, + +00:04:52 and I only had 2.5 since I created the thing. + +00:04:56 So you didn't qualify for the FastAPI job? + +00:04:59 I didn't qualify for that, yeah. + +00:05:00 And then the funny thing is, you know, like people sometimes, + +00:05:03 even people in Python itself, and tell me like, oh, wait, like you're, and I say like, oh, yeah, + +00:05:08 I created this thing called FastAPI. + +00:05:09 Oh, wait, okay, so what is FastAPI? + +00:05:11 Oh, wait, you are the guy from the meme, the meme about FastAPI. + +00:05:16 Are you serious? + +00:05:17 Yeah, you know, like suddenly that is super important that I am the guy for the meme about FastAPI. + +00:05:23 Not the guy from FastAPI, the guy from the meme. + +00:05:26 Oh my gosh, I saw you on TikTok. It was amazing. + +00:05:29 It was a live achievement. I wrote a viral tweet. So yeah, nice to meet you all. + +00:05:34 You know what? Sometimes your moment in the sun is not the one you expected. + +00:05:37 No, congratulations on how good FastAPI is. + +00:05:40 On the tweet. + +00:05:41 Exactly. You really nailed it. Patrick, welcome to the show. + +00:05:45 Nice to be here. Yeah, I'm Patrick. + +00:05:47 I guess the main thing I'm kind of known for in the community is like this library called + +00:05:50 Storary, which is similar to FastAPI, but instead of REST is for GraphQL. + +00:05:55 Other than that, I help organize PyCon Italy and I used to also do EuroPython as well, + +00:06:00 but I stopped because of way too many things. + +00:06:03 Yeah, that's pretty much me. + +00:06:05 How do you see GraphQL these days? + +00:06:07 Is it still popular? + +00:06:08 I think it's mostly popular in the enterprises, unfortunately. + +00:06:12 I'm a bit, to be honest, I'm a bit annoyed about the companies that do tooling around + +00:06:16 GraphQL because I don't know, I feel like they're not really pushing it forward. They're just, + +00:06:20 I don't know, trying to work with enterprises and that's it. Or maybe people think to AI. + +00:06:25 Yeah. It feels a little bit like the soap, soap, wisdom, XML, modern version. Savannah. + +00:06:33 Yeah. + +00:06:34 You like tapping out of being an organizer for EuroPython is like, you know, the classic + +00:06:40 open source oversubscribed doing all the things very relatable. + +00:06:44 Yeah. + +00:06:45 Yeah. + +00:06:45 But yeah, I'm Savannah. + +00:06:46 What can I say? + +00:06:47 I am on the Python Steering Council for 2096, which is very exciting. + +00:06:52 Congratulations. + +00:06:53 Oh, thank you. + +00:06:53 I am also the release manager for the upcoming version of Python, Python 3.16. + +00:06:59 And so that'll kick off later this year, which is really cool and very exciting. + +00:07:04 I work on CPython stuff, the JIT, arg parse, basically whatever needs help. + +00:07:09 It's kind of where you'll find me. + +00:07:09 Awesome. + +00:07:10 Congratulations on the Steering Council. + +00:07:12 And yeah, that's a lot of cool stuff. + +00:07:15 Hopefully we don't get a Python 4.0 right after 3.16 because then your job will never end, is what I've learned. + +00:07:23 Yeah, yeah. + +00:07:23 Benjamin Peterson, Python 2.7 forever kind of situation. + +00:07:28 Yeah, yeah. + +00:07:29 I mean, release management is still, I mean, it's still quite a commitment. + +00:07:33 It's like seven-ish years when you think about all the staggered releases + +00:07:36 because you're a release manager 2.0 and then you have the five-year maintenance cycle. + +00:07:41 So yeah, it's Python forever is what I only said. + +00:07:45 Yeah, it's probably not a fad. + +00:07:47 It's probably going to stick around, this Python thing. + +00:07:50 No, that's awesome. + +00:07:51 Congratulations. + +00:07:51 Also, cool with arg parse. + +00:07:54 I feel like that's making a strong comeback now that we have these AI things that can just put stuff together for us instead of like, oh, I need to depend on this library and that library. + +00:08:04 Like, I just need to take a few arguments and have a little help text. + +00:08:08 And it's like, well, you've already got this built-in thing. + +00:08:10 Oh, who knew? + +00:08:11 You know, people are like, oh, I didn't even know. + +00:08:12 I thought I used typer or click or something, right? + +00:08:15 There's, you know, the typers and clicks of the world. + +00:08:18 But sometimes you just want the simplest thing. + +00:08:20 And ArgParse is pretty great at that. + +00:08:22 Although it has many quirks that are probably and most definitely unfixable at this point. + +00:08:28 Because bugs are features when you have things that have been around as long as Python. + +00:08:32 But yeah, no, I mean, AI loves to write Python. + +00:08:35 I think it's like the language used the most, AI-generated code. + +00:08:39 I'll just say we live in weird times, very weird times. + +00:08:42 I would love a precedented time at some time. + +00:08:45 Exactly. + +00:08:46 Yeah. + +00:08:46 Can we just get the boring times? + +00:08:48 No, nothing interesting, please. + +00:08:49 What I said about GraphQL may sound like a bit of a smash, but I didn't mean it in a negative, + +00:08:56 super negative way anyway. + +00:08:57 Like it used to be all the enterprises were all about soap and wisdom and like subscribing + +00:09:02 your tooling. + +00:09:03 Please don't write me. + +00:09:04 I'm not trying to bash on your technology. + +00:09:07 All right. + +00:09:08 Jonathan, also welcome. + +00:09:10 Hi. + +00:09:10 Yeah, I'm not nearly as famous as everyone else in this call. + +00:09:14 I'm more infamous internally at FastAPI Cloud, I would say, for a bunch of things. + +00:09:21 I've heard of emojis or something along the lines of line. + +00:09:23 One Mima way. + +00:09:24 You're just one Mima way. + +00:09:25 Just one Mima way. + +00:09:27 Yeah, that's true. + +00:09:28 We keep piling them up internally. + +00:09:30 But yeah, I used to work with Patrick together for years, also on Craftia. + +00:09:34 Same library as him. + +00:09:35 That's how I know him. + +00:09:36 And that's why I, well, made a weird sound when you said soap. + +00:09:41 I've been with FastAPI Cloud since EuroPattern, actually, the last one. + +00:09:45 I promised Sebastian I would implement server-send events in FastAPI, + +00:09:48 and I haven't started with it yet at all, but somehow I'm still here. + +00:09:54 So that's great. + +00:09:55 Well, actually, yeah, and it was actually like a sneak peek, I guess. + +00:10:00 We already started having a bunch of chats and discussing what we do. + +00:10:04 Should we do it here? Should we do it there? + +00:10:06 What should we do? + +00:10:07 So like, yeah, it's something that is coming to FastAPI probably soon-ish. + +00:10:11 Like there was a lot of things that needed to happen before that. + +00:10:14 Like Patrick is slightly smiley, like, oh no, this is pressure. + +00:10:19 There were some things that needed to happen in FastAPI, as you know, dropping support for + +00:10:23 Pydantic version one or things like that, that just made the internal code so complex. + +00:10:28 And now that it's over, we can actually work more on improving performance, adding features + +00:10:33 and things like that. + +00:10:34 I definitely want to dive into how FastAPI Cloud has sort of influenced the whole FastAPI side of things. + +00:10:43 But I made aware that there is, in fact, an entire website, an entire website dedicated to the meme. + +00:10:51 Yeah. And out in the audience, we get, hey, everyone, is that the guy from the meme? + +00:10:54 And meme is greater than Nobel Prize. + +00:10:57 So, you know what? + +00:10:59 It may be true. + +00:11:00 It may be true. + +00:11:01 I recognize the person saying, this is the guy from the Ming. + +00:11:04 He might be my husband. + +00:11:09 Incredible. + +00:11:11 Incredible. + +00:11:12 All right, well, let's start with FastAPI Cloud, and then we'll bring it back around to the FastAPI. + +00:11:19 Let's talk origin story. + +00:11:21 So what is this FastAPI Cloud? + +00:11:23 Nice. + +00:11:24 So if you were looking at the FastAPI Labs website, + +00:11:27 that doesn't really show that much. + +00:11:29 If you click on the join the waiting list that takes you to the website for FastAPI Cloud, + +00:11:34 there we can see this is what we are building. + +00:11:36 This is the thing that we are doing. + +00:11:39 It's actually super simple. + +00:11:40 The funny thing is that the pitch, the explanation of the product is so short. + +00:11:44 So it's one command. + +00:11:45 It's FastAPI Deploy. + +00:11:46 You have a FastAPI app, you just hit FastAPI Deploy, and then it's on the cloud. + +00:11:51 We take care of everything. + +00:11:52 We build a thing, deploy it, handle HTTPS, auto-scaling, all this stuff. + +00:11:57 and then you can just like focus on building the application, + +00:12:01 building apps. + +00:12:01 The funny thing is that it's super short to explain, + +00:12:03 but then building it is so cold, Blake. + +00:12:06 I'm glad it's so short. + +00:12:07 So thanks for being here. + +00:12:08 That was a great show, y'all. + +00:12:09 Well, no, just kidding. + +00:12:11 I think it's a little bit like Jupyter Notebooks in that sense that like you all are taking one for the team + +00:12:18 so that other people can have a simple experience. + +00:12:22 Whereas, you know, it's like those Jupyter folks, they write tons of TypeScript and do all sorts of things + +00:12:27 that nobody wants to necessarily do in the data science space + +00:12:31 so that you can just drag your widgets around. + +00:12:33 You know what I mean? + +00:12:33 I think that is a great analogy. + +00:12:35 I feel like the deployment space, it's a bit of a mixed bag. + +00:12:39 And I've been really frustrated to the point such that I wrote a book about it, + +00:12:45 that I think about an alternative, that I think over the last five plus years, + +00:12:51 it's just trended towards a little more complex, a little more complex. + +00:12:56 could we just add one of these things? + +00:12:57 And oh, now we got these three. + +00:12:58 We need one more thing to like make sure those things are doing, you know what I mean? + +00:13:02 And it's just like, wow, why are there 200 choices in my console to use this? + +00:13:07 Which is like kind of funny, right? + +00:13:08 Because I feel like a lot of these companies started with this, like, + +00:13:12 I don't want to understand all the ins and outs of all the infrastructure + +00:13:15 that comes with the cloud service provider. + +00:13:17 And that's really complicated to understand because I'm an app dev + +00:13:19 and I don't know anything about, you know, whatever, right? + +00:13:22 Now we're like, I don't know, kind of slowly accumulating. + +00:13:25 complexity but i think one of the cool things about what we're building and i like i've worked + +00:13:30 on cloud tooling before is like this is like just spoke for python developers and i think that's like + +00:13:35 quite like unique in that like we are really trying to like bring the bleeding edge and like + +00:13:41 all the new tooling that people are using and making sure that we play well with like uv and + +00:13:45 like i think like the there's a lot of thought and care put into that by the team that's a super good + +00:13:52 point. I mean, I remember Azure came out with like, here's your platform as a service. You just + +00:13:57 upload your web app and we'll just take it and go. And now that thing is so complicated along with + +00:14:01 many, many others, right? It's not just them. It's you've got AWS, you've got Vercel. There's + +00:14:06 lots of things we could point at for, wow, there's a lot of options here, you know? + +00:14:09 And then there are a lot of tools and like, you know, like many tools and many companies are also + +00:14:14 like doing a great job at many of the things that they are doing. But in many cases, it's just so + +00:14:18 complex, it's so complicated. + +00:14:19 You know, like I was, I have always been so, what, so adamant, I think is the word, to + +00:14:25 just teaching people how to use the tools. + +00:14:28 I think I have the most documentation about how to deploy things on your own than any other + +00:14:33 framework. + +00:14:34 I have so much information. + +00:14:35 MARK MANDEL: I hear that all the time from people. + +00:14:37 They say one of the reasons they chose FastAPIs is because how clear the documentation + +00:14:41 was, you know? + +00:14:41 FRANCESC CAMPOY: And then the thing is, you know, like just learning all those concepts + +00:14:45 and learning all the stuff that needs to be learned + +00:14:47 just to deploy something, and then you barely have the minimum. + +00:14:51 It's like, this is just too much. + +00:14:53 It's too much complacency. + +00:14:54 I think for me, I guess personally, my analogy is that FastAPI Cloud + +00:15:00 is the equivalent of what FastAPI is to building web APIs and backend. + +00:15:06 You could do the same with any other framework. + +00:15:08 You could validate data. + +00:15:09 You could generate open API. + +00:15:11 You could have automatic docs, But you will probably have to do a lot of the wiring yourself and making sure that it's actually correct and that it doesn't explode, all that stuff. + +00:15:19 That is, you know, like we are trying to do a lot of that work for the final users. + +00:15:25 Yeah, and I think it's great. + +00:15:27 I think it's really nice to just provide this on-ramp because, as you said at the opening, when I asked, you know, what the origin story is just FastAPI deploy. + +00:15:37 That solves so many stories. + +00:15:38 But I'm sure behind the scenes, what happens is just about as simple as that. + +00:15:44 Oh my gosh. + +00:15:45 About that. + +00:15:48 Some of us don't even get to write Python anymore to make all of this happen. + +00:15:52 So speaking about taking one for the team. + +00:15:56 Yeah, that is taking one for a team, right? + +00:15:57 It is. + +00:16:00 This portion of Talk Python To Me is brought to you by us. + +00:16:02 I'm thrilled to announce a brand new app built for developers created by yours truly. + +00:16:07 It's called Command Book. + +00:16:09 You know that thing you do every morning? + +00:16:11 Open up six terminal tabs, CD into this directory, activate that virtual environment, + +00:16:16 run the server with --reload. + +00:16:18 Now, CD somewhere else, start the background worker, another tab for Docker, another one to tail production logs. + +00:16:24 Every tab just says Python, Python, Python, Docker tail. + +00:16:28 And you're clicking through them going, which Python was that again? + +00:16:31 Where my app is running? + +00:16:32 Then sometime later, your dev server silently dies because it tried to reload + +00:16:36 while you're in the middle of a code edit, unmatched brace, a half-written import, or something. + +00:16:42 Now you're hunting through tabs to figure out which process crashed and how to restart it. + +00:16:46 My app, CommandBook, gives all of these long-running commands a permanent home. + +00:16:51 You save a command once, the working directory, the environment, + +00:16:54 three commands like git pull, and from then on, you just click run. + +00:16:58 You can even group commands together to start and stop everything for a project + +00:17:01 with a single click. + +00:17:02 It also has what I call honey badger mode, auto restart on crash. + +00:17:07 so when your dev server goes down mid-reload, Command Book just brings it right back up + +00:17:12 and does so over and over until the code is fixed. + +00:17:15 It also detects URLs from your output so you're never scrolling through thousands of lines of logs + +00:17:19 just to figure out how to reopen your web app. + +00:17:22 And it shows you uptime, memory usage, and all sorts of cool things about your process. + +00:17:26 The whole thing is a native macOS app. + +00:17:28 No Electron, no Chromium, just 21 megs. + +00:17:31 And it comes with a full CLI so anything you've configured in the UI, + +00:17:35 you can fire off from your terminal with just a single command. + +00:17:38 Right now, it's macOS only, but if there's enough interest, + +00:17:41 I'll build a Windows version too, so let me know. + +00:17:44 Please check it out at talkpython.fm/commandbook app. + +00:17:49 Download it for free, level up your developer workflow. + +00:17:52 The link is in your podcast player's show notes. + +00:17:54 That's talkpython.fm/commandbook. + +00:17:56 I really hope you enjoy this new app that I built. + +00:18:00 Let's save the internals for a little bit later. + +00:18:02 Maybe what we could do right now, Maybe we could do a bit of a walkthrough of just kind of what it's like to set up an app from scratch, right? + +00:18:12 Nice. + +00:18:12 I see that uv is here, which is, I've been certainly an advocate for uv in all sorts of deployment, + +00:18:20 but especially when you have like repeated build type of scenarios for like Docker, + +00:18:26 Docker Compose or Kubernetes or whatever, uv makes that stuff so much faster and so on. + +00:18:31 So who would like to be my guide that just kind of talks us through what it means to set up a new project here? + +00:18:37 I mean, there is like this really nice command that Savannah built, just FastAPI new, + +00:18:41 which I think is something like, I don't know, like super helpful. + +00:18:45 What does FastAPI new do? + +00:18:46 Like, is that kind of a cookie cutter-esque experience or what is it? + +00:18:51 Yes, exactly. + +00:18:52 At the moment, Onesco holds a super basic FastAPI application using uv. + +00:18:57 It also installs dependencies, creates a folder, everything that you need. + +00:19:00 In future, I think we're going to plan support for templates + +00:19:03 so you can build multiple kind of things as well. + +00:19:06 But for now, it's just basically just uv FastAPI new, + +00:19:09 sorry, uvx FastAPI new, and then that scaffolds the project for you. + +00:19:12 I don't know if you want to try it live or... + +00:19:14 No, go ahead. + +00:19:15 Just, I would think it might disrupt you. + +00:19:18 Just let's talk us through it. + +00:19:19 It could work. + +00:19:20 I'm just going to put that out there. + +00:19:21 I'll tell you the most insane, like let's do that live on the podcast experience. + +00:19:25 I'm pretty sure, yeah, this is definitely the most insane. + +00:19:28 I had Matthew Rocklin on from Coiled, and those guys are all about like, hey, we're going to scale up like a bunch of available servers for you, right? + +00:19:36 So that you can do your data science. + +00:19:38 Like I want to do some ML thing, and it needs 500 servers. + +00:19:41 So during the podcast, he's, oh, let me just spin up 2,000 EC2 instances. + +00:19:45 Hold on. + +00:19:47 And then we ran some code on it during the show. + +00:19:49 And he's like, oh, let's try that on ARM. + +00:19:50 And then spin up another 2,000 on ARM Linux machines. + +00:19:53 I'm like, okay, that's nuts. + +00:19:55 But let's just. + +00:19:56 That's a lot of power. + +00:19:57 So I was impressed, but Patrick, sorry, I kind of did realty there. + +00:20:02 Let's talk through it. + +00:20:03 Yeah, so you do uvx FastAPI app, FastAPI new, then you specify the name of the application. + +00:20:08 And that's almost there. + +00:20:10 You just need one more command to deploy, which is FastAPI deploy. + +00:20:13 The first time it's going to ask you to log in or join the waiting list if you haven't been invited yet. + +00:20:17 It's still in beta. + +00:20:19 And then you follow the steps. + +00:20:21 So like FastAPI deploy, log in, decide the team. + +00:20:25 If you have multiple teams, design the application name, + +00:20:28 and then you wait a few seconds and the application is going to be live. + +00:20:31 And just to be clear, FastAPI new is not required if you already have a FastAPI app. + +00:20:36 If you've already written your own code and you have your application, + +00:20:40 you can just go right into logging in and deploying. + +00:20:43 This is just so that if you're starting something new, + +00:20:46 you don't have to do any thinking about all the right things that need to be there. + +00:20:50 So this is more of a greenfield application. + +00:20:52 I'm bootstrapping a project. + +00:20:54 Right, right, because you want to have the best structure. + +00:20:57 Now, it uses uv. + +00:20:59 It is nodes required. + +00:21:00 Yeah, I was going to say, do I have to use the uv project management type of thing? + +00:21:04 Do I have to use the uv.lock files and uv add uv sync? + +00:21:08 Can I do requirements.txt? + +00:21:09 What's the story there? + +00:21:10 Yes, so we support uv with uv lock. + +00:21:13 We also support the, forget the name, the other, the PyLock file. + +00:21:17 And we also support plain requirements.txt. + +00:21:19 And maybe something else, I don't know, Jonathan, can you? + +00:21:22 PyLock's pretty new, right? + +00:21:23 I think Brett Cannon just got that out pretty recently, right? + +00:21:26 Brett was pretty excited. + +00:21:27 I know. + +00:21:29 Implemented that. + +00:21:30 Oh, was he? + +00:21:30 Okay, I'm sure he was. + +00:21:31 That's awesome. + +00:21:32 He put years of work into that. + +00:21:33 And he also said that one of the motivations was also like, you know, like cloud providers. + +00:21:38 So it's like, yes. + +00:21:40 The other thing is like, you know, if you use other different package managers, if they use the standard PyProject Autonomous format, that will also be supported. + +00:21:49 That means that, you know, like if you use PDM or if you use poetry with one of the recent versions, like that will work. + +00:21:56 If you use a very old version of poetry or like you use some other strange package manager or something, that will probably be problematic. + +00:22:03 But for like most of the use cases that use the standard package formats, it will just work. + +00:22:08 And if you use uv, then like you're going to have the best experience because we are fans of uv and Astro. + +00:22:14 They've definitely put a dent in the way that sort of Python gets started and making that a lot easier. + +00:22:19 So it totally makes sense. + +00:22:21 And also, I noticed, speaking of uv, that there's, at least in the recommended way, or the way in the docs, let's say, + +00:22:29 it doesn't say, here's how you install FastAPI. + +00:22:32 You just, here's how you run FastAPI-new, leveraging uv, which then will silently install and manage. + +00:22:42 All right, that's pretty neat. + +00:22:43 That helps you guys tell a simpler story, right? + +00:22:47 Instead of, here's how you create the virtual environment to install our thing and so on, you know? + +00:22:51 The idea is to make it like, as I was saying, just super simple for people just to start from scratch. + +00:22:56 Like no idea how to create an app, how to start, how to create an environment. + +00:23:00 It's just you run this command and you're off to go. + +00:23:03 Off to the races, I'm missing sands. + +00:23:06 Anyway, that's what Colombians do. + +00:23:09 But then if you already have an app, you have like, you know, like anything with FastAPI standard installed, then like that also just works. + +00:23:18 And Savannah, you pointed out that it doesn't have to be a new project. + +00:23:21 If you want to start from an existing one, that's totally fine. + +00:23:24 But what do I got to do if I'm starting from, if I'm migrating an existing one? + +00:23:29 Like how easy or hard is this? + +00:23:30 I have some like legacy project demo apps I've built at other companies I've worked with that have used FastAPI. + +00:23:37 And I literally just ran like FastAPI login and then FastAPI deploy and it just worked, which felt really magical. + +00:23:44 Right. Like I think that's like, I don't know, like having worked on cloud products for quite a while, like I think one of the biggest gaps is like the just I don't know, like the disparity between like my local dev environment and what is actually like lives up in the cloud somewhere. + +00:23:59 And so being able to just run one command and having the project as it exists on my machine go and work somewhere without having to think about like the infrastructure. + +00:24:08 And of course, like, you know, we want to be like amenable to folks who do want a little bit, you know, like higher touch. + +00:24:15 But we also want to work for people who are like learning FastAPI and Python, right? + +00:24:19 Like educators and people that are teaching Python. + +00:24:22 I think this is like something that you've had some interest in as well from those folks. + +00:24:26 Yeah, I was just listening to the Teaching Python podcast books just the other day and thinking, you know, like this, when I look at this, I know this is not necessarily your focus, but certainly people who are trying to teach a class, be it college class or high school class or whatever. + +00:24:44 And if you build anything on the web, the next question is, this is cool. + +00:24:49 How do I share it with people? + +00:24:50 And then they're like, oh, no. + +00:24:52 Oh, no. + +00:24:53 Hold on. + +00:24:54 Like coding boot camps, right? + +00:24:55 Like if you're teaching someone how to write Python or how to build an API with FastAPI, + +00:25:02 like actually setting up the environment for them to deploy is not part of it, right? + +00:25:06 Like that's not actually part of the curriculum. + +00:25:08 It's like this peripheral thing that ends up eating up a bunch of the educator's time or + +00:25:12 the student's time trying to understand both like how to write code and then also understand + +00:25:16 cloud stuff. + +00:25:17 And that's like a lot to ask people when they're fresh up the gate. + +00:25:20 I feel the same way about like tutorials and stuff at conferences. + +00:25:24 Yeah, totally. + +00:25:25 Yeah. Or training sessions. If you're doing like corporate training or like, they're all like, + +00:25:29 Oh, well, let's get everybody's machine working. There goes an hour and whatever. But yeah, + +00:25:36 if you can just say, look, I think when you're either, when you're trying to learn something, + +00:25:40 you'll be it through school or on your own or through these like more structured ways, + +00:25:46 like bootcamps and training and so on. I think if it's not the main purpose, + +00:25:51 I feel so often there's like, we're going to do 20 steps for four hours before you get any sort of + +00:25:56 reward of what you've done. And if you can go, okay, do you have it running? Okay, now you run + +00:26:01 this command. Look, now it's on the internet. Like, oh, wait, awesome. I got an app on the internet. + +00:26:05 Everybody look at me. You know what I mean? And I think shortening that cycle to where people can + +00:26:10 have that aha moment. And then later they can dive into like, well, how is it really working? And what + +00:26:14 do we really need to understand? But that quick iteration cycle, especially in the early parts of + +00:26:21 in new tech. It's really important. But also, you know, like down the line as well, I think, + +00:26:25 like, I don't know, there are so many things that I have been wanting to build and I don't, + +00:26:29 but I didn't because it was just so complex to deploy stuff. You know, like knowing, + +00:26:34 knowing how to do the whole thing, how to set up the clusters, the machines, install the Linux + +00:26:40 systems, deploy the cluster, whatever, like all that stuff, deploy the things, + +00:26:44 handling load balancers and HTTPS. I'm like, you know, like I know how to do that. I built one of + +00:26:49 the most popular websites teaching how to use Docker to ARM, which was like the contender + +00:26:54 before Kubernetes won everything. + +00:26:56 I like it. + +00:26:57 But still, it's just so complicated, like doing all those steps that are like, yeah, no, I'll + +00:27:02 just not do it. + +00:27:04 Like some other day. + +00:27:05 Now I can just like play around and do random stuff and just like deploy when it just works. + +00:27:09 It is, I really like that. + +00:27:11 I guess like coming back to that, like taking one for the team point earlier, like I feel + +00:27:16 like building Python tooling. It's kind of like taking one for the team sometimes because you have + +00:27:20 these folks that are like, you know, brand new to Python. Like Python is an extremely approachable + +00:27:26 language for people who are new to writing code. But then, you know, we also want to make FastAPI + +00:27:30 cloud work for someone that's building like an enterprise grade application, right? And so like, + +00:27:35 like pretty wide spectrum of folks with like a million different use cases and different types + +00:27:41 of applications they want to deploy with different constraints and like security stuff. + +00:27:46 And like, so yeah, I think, I don't know, maybe that's just like Python tooling. + +00:27:50 It's a lot of work, I guess, to like build something that works for the masses. + +00:27:54 Yeah, well, it's certainly tough to make something that feels simple, + +00:27:57 but it's not overly simplistic, you know, that can actually solve the problems. + +00:28:01 Has the right knobs for the right users too, right? + +00:28:04 I would argue we're not only trying to do it simple and easy. + +00:28:08 I feel like we're choosing a particular flavor of simple, which is... + +00:28:13 We have this discussion a few times. + +00:28:15 It's like, if you make a cloud, how do we make it feel Pythonic? + +00:28:18 What does that mean in a cloud setting? + +00:28:20 We talk about Pythonic libraries, Pythonic coding style in the community a lot. + +00:28:25 And now we kind of try to transfer that flavor, that feeling to the cloud + +00:28:29 and make everything around that feel just like we want our libraries to feel. + +00:28:34 So you feel at home as a Python developer and it just feels right. + +00:28:37 So that's extra step on top of making it simple. + +00:28:40 And we discuss that a lot. + +00:28:41 That's how I feel about it. + +00:28:42 I love it. + +00:28:43 I think it's one of the coolest things about this team. + +00:28:47 Like, you know, like people are being able to hear a few of us. + +00:28:50 There's like, there are like a bunch of others, but like that each one of us is so passionate + +00:28:55 about the things that we are working on. + +00:28:57 So like, you know, like each one of us is trying to make the best out of the things that + +00:29:01 we are building. + +00:29:02 And then like, we are so passionate about the thing that we care about and that we are building. + +00:29:07 that I think that ends up being an amazing result. + +00:29:11 For example, the CLI. + +00:29:12 We wanted to have some specific, you know, like behavior, + +00:29:17 some look and feel. + +00:29:18 And like we wanted to be able to have like the best kind of CLIs. + +00:29:22 So Patrick went ahead and built this bunch of tooling + +00:29:25 that we needed to be able to have it and like made it open source and everything. + +00:29:29 So we could have this great experience when working with CLIs. + +00:29:33 Jonathan recently was doing so much stuff about the something that caches and handling security, + +00:29:38 making sure that everything was super secure, super fast, super snappy. + +00:29:42 You know, like Alejandra is super careful about all the UI. + +00:29:46 Martin is super careful about all the infra. + +00:29:49 You know, it's like this passionate mess, which is a word I just made up. + +00:29:57 This, Alejandra goes and says, like, this thing doesn't have the proper margins. + +00:30:01 We need to increase this a little bit. + +00:30:02 I don't like it. + +00:30:04 She just goes and fixes it. + +00:30:05 The same with Marvin. + +00:30:06 He says, like, we need to have, like, this sort of thing in infrastructure. + +00:30:09 And, like, just comes and tells me, hey, we are doing this. + +00:30:12 And he's like, yes, sir. + +00:30:13 Like this with the OpenVPN, like, Unix, for example, + +00:30:16 that is mainly focused on the open source, is constantly looking at all the discussions, PRs, conversations, + +00:30:22 making sure that everything that we do, that doesn't. + +00:30:24 So why, you know, like, there have been, like, recently way more releases of FastAPI friends of the open source projects + +00:30:31 and very fast book fixes, very fast responses to handle everything for the community. + +00:30:37 Now we actually have people that is paying attention constantly + +00:30:40 to what is happening, what are the things that we have to do, + +00:30:43 and that really care about that part as well. + +00:30:46 So I think this extreme care about what we are doing. + +00:30:50 You know, like Savannah is making Python. + +00:30:53 This detail that each one of us cares so, so much about each one of the things that we build, + +00:30:59 making sure that the product is actually amazing. + +00:31:01 It's as good as it can be, and we can all feel at home when... + +00:31:06 I get so excited for talking about it because I really enjoy the end result of the product + +00:31:12 and of being able to use it. + +00:31:13 I would use it in the end. + +00:31:15 I would use to work with it. + +00:31:16 It's super simple. + +00:31:17 Yeah, that's awesome. + +00:31:18 Hey, let me adjust your mic real quick. + +00:31:20 I think it was like ducking, ducking out a little bit. + +00:31:23 We just went through a lot, a lot of content and a lot of sweating + +00:31:26 because your microphone went through like six different stages of weirdness. + +00:31:31 I think that really leads to like something I wanted to talk about is just what impact has this had on FastAPI? + +00:31:39 And before you jump in and answer that question, everyone, there's especially I think with Astral, + +00:31:44 but with because they've had so much success, there's been an undercurrent of concern of like, + +00:31:49 oh, my gosh, commercialism is getting into our open source. + +00:31:53 And what if it pollutes it and causes these negative aspects? + +00:31:57 But just hearing all of the energy around FastAPI with so many people, + +00:32:03 because of FastAPI Cloud, that's super neat. + +00:32:05 So I wanted to throw out to you all, how has this building FastAPI Cloud and the existence of FastAPI Cloud + +00:32:12 been giving back to FastAPI, I guess? + +00:32:14 I'm waiting to see if someone will speak FastAPI. + +00:32:18 I'm always the one that is speaking the most. + +00:32:22 I mean, it might be your project. + +00:32:24 Like, you may have started the project. + +00:32:25 Yeah, maybe so. + +00:32:27 Like, last year, I had, like, a few keynotes in some picons in different places. + +00:32:31 And, like, one of the key points that I wanted to bring was this idea that I'm trying to show that, in many cases, people worry about the boss factor. + +00:32:41 And the boss factor is just this idea. + +00:32:43 Yes, yes, I've heard this, yes. + +00:32:45 Yeah, you know, like, the boss factor is the idea that, oh, what happens if, like, there's one person doing this work? + +00:32:51 What happens if a boss runs over this person? + +00:32:54 And there's so much worry about this boss factor. + +00:32:58 It's sort of a morbid analogy, but I understand, right? + +00:33:02 Like, what will happen to the open source project if the maintainer vanishes for some reason, right? + +00:33:07 Exactly. + +00:33:08 But, you know, like, it also applies to projects and to many other different things. + +00:33:11 But what I think is that it's a disproportionate amount of attention to this detail of the boss factor. + +00:33:19 And I think every time people talk about the boss factor, you know, like one of my points in what I was trying to say in these talks was I would like people to think about the boss ticket factor. + +00:33:30 Who is paying for those tickets? + +00:33:32 It doesn't matter how big is the team. + +00:33:34 You know, like you have seen Google, Amazon, Meta, all the big ones. + +00:33:38 They don't have a small boost factor. + +00:33:40 They have a lot of people in their payroll and still they finish products. + +00:33:45 They just cancel them. + +00:33:46 Open source or private or whatever. + +00:33:49 is not the main factor defining the success of a project, + +00:33:55 being it commercial or open source of any type, is not really how many people are behind it. + +00:34:00 It's more of what is the value that whoever is putting the effort to keep it alive + +00:34:05 is getting from putting all that effort. + +00:34:07 It could be just satisfaction. + +00:34:09 It could be like open source, like, oh, I feel so good that I'm contributing to society. + +00:34:13 And that is valid. + +00:34:14 It doesn't pay the rent, but it's still valid. + +00:34:17 It might last for a while. + +00:34:18 But then also like, you know, like when you see like there are so many Python projects, so many Python, so many open source projects that can do well or can do bad. + +00:34:26 And it doesn't really depend on how many people they have. + +00:34:28 And when you are using a project, when you're using an open source project or when you are using a product of any type, I will encourage you to think about what is the best ticket factor of this project? + +00:34:41 What are the things that whoever is building this is receiving in exchange for giving it away? + +00:34:47 So like, you know, like what are they expecting to sell you at some point? + +00:34:52 Or what are they receiving in exchange? + +00:34:55 You know, for example, Bun, the JavaScript Brompton LiDAR. + +00:34:58 Like it was like, we don't know what they're going to sell. + +00:35:01 But now, you know, Cloud and Entropic really want to have like this thing keep working because they are using it internally. + +00:35:07 So you can say like, OK, I'm going to use it. + +00:35:09 I'm going to use it for free. + +00:35:10 I know that what they receive for me using is just like that they just really want it. + +00:35:15 So I can just like, whenever you are using bond, you are getting, now you are getting free services + +00:35:20 from Antropoc, that's it. + +00:35:21 But you know, like every time you are using a project, + +00:35:24 you can think about why are people receiving an exchange + +00:35:27 for giving this away for me? + +00:35:28 This is like the thing that I would like people to think about, you know, like also like + +00:35:34 how can they give back? + +00:35:35 Maybe they can actually contribute to that community + +00:35:38 or to that project. + +00:35:38 There are many ways and in many cases, the thing that is needed the most is just like help + +00:35:43 and work, just answering questions and issues. + +00:35:47 this portion of talk python enemy is brought to you by us i'm excited to talk about my first solo + +00:35:52 book talk python in production it's an inside look at how we host all the talk python sites + +00:35:58 apis mobile apps and way more here's the thing i believe most hosting stories sold to developers + +00:36:03 and data scientists are way over complicated and overpriced you've heard me say you're not google + +00:36:09 you're not netflix so you shouldn't run your infrastructure the way they do but if not that + +00:36:13 then what? This book is both a blueprint for what I chose for Talk Python and a story arc of 10 years + +00:36:20 of running my own infrastructure from a complete newbie, apprehensive to Linux, to some pretty + +00:36:25 neat infrastructures code DevOps. It covers Docker, Nginx, Let's Encrypt, self-hosted analytics and + +00:36:32 monitoring, CDN setup, framework migrations, and a whole philosophy that I've termed stack native, + +00:36:38 keeping things streamlined, powerful, and free of cloud lock-in. And it's more than just your + +00:36:43 standard tech book. It comes with code and figure galleries on GitHub, a discussion forum, and + +00:36:48 something unique, over an hour of audio readers briefs, short conversations that bookend each + +00:36:54 chapter to prime your focus or broaden your takeaways. Oh, and 0% of this book was written + +00:37:00 by AI. Every word is mine, written over the course in high months, for better or worse. + +00:37:04 I've made the first third of the book available for free online. After that, you can grab the DRM + +00:37:09 free EPUB and Kindle editions. And I'm working on a paperback edition as well. Please check it out + +00:37:15 at talkpython.fm/DevOps, or just click book in the nav bar on the website. It's a great way to + +00:37:20 support the podcast. And I hope it changes a bit how you think about running your apps in production. + +00:37:26 Kind of related to what you're saying, I think one of the angles that I really appreciate about + +00:37:30 the way we think about FastAPI and FastAPI Cloud is like where like a lot of our team was involved + +00:37:36 in open source before coming to work at FastAPI Cloud on various projects around the Python + +00:37:41 ecosystem, outside of Python. + +00:37:42 And I think all of us have deep appreciation and understanding of the value of open source + +00:37:48 and really, really try and build in a way that is like, I mean, Sebastian, you've talked + +00:37:53 about this a lot, but solving a real problem for folks, right? + +00:37:56 And so FastAPI Cloud is sort of this extension of this open source ecosystem people would + +00:38:01 be using. + +00:38:03 FastAPI Cloud may be an option. + +00:38:05 Maybe someone picks some other cloud for some reason. + +00:38:07 I don't think like, I think we're all very mindful of that. + +00:38:10 But like the angle that's very cool, I think, is that like, because we all work at FastAPI Cloud, + +00:38:15 like I know that I personally have time, more time for my open source work + +00:38:19 and my employer understands the value of my open source work, + +00:38:23 which isn't that positive for the open source community. + +00:38:25 Like I get to work on CPython sometimes and I have, you know, the bandwidth + +00:38:29 to go and do my steering council work or upcoming release management work. + +00:38:33 I understand like this sort of like, tempering, like open source, commercial, bad, all bad. + +00:38:39 It's not all bad. + +00:38:40 It's actually like really good in a lot of cases for folks to build business. + +00:38:43 Look at uv for an example to hold up, right? + +00:38:46 Astral, yeah, yeah, totally. + +00:38:47 Yeah, yeah. + +00:38:48 I think there are some really good examples of this. + +00:38:50 So I think like that's another angle that, I mean, I really, I get a lot of energy out of our team + +00:38:55 because we all, I don't have to, I don't have to fight the open source battle + +00:38:59 at FastAPI Cloud. + +00:39:01 I think that's really cool. + +00:39:02 I do think that's super cool as well. + +00:39:03 Let me put out two examples for you. + +00:39:05 here that I think everyone will be aware of as sort of to add to what Sebastian was saying is + +00:39:12 look how much Apple freaked out when Steve Jobs died and how many people work at Apple, right? + +00:39:17 Like that was still like, oh my gosh. But, you know, I think there's, they're hanging in there. + +00:39:23 They're going to be probably making it. They are not in our business. I tell you what, + +00:39:27 they got some of my money. That's for sure. But also, you know, look at Flask, right? Armin + +00:39:35 drifted away, which is totally fine. And David Lord and Pallets picked it up and kept running, + +00:39:41 right? Like it's still one of the most popular frameworks out there, right? So it's, I think + +00:39:46 the bus factor is over, overblown a bit, but also looking at the team of folks here, I think it's, + +00:39:51 it's even more obvious that there's a bunch of people who are on the inside, you know? + +00:39:54 For example, Flask, you know, like I learned so many things from Flask and like, the thing is, + +00:39:59 I feel like sometimes, sometimes people go and complain about the tool and say like, oh, + +00:40:04 this is not working for this or for that. And in many cases, it's in this insensitive way towards + +00:40:09 the people that are working on that. And it's like, you know, like in the end, realize that + +00:40:13 there's actually people behind the scenes doing the work. And like, in many cases, it's just like + +00:40:17 one or two people doing a lot of work in many cases, just for free. And, you know, like, I think + +00:40:22 it's worth calling that out. Like all the work that David Lord does for Flask is just like so + +00:40:27 much work. And yeah, deserves a lot of respect. I totally agree. The other thing that I forgot to + +00:40:32 mention is that there are so many ideas of potential products that I could build over the years, and I + +00:40:37 never did, and I never started a company because I didn't have clarity of what will be a good thing + +00:40:43 to actually sell and will have a good alignment. The cloud product has such a good alignment with + +00:40:49 the open source side because as more successful FastAPI is, the more successful FastAPI cloud + +00:40:57 has a potential to be. + +00:40:59 The more people using Python effectively, the more people might end up checking out FastAPI + +00:41:05 and the more people might end up checking out the product. + +00:41:08 So if FastAPI does well, if the open source does well, + +00:41:11 if Python does well, that's better for the company. + +00:41:13 So it doesn't really depend on my personal principles + +00:41:17 and values or something like that. + +00:41:19 It's aligned with, it's financially aligned with the company. + +00:41:24 So it's just going to be beneficial in the end It doesn't depend on good intentions. + +00:41:30 And FastAPI is open source. + +00:41:31 It has like 7,000 forks or something. + +00:41:34 So if a boost runs over me, there are 7,000 forks. + +00:41:38 It's not going away. + +00:41:39 I definitely agree with you on that. + +00:41:40 I feel like I should maybe give a little bit of a, I'll tell a little bit of the story + +00:41:45 of what's going on with, where did I put it? + +00:41:47 I don't think I pasted it over here, is what's going on with Tailwind right now. + +00:41:51 And I think Tailwind is having a tough time, Tailwind CSS. + +00:41:55 Traffic to Tailwind is up six times year over year on npm downloads. + +00:42:02 But the revenue of Tailwind is down five times. + +00:42:07 You know, I mean, these are completely out of whack things because instead of people going + +00:42:11 to docs to learn about it, it's just like, well, when you go to the docs, you learn they + +00:42:15 also have premium offerings, right? + +00:42:17 And I think you guys are different because it's not just, oh, here's a little bit nicer + +00:42:22 of a thing, right? + +00:42:23 I feel like it would be a little bit as if you were selling cookie cutter templates for FastAPI apps, you know, it's like, well, the AI can make the shape of the thing that comes out of the cookie cutter, to be honest. + +00:42:34 But you're offering something that has ongoing value that it costs more and is more complex in other places. + +00:42:42 And so I think maybe just thinking about the how this just keeps the team going for FastAPI is really awesome. + +00:42:50 And I think it's got a nice flywheel effect there. + +00:42:53 is I'll link to this, I guess, audio track. + +00:42:56 I don't know what I call it. + +00:42:57 It's a blog post that has one sentence, but a 30-minute audio you can check out + +00:43:01 from the guy, Adam, who's one of the founders of Tailwind, + +00:43:04 talking about going into this. + +00:43:06 It's kind of rough. + +00:43:07 I think I don't necessarily want to go into a deep AI, + +00:43:10 what it means for the industry. + +00:43:11 Like, let's stay focused on what you guys are doing. + +00:43:13 But I think it's going to be its own series. + +00:43:17 I mean, Stack Overflow had as many questions asked this month as they did in the first month of their existence, right? + +00:43:25 Three or 4,000, whereas at their peak, they were 200,000 questions a month. + +00:43:29 There's like real turmoil that's coming from some of these things, which is tricky. + +00:43:35 But I'm really excited to see you all doing this because I'm a big fan of FastAPI. + +00:43:40 And I think this is just sustaining and more for FastAPI, right? + +00:43:45 Like, what do you all think? + +00:43:46 That's what we hope that is going on. + +00:43:49 I thought about Taiwan for a second, right? + +00:43:52 It's not like we're immune to what happened to them. + +00:43:54 Like we also have a lot of documentation online. + +00:43:56 AI could train on that. + +00:43:58 And if it's good enough, it could maintain your infrastructure and stuff. + +00:44:01 It's just too hard at the moment. + +00:44:03 And there's an additional thing we're kind of selling, which is like, I guess, responsibility. + +00:44:08 Like you're shifting the risk from like letting your AI or your infantry team maintain your infrastructure to us. + +00:44:15 So we're staying up at night and worry about it. + +00:44:18 that has a lot of value as well. + +00:44:20 And that's probably not going to get removed by AI. + +00:44:24 Here's a very common cloud code, cursor, whatever conversation. + +00:44:29 Hey, build me something with Python and needs an API. + +00:44:33 Okay, we built it with FastAPI. + +00:44:34 How do I host it? + +00:44:36 Right, that doesn't just, it will build a cloud for you, right? + +00:44:39 It's going to recommend something out there. + +00:44:41 And a real natural way to how do I host FastAPI is FastAPI cloud, right? + +00:44:46 Like if it suggests, oh, you're just going to like spread it across Lambda by breaking. + +00:44:51 Like, whoa, no, I want something simple. + +00:44:52 Okay, give me FastAPI cloud, right? + +00:44:54 I think that that's a really good thing. + +00:44:55 And then on the enterprise side, enterprise folks are notoriously not good at supporting open source + +00:45:03 in that they're not like paying for it. + +00:45:05 I know some companies are big supporters of the PSF and Python and open source. + +00:45:11 But in general, it's like, yeah, we have this project with 5,000 people working on it. + +00:45:15 It's all Python. + +00:45:16 And are we sponsoring this? + +00:45:19 Nope. + +00:45:20 We're just enjoying the money, right? + +00:45:22 And we're a bank. + +00:45:23 So we got the money. + +00:45:24 We got all the money. + +00:45:25 So they're just not good at paying for like a really great framework that they use a lot. + +00:45:29 But they got plenty of hosting, plenty of internal apps that they just need to make run and stuff. + +00:45:34 So I think both on like the low end and the high end, there's a lot of synergy between these things. + +00:45:40 That is not just, you know, slightly advanced, not to diminish it, but slightly advanced UI widgets that you could ask your AI to build or something or like cookie cutter templates for project starters. + +00:45:52 I think we are in a somewhat fortunate position of like, you know, like FastAPI. FastAPI has grown so much. + +00:45:59 Like, you know, like when you check the statistics about downloads or GitHub stars or entries in developer surveys, + +00:46:06 like it's at the top in like in each category. + +00:46:08 It's like, you know, like the backend framework with the most GitHub stars across languages, + +00:46:14 even like, you know, like Java, Go, Ruby, JS, like whatever. + +00:46:17 It's like the top one, at least in GitHub stars. + +00:46:20 So like, you know, like FastAPI is like people are liking it, fortunately. + +00:46:25 And there's probably going to be people deploying things to FastAPI Cloud. + +00:46:29 So that's probably going to be like, we are probably going to be fine. + +00:46:33 I think, you know, like the, I guess it will be like a good point to ask people to go and + +00:46:38 check where the open source project is that they are using and check where is the bus ticket + +00:46:42 factor for those open source projects. + +00:46:45 You know, like if you are using Tailwind CSS, it would have been very cool if at some point you check if the premium things were useful for you and for your company or your project or something like that, you know? + +00:46:56 Yeah, because what is the thing that keeps that project going? + +00:46:59 Exactly. + +00:47:00 And I really personally admire if a project or something offers like more value, not just, hey, buy me a coffee, but here's a thing that you get way more of, you know? + +00:47:12 And in that regard, I think Tailwind was doing that, right? + +00:47:14 They were offering this suite of pre-built things. + +00:47:17 And I think that that's great. + +00:47:19 But yeah, I do think you've got more of these crazy AI things + +00:47:24 are going to maybe recommend FastAPI Cloud more than they're just going to undercut it. + +00:47:28 So I think that's really great. + +00:47:29 And by the way, I was just looking for the GitHub Stars graph. + +00:47:33 Like there's a whole, I can't remember what the domain of that site is. + +00:47:36 And I ran across, by the way, I just want to give a quick shout out. + +00:47:39 Like your cult repo documentary on FastAPI was awesome. + +00:47:44 Right? + +00:47:44 That was so fun. + +00:47:45 They made me look good. + +00:47:46 I didn't see that coming. + +00:47:47 Yeah, it came right on the heels of the Python official + +00:47:50 documentary, the one hour one. + +00:47:51 This is the same group, and the production quality is really nice. + +00:47:54 So like-- + +00:47:55 When they released the trailer for the Python documentary, + +00:47:58 before releasing the documentary, when they released the trailer, + +00:48:01 they contacted me and said, hey, we're doing these mini documentaries about different frameworks, + +00:48:06 different tools, and we want to include FastAPI there. + +00:48:08 I was like, oh, nice. + +00:48:10 But then I was just trying to stay excited, but super excited. + +00:48:13 Oh, that's so cool. Yeah, I watched it as soon as it came out. So I'll link to that. People should + +00:48:17 definitely, it's only like 10 minutes or something, but it's worth it. We're checking out. So it's not + +00:48:22 a huge investment time. People can watch it, I suppose. It's not TikTok. I mean, it's not like, + +00:48:27 oh, I saw the documentary, but it doesn't take you on huge about it. + +00:48:34 You have to listen for 10 minutes, overly excited Colombian. + +00:48:37 I don't understand what's happened to the attention span of society. I'm really, + +00:48:41 honestly a little concerned. I used to, when I would create my courses, people would say, + +00:48:45 you know, like a four hour course and there'd be like a 10, 15 minute sort of, Hey, here's how you + +00:48:49 set up your computer. And here's all the introduction and people, Oh, that's so awesome. + +00:48:52 I loved how you kind of set the stage. I'm really motivated to take the course. + +00:48:55 Nowadays, I just get messages like, why are you still talking? This is five minutes long. Do you + +00:48:59 understand? I'm like, this is your job. You can't spend five minutes. Oh my gosh. Anyway, that's, + +00:49:06 that's sort of the origin of my comment there. It's all right. So we're kind of getting so much + +00:49:11 time, I think I want to talk about a couple of things. Let's talk a little bit about internals. + +00:49:17 Like what, I don't know who wants to take this one, but let's talk about just how, when I say + +00:49:23 FastAPI deploy, then what? It's just a uv pip install and it just goes and it's magic and it's + +00:49:30 easy, right? We have a nickname for Jonathan. Can we say it or no? I don't know. It's so funny. + +00:49:34 This happened because I told my friends, I'm so concerned about being at the podcast because + +00:49:38 Because everyone here is a visionary, and then I'm the back-end guy. + +00:49:42 I think the things I could contribute to this conversation, I should probably keep to myself. + +00:49:47 But you're just leaking your turnouts, right? + +00:49:50 There are some things that are not really secret. + +00:49:53 Like, as Sebastian said earlier, Kubernetes 1 in the infrastructure and deployment field, to some extent. + +00:50:01 So that's somewhere in there, right? + +00:50:03 But it's all the way deep down, so no one has to worry about it. + +00:50:07 But it's still a foundation. + +00:50:08 which is a good foundation. + +00:50:09 I think one thing that's, you might have guessed it, + +00:50:12 but FastAPI Cloud is built on FastAPI, which kind of makes sense, right? + +00:50:16 And that also has an effect on like recent patches, updates and stuff. + +00:50:20 Because if we find something internally which we're not happy with, + +00:50:24 then we just fix it. + +00:50:25 And that's how some releases came out faster than months before. + +00:50:30 Power of dogfooding. + +00:50:31 Yeah, that's awesome. + +00:50:32 Dogfooding a lot. + +00:50:33 Also all the related libraries like SQL model and, well, others. + +00:50:38 they experience the same thing. + +00:50:40 New library is coming out. + +00:50:42 Patrick will announce at some point soon. + +00:50:43 It's not just FastJPay and friends. + +00:50:45 We're like really open. + +00:50:47 Like recently, Patrick just open-sourced everything we use for authentication authorization, + +00:50:52 for example. + +00:50:52 Is it open-source yet? + +00:50:53 Did they just leak something? + +00:50:55 It will be announced soon at some point. + +00:50:57 We build stuff internally in the moment really. + +00:51:00 Like we build it in a way, like in a separate package, + +00:51:03 just like an open-source library. + +00:51:04 And if we feel like the time is ripe, it's just getting open-sourced + +00:51:07 because a lot of things are reusable. + +00:51:09 And that's in all departments. + +00:51:10 That happens a lot. + +00:51:12 When I started there, I already realized that. + +00:51:14 Everyone was building open source, but now I joined in myself as well. + +00:51:17 I open source the library for compressing and decompressing archives in Python + +00:51:24 because the internal top high thing is just slow and we needed it to be faster + +00:51:28 because we're staring at the deployment process and we're like, hey, we could probably shave off a few seconds here. + +00:51:33 Then that's just open source for everyone to use. + +00:51:35 So we're contributing to the old Python ecosystem as well. + +00:51:38 You have to say the name. + +00:51:40 It's so good. + +00:51:41 Is it good? + +00:51:42 No, it's just, it's faster because it's, you know, faster than just tar. + +00:51:47 Fast tar? + +00:51:48 I love it. + +00:51:48 Fast tar, yes. + +00:51:49 And you can say with that very, very German accent, fast tar. + +00:51:53 I'll go star it. + +00:51:54 We'll get you some stars. + +00:51:56 This is going to happen. + +00:51:57 That's the irony about it. + +00:51:58 Like, it literally has no stars. + +00:52:00 But if you scroll down, you see the downloads. + +00:52:01 That's going to prove we're actually using it. + +00:52:04 Yeah, I like it. + +00:52:05 It's a little context manager. + +00:52:07 It's almost working the same as the TAL file in the standard library. + +00:52:13 Like the same, like almost similar API to that. + +00:52:16 It's basically a drop-in replacement, more or less. + +00:52:19 But they know they need everything to happen in Rust. + +00:52:21 Because Rust. + +00:52:22 Because Rust, yeah. + +00:52:23 Well, as soon as it becomes infrastructure and you've got to run it a million times, + +00:52:27 that starts to make sense, right? + +00:52:29 Yeah. + +00:52:29 Python is one of the fastest programming languages in the world. + +00:52:33 when you think about human time to build the things, right? + +00:52:37 Like that's one of its real superpowers is like, I mean, there's the whole story of Google Video and YouTube, right? + +00:52:44 And Google Video was written in C++ with 100 engineers + +00:52:47 and YouTube was a small team in Python and they just couldn't keep up with the features. + +00:52:51 So they bought this little old thing, YouTube, and see if we're going to make something with it. + +00:52:55 And last I checked, it was still in Python. + +00:52:57 I'm sure some of it isn't, but a few years ago it was, which is wild. + +00:53:00 Anyway, there's different ways of fast, But when it's down to like little utilities, yeah. + +00:53:04 I know some people that are trying to make Python fast. + +00:53:06 I know a couple. + +00:53:07 Yeah. + +00:53:08 And honestly, massive success in the last five years, right? + +00:53:11 Like since 3.11, since the specializing adaptive interpreter, there's been pretty big improvements. + +00:53:17 3.9 and 3.11 did a lot of like foundational work. + +00:53:20 And then 3.9 onward really just uncorked a lot of innovation there. + +00:53:24 Yeah, that's pretty awesome. + +00:53:26 All right. + +00:53:26 It sounds like, Sebastian, you've talked a lot about Kubernetes. + +00:53:29 So I imagine Kubernetes is happening. + +00:53:31 Do we get to pick what data centers it runs on? + +00:53:34 Do we get to pick what clouds it runs on? + +00:53:37 You're going to get to pick some of these things. + +00:53:41 Not yet. + +00:53:42 It's not released yet. + +00:53:43 But, you know, like it's top, of course. + +00:53:45 Like we have like regular cloud providers on the MIT + +00:53:48 and there's a bunch of Kubernetes. + +00:53:49 Then there's a bunch of additional stuff that needs to run on top. + +00:53:53 Then there's like custom Kubernetes controllers and things that Jonathan was saying + +00:53:58 that he's having to write in Go so that people in Python can be happy to be able to, you know, + +00:54:04 like manage all the Kubernetes shenanigans that need to happen + +00:54:07 because there's so much complexity that needs to be handled. + +00:54:10 There's a lot of that. + +00:54:11 We do a lot of advanced tricks also. + +00:54:15 Jonathan was recently doing a bunch of advanced tricks + +00:54:17 to handle the caches for the builds. + +00:54:20 So the way that we handle caches, and we also like tap into uv and how things work + +00:54:25 so that builds can be super, super fast because it's like something is, + +00:54:30 we are, you know, like we are very much targeted at FastAPI and Python in general. + +00:54:36 So we can take advantage of knowing how things run internally, + +00:54:40 how things are installed, how to optimize everything. + +00:54:43 So everything is just like super fast, super fast to install, to run, to like do everything. + +00:54:47 I imagine you all have base Docker images that are like just one layer away + +00:54:54 from whoever's code is running. + +00:54:56 You know, like you've got it all optimized, already pre-built with FastAPI and whatever settings of Python you want. + +00:55:02 A bunch of things and tricks. + +00:55:04 But there are also different things, like the different ways that we do to actually build the things + +00:55:08 and install things and put them inside of the actual build application. + +00:55:14 There's a lot of sourcing that we do there, and Jonathan has been working on a lot of that. + +00:55:21 And there's also all the logic and all the stuff. + +00:55:23 We have a bunch of stuff on top of that to handle out of scaling, + +00:55:27 which is something that is actually not that easy to find in different providers. + +00:55:31 We have auto-scaling based on requests, including scaling down to zero, which saves costs. + +00:55:37 But this is not Lambdas. + +00:55:40 It's not AWS Lambdas. + +00:55:42 It's like the full deployed application, the full container or whatever it is, + +00:55:47 which is the full thing with all the dependencies. + +00:55:50 It's running for whenever it has to run, but we can scale based on requests. + +00:55:54 So I guess it's like the type of thing that you will have if you have this giant cluster for a huge enterprise with a bunch of infra people making sure everything just works perfectly. + +00:56:08 But you just pay us to do that for you. + +00:56:10 This is also a good time for us to probably say lots of stuff is coming and we're in private beta. + +00:56:15 And so you should sign up for the wait list so that you can get admitted and try out these very cool things we've been talking about. + +00:56:22 Absolutely. + +00:56:23 And I think I'll let Tech Insider out in the audience sort of lean into it. + +00:56:29 Public release one. + +00:56:30 Sebastian, when? + +00:56:31 Public release one. + +00:56:32 My final topic, which is just what's the roadmap? + +00:56:35 When is this stuff? + +00:56:36 Like, how do we get into it here? + +00:56:38 Why would they like Litestar? + +00:56:40 We have the, right now we have the waiting list and we are onboarding people. + +00:56:44 We already have like a bunch of people in the private beta. + +00:56:46 We're going to keep onboarding people from the waiting list and like, you know, like ramp that up. + +00:56:51 But it will be like, you know, like through the waiting list is the main place where we are onboarding. + +00:56:56 People will want to make sure that everything is super fine-tuned. + +00:56:58 And we're going to keep it that way for a while. + +00:57:00 So like people that are on the waiting list are going to be like the ones that are going to be able to start using it the soonest. + +00:57:06 At some point, we'll probably have ways for people to invite others and things like that. + +00:57:10 About the things that we are building, we want to, you know, like we are super focused on FastAPI and then Python in general. + +00:57:18 at some point will probably support different tools, + +00:57:21 different ways to run, also like Python code in general, + +00:57:24 probably different frameworks. + +00:57:25 It will also depend a lot on what the users are asking for, + +00:57:30 whether like the tools, the frameworks, the use cases, + +00:57:33 the things that they need to build. + +00:57:34 And like, we're going to evolve the platform and the system + +00:57:38 based on what people need out of it. + +00:57:40 We have like a GitHub repo where we have issues, but we also have like a Slack that once people are admitted, + +00:57:45 they can talk directly to us and that feedback is really, really valuable for shaping the roadmap + +00:57:50 and figuring out all the fun things you want us to support. + +00:57:53 Awesome. Of course, you're going to charge money for it. It runs on servers and you guys are not + +00:57:59 a charity, but can you give any sense of what you're thinking about that kind of stuff or + +00:58:05 join the waitlist and see? + +00:58:06 Well, first join the waitlist and see, but we don't have like that predefined yet, + +00:58:12 but they will be on the ballpark of what you could get from different cloud providers. + +00:58:16 So different similar-ish providers will be on the ballpark of what you will get. + +00:58:24 But it's not written in stone yet. + +00:58:27 It's still a little bit different because we can auto-scale based on requests. + +00:58:32 So we can increase the amount of replicas of your application automatically, + +00:58:36 and then we can decrease them automatically, and we can scale down to zero. + +00:58:40 So you can probably handle all the load that you need and in the end spend a lot less because you don't have to have a bunch of instances constantly running or things like that, you know. + +00:58:50 So it will probably work a little bit different than what it will be for other providers. + +00:58:56 But in the end, it should be roughly similar. + +00:58:58 Okay. And given the fact that you all handle so much of it as a platform as a service type of thing, you don't have to have a cloud expert on hand or a DevOps expert necessarily, right? As soon as a company hires somebody to be an AWS cloud architect or something like that, it's no longer just what is your AWS bill. + +00:59:18 It's also a little bit of pain that we are swallowing so you don't have to take it. + +00:59:22 Exactly. It's part of taking one for the team, right? + +00:59:24 Yes. + +00:59:25 Yes. + +00:59:27 Indeed. + +00:59:27 All right, so I had one or two things specifically that I was seeking. + +00:59:32 It's like custom domains. + +00:59:34 How far off are custom domains? + +00:59:35 I was like, oh, I could put some cool things on there. + +00:59:39 I could tell Jonathan is psyched about this. + +00:59:41 It'd be really fun to put one of my really small FastAPI projects over there, + +00:59:46 something I set up for some of my courses or something, + +00:59:48 and then I can point people to go, look, it's running on FastAPI Cloud. + +00:59:51 How neat, you guys can check that out over there. + +00:59:54 And I'm like, but it's on its own domain, that domain is baked into the course videos, you know what I mean? And it's written in stone. + +01:00:00 It's marketing. + +01:00:01 Yeah, exactly. So I can't really move it because it has, you know, some subdomain of Talk Python, + +01:00:08 right? + +01:00:08 I was working on it. And then I got the notification by Google Calendar that I should + +01:00:12 join a certain podcast. So... + +01:00:14 Are you telling me we don't have custom domains? Because I'm here asking you about custom domains. + +01:00:19 How meta is that? + +01:00:19 You got it. It could be here already. But no, you have to wait a bit more. + +01:00:23 Okay. But soon? + +01:00:24 Yeah. As soon as broad enough, but I'm actively working on it. Let's put it like that. + +01:00:29 Okay. That sounds great. And then, I mean, just, it's never simple. You know, I just, + +01:00:34 I set up some stuff and it's like, you get the pop-up. Oh, you got to put this, you know, + +01:00:39 this TXT record or this CNAME or whatever record into your DNS and then we're checking it. Oh, + +01:00:45 it might take three days for your DNS to propagate. So hang in there and just, I can imagine like + +01:00:50 you're having fun yeah i guess you're kidding me that's like wow i i thought i'm almost off work + +01:00:56 but no you're bringing it all back but yeah that's a that's a thing i'm sure the company + +01:01:00 could support therapy to like work work through the issues and the trauma that you've suffered + +01:01:04 from the dns it's always dns that's right i mean you got that's an yes it's always dns yes + +01:01:10 i guess one of our goals with custom domains also to make it super simple for you to set up them + +01:01:15 Like, for example, if you're using one of the providers that support OAuth, we can also just do one click and then it's going to be automatically. + +01:01:23 Oh, that's cool. Yeah, that's really nice. + +01:01:25 But unfortunately, it depends on the platform you're using. + +01:01:27 All of them support this. + +01:01:29 This is said by the person in charge of most of the integrations. + +01:01:32 So Patrick has built, we have integrations for a bunch of database providers and things like that. + +01:01:38 I think now Patrick knows by memory, open ID specification. + +01:01:42 I don't know. + +01:01:44 Yeah, the other thing I wanted to talk a bit about was just integrations, like what kind of stuff you guys have coming in. + +01:01:48 I saw that Hugging Face is going to be integrated soon. + +01:01:52 You've got Supabase, which is kind of Postgres as a service. + +01:01:56 There's a lot of those things out there that theoretically could be added. + +01:02:00 Someone also asked for MongoDB. + +01:02:01 Maybe that's one that we're going to take a look into. + +01:02:04 It really depends on the provider. + +01:02:06 So at the moment, we don't want to ask databases for you because that's also another kind of rabbit hole. + +01:02:11 Jonathan is probably not ready for that. + +01:02:14 But yeah, definitely database. + +01:02:16 But I guess we can say that we're also talking with the people from Pydantic + +01:02:21 so we can integrate maybe Logfire automatically, that kind of stuff. + +01:02:25 Yeah. + +01:02:25 And also things like Redis, which is also another kind of database. + +01:02:29 That's also coming soon. + +01:02:30 Yeah, there's a couple of database as a service type things + +01:02:33 that don't require too much other than just connecting API keys + +01:02:37 and something like that, right? + +01:02:38 Those seem like low-hanging fruit. + +01:02:40 Like the kind of goal with the integration is not just done. + +01:02:44 Like, yeah, right now it's just setting up an environment variable. + +01:02:47 But the idea is also to more, I don't know, like the proper integration, I would say. + +01:02:52 Like, for example, for things like Superbase, if, yeah, I think that's support branching. + +01:02:56 Like, for example, once we support ProQuest Previews for GitHub, like we can also create + +01:03:00 a branch automatically for you if you have the Superbase integration enabled. + +01:03:04 And we can do this kind of stuff as well. + +01:03:06 Or even we could show like some information about database. + +01:03:10 I don't know, like load or like memory usage, things are directly from our dashboard. + +01:03:14 So you don't have to go there. + +01:03:15 That's the main reason why we're building this infrastructure + +01:03:18 for the integration. + +01:03:19 Well, people can sign up to the waiting list and hopefully get on the private beta. + +01:03:24 We actually check the waiting list. + +01:03:26 We actually check the use cases, team sizes, like what are people building with it? + +01:03:31 Like we actually go and check it and we bring in people from the waiting list. + +01:03:37 Nice. You know, I didn't join the waiting list directly. + +01:03:39 I was added by some guy I know who was very kind to help me get some behind-the-scene look. + +01:03:45 So I don't know what the process is. + +01:03:46 Do you actually say what you want to do with it? + +01:03:48 And you evaluate that a little bit as well based on, like, + +01:03:51 hey, this would be a cool use case for us to support? + +01:03:53 There are many types of applications and many types of different team sizes, + +01:03:57 many types of things that people might want to build. + +01:04:00 And we try to see, like, okay, where is a case where we could be a good fit and we can provide a great service? + +01:04:06 And where are the things that people are trying to build? + +01:04:09 Also, it also helps us see, like, you know, like, what are people trying to do with Fasted + +01:04:14 Vehicle so that we know what we have to provide? + +01:04:17 But we actually go and check those, you know, like those submissions on, like, sexually thousands + +01:04:24 of people in the waiting list, but we still go and check and approve kind of manually still + +01:04:30 to bring a bunch of people on board in the different ways that we have been bringing people. + +01:04:35 So if people go and join the waiting list and actually tell us what they are, what is their + +01:04:39 use case, their team, what are they planning on doing? + +01:04:42 There's a much higher chance that we are going to go + +01:04:45 on to bring them up. + +01:04:46 MARK MANDEL: Awesome. + +01:04:47 So everyone, go join the waitlist. + +01:04:49 If you're doing FastAPI, I'll link to it in the show notes, + +01:04:53 of course. + +01:04:53 Thank you all for being here and sharing the story. + +01:04:57 And I, for one, am very excited to see FastAPI Cloud exist + +01:05:01 and just one more way to make FastAPI stronger and more resilient and so on. + +01:05:06 FRANCESC CAMPOY: Thank you very much. + +01:05:07 Thank you for having us. + +01:05:08 Yeah, it's super fun. + +01:05:09 Thanks for having us. + +01:05:10 Yeah, you bet. + +01:05:11 Bye, everyone. + +01:05:11 Bye, folks. + +01:05:12 Bye-bye. + +01:05:12 Bye. + +01:05:14 This has been another episode of Talk Python To Me. + +01:05:16 Thank you to our sponsors. + +01:05:17 Be sure to check out what they're offering. + +01:05:19 It really helps support the show. + +01:05:21 This episode is brought to you by CommandBook, a native macOS app that I built + +01:05:26 that gives long-running terminal commands a permanent home. + +01:05:29 No more juggling six terminal tabs every morning. + +01:05:31 Carefully craft a command once, run it forever with auto-restart, + +01:05:34 Ural detection, and a full CLI. + +01:05:36 Download it for free at talkpython.fm/command book app. + +01:05:41 And it's brought to you by the Talk Python in Production Book, + +01:05:44 an inside look at 10 years of the real-world DevOps behind the Talk Python sites and apps. + +01:05:49 Check it out at talkpython.fm/DevOps book. + +01:05:52 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from + +01:05:59 complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:06:05 Best of all, there's no subscription in sight. + +01:06:08 Browse the catalog at talkpython.fm. + +01:06:10 And if you're not already subscribed to the show on your favorite podcast player, + +01:06:14 what are you waiting for? + +01:06:15 Just search for Python in your podcast player. + +01:06:17 We should be right at the top. + +01:06:18 If you enjoy that geeky rap song, you can download the full track. + +01:06:21 The link is actually in your podcast blur show notes. + +01:06:24 This is your host, Michael Kennedy. + +01:06:26 Thank you so much for listening. + +01:06:27 I really appreciate it. + +01:06:28 I'll see you next time. + +01:06:41 I'm out. + diff --git a/transcripts/536-fly-inside-fastapi-cloud.vtt b/transcripts/536-fly-inside-fastapi-cloud.vtt new file mode 100644 index 0000000..1367c48 --- /dev/null +++ b/transcripts/536-fly-inside-fastapi-cloud.vtt @@ -0,0 +1,3842 @@ +WEBVTT + +00:00:00.020 --> 00:00:04.600 +You've built your FastAPI app. It's running great locally. Now you want to share it with the world. + +00:00:05.360 --> 00:00:12.480 +But then reality hits. Containers, load balancers, HTTPS certificates, cloud consoles with 200 + +00:00:12.680 --> 00:00:18.560 +options. What if deploying was just one command? That's exactly what Sebastian Ramirez and the + +00:00:18.660 --> 00:00:24.460 +FastAPI cloud team are building. On this episode, we sit down with Sebastian, Patrick Arminio, + +00:00:24.720 --> 00:00:29.980 +Savannah Ostrowski, and Jonathan Ewald to go inside FastAPI cloud, explore what it means + +00:00:30.000 --> 00:00:35.340 +to build a Pythonic cloud and dig into how this commercial venture is actually making FastAPI, + +00:00:35.380 --> 00:00:42.060 +the open source project, stronger than ever. This is Talk Python To Me, episode 536, recorded January + +00:00:42.440 --> 00:00:51.440 +13th, 2026. Talk Python To Me, yeah, we ready to roll. Upgrading the code, no fear of getting old. + +00:00:51.580 --> 00:00:59.960 +Async in the air, new frameworks in sight, geeky rap on deck. Quark crew, it's time to unite. We + +00:01:01.820 --> 00:01:06.460 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:06.980 --> 00:01:12.300 +This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. + +00:01:12.940 --> 00:01:17.460 +Let's connect on social media. You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:17.710 --> 00:01:23.000 +The social links are all in your show notes. You can find over 10 years of past episodes at + +00:01:23.120 --> 00:01:27.400 +Talk Python.fm. And if you want to be part of the show, you can join our recording live streams. + +00:01:27.680 --> 00:01:31.580 +That's right, we live stream the raw uncut version of each episode on YouTube. + +00:01:32.180 --> 00:01:36.600 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:36.860 --> 00:01:40.480 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:41.480 --> 00:01:46.440 +This episode is brought to you by CommandBook, a native macOS app that I built that gives + +00:01:46.640 --> 00:01:48.680 +long-running terminal commands a permanent home. + +00:01:49.080 --> 00:01:51.080 +No more juggling six terminal tabs every morning. + +00:01:51.700 --> 00:01:56.340 +Carefully craft a command once, run it forever with auto-restart, URL detection, and a full CLI. + +00:01:56.780 --> 00:01:59.840 +Download it for free at talkpython.fm/command book app. + +00:02:00.520 --> 00:02:03.140 +And it's brought to you by the Talk Python in Production Book, + +00:02:03.700 --> 00:02:08.360 +an inside look at 10 years of the real-world DevOps behind the Talk Python sites and apps. + +00:02:08.640 --> 00:02:11.340 +Check it out at talkpython.fm/DevOps book. + +00:02:12.100 --> 00:02:14.220 +Before we dive in, I have something excellent to announce. + +00:02:14.440 --> 00:02:17.820 +A few episodes back, I told you about our new AI integrations, + +00:02:18.120 --> 00:02:21.880 +the Talk Python MCP servers, and our llms.txt file, + +00:02:22.300 --> 00:02:31.420 +so that AI tools can tap into our over 550 episodes and 7.5 million words of info around the Python community and history. + +00:02:32.080 --> 00:02:33.580 +Well, I'm building on that now. + +00:02:34.240 --> 00:02:37.120 +Talk Python now has an open source CLI tool. + +00:02:37.490 --> 00:02:41.960 +You can search episodes, transcripts, guests, and even our courses right from your terminal. + +00:02:42.260 --> 00:02:43.240 +No browser required. + +00:02:44.240 --> 00:02:49.040 +It's fast too, backed by our optimized APIs, taking about five milliseconds of response time, + +00:02:49.300 --> 00:02:51.140 +plus, you know, whatever the internet ping time is. + +00:02:51.680 --> 00:02:57.120 +It supports rich text for humans, JSON for programs, and markdown output for AIs. + +00:02:57.900 --> 00:03:04.540 +You just install it with uvtoolinstall talk-python-cli, and check it out on GitHub. + +00:03:04.920 --> 00:03:06.460 +The link is in the podcast player show notes. + +00:03:07.160 --> 00:03:09.560 +I also wrote up a blog post on the hows and whys of it. + +00:03:09.880 --> 00:03:14.100 +Check that out over on the Talk Python blog at, well, talkpython.fm/blog. + +00:03:14.780 --> 00:03:17.120 +The full link to the exact article is in the show notes. + +00:03:17.860 --> 00:03:18.680 +All right, let's jump in. + +00:03:20.100 --> 00:03:20.700 +Hello, everyone. + +00:03:21.220 --> 00:03:23.940 +Sebastian, Patrick, Savannah, and Jonathan. + +00:03:24.460 --> 00:03:25.580 +Awesome to have you all here. + +00:03:26.160 --> 00:03:27.460 +Excited to talk about FastAPI Cloud. + +00:03:28.120 --> 00:03:28.300 +Welcome. + +00:03:28.310 --> 00:03:28.460 +Yes. + +00:03:28.980 --> 00:03:29.620 +Thanks for having me. + +00:03:29.860 --> 00:03:30.100 +Thank you. + +00:03:30.240 --> 00:03:31.080 +We're also ahead, Mike. + +00:03:31.660 --> 00:03:32.640 +What a project. + +00:03:32.920 --> 00:03:34.420 +It's been going on for a while. + +00:03:34.880 --> 00:03:40.100 +I've heard stuff from Sebastian that maybe something was brewing and all these things, + +00:03:40.320 --> 00:03:42.780 +but not too long ago, you all announced it. + +00:03:43.020 --> 00:03:47.100 +And I heard that FastAPI, some people have been using it recently. + +00:03:47.500 --> 00:03:50.820 +Some of the surveys show that some people use it for websites. + +00:03:51.140 --> 00:03:51.560 +I'm not sure. + +00:03:51.860 --> 00:03:52.320 +Rumors. + +00:03:52.900 --> 00:03:53.180 +Yeah, yeah. + +00:03:53.380 --> 00:03:53.800 +Rumors. + +00:03:54.680 --> 00:03:55.080 +Rumors. + +00:03:55.440 --> 00:03:56.020 +Oh, my gosh. + +00:03:56.100 --> 00:03:58.060 +I mean, congratulations on that. + +00:03:58.100 --> 00:04:03.120 +But before we dive into FastAPI and FastAPI Cloud, let's just do a quick introduction. + +00:04:03.720 --> 00:04:04.320 +Who are you? + +00:04:04.380 --> 00:04:08.220 +We'll just go around the Brady Bunch squares of our live stream here and start with Sebastian. + +00:04:08.620 --> 00:04:09.680 +You've been on the show a few times. + +00:04:09.800 --> 00:04:14.380 +In fact, you've been on the show just recently for a really fun episode, Sebastian. + +00:04:14.760 --> 00:04:15.060 +Who are you? + +00:04:15.260 --> 00:04:16.519 +In real, that was super fun. + +00:04:17.019 --> 00:04:18.100 +So, hello, everyone. + +00:04:18.280 --> 00:04:20.299 +I'm Sebastian Ramirez, or Tiangulo. + +00:04:20.799 --> 00:04:22.300 +I created FastAPI. + +00:04:22.400 --> 00:04:26.200 +That is this Python framework for building web APIs and backend. + +00:04:26.940 --> 00:04:29.980 +In case you've been living in a hole and haven't done any Python for 10 years. + +00:04:30.500 --> 00:04:36.480 +You also are famous for really pointing out the ridiculousness of modern tech recruiting. + +00:04:36.800 --> 00:04:38.220 +You know what I'm talking about? + +00:04:38.580 --> 00:04:40.140 +Yeah, you know, like it's fun. + +00:04:40.280 --> 00:04:42.700 +This is probably the thing that I am known for. + +00:04:42.780 --> 00:04:47.700 +It's for writing a tweet saying, yeah, what was it? + +00:04:47.980 --> 00:04:52.460 +that I saw a job asking for five years of experience in FastAPI, + +00:04:52.920 --> 00:04:56.040 +and I only had 2.5 since I created the thing. + +00:04:56.140 --> 00:04:58.920 +So you didn't qualify for the FastAPI job? + +00:04:59.000 --> 00:05:00.460 +I didn't qualify for that, yeah. + +00:05:00.800 --> 00:05:03.260 +And then the funny thing is, you know, like people sometimes, + +00:05:03.940 --> 00:05:06.240 +even people in Python itself, and tell me like, + +00:05:06.400 --> 00:05:08.220 +oh, wait, like you're, and I say like, oh, yeah, + +00:05:08.270 --> 00:05:09.580 +I created this thing called FastAPI. + +00:05:09.700 --> 00:05:11.340 +Oh, wait, okay, so what is FastAPI? + +00:05:11.480 --> 00:05:14.440 +Oh, wait, you are the guy from the meme, + +00:05:14.900 --> 00:05:16.160 +the meme about FastAPI. + +00:05:16.160 --> 00:05:16.780 +Are you serious? + +00:05:17.440 --> 00:05:22.960 +Yeah, you know, like suddenly that is super important that I am the guy for the meme about FastAPI. + +00:05:23.260 --> 00:05:26.020 +Not the guy from FastAPI, the guy from the meme. + +00:05:26.740 --> 00:05:29.080 +Oh my gosh, I saw you on TikTok. It was amazing. + +00:05:29.840 --> 00:05:34.240 +It was a live achievement. I wrote a viral tweet. So yeah, nice to meet you all. + +00:05:34.360 --> 00:05:37.680 +You know what? Sometimes your moment in the sun is not the one you expected. + +00:05:37.920 --> 00:05:40.340 +No, congratulations on how good FastAPI is. + +00:05:40.340 --> 00:05:40.620 +On the tweet. + +00:05:41.820 --> 00:05:44.980 +Exactly. You really nailed it. Patrick, welcome to the show. + +00:05:45.160 --> 00:05:46.580 +Nice to be here. Yeah, I'm Patrick. + +00:05:47.040 --> 00:05:50.780 +I guess the main thing I'm kind of known for in the community is like this library called + +00:05:50.940 --> 00:05:54.820 +Storary, which is similar to FastAPI, but instead of REST is for GraphQL. + +00:05:55.130 --> 00:06:00.740 +Other than that, I help organize PyCon Italy and I used to also do EuroPython as well, + +00:06:00.940 --> 00:06:03.160 +but I stopped because of way too many things. + +00:06:03.550 --> 00:06:05.120 +Yeah, that's pretty much me. + +00:06:05.200 --> 00:06:06.780 +How do you see GraphQL these days? + +00:06:07.290 --> 00:06:08.220 +Is it still popular? + +00:06:08.500 --> 00:06:12.160 +I think it's mostly popular in the enterprises, unfortunately. + +00:06:12.800 --> 00:06:16.039 +I'm a bit, to be honest, I'm a bit annoyed about the companies that do tooling around + +00:06:16.060 --> 00:06:20.200 +GraphQL because I don't know, I feel like they're not really pushing it forward. They're just, + +00:06:20.380 --> 00:06:24.900 +I don't know, trying to work with enterprises and that's it. Or maybe people think to AI. + +00:06:25.140 --> 00:06:31.180 +Yeah. It feels a little bit like the soap, soap, wisdom, XML, modern version. Savannah. + +00:06:33.040 --> 00:06:33.180 +Yeah. + +00:06:34.740 --> 00:06:39.840 +You like tapping out of being an organizer for EuroPython is like, you know, the classic + +00:06:40.060 --> 00:06:44.180 +open source oversubscribed doing all the things very relatable. + +00:06:44.720 --> 00:06:44.880 +Yeah. + +00:06:45.030 --> 00:06:45.180 +Yeah. + +00:06:45.660 --> 00:06:46.280 +But yeah, I'm Savannah. + +00:06:46.760 --> 00:06:47.660 +What can I say? + +00:06:47.780 --> 00:06:52.360 +I am on the Python Steering Council for 2096, which is very exciting. + +00:06:52.640 --> 00:06:52.920 +Congratulations. + +00:06:53.340 --> 00:06:53.780 +Oh, thank you. + +00:06:53.980 --> 00:06:59.280 +I am also the release manager for the upcoming version of Python, Python 3.16. + +00:06:59.910 --> 00:07:03.800 +And so that'll kick off later this year, which is really cool and very exciting. + +00:07:04.320 --> 00:07:08.800 +I work on CPython stuff, the JIT, arg parse, basically whatever needs help. + +00:07:09.060 --> 00:07:09.880 +It's kind of where you'll find me. + +00:07:09.960 --> 00:07:10.160 +Awesome. + +00:07:10.760 --> 00:07:12.300 +Congratulations on the Steering Council. + +00:07:12.880 --> 00:07:15.000 +And yeah, that's a lot of cool stuff. + +00:07:15.540 --> 00:07:19.740 +Hopefully we don't get a Python 4.0 right after 3.16 + +00:07:20.540 --> 00:07:22.960 +because then your job will never end, is what I've learned. + +00:07:23.240 --> 00:07:23.560 +Yeah, yeah. + +00:07:23.920 --> 00:07:27.920 +Benjamin Peterson, Python 2.7 forever kind of situation. + +00:07:28.460 --> 00:07:28.860 +Yeah, yeah. + +00:07:29.070 --> 00:07:32.920 +I mean, release management is still, I mean, it's still quite a commitment. + +00:07:33.140 --> 00:07:36.900 +It's like seven-ish years when you think about all the staggered releases + +00:07:36.990 --> 00:07:38.620 +because you're a release manager 2.0 + +00:07:38.620 --> 00:07:40.700 +and then you have the five-year maintenance cycle. + +00:07:41.980 --> 00:07:45.800 +So yeah, it's Python forever is what I only said. + +00:07:45.960 --> 00:07:47.040 +Yeah, it's probably not a fad. + +00:07:47.120 --> 00:07:48.720 +It's probably going to stick around, this Python thing. + +00:07:50.400 --> 00:07:51.080 +No, that's awesome. + +00:07:51.360 --> 00:07:51.580 +Congratulations. + +00:07:51.940 --> 00:07:53.780 +Also, cool with arg parse. + +00:07:54.080 --> 00:08:04.680 +I feel like that's making a strong comeback now that we have these AI things that can just put stuff together for us instead of like, oh, I need to depend on this library and that library. + +00:08:04.700 --> 00:08:08.060 +Like, I just need to take a few arguments and have a little help text. + +00:08:08.120 --> 00:08:10.000 +And it's like, well, you've already got this built-in thing. + +00:08:10.120 --> 00:08:11.240 +Oh, who knew? + +00:08:11.420 --> 00:08:12.820 +You know, people are like, oh, I didn't even know. + +00:08:12.870 --> 00:08:15.800 +I thought I used typer or click or something, right? + +00:08:15.940 --> 00:08:18.320 +There's, you know, the typers and clicks of the world. + +00:08:18.430 --> 00:08:20.720 +But sometimes you just want the simplest thing. + +00:08:20.880 --> 00:08:22.740 +And ArgParse is pretty great at that. + +00:08:22.950 --> 00:08:27.900 +Although it has many quirks that are probably and most definitely unfixable at this point. + +00:08:28.220 --> 00:08:32.080 +Because bugs are features when you have things that have been around as long as Python. + +00:08:32.490 --> 00:08:35.000 +But yeah, no, I mean, AI loves to write Python. + +00:08:35.320 --> 00:08:39.240 +I think it's like the language used the most, AI-generated code. + +00:08:39.680 --> 00:08:42.039 +I'll just say we live in weird times, very weird times. + +00:08:42.539 --> 00:08:44.940 +I would love a precedented time at some time. + +00:08:45.440 --> 00:08:45.840 +Exactly. + +00:08:46.160 --> 00:08:46.220 +Yeah. + +00:08:46.320 --> 00:08:47.660 +Can we just get the boring times? + +00:08:48.000 --> 00:08:49.640 +No, nothing interesting, please. + +00:08:49.940 --> 00:08:55.460 +What I said about GraphQL may sound like a bit of a smash, but I didn't mean it in a negative, + +00:08:56.020 --> 00:08:56.940 +super negative way anyway. + +00:08:57.220 --> 00:09:02.500 +Like it used to be all the enterprises were all about soap and wisdom and like subscribing + +00:09:02.580 --> 00:09:03.060 +your tooling. + +00:09:03.460 --> 00:09:04.160 +Please don't write me. + +00:09:04.160 --> 00:09:05.780 +I'm not trying to bash on your technology. + +00:09:07.780 --> 00:09:08.140 +All right. + +00:09:08.800 --> 00:09:10.080 +Jonathan, also welcome. + +00:09:10.320 --> 00:09:10.440 +Hi. + +00:09:10.580 --> 00:09:14.080 +Yeah, I'm not nearly as famous as everyone else in this call. + +00:09:14.980 --> 00:09:20.420 +I'm more infamous internally at FastAPI Cloud, I would say, for a bunch of things. + +00:09:21.000 --> 00:09:23.860 +I've heard of emojis or something along the lines of line. + +00:09:23.920 --> 00:09:24.160 +One Mima way. + +00:09:24.300 --> 00:09:25.580 +You're just one Mima way. + +00:09:25.680 --> 00:09:26.820 +Just one Mima way. + +00:09:27.020 --> 00:09:27.800 +Yeah, that's true. + +00:09:28.400 --> 00:09:30.000 +We keep piling them up internally. + +00:09:30.560 --> 00:09:34.060 +But yeah, I used to work with Patrick together for years, also on Craftia. + +00:09:34.360 --> 00:09:35.280 +Same library as him. + +00:09:35.540 --> 00:09:36.340 +That's how I know him. + +00:09:36.620 --> 00:09:40.300 +And that's why I, well, made a weird sound when you said soap. + +00:09:41.320 --> 00:09:44.940 +I've been with FastAPI Cloud since EuroPattern, actually, the last one. + +00:09:45.220 --> 00:09:48.620 +I promised Sebastian I would implement server-send events in FastAPI, + +00:09:48.720 --> 00:09:54.220 +and I haven't started with it yet at all, but somehow I'm still here. + +00:09:54.290 --> 00:09:54.840 +So that's great. + +00:09:55.960 --> 00:10:00.420 +Well, actually, yeah, and it was actually like a sneak peek, I guess. + +00:10:00.540 --> 00:10:04.280 +We already started having a bunch of chats and discussing what we do. + +00:10:04.430 --> 00:10:05.980 +Should we do it here? Should we do it there? + +00:10:06.100 --> 00:10:06.660 +What should we do? + +00:10:07.140 --> 00:10:11.640 +So like, yeah, it's something that is coming to FastAPI probably soon-ish. + +00:10:11.800 --> 00:10:14.520 +Like there was a lot of things that needed to happen before that. + +00:10:14.700 --> 00:10:18.360 +Like Patrick is slightly smiley, like, oh no, this is pressure. + +00:10:19.740 --> 00:10:23.160 +There were some things that needed to happen in FastAPI, as you know, dropping support for + +00:10:23.420 --> 00:10:27.480 +Pydantic version one or things like that, that just made the internal code so complex. + +00:10:28.020 --> 00:10:32.820 +And now that it's over, we can actually work more on improving performance, adding features + +00:10:33.040 --> 00:10:33.760 +and things like that. + +00:10:34.220 --> 00:10:42.820 +I definitely want to dive into how FastAPI Cloud has sort of influenced the whole FastAPI side of things. + +00:10:43.480 --> 00:10:50.780 +But I made aware that there is, in fact, an entire website, an entire website dedicated to the meme. + +00:10:51.180 --> 00:10:54.680 +Yeah. And out in the audience, we get, hey, everyone, is that the guy from the meme? + +00:10:54.960 --> 00:10:56.980 +And meme is greater than Nobel Prize. + +00:10:57.480 --> 00:10:59.040 +So, you know what? + +00:10:59.460 --> 00:11:00.280 +It may be true. + +00:11:00.660 --> 00:11:01.220 +It may be true. + +00:11:01.240 --> 00:11:04.540 +I recognize the person saying, this is the guy from the Ming. + +00:11:04.720 --> 00:11:05.740 +He might be my husband. + +00:11:09.140 --> 00:11:09.580 +Incredible. + +00:11:11.500 --> 00:11:11.840 +Incredible. + +00:11:12.600 --> 00:11:16.100 +All right, well, let's start with FastAPI Cloud, + +00:11:16.920 --> 00:11:19.260 +and then we'll bring it back around to the FastAPI. + +00:11:19.380 --> 00:11:21.300 +Let's talk origin story. + +00:11:21.540 --> 00:11:23.700 +So what is this FastAPI Cloud? + +00:11:23.900 --> 00:11:24.240 +Nice. + +00:11:24.660 --> 00:11:27.600 +So if you were looking at the FastAPI Labs website, + +00:11:27.660 --> 00:11:29.600 +that doesn't really show that much. + +00:11:29.660 --> 00:11:33.680 +If you click on the join the waiting list that takes you to the website for FastAPI Cloud, + +00:11:34.200 --> 00:11:36.720 +there we can see this is what we are building. + +00:11:36.920 --> 00:11:38.120 +This is the thing that we are doing. + +00:11:39.120 --> 00:11:40.220 +It's actually super simple. + +00:11:40.400 --> 00:11:44.540 +The funny thing is that the pitch, the explanation of the product is so short. + +00:11:44.760 --> 00:11:45.520 +So it's one command. + +00:11:45.590 --> 00:11:46.600 +It's FastAPI Deploy. + +00:11:46.960 --> 00:11:51.400 +You have a FastAPI app, you just hit FastAPI Deploy, and then it's on the cloud. + +00:11:51.410 --> 00:11:52.520 +We take care of everything. + +00:11:52.590 --> 00:11:57.580 +We build a thing, deploy it, handle HTTPS, auto-scaling, all this stuff. + +00:11:57.920 --> 00:12:00.760 +and then you can just like focus on building the application, + +00:12:01.230 --> 00:12:01.580 +building apps. + +00:12:01.920 --> 00:12:03.760 +The funny thing is that it's super short to explain, + +00:12:03.880 --> 00:12:06.100 +but then building it is so cold, Blake. + +00:12:06.220 --> 00:12:07.020 +I'm glad it's so short. + +00:12:07.220 --> 00:12:08.000 +So thanks for being here. + +00:12:08.080 --> 00:12:08.960 +That was a great show, y'all. + +00:12:09.260 --> 00:12:10.040 +Well, no, just kidding. + +00:12:11.120 --> 00:12:14.960 +I think it's a little bit like Jupyter Notebooks + +00:12:15.050 --> 00:12:17.960 +in that sense that like you all are taking one for the team + +00:12:18.620 --> 00:12:21.600 +so that other people can have a simple experience. + +00:12:22.500 --> 00:12:25.100 +Whereas, you know, it's like those Jupyter folks, + +00:12:25.130 --> 00:12:27.620 +they write tons of TypeScript and do all sorts of things + +00:12:27.640 --> 00:12:30.380 +that nobody wants to necessarily do in the data science space + +00:12:31.040 --> 00:12:32.760 +so that you can just drag your widgets around. + +00:12:33.000 --> 00:12:33.360 +You know what I mean? + +00:12:33.500 --> 00:12:35.060 +I think that is a great analogy. + +00:12:35.360 --> 00:12:39.000 +I feel like the deployment space, it's a bit of a mixed bag. + +00:12:39.500 --> 00:12:44.540 +And I've been really frustrated to the point such that I wrote a book about it, + +00:12:45.020 --> 00:12:47.340 +that I think about an alternative, + +00:12:48.000 --> 00:12:51.240 +that I think over the last five plus years, + +00:12:51.580 --> 00:12:55.080 +it's just trended towards a little more complex, a little more complex. + +00:12:56.000 --> 00:12:57.460 +could we just add one of these things? + +00:12:57.680 --> 00:12:58.760 +And oh, now we got these three. + +00:12:58.860 --> 00:13:01.920 +We need one more thing to like make sure those things are doing, you know what I mean? + +00:13:02.500 --> 00:13:07.180 +And it's just like, wow, why are there 200 choices in my console to use this? + +00:13:07.440 --> 00:13:08.780 +Which is like kind of funny, right? + +00:13:08.920 --> 00:13:12.100 +Because I feel like a lot of these companies started with this, like, + +00:13:12.280 --> 00:13:15.300 +I don't want to understand all the ins and outs of all the infrastructure + +00:13:15.680 --> 00:13:17.400 +that comes with the cloud service provider. + +00:13:17.580 --> 00:13:19.840 +And that's really complicated to understand because I'm an app dev + +00:13:19.960 --> 00:13:22.380 +and I don't know anything about, you know, whatever, right? + +00:13:22.520 --> 00:13:25.400 +Now we're like, I don't know, kind of slowly accumulating. + +00:13:25.460 --> 00:13:30.480 +complexity but i think one of the cool things about what we're building and i like i've worked + +00:13:30.480 --> 00:13:35.800 +on cloud tooling before is like this is like just spoke for python developers and i think that's like + +00:13:35.940 --> 00:13:41.220 +quite like unique in that like we are really trying to like bring the bleeding edge and like + +00:13:41.240 --> 00:13:45.420 +all the new tooling that people are using and making sure that we play well with like uv and + +00:13:45.520 --> 00:13:52.180 +like i think like the there's a lot of thought and care put into that by the team that's a super good + +00:13:52.200 --> 00:13:56.880 +point. I mean, I remember Azure came out with like, here's your platform as a service. You just + +00:13:57.280 --> 00:14:01.640 +upload your web app and we'll just take it and go. And now that thing is so complicated along with + +00:14:01.860 --> 00:14:06.160 +many, many others, right? It's not just them. It's you've got AWS, you've got Vercel. There's + +00:14:06.240 --> 00:14:09.900 +lots of things we could point at for, wow, there's a lot of options here, you know? + +00:14:09.960 --> 00:14:13.920 +And then there are a lot of tools and like, you know, like many tools and many companies are also + +00:14:14.040 --> 00:14:18.379 +like doing a great job at many of the things that they are doing. But in many cases, it's just so + +00:14:18.400 --> 00:14:19.500 +complex, it's so complicated. + +00:14:19.740 --> 00:14:25.340 +You know, like I was, I have always been so, what, so adamant, I think is the word, to + +00:14:25.700 --> 00:14:27.580 +just teaching people how to use the tools. + +00:14:28.000 --> 00:14:33.540 +I think I have the most documentation about how to deploy things on your own than any other + +00:14:33.760 --> 00:14:33.920 +framework. + +00:14:34.020 --> 00:14:35.360 +I have so much information. + +00:14:35.460 --> 00:14:36.940 +MARK MANDEL: I hear that all the time from people. + +00:14:37.140 --> 00:14:41.100 +They say one of the reasons they chose FastAPIs is because how clear the documentation + +00:14:41.440 --> 00:14:41.760 +was, you know? + +00:14:41.760 --> 00:14:44.979 +FRANCESC CAMPOY: And then the thing is, you know, like just learning all those concepts + +00:14:45.000 --> 00:14:47.520 +and learning all the stuff that needs to be learned + +00:14:47.750 --> 00:14:48.740 +just to deploy something, + +00:14:48.810 --> 00:14:50.900 +and then you barely have the minimum. + +00:14:51.460 --> 00:14:53.480 +It's like, this is just too much. + +00:14:53.640 --> 00:14:54.740 +It's too much complacency. + +00:14:54.960 --> 00:14:58.340 +I think for me, I guess personally, + +00:14:58.520 --> 00:15:00.400 +my analogy is that FastAPI Cloud + +00:15:00.560 --> 00:15:03.280 +is the equivalent of what FastAPI is + +00:15:03.350 --> 00:15:05.800 +to building web APIs and backend. + +00:15:06.140 --> 00:15:08.340 +You could do the same with any other framework. + +00:15:08.390 --> 00:15:09.580 +You could validate data. + +00:15:09.650 --> 00:15:10.940 +You could generate open API. + +00:15:11.070 --> 00:15:12.280 +You could have automatic docs, + +00:15:12.640 --> 00:15:19.340 +But you will probably have to do a lot of the wiring yourself and making sure that it's actually correct and that it doesn't explode, all that stuff. + +00:15:19.710 --> 00:15:25.600 +That is, you know, like we are trying to do a lot of that work for the final users. + +00:15:25.800 --> 00:15:26.840 +Yeah, and I think it's great. + +00:15:27.260 --> 00:15:36.500 +I think it's really nice to just provide this on-ramp because, as you said at the opening, when I asked, you know, what the origin story is just FastAPI deploy. + +00:15:37.050 --> 00:15:38.300 +That solves so many stories. + +00:15:38.580 --> 00:15:42.920 +But I'm sure behind the scenes, what happens is just about as simple as that. + +00:15:44.280 --> 00:15:44.940 +Oh my gosh. + +00:15:45.060 --> 00:15:45.720 +About that. + +00:15:48.460 --> 00:15:51.820 +Some of us don't even get to write Python anymore to make all of this happen. + +00:15:52.300 --> 00:15:54.480 +So speaking about taking one for the team. + +00:15:56.040 --> 00:15:57.620 +Yeah, that is taking one for a team, right? + +00:15:57.740 --> 00:15:58.200 +It is. + +00:16:00.220 --> 00:16:02.660 +This portion of Talk Python To Me is brought to you by us. + +00:16:02.980 --> 00:16:07.520 +I'm thrilled to announce a brand new app built for developers created by yours truly. + +00:16:07.980 --> 00:16:09.140 +It's called Command Book. + +00:16:09.930 --> 00:16:11.200 +You know that thing you do every morning? + +00:16:11.880 --> 00:16:14.460 +Open up six terminal tabs, CD into this directory, + +00:16:14.980 --> 00:16:16.020 +activate that virtual environment, + +00:16:16.550 --> 00:16:17.960 +run the server with --reload. + +00:16:18.260 --> 00:16:20.960 +Now, CD somewhere else, start the background worker, + +00:16:21.360 --> 00:16:23.940 +another tab for Docker, another one to tail production logs. + +00:16:24.400 --> 00:16:27.120 +Every tab just says Python, Python, Python, Docker tail. + +00:16:28.220 --> 00:16:29.160 +And you're clicking through them going, + +00:16:29.620 --> 00:16:30.840 +which Python was that again? + +00:16:31.260 --> 00:16:32.040 +Where my app is running? + +00:16:32.620 --> 00:16:35.560 +Then sometime later, your dev server silently dies + +00:16:35.750 --> 00:16:36.879 +because it tried to reload + +00:16:36.900 --> 00:16:38.420 +while you're in the middle of a code edit, + +00:16:39.020 --> 00:16:41.580 +unmatched brace, a half-written import, or something. + +00:16:42.320 --> 00:16:43.340 +Now you're hunting through tabs + +00:16:43.360 --> 00:16:45.800 +to figure out which process crashed and how to restart it. + +00:16:46.280 --> 00:16:47.260 +My app, CommandBook, + +00:16:47.500 --> 00:16:50.820 +gives all of these long-running commands a permanent home. + +00:16:51.380 --> 00:16:52.560 +You save a command once, + +00:16:53.000 --> 00:16:54.180 +the working directory, the environment, + +00:16:54.500 --> 00:16:55.640 +three commands like git pull, + +00:16:55.980 --> 00:16:57.640 +and from then on, you just click run. + +00:16:58.180 --> 00:16:59.540 +You can even group commands together + +00:16:59.840 --> 00:17:01.640 +to start and stop everything for a project + +00:17:01.880 --> 00:17:02.520 +with a single click. + +00:17:02.960 --> 00:17:05.040 +It also has what I call honey badger mode, + +00:17:05.260 --> 00:17:06.439 +auto restart on crash. + +00:17:07.220 --> 00:17:09.640 +so when your dev server goes down mid-reload, + +00:17:10.199 --> 00:17:12.079 +Command Book just brings it right back up + +00:17:12.360 --> 00:17:14.400 +and does so over and over until the code is fixed. + +00:17:15.020 --> 00:17:16.819 +It also detects URLs from your output + +00:17:17.079 --> 00:17:19.400 +so you're never scrolling through thousands of lines of logs + +00:17:19.600 --> 00:17:21.560 +just to figure out how to reopen your web app. + +00:17:22.079 --> 00:17:23.780 +And it shows you uptime, memory usage, + +00:17:24.120 --> 00:17:25.959 +and all sorts of cool things about your process. + +00:17:26.660 --> 00:17:28.640 +The whole thing is a native macOS app. + +00:17:28.780 --> 00:17:31.200 +No Electron, no Chromium, just 21 megs. + +00:17:31.640 --> 00:17:32.960 +And it comes with a full CLI + +00:17:33.180 --> 00:17:35.040 +so anything you've configured in the UI, + +00:17:35.420 --> 00:17:37.900 +you can fire off from your terminal with just a single command. + +00:17:38.440 --> 00:17:41.580 +Right now, it's macOS only, but if there's enough interest, + +00:17:41.840 --> 00:17:43.840 +I'll build a Windows version too, so let me know. + +00:17:44.660 --> 00:17:48.580 +Please check it out at talkpython.fm/commandbook app. + +00:17:49.140 --> 00:17:51.640 +Download it for free, level up your developer workflow. + +00:17:52.080 --> 00:17:53.720 +The link is in your podcast player's show notes. + +00:17:54.320 --> 00:17:56.380 +That's talkpython.fm/commandbook. + +00:17:56.580 --> 00:17:58.500 +I really hope you enjoy this new app that I built. + +00:18:00.320 --> 00:18:02.540 +Let's save the internals for a little bit later. + +00:18:02.940 --> 00:18:04.440 +Maybe what we could do right now, + +00:18:04.540 --> 00:18:12.300 +Maybe we could do a bit of a walkthrough of just kind of what it's like to set up an app from scratch, right? + +00:18:12.480 --> 00:18:12.700 +Nice. + +00:18:12.900 --> 00:18:20.220 +I see that uv is here, which is, I've been certainly an advocate for uv in all sorts of deployment, + +00:18:20.260 --> 00:18:25.380 +but especially when you have like repeated build type of scenarios for like Docker, + +00:18:26.060 --> 00:18:31.680 +Docker Compose or Kubernetes or whatever, uv makes that stuff so much faster and so on. + +00:18:31.780 --> 00:18:37.140 +So who would like to be my guide that just kind of talks us through what it means to set up a new project here? + +00:18:37.300 --> 00:18:41.500 +I mean, there is like this really nice command that Savannah built, just FastAPI new, + +00:18:41.920 --> 00:18:44.960 +which I think is something like, I don't know, like super helpful. + +00:18:45.160 --> 00:18:46.660 +What does FastAPI new do? + +00:18:46.880 --> 00:18:51.540 +Like, is that kind of a cookie cutter-esque experience or what is it? + +00:18:51.600 --> 00:18:52.080 +Yes, exactly. + +00:18:52.160 --> 00:18:56.960 +At the moment, Onesco holds a super basic FastAPI application using uv. + +00:18:57.360 --> 00:19:00.080 +It also installs dependencies, creates a folder, everything that you need. + +00:19:00.440 --> 00:19:03.240 +In future, I think we're going to plan support for templates + +00:19:03.500 --> 00:19:05.940 +so you can build multiple kind of things as well. + +00:19:06.210 --> 00:19:08.840 +But for now, it's just basically just uv FastAPI new, + +00:19:09.220 --> 00:19:10.640 +sorry, uvx FastAPI new, + +00:19:10.910 --> 00:19:12.540 +and then that scaffolds the project for you. + +00:19:12.900 --> 00:19:14.680 +I don't know if you want to try it live or... + +00:19:14.700 --> 00:19:15.440 +No, go ahead. + +00:19:15.580 --> 00:19:17.980 +Just, I would think it might disrupt you. + +00:19:18.030 --> 00:19:19.080 +Just let's talk us through it. + +00:19:19.240 --> 00:19:19.920 +It could work. + +00:19:20.240 --> 00:19:21.740 +I'm just going to put that out there. + +00:19:21.880 --> 00:19:23.000 +I'll tell you the most insane, + +00:19:23.330 --> 00:19:25.320 +like let's do that live on the podcast experience. + +00:19:25.520 --> 00:19:28.160 +I'm pretty sure, yeah, this is definitely the most insane. + +00:19:28.620 --> 00:19:36.320 +I had Matthew Rocklin on from Coiled, and those guys are all about like, hey, we're going to scale up like a bunch of available servers for you, right? + +00:19:36.700 --> 00:19:37.940 +So that you can do your data science. + +00:19:38.020 --> 00:19:41.000 +Like I want to do some ML thing, and it needs 500 servers. + +00:19:41.600 --> 00:19:45.520 +So during the podcast, he's, oh, let me just spin up 2,000 EC2 instances. + +00:19:45.800 --> 00:19:46.060 +Hold on. + +00:19:47.020 --> 00:19:49.040 +And then we ran some code on it during the show. + +00:19:49.100 --> 00:19:50.200 +And he's like, oh, let's try that on ARM. + +00:19:50.380 --> 00:19:53.040 +And then spin up another 2,000 on ARM Linux machines. + +00:19:53.120 --> 00:19:54.180 +I'm like, okay, that's nuts. + +00:19:55.440 --> 00:19:55.840 +But let's just. + +00:19:56.420 --> 00:19:57.180 +That's a lot of power. + +00:19:57.260 --> 00:20:01.960 +So I was impressed, but Patrick, sorry, I kind of did realty there. + +00:20:02.500 --> 00:20:02.980 +Let's talk through it. + +00:20:03.020 --> 00:20:08.260 +Yeah, so you do uvx FastAPI app, FastAPI new, then you specify the name of the application. + +00:20:08.740 --> 00:20:10.080 +And that's almost there. + +00:20:10.080 --> 00:20:13.240 +You just need one more command to deploy, which is FastAPI deploy. + +00:20:13.640 --> 00:20:17.660 +The first time it's going to ask you to log in or join the waiting list if you haven't been invited yet. + +00:20:17.940 --> 00:20:18.940 +It's still in beta. + +00:20:19.500 --> 00:20:20.900 +And then you follow the steps. + +00:20:21.200 --> 00:20:24.640 +So like FastAPI deploy, log in, decide the team. + +00:20:25.040 --> 00:20:27.820 +If you have multiple teams, design the application name, + +00:20:28.220 --> 00:20:31.020 +and then you wait a few seconds and the application is going to be live. + +00:20:31.100 --> 00:20:36.220 +And just to be clear, FastAPI new is not required if you already have a FastAPI app. + +00:20:36.520 --> 00:20:39.780 +If you've already written your own code and you have your application, + +00:20:40.120 --> 00:20:42.940 +you can just go right into logging in and deploying. + +00:20:43.440 --> 00:20:46.120 +This is just so that if you're starting something new, + +00:20:46.500 --> 00:20:49.900 +you don't have to do any thinking about all the right things that need to be there. + +00:20:50.180 --> 00:20:52.160 +So this is more of a greenfield application. + +00:20:52.880 --> 00:20:53.960 +I'm bootstrapping a project. + +00:20:54.240 --> 00:20:56.660 +Right, right, because you want to have the best structure. + +00:20:57.560 --> 00:20:58.980 +Now, it uses uv. + +00:20:59.420 --> 00:21:00.240 +It is nodes required. + +00:21:00.420 --> 00:21:03.900 +Yeah, I was going to say, do I have to use the uv project management type of thing? + +00:21:04.190 --> 00:21:08.100 +Do I have to use the uv.lock files and uv add uv sync? + +00:21:08.360 --> 00:21:09.480 +Can I do requirements.txt? + +00:21:09.640 --> 00:21:10.380 +What's the story there? + +00:21:10.500 --> 00:21:13.320 +Yes, so we support uv with uv lock. + +00:21:13.390 --> 00:21:16.860 +We also support the, forget the name, the other, the PyLock file. + +00:21:17.200 --> 00:21:19.580 +And we also support plain requirements.txt. + +00:21:19.980 --> 00:21:22.200 +And maybe something else, I don't know, Jonathan, can you? + +00:21:22.700 --> 00:21:23.720 +PyLock's pretty new, right? + +00:21:23.900 --> 00:21:26.520 +I think Brett Cannon just got that out pretty recently, right? + +00:21:26.820 --> 00:21:27.680 +Brett was pretty excited. + +00:21:27.980 --> 00:21:28.360 +I know. + +00:21:29.240 --> 00:21:29.960 +Implemented that. + +00:21:30.240 --> 00:21:30.660 +Oh, was he? + +00:21:30.850 --> 00:21:31.620 +Okay, I'm sure he was. + +00:21:31.800 --> 00:21:32.160 +That's awesome. + +00:21:32.260 --> 00:21:33.740 +He put years of work into that. + +00:21:33.860 --> 00:21:38.100 +And he also said that one of the motivations was also like, you know, like cloud providers. + +00:21:38.560 --> 00:21:39.200 +So it's like, yes. + +00:21:40.600 --> 00:21:49.220 +The other thing is like, you know, if you use other different package managers, if they use the standard PyProject Autonomous format, that will also be supported. + +00:21:49.500 --> 00:21:56.300 +That means that, you know, like if you use PDM or if you use poetry with one of the recent versions, like that will work. + +00:21:56.600 --> 00:22:03.100 +If you use a very old version of poetry or like you use some other strange package manager or something, that will probably be problematic. + +00:22:03.500 --> 00:22:08.440 +But for like most of the use cases that use the standard package formats, it will just work. + +00:22:08.640 --> 00:22:14.080 +And if you use uv, then like you're going to have the best experience because we are fans of uv and Astro. + +00:22:14.220 --> 00:22:19.380 +They've definitely put a dent in the way that sort of Python gets started and making that a lot easier. + +00:22:19.490 --> 00:22:20.500 +So it totally makes sense. + +00:22:21.310 --> 00:22:28.920 +And also, I noticed, speaking of uv, that there's, at least in the recommended way, or the way in the docs, let's say, + +00:22:29.580 --> 00:22:32.240 +it doesn't say, here's how you install FastAPI. + +00:22:32.650 --> 00:22:41.660 +You just, here's how you run FastAPI-new, leveraging uv, which then will silently install and manage. + +00:22:42.120 --> 00:22:43.260 +All right, that's pretty neat. + +00:22:43.420 --> 00:22:46.960 +That helps you guys tell a simpler story, right? + +00:22:47.350 --> 00:22:51.380 +Instead of, here's how you create the virtual environment to install our thing and so on, you know? + +00:22:51.380 --> 00:22:56.340 +The idea is to make it like, as I was saying, just super simple for people just to start from scratch. + +00:22:56.680 --> 00:23:00.760 +Like no idea how to create an app, how to start, how to create an environment. + +00:23:00.890 --> 00:23:03.340 +It's just you run this command and you're off to go. + +00:23:03.760 --> 00:23:06.200 +Off to the races, I'm missing sands. + +00:23:06.820 --> 00:23:09.460 +Anyway, that's what Colombians do. + +00:23:09.920 --> 00:23:17.880 +But then if you already have an app, you have like, you know, like anything with FastAPI standard installed, then like that also just works. + +00:23:18.010 --> 00:23:21.000 +And Savannah, you pointed out that it doesn't have to be a new project. + +00:23:21.620 --> 00:23:23.860 +If you want to start from an existing one, that's totally fine. + +00:23:24.300 --> 00:23:28.820 +But what do I got to do if I'm starting from, if I'm migrating an existing one? + +00:23:29.040 --> 00:23:30.320 +Like how easy or hard is this? + +00:23:30.580 --> 00:23:36.860 +I have some like legacy project demo apps I've built at other companies I've worked with that have used FastAPI. + +00:23:37.120 --> 00:23:44.460 +And I literally just ran like FastAPI login and then FastAPI deploy and it just worked, which felt really magical. + +00:23:44.740 --> 00:23:58.860 +Right. Like I think that's like, I don't know, like having worked on cloud products for quite a while, like I think one of the biggest gaps is like the just I don't know, like the disparity between like my local dev environment and what is actually like lives up in the cloud somewhere. + +00:23:59.360 --> 00:24:08.160 +And so being able to just run one command and having the project as it exists on my machine go and work somewhere without having to think about like the infrastructure. + +00:24:08.340 --> 00:24:14.920 +And of course, like, you know, we want to be like amenable to folks who do want a little bit, you know, like higher touch. + +00:24:15.480 --> 00:24:19.280 +But we also want to work for people who are like learning FastAPI and Python, right? + +00:24:19.400 --> 00:24:22.240 +Like educators and people that are teaching Python. + +00:24:22.420 --> 00:24:26.600 +I think this is like something that you've had some interest in as well from those folks. + +00:24:26.720 --> 00:24:44.200 +Yeah, I was just listening to the Teaching Python podcast books just the other day and thinking, you know, like this, when I look at this, I know this is not necessarily your focus, but certainly people who are trying to teach a class, be it college class or high school class or whatever. + +00:24:44.840 --> 00:24:49.020 +And if you build anything on the web, the next question is, this is cool. + +00:24:49.340 --> 00:24:50.420 +How do I share it with people? + +00:24:50.620 --> 00:24:51.800 +And then they're like, oh, no. + +00:24:52.420 --> 00:24:53.500 +Oh, no. + +00:24:53.720 --> 00:24:54.080 +Hold on. + +00:24:54.280 --> 00:24:55.560 +Like coding boot camps, right? + +00:24:55.640 --> 00:25:01.780 +Like if you're teaching someone how to write Python or how to build an API with FastAPI, + +00:25:02.110 --> 00:25:06.480 +like actually setting up the environment for them to deploy is not part of it, right? + +00:25:06.720 --> 00:25:08.660 +Like that's not actually part of the curriculum. + +00:25:08.840 --> 00:25:12.880 +It's like this peripheral thing that ends up eating up a bunch of the educator's time or + +00:25:12.890 --> 00:25:16.680 +the student's time trying to understand both like how to write code and then also understand + +00:25:16.840 --> 00:25:17.240 +cloud stuff. + +00:25:17.520 --> 00:25:20.380 +And that's like a lot to ask people when they're fresh up the gate. + +00:25:20.520 --> 00:25:23.720 +I feel the same way about like tutorials and stuff at conferences. + +00:25:24.200 --> 00:25:25.240 +Yeah, totally. + +00:25:25.780 --> 00:25:29.820 +Yeah. Or training sessions. If you're doing like corporate training or like, they're all like, + +00:25:29.940 --> 00:25:36.120 +Oh, well, let's get everybody's machine working. There goes an hour and whatever. But yeah, + +00:25:36.140 --> 00:25:40.520 +if you can just say, look, I think when you're either, when you're trying to learn something, + +00:25:40.960 --> 00:25:45.860 +you'll be it through school or on your own or through these like more structured ways, + +00:25:46.180 --> 00:25:50.760 +like bootcamps and training and so on. I think if it's not the main purpose, + +00:25:51.380 --> 00:25:56.200 +I feel so often there's like, we're going to do 20 steps for four hours before you get any sort of + +00:25:56.520 --> 00:26:01.080 +reward of what you've done. And if you can go, okay, do you have it running? Okay, now you run + +00:26:01.180 --> 00:26:04.880 +this command. Look, now it's on the internet. Like, oh, wait, awesome. I got an app on the internet. + +00:26:05.140 --> 00:26:09.780 +Everybody look at me. You know what I mean? And I think shortening that cycle to where people can + +00:26:10.000 --> 00:26:14.780 +have that aha moment. And then later they can dive into like, well, how is it really working? And what + +00:26:14.780 --> 00:26:21.340 +do we really need to understand? But that quick iteration cycle, especially in the early parts of + +00:26:21.360 --> 00:26:24.980 +in new tech. It's really important. But also, you know, like down the line as well, I think, + +00:26:25.280 --> 00:26:29.400 +like, I don't know, there are so many things that I have been wanting to build and I don't, + +00:26:29.780 --> 00:26:34.320 +but I didn't because it was just so complex to deploy stuff. You know, like knowing, + +00:26:34.680 --> 00:26:39.560 +knowing how to do the whole thing, how to set up the clusters, the machines, install the Linux + +00:26:40.299 --> 00:26:44.060 +systems, deploy the cluster, whatever, like all that stuff, deploy the things, + +00:26:44.360 --> 00:26:49.940 +handling load balancers and HTTPS. I'm like, you know, like I know how to do that. I built one of + +00:26:49.960 --> 00:26:54.400 +the most popular websites teaching how to use Docker to ARM, which was like the contender + +00:26:54.620 --> 00:26:56.160 +before Kubernetes won everything. + +00:26:56.580 --> 00:26:56.960 +I like it. + +00:26:57.380 --> 00:27:02.520 +But still, it's just so complicated, like doing all those steps that are like, yeah, no, I'll + +00:27:02.660 --> 00:27:03.380 +just not do it. + +00:27:04.000 --> 00:27:04.780 +Like some other day. + +00:27:05.360 --> 00:27:09.500 +Now I can just like play around and do random stuff and just like deploy when it just works. + +00:27:09.860 --> 00:27:11.300 +It is, I really like that. + +00:27:11.300 --> 00:27:16.220 +I guess like coming back to that, like taking one for the team point earlier, like I feel + +00:27:16.240 --> 00:27:20.860 +like building Python tooling. It's kind of like taking one for the team sometimes because you have + +00:27:20.860 --> 00:27:25.920 +these folks that are like, you know, brand new to Python. Like Python is an extremely approachable + +00:27:26.000 --> 00:27:30.160 +language for people who are new to writing code. But then, you know, we also want to make FastAPI + +00:27:30.420 --> 00:27:34.760 +cloud work for someone that's building like an enterprise grade application, right? And so like, + +00:27:35.240 --> 00:27:41.159 +like pretty wide spectrum of folks with like a million different use cases and different types + +00:27:41.180 --> 00:27:45.760 +of applications they want to deploy with different constraints and like security stuff. + +00:27:46.180 --> 00:27:49.540 +And like, so yeah, I think, I don't know, maybe that's just like Python tooling. + +00:27:50.060 --> 00:27:53.600 +It's a lot of work, I guess, to like build something that works for the masses. + +00:27:54.000 --> 00:27:57.000 +Yeah, well, it's certainly tough to make something that feels simple, + +00:27:57.520 --> 00:28:01.260 +but it's not overly simplistic, you know, that can actually solve the problems. + +00:28:01.540 --> 00:28:04.200 +Has the right knobs for the right users too, right? + +00:28:04.300 --> 00:28:07.920 +I would argue we're not only trying to do it simple and easy. + +00:28:08.220 --> 00:28:12.440 +I feel like we're choosing a particular flavor of simple, which is... + +00:28:13.400 --> 00:28:14.880 +We have this discussion a few times. + +00:28:15.080 --> 00:28:18.620 +It's like, if you make a cloud, how do we make it feel Pythonic? + +00:28:18.760 --> 00:28:20.480 +What does that mean in a cloud setting? + +00:28:20.900 --> 00:28:25.340 +We talk about Pythonic libraries, Pythonic coding style in the community a lot. + +00:28:25.700 --> 00:28:29.620 +And now we kind of try to transfer that flavor, that feeling to the cloud + +00:28:29.700 --> 00:28:33.720 +and make everything around that feel just like we want our libraries to feel. + +00:28:34.020 --> 00:28:36.960 +So you feel at home as a Python developer and it just feels right. + +00:28:37.280 --> 00:28:40.160 +So that's extra step on top of making it simple. + +00:28:40.560 --> 00:28:41.560 +And we discuss that a lot. + +00:28:41.820 --> 00:28:42.740 +That's how I feel about it. + +00:28:42.770 --> 00:28:43.160 +I love it. + +00:28:43.170 --> 00:28:46.860 +I think it's one of the coolest things about this team. + +00:28:47.440 --> 00:28:50.280 +Like, you know, like people are being able to hear a few of us. + +00:28:50.360 --> 00:28:55.040 +There's like, there are like a bunch of others, but like that each one of us is so passionate + +00:28:55.230 --> 00:28:57.320 +about the things that we are working on. + +00:28:57.760 --> 00:29:01.780 +So like, you know, like each one of us is trying to make the best out of the things that + +00:29:01.960 --> 00:29:02.540 +we are building. + +00:29:02.860 --> 00:29:07.040 +And then like, we are so passionate about the thing that we care about and that we are building. + +00:29:07.160 --> 00:29:10.740 +that I think that ends up being an amazing result. + +00:29:11.000 --> 00:29:12.360 +For example, the CLI. + +00:29:12.660 --> 00:29:16.960 +We wanted to have some specific, you know, like behavior, + +00:29:17.140 --> 00:29:18.140 +some look and feel. + +00:29:18.220 --> 00:29:22.200 +And like we wanted to be able to have like the best kind of CLIs. + +00:29:22.320 --> 00:29:25.700 +So Patrick went ahead and built this bunch of tooling + +00:29:25.840 --> 00:29:27.380 +that we needed to be able to have it + +00:29:27.460 --> 00:29:28.720 +and like made it open source and everything. + +00:29:29.160 --> 00:29:32.780 +So we could have this great experience when working with CLIs. + +00:29:33.140 --> 00:29:35.859 +Jonathan recently was doing so much stuff + +00:29:35.880 --> 00:29:38.580 +about the something that caches and handling security, + +00:29:38.940 --> 00:29:40.680 +making sure that everything was super secure, + +00:29:41.060 --> 00:29:42.420 +super fast, super snappy. + +00:29:42.800 --> 00:29:46.220 +You know, like Alejandra is super careful about all the UI. + +00:29:46.740 --> 00:29:48.820 +Martin is super careful about all the infra. + +00:29:49.200 --> 00:29:52.940 +You know, it's like this passionate mess, + +00:29:53.320 --> 00:29:54.420 +which is a word I just made up. + +00:29:57.340 --> 00:29:58.600 +This, Alejandra goes and says, + +00:29:58.640 --> 00:30:01.220 +like, this thing doesn't have the proper margins. + +00:30:01.320 --> 00:30:02.920 +We need to increase this a little bit. + +00:30:02.980 --> 00:30:04.340 +I don't like it. + +00:30:04.360 --> 00:30:05.360 +She just goes and fixes it. + +00:30:05.600 --> 00:30:06.100 +The same with Marvin. + +00:30:06.190 --> 00:30:09.340 +He says, like, we need to have, like, this sort of thing in infrastructure. + +00:30:09.550 --> 00:30:12.040 +And, like, just comes and tells me, hey, we are doing this. + +00:30:12.140 --> 00:30:12.800 +And he's like, yes, sir. + +00:30:13.340 --> 00:30:16.260 +Like this with the OpenVPN, like, Unix, for example, + +00:30:16.350 --> 00:30:18.100 +that is mainly focused on the open source, + +00:30:18.320 --> 00:30:21.960 +is constantly looking at all the discussions, PRs, conversations, + +00:30:22.340 --> 00:30:24.840 +making sure that everything that we do, that doesn't. + +00:30:24.840 --> 00:30:26.300 +So why, you know, like, there have been, like, + +00:30:26.810 --> 00:30:31.159 +recently way more releases of FastAPI friends of the open source projects + +00:30:31.620 --> 00:30:33.520 +and very fast book fixes, + +00:30:34.150 --> 00:30:37.060 +very fast responses to handle everything for the community. + +00:30:37.600 --> 00:30:40.360 +Now we actually have people that is paying attention constantly + +00:30:40.720 --> 00:30:41.460 +to what is happening, + +00:30:42.320 --> 00:30:43.360 +what are the things that we have to do, + +00:30:43.700 --> 00:30:46.620 +and that really care about that part as well. + +00:30:46.650 --> 00:30:50.220 +So I think this extreme care about what we are doing. + +00:30:50.440 --> 00:30:52.180 +You know, like Savannah is making Python. + +00:30:53.060 --> 00:30:56.860 +This detail that each one of us cares so, so much + +00:30:57.100 --> 00:30:58.500 +about each one of the things that we build, + +00:30:59.540 --> 00:31:01.620 +making sure that the product is actually amazing. + +00:31:01.840 --> 00:31:03.140 +It's as good as it can be, + +00:31:03.280 --> 00:31:05.500 +and we can all feel at home when... + +00:31:06.940 --> 00:31:08.920 +I get so excited for talking about it + +00:31:08.980 --> 00:31:12.140 +because I really enjoy the end result of the product + +00:31:12.220 --> 00:31:13.280 +and of being able to use it. + +00:31:13.820 --> 00:31:15.020 +I would use it in the end. + +00:31:15.140 --> 00:31:16.420 +I would use to work with it. + +00:31:16.800 --> 00:31:17.560 +It's super simple. + +00:31:17.660 --> 00:31:18.300 +Yeah, that's awesome. + +00:31:18.480 --> 00:31:19.680 +Hey, let me adjust your mic real quick. + +00:31:20.140 --> 00:31:21.840 +I think it was like ducking, + +00:31:22.300 --> 00:31:23.080 +ducking out a little bit. + +00:31:23.080 --> 00:31:24.000 +We just went through a lot, + +00:31:24.140 --> 00:31:26.399 +a lot of content and a lot of sweating + +00:31:26.420 --> 00:31:31.380 +because your microphone went through like six different stages of weirdness. + +00:31:31.720 --> 00:31:39.180 +I think that really leads to like something I wanted to talk about is just what impact has this had on FastAPI? + +00:31:39.180 --> 00:31:44.640 +And before you jump in and answer that question, everyone, there's especially I think with Astral, + +00:31:44.820 --> 00:31:49.920 +but with because they've had so much success, there's been an undercurrent of concern of like, + +00:31:49.920 --> 00:31:53.200 +oh, my gosh, commercialism is getting into our open source. + +00:31:53.480 --> 00:31:56.440 +And what if it pollutes it and causes these negative aspects? + +00:31:57.560 --> 00:32:02.500 +But just hearing all of the energy around FastAPI with so many people, + +00:32:03.140 --> 00:32:05.580 +because of FastAPI Cloud, that's super neat. + +00:32:05.770 --> 00:32:07.260 +So I wanted to throw out to you all, + +00:32:07.660 --> 00:32:12.100 +how has this building FastAPI Cloud and the existence of FastAPI Cloud + +00:32:12.300 --> 00:32:14.840 +been giving back to FastAPI, I guess? + +00:32:14.980 --> 00:32:17.560 +I'm waiting to see if someone will speak FastAPI. + +00:32:18.760 --> 00:32:20.580 +I'm always the one that is speaking the most. + +00:32:22.400 --> 00:32:23.920 +I mean, it might be your project. + +00:32:24.220 --> 00:32:25.680 +Like, you may have started the project. + +00:32:25.900 --> 00:32:26.600 +Yeah, maybe so. + +00:32:27.020 --> 00:32:31.220 +Like, last year, I had, like, a few keynotes in some picons in different places. + +00:32:31.320 --> 00:32:41.660 +And, like, one of the key points that I wanted to bring was this idea that I'm trying to show that, in many cases, people worry about the boss factor. + +00:32:41.960 --> 00:32:43.580 +And the boss factor is just this idea. + +00:32:43.840 --> 00:32:45.180 +Yes, yes, I've heard this, yes. + +00:32:45.320 --> 00:32:51.320 +Yeah, you know, like, the boss factor is the idea that, oh, what happens if, like, there's one person doing this work? + +00:32:51.420 --> 00:32:54.180 +What happens if a boss runs over this person? + +00:32:54.860 --> 00:32:58.260 +And there's so much worry about this boss factor. + +00:32:58.420 --> 00:33:01.960 +It's sort of a morbid analogy, but I understand, right? + +00:33:02.040 --> 00:33:06.980 +Like, what will happen to the open source project if the maintainer vanishes for some reason, right? + +00:33:07.140 --> 00:33:07.460 +Exactly. + +00:33:08.080 --> 00:33:11.160 +But, you know, like, it also applies to projects and to many other different things. + +00:33:11.600 --> 00:33:18.760 +But what I think is that it's a disproportionate amount of attention to this detail of the boss factor. + +00:33:19.180 --> 00:33:30.300 +And I think every time people talk about the boss factor, you know, like one of my points in what I was trying to say in these talks was I would like people to think about the boss ticket factor. + +00:33:30.720 --> 00:33:32.140 +Who is paying for those tickets? + +00:33:32.400 --> 00:33:34.140 +It doesn't matter how big is the team. + +00:33:34.440 --> 00:33:37.960 +You know, like you have seen Google, Amazon, Meta, all the big ones. + +00:33:38.240 --> 00:33:39.960 +They don't have a small boost factor. + +00:33:40.020 --> 00:33:45.100 +They have a lot of people in their payroll and still they finish products. + +00:33:45.340 --> 00:33:46.340 +They just cancel them. + +00:33:46.660 --> 00:33:48.940 +Open source or private or whatever. + +00:33:49.620 --> 00:33:51.820 +is not the main factor + +00:33:52.680 --> 00:33:54.640 +defining the success of a project, + +00:33:55.080 --> 00:33:57.240 +being it commercial or open source of any type, + +00:33:57.700 --> 00:34:00.280 +is not really how many people are behind it. + +00:34:00.640 --> 00:34:02.960 +It's more of what is the value + +00:34:03.140 --> 00:34:05.380 +that whoever is putting the effort to keep it alive + +00:34:05.840 --> 00:34:07.660 +is getting from putting all that effort. + +00:34:07.980 --> 00:34:09.020 +It could be just satisfaction. + +00:34:09.419 --> 00:34:10.600 +It could be like open source, + +00:34:10.679 --> 00:34:13.240 +like, oh, I feel so good that I'm contributing to society. + +00:34:13.360 --> 00:34:13.940 +And that is valid. + +00:34:14.600 --> 00:34:15.440 +It doesn't pay the rent, + +00:34:15.899 --> 00:34:16.899 +but it's still valid. + +00:34:17.080 --> 00:34:18.100 +It might last for a while. + +00:34:18.440 --> 00:34:25.820 +But then also like, you know, like when you see like there are so many Python projects, so many Python, so many open source projects that can do well or can do bad. + +00:34:26.000 --> 00:34:28.300 +And it doesn't really depend on how many people they have. + +00:34:28.820 --> 00:34:40.780 +And when you are using a project, when you're using an open source project or when you are using a product of any type, I will encourage you to think about what is the best ticket factor of this project? + +00:34:41.260 --> 00:34:47.500 +What are the things that whoever is building this is receiving in exchange for giving it away? + +00:34:47.940 --> 00:34:51.840 +So like, you know, like what are they expecting to sell you at some point? + +00:34:52.300 --> 00:34:54.500 +Or what are they receiving in exchange? + +00:34:55.080 --> 00:34:58.520 +You know, for example, Bun, the JavaScript Brompton LiDAR. + +00:34:58.820 --> 00:35:00.880 +Like it was like, we don't know what they're going to sell. + +00:35:01.040 --> 00:35:06.940 +But now, you know, Cloud and Entropic really want to have like this thing keep working because they are using it internally. + +00:35:07.000 --> 00:35:09.360 +So you can say like, OK, I'm going to use it. + +00:35:09.420 --> 00:35:10.340 +I'm going to use it for free. + +00:35:10.500 --> 00:35:15.840 +I know that what they receive for me using is just like that they just really want it. + +00:35:15.980 --> 00:35:17.740 +So I can just like, whenever you are using bond, + +00:35:17.860 --> 00:35:20.380 +you are getting, now you are getting free services + +00:35:20.600 --> 00:35:21.460 +from Antropoc, that's it. + +00:35:21.860 --> 00:35:23.940 +But you know, like every time you are using a project, + +00:35:24.040 --> 00:35:26.680 +you can think about why are people receiving an exchange + +00:35:27.080 --> 00:35:28.700 +for giving this away for me? + +00:35:28.960 --> 00:35:31.100 +This is like the thing that I would like people + +00:35:31.340 --> 00:35:33.800 +to think about, you know, like also like + +00:35:34.020 --> 00:35:35.120 +how can they give back? + +00:35:35.280 --> 00:35:37.860 +Maybe they can actually contribute to that community + +00:35:38.080 --> 00:35:38.660 +or to that project. + +00:35:38.800 --> 00:35:40.440 +There are many ways and in many cases, + +00:35:41.140 --> 00:35:43.260 +the thing that is needed the most is just like help + +00:35:43.620 --> 00:35:45.420 +and work, just answering questions and issues. + +00:35:47.260 --> 00:35:52.320 +this portion of talk python enemy is brought to you by us i'm excited to talk about my first solo + +00:35:52.520 --> 00:35:57.780 +book talk python in production it's an inside look at how we host all the talk python sites + +00:35:58.120 --> 00:36:03.440 +apis mobile apps and way more here's the thing i believe most hosting stories sold to developers + +00:36:03.520 --> 00:36:08.720 +and data scientists are way over complicated and overpriced you've heard me say you're not google + +00:36:09.000 --> 00:36:13.859 +you're not netflix so you shouldn't run your infrastructure the way they do but if not that + +00:36:13.960 --> 00:36:20.140 +then what? This book is both a blueprint for what I chose for Talk Python and a story arc of 10 years + +00:36:20.140 --> 00:36:25.680 +of running my own infrastructure from a complete newbie, apprehensive to Linux, to some pretty + +00:36:25.860 --> 00:36:31.940 +neat infrastructures code DevOps. It covers Docker, Nginx, Let's Encrypt, self-hosted analytics and + +00:36:32.140 --> 00:36:37.880 +monitoring, CDN setup, framework migrations, and a whole philosophy that I've termed stack native, + +00:36:38.460 --> 00:36:42.980 +keeping things streamlined, powerful, and free of cloud lock-in. And it's more than just your + +00:36:43.000 --> 00:36:48.120 +standard tech book. It comes with code and figure galleries on GitHub, a discussion forum, and + +00:36:48.500 --> 00:36:54.020 +something unique, over an hour of audio readers briefs, short conversations that bookend each + +00:36:54.460 --> 00:36:59.920 +chapter to prime your focus or broaden your takeaways. Oh, and 0% of this book was written + +00:37:00.040 --> 00:37:03.960 +by AI. Every word is mine, written over the course in high months, for better or worse. + +00:37:04.500 --> 00:37:09.820 +I've made the first third of the book available for free online. After that, you can grab the DRM + +00:37:09.840 --> 00:37:15.380 +free EPUB and Kindle editions. And I'm working on a paperback edition as well. Please check it out + +00:37:15.460 --> 00:37:20.740 +at talkpython.fm/DevOps, or just click book in the nav bar on the website. It's a great way to + +00:37:20.820 --> 00:37:25.040 +support the podcast. And I hope it changes a bit how you think about running your apps in production. + +00:37:26.040 --> 00:37:30.480 +Kind of related to what you're saying, I think one of the angles that I really appreciate about + +00:37:30.560 --> 00:37:36.160 +the way we think about FastAPI and FastAPI Cloud is like where like a lot of our team was involved + +00:37:36.180 --> 00:37:41.020 +in open source before coming to work at FastAPI Cloud on various projects around the Python + +00:37:41.260 --> 00:37:42.240 +ecosystem, outside of Python. + +00:37:42.740 --> 00:37:48.140 +And I think all of us have deep appreciation and understanding of the value of open source + +00:37:48.780 --> 00:37:53.460 +and really, really try and build in a way that is like, I mean, Sebastian, you've talked + +00:37:53.560 --> 00:37:56.540 +about this a lot, but solving a real problem for folks, right? + +00:37:56.840 --> 00:38:01.920 +And so FastAPI Cloud is sort of this extension of this open source ecosystem people would + +00:38:01.980 --> 00:38:02.380 +be using. + +00:38:03.680 --> 00:38:05.120 +FastAPI Cloud may be an option. + +00:38:05.500 --> 00:38:07.580 +Maybe someone picks some other cloud for some reason. + +00:38:07.970 --> 00:38:10.580 +I don't think like, I think we're all very mindful of that. + +00:38:10.790 --> 00:38:12.520 +But like the angle that's very cool, I think, + +00:38:12.600 --> 00:38:15.600 +is that like, because we all work at FastAPI Cloud, + +00:38:15.880 --> 00:38:17.580 +like I know that I personally have time, + +00:38:17.860 --> 00:38:19.220 +more time for my open source work + +00:38:19.380 --> 00:38:21.420 +and my employer understands the value + +00:38:21.560 --> 00:38:22.540 +of my open source work, + +00:38:23.020 --> 00:38:25.160 +which isn't that positive for the open source community. + +00:38:25.290 --> 00:38:27.300 +Like I get to work on CPython sometimes + +00:38:27.820 --> 00:38:29.780 +and I have, you know, the bandwidth + +00:38:29.950 --> 00:38:31.480 +to go and do my steering council work + +00:38:31.540 --> 00:38:32.880 +or upcoming release management work. + +00:38:33.300 --> 00:38:34.880 +I understand like this sort of like, + +00:38:35.420 --> 00:38:39.200 +tempering, like open source, commercial, bad, all bad. + +00:38:39.400 --> 00:38:39.940 +It's not all bad. + +00:38:40.060 --> 00:38:41.920 +It's actually like really good in a lot of cases + +00:38:42.280 --> 00:38:43.740 +for folks to build business. + +00:38:43.920 --> 00:38:46.320 +Look at uv for an example to hold up, right? + +00:38:46.420 --> 00:38:47.700 +Astral, yeah, yeah, totally. + +00:38:47.860 --> 00:38:48.160 +Yeah, yeah. + +00:38:48.340 --> 00:38:50.180 +I think there are some really good examples of this. + +00:38:50.370 --> 00:38:52.900 +So I think like that's another angle that, + +00:38:52.980 --> 00:38:55.480 +I mean, I really, I get a lot of energy out of our team + +00:38:55.890 --> 00:38:57.380 +because we all, I don't have to, + +00:38:57.630 --> 00:38:59.220 +I don't have to fight the open source battle + +00:38:59.580 --> 00:39:00.860 +at FastAPI Cloud. + +00:39:01.160 --> 00:39:02.080 +I think that's really cool. + +00:39:02.260 --> 00:39:03.620 +I do think that's super cool as well. + +00:39:03.660 --> 00:39:05.260 +Let me put out two examples for you. + +00:39:05.340 --> 00:39:11.840 +here that I think everyone will be aware of as sort of to add to what Sebastian was saying is + +00:39:12.140 --> 00:39:17.500 +look how much Apple freaked out when Steve Jobs died and how many people work at Apple, right? + +00:39:17.940 --> 00:39:23.040 +Like that was still like, oh my gosh. But, you know, I think there's, they're hanging in there. + +00:39:23.140 --> 00:39:27.920 +They're going to be probably making it. They are not in our business. I tell you what, + +00:39:27.950 --> 00:39:35.300 +they got some of my money. That's for sure. But also, you know, look at Flask, right? Armin + +00:39:35.500 --> 00:39:41.000 +drifted away, which is totally fine. And David Lord and Pallets picked it up and kept running, + +00:39:41.090 --> 00:39:46.020 +right? Like it's still one of the most popular frameworks out there, right? So it's, I think + +00:39:46.030 --> 00:39:51.200 +the bus factor is over, overblown a bit, but also looking at the team of folks here, I think it's, + +00:39:51.340 --> 00:39:54.880 +it's even more obvious that there's a bunch of people who are on the inside, you know? + +00:39:54.990 --> 00:39:59.560 +For example, Flask, you know, like I learned so many things from Flask and like, the thing is, + +00:39:59.750 --> 00:40:04.020 +I feel like sometimes, sometimes people go and complain about the tool and say like, oh, + +00:40:04.140 --> 00:40:08.980 +this is not working for this or for that. And in many cases, it's in this insensitive way towards + +00:40:09.260 --> 00:40:13.360 +the people that are working on that. And it's like, you know, like in the end, realize that + +00:40:13.480 --> 00:40:17.440 +there's actually people behind the scenes doing the work. And like, in many cases, it's just like + +00:40:17.620 --> 00:40:22.480 +one or two people doing a lot of work in many cases, just for free. And, you know, like, I think + +00:40:22.820 --> 00:40:27.480 +it's worth calling that out. Like all the work that David Lord does for Flask is just like so + +00:40:27.600 --> 00:40:32.780 +much work. And yeah, deserves a lot of respect. I totally agree. The other thing that I forgot to + +00:40:32.800 --> 00:40:37.740 +mention is that there are so many ideas of potential products that I could build over the years, and I + +00:40:37.820 --> 00:40:42.840 +never did, and I never started a company because I didn't have clarity of what will be a good thing + +00:40:43.140 --> 00:40:49.860 +to actually sell and will have a good alignment. The cloud product has such a good alignment with + +00:40:49.860 --> 00:40:57.640 +the open source side because as more successful FastAPI is, the more successful FastAPI cloud + +00:40:57.920 --> 00:40:59.140 +has a potential to be. + +00:40:59.840 --> 00:41:02.640 +The more people using Python effectively, + +00:41:03.040 --> 00:41:05.800 +the more people might end up checking out FastAPI + +00:41:05.820 --> 00:41:07.800 +and the more people might end up checking out the product. + +00:41:08.240 --> 00:41:10.980 +So if FastAPI does well, if the open source does well, + +00:41:11.060 --> 00:41:13.140 +if Python does well, that's better for the company. + +00:41:13.340 --> 00:41:17.420 +So it doesn't really depend on my personal principles + +00:41:17.580 --> 00:41:19.020 +and values or something like that. + +00:41:19.340 --> 00:41:23.680 +It's aligned with, it's financially aligned with the company. + +00:41:24.040 --> 00:41:27.620 +So it's just going to be beneficial in the end + +00:41:27.640 --> 00:41:29.680 +It doesn't depend on good intentions. + +00:41:30.519 --> 00:41:31.760 +And FastAPI is open source. + +00:41:31.980 --> 00:41:33.800 +It has like 7,000 forks or something. + +00:41:34.020 --> 00:41:36.320 +So if a boost runs over me, + +00:41:37.020 --> 00:41:38.100 +there are 7,000 forks. + +00:41:38.220 --> 00:41:39.000 +It's not going away. + +00:41:39.160 --> 00:41:40.420 +I definitely agree with you on that. + +00:41:40.470 --> 00:41:43.300 +I feel like I should maybe give a little bit of a, + +00:41:43.860 --> 00:41:44.880 +I'll tell a little bit of the story + +00:41:45.130 --> 00:41:46.120 +of what's going on with, + +00:41:46.910 --> 00:41:47.600 +where did I put it? + +00:41:47.840 --> 00:41:48.780 +I don't think I pasted it over here, + +00:41:49.200 --> 00:41:51.160 +is what's going on with Tailwind right now. + +00:41:51.550 --> 00:41:54.180 +And I think Tailwind is having a tough time, + +00:41:54.280 --> 00:41:54.700 +Tailwind CSS. + +00:41:55.480 --> 00:42:01.500 +Traffic to Tailwind is up six times year over year on npm downloads. + +00:42:02.280 --> 00:42:07.060 +But the revenue of Tailwind is down five times. + +00:42:07.540 --> 00:42:11.580 +You know, I mean, these are completely out of whack things because instead of people going + +00:42:11.580 --> 00:42:15.560 +to docs to learn about it, it's just like, well, when you go to the docs, you learn they + +00:42:15.640 --> 00:42:16.960 +also have premium offerings, right? + +00:42:17.440 --> 00:42:22.240 +And I think you guys are different because it's not just, oh, here's a little bit nicer + +00:42:22.420 --> 00:42:23.520 +of a thing, right? + +00:42:23.680 --> 00:42:34.040 +I feel like it would be a little bit as if you were selling cookie cutter templates for FastAPI apps, you know, it's like, well, the AI can make the shape of the thing that comes out of the cookie cutter, to be honest. + +00:42:34.260 --> 00:42:41.380 +But you're offering something that has ongoing value that it costs more and is more complex in other places. + +00:42:42.020 --> 00:42:50.720 +And so I think maybe just thinking about the how this just keeps the team going for FastAPI is really awesome. + +00:42:50.880 --> 00:42:53.260 +And I think it's got a nice flywheel effect there. + +00:42:53.340 --> 00:42:56.540 +is I'll link to this, I guess, audio track. + +00:42:56.730 --> 00:42:57.540 +I don't know what I call it. + +00:42:57.580 --> 00:42:59.000 +It's a blog post that has one sentence, + +00:42:59.260 --> 00:43:00.960 +but a 30-minute audio you can check out + +00:43:01.120 --> 00:43:02.960 +from the guy, Adam, + +00:43:03.420 --> 00:43:04.600 +who's one of the founders of Tailwind, + +00:43:04.780 --> 00:43:06.060 +talking about going into this. + +00:43:06.740 --> 00:43:07.360 +It's kind of rough. + +00:43:07.450 --> 00:43:09.900 +I think I don't necessarily want to go into a deep AI, + +00:43:10.080 --> 00:43:11.000 +what it means for the industry. + +00:43:11.180 --> 00:43:13.120 +Like, let's stay focused on what you guys are doing. + +00:43:13.150 --> 00:43:16.940 +But I think it's going to be its own series. + +00:43:17.280 --> 00:43:21.240 +I mean, Stack Overflow had as many questions asked + +00:43:21.260 --> 00:43:25.240 +this month as they did in the first month of their existence, right? + +00:43:25.530 --> 00:43:29.480 +Three or 4,000, whereas at their peak, they were 200,000 questions a month. + +00:43:29.900 --> 00:43:34.920 +There's like real turmoil that's coming from some of these things, which is tricky. + +00:43:35.560 --> 00:43:40.000 +But I'm really excited to see you all doing this because I'm a big fan of FastAPI. + +00:43:40.500 --> 00:43:45.840 +And I think this is just sustaining and more for FastAPI, right? + +00:43:45.920 --> 00:43:46.520 +Like, what do you all think? + +00:43:46.640 --> 00:43:48.260 +That's what we hope that is going on. + +00:43:49.920 --> 00:43:51.840 +I thought about Taiwan for a second, right? + +00:43:52.060 --> 00:43:54.280 +It's not like we're immune to what happened to them. + +00:43:54.420 --> 00:43:56.500 +Like we also have a lot of documentation online. + +00:43:56.880 --> 00:43:58.040 +AI could train on that. + +00:43:58.240 --> 00:43:59.400 +And if it's good enough, + +00:43:59.420 --> 00:44:01.300 +it could maintain your infrastructure and stuff. + +00:44:01.380 --> 00:44:02.960 +It's just too hard at the moment. + +00:44:03.080 --> 00:44:05.460 +And there's an additional thing we're kind of selling, + +00:44:05.760 --> 00:44:07.620 +which is like, I guess, responsibility. + +00:44:08.180 --> 00:44:11.480 +Like you're shifting the risk from like letting your AI + +00:44:11.800 --> 00:44:15.540 +or your infantry team maintain your infrastructure to us. + +00:44:15.920 --> 00:44:18.660 +So we're staying up at night and worry about it. + +00:44:18.980 --> 00:44:20.400 +that has a lot of value as well. + +00:44:20.780 --> 00:44:24.300 +And that's probably not going to get removed by AI. + +00:44:24.980 --> 00:44:27.040 +Here's a very common cloud code, + +00:44:27.540 --> 00:44:28.760 +cursor, whatever conversation. + +00:44:29.520 --> 00:44:33.140 +Hey, build me something with Python and needs an API. + +00:44:33.240 --> 00:44:34.600 +Okay, we built it with FastAPI. + +00:44:34.900 --> 00:44:35.920 +How do I host it? + +00:44:36.230 --> 00:44:37.160 +Right, that doesn't just, + +00:44:37.380 --> 00:44:39.060 +it will build a cloud for you, right? + +00:44:39.140 --> 00:44:41.400 +It's going to recommend something out there. + +00:44:41.840 --> 00:44:45.120 +And a real natural way to how do I host FastAPI + +00:44:45.280 --> 00:44:46.740 +is FastAPI cloud, right? + +00:44:46.880 --> 00:44:48.060 +Like if it suggests, + +00:44:48.200 --> 00:44:50.940 +oh, you're just going to like spread it across Lambda by breaking. + +00:44:51.080 --> 00:44:52.460 +Like, whoa, no, I want something simple. + +00:44:52.640 --> 00:44:54.180 +Okay, give me FastAPI cloud, right? + +00:44:54.320 --> 00:44:55.860 +I think that that's a really good thing. + +00:44:55.860 --> 00:44:57.800 +And then on the enterprise side, + +00:44:58.340 --> 00:45:03.220 +enterprise folks are notoriously not good at supporting open source + +00:45:03.440 --> 00:45:05.720 +in that they're not like paying for it. + +00:45:05.720 --> 00:45:11.000 +I know some companies are big supporters of the PSF and Python and open source. + +00:45:11.440 --> 00:45:15.240 +But in general, it's like, yeah, we have this project with 5,000 people working on it. + +00:45:15.240 --> 00:45:15.680 +It's all Python. + +00:45:16.340 --> 00:45:19.240 +And are we sponsoring this? + +00:45:19.640 --> 00:45:19.780 +Nope. + +00:45:20.140 --> 00:45:22.060 +We're just enjoying the money, right? + +00:45:22.180 --> 00:45:22.740 +And we're a bank. + +00:45:23.360 --> 00:45:24.140 +So we got the money. + +00:45:24.260 --> 00:45:24.860 +We got all the money. + +00:45:25.780 --> 00:45:29.520 +So they're just not good at paying for like a really great framework that they use a lot. + +00:45:29.860 --> 00:45:34.460 +But they got plenty of hosting, plenty of internal apps that they just need to make run and stuff. + +00:45:34.680 --> 00:45:40.220 +So I think both on like the low end and the high end, there's a lot of synergy between these things. + +00:45:40.340 --> 00:45:52.540 +That is not just, you know, slightly advanced, not to diminish it, but slightly advanced UI widgets that you could ask your AI to build or something or like cookie cutter templates for project starters. + +00:45:52.840 --> 00:45:59.120 +I think we are in a somewhat fortunate position of like, you know, like FastAPI. FastAPI has grown so much. + +00:45:59.560 --> 00:46:06.100 +Like, you know, like when you check the statistics about downloads or GitHub stars or entries in developer surveys, + +00:46:06.220 --> 00:46:08.680 +like it's at the top in like in each category. + +00:46:08.860 --> 00:46:13.560 +It's like, you know, like the backend framework with the most GitHub stars across languages, + +00:46:14.040 --> 00:46:17.180 +even like, you know, like Java, Go, Ruby, JS, like whatever. + +00:46:17.280 --> 00:46:19.840 +It's like the top one, at least in GitHub stars. + +00:46:20.260 --> 00:46:24.620 +So like, you know, like FastAPI is like people are liking it, fortunately. + +00:46:25.480 --> 00:46:29.080 +And there's probably going to be people deploying things to FastAPI Cloud. + +00:46:29.100 --> 00:46:32.600 +So that's probably going to be like, we are probably going to be fine. + +00:46:33.180 --> 00:46:38.760 +I think, you know, like the, I guess it will be like a good point to ask people to go and + +00:46:38.880 --> 00:46:42.580 +check where the open source project is that they are using and check where is the bus ticket + +00:46:42.800 --> 00:46:44.920 +factor for those open source projects. + +00:46:45.040 --> 00:46:55.800 +You know, like if you are using Tailwind CSS, it would have been very cool if at some point you check if the premium things were useful for you and for your company or your project or something like that, you know? + +00:46:56.100 --> 00:46:59.400 +Yeah, because what is the thing that keeps that project going? + +00:46:59.560 --> 00:46:59.880 +Exactly. + +00:47:00.260 --> 00:47:12.820 +And I really personally admire if a project or something offers like more value, not just, hey, buy me a coffee, but here's a thing that you get way more of, you know? + +00:47:12.820 --> 00:47:14.840 +And in that regard, I think Tailwind was doing that, right? + +00:47:14.960 --> 00:47:17.240 +They were offering this suite of pre-built things. + +00:47:17.570 --> 00:47:18.940 +And I think that that's great. + +00:47:19.280 --> 00:47:24.440 +But yeah, I do think you've got more of these crazy AI things + +00:47:24.470 --> 00:47:26.680 +are going to maybe recommend FastAPI Cloud more + +00:47:26.960 --> 00:47:28.340 +than they're just going to undercut it. + +00:47:28.350 --> 00:47:29.200 +So I think that's really great. + +00:47:29.460 --> 00:47:33.180 +And by the way, I was just looking for the GitHub Stars graph. + +00:47:33.530 --> 00:47:36.200 +Like there's a whole, I can't remember what the domain of that site is. + +00:47:36.840 --> 00:47:38.980 +And I ran across, by the way, I just want to give a quick shout out. + +00:47:39.130 --> 00:47:43.420 +Like your cult repo documentary on FastAPI was awesome. + +00:47:44.000 --> 00:47:44.080 +Right? + +00:47:44.420 --> 00:47:45.080 +That was so fun. + +00:47:45.440 --> 00:47:46.280 +They made me look good. + +00:47:46.280 --> 00:47:47.040 +I didn't see that coming. + +00:47:47.660 --> 00:47:50.060 +Yeah, it came right on the heels of the Python official + +00:47:50.700 --> 00:47:51.800 +documentary, the one hour one. + +00:47:51.900 --> 00:47:54.200 +This is the same group, and the production quality + +00:47:54.260 --> 00:47:54.660 +is really nice. + +00:47:54.750 --> 00:47:55.320 +So like-- + +00:47:55.440 --> 00:47:58.580 +When they released the trailer for the Python documentary, + +00:47:58.800 --> 00:48:00.940 +before releasing the documentary, when they released the trailer, + +00:48:01.180 --> 00:48:02.980 +they contacted me and said, hey, we're + +00:48:03.670 --> 00:48:06.080 +doing these mini documentaries about different frameworks, + +00:48:06.230 --> 00:48:08.360 +different tools, and we want to include FastAPI there. + +00:48:08.410 --> 00:48:09.260 +I was like, oh, nice. + +00:48:10.040 --> 00:48:12.220 +But then I was just trying to stay excited, + +00:48:12.440 --> 00:48:13.200 +but super excited. + +00:48:13.760 --> 00:48:17.740 +Oh, that's so cool. Yeah, I watched it as soon as it came out. So I'll link to that. People should + +00:48:17.880 --> 00:48:22.500 +definitely, it's only like 10 minutes or something, but it's worth it. We're checking out. So it's not + +00:48:22.500 --> 00:48:26.900 +a huge investment time. People can watch it, I suppose. It's not TikTok. I mean, it's not like, + +00:48:27.440 --> 00:48:34.900 +oh, I saw the documentary, but it doesn't take you on huge about it. + +00:48:34.960 --> 00:48:37.840 +You have to listen for 10 minutes, overly excited Colombian. + +00:48:37.940 --> 00:48:41.320 +I don't understand what's happened to the attention span of society. I'm really, + +00:48:41.560 --> 00:48:45.600 +honestly a little concerned. I used to, when I would create my courses, people would say, + +00:48:45.760 --> 00:48:49.460 +you know, like a four hour course and there'd be like a 10, 15 minute sort of, Hey, here's how you + +00:48:49.580 --> 00:48:52.280 +set up your computer. And here's all the introduction and people, Oh, that's so awesome. + +00:48:52.400 --> 00:48:55.200 +I loved how you kind of set the stage. I'm really motivated to take the course. + +00:48:55.840 --> 00:48:59.800 +Nowadays, I just get messages like, why are you still talking? This is five minutes long. Do you + +00:48:59.960 --> 00:49:06.500 +understand? I'm like, this is your job. You can't spend five minutes. Oh my gosh. Anyway, that's, + +00:49:06.620 --> 00:49:11.520 +that's sort of the origin of my comment there. It's all right. So we're kind of getting so much + +00:49:11.540 --> 00:49:17.380 +time, I think I want to talk about a couple of things. Let's talk a little bit about internals. + +00:49:17.460 --> 00:49:22.820 +Like what, I don't know who wants to take this one, but let's talk about just how, when I say + +00:49:23.080 --> 00:49:30.140 +FastAPI deploy, then what? It's just a uv pip install and it just goes and it's magic and it's + +00:49:30.240 --> 00:49:34.320 +easy, right? We have a nickname for Jonathan. Can we say it or no? I don't know. It's so funny. + +00:49:34.380 --> 00:49:38.660 +This happened because I told my friends, I'm so concerned about being at the podcast because + +00:49:38.680 --> 00:49:42.000 +Because everyone here is a visionary, and then I'm the back-end guy. + +00:49:42.300 --> 00:49:46.300 +I think the things I could contribute to this conversation, I should probably keep to myself. + +00:49:47.540 --> 00:49:50.020 +But you're just leaking your turnouts, right? + +00:49:50.180 --> 00:49:52.940 +There are some things that are not really secret. + +00:49:53.340 --> 00:50:01.300 +Like, as Sebastian said earlier, Kubernetes 1 in the infrastructure and deployment field, to some extent. + +00:50:01.500 --> 00:50:03.600 +So that's somewhere in there, right? + +00:50:03.880 --> 00:50:07.260 +But it's all the way deep down, so no one has to worry about it. + +00:50:07.420 --> 00:50:08.500 +But it's still a foundation. + +00:50:08.580 --> 00:50:09.580 +which is a good foundation. + +00:50:09.950 --> 00:50:11.300 +I think one thing that's, + +00:50:11.610 --> 00:50:12.500 +you might have guessed it, + +00:50:12.800 --> 00:50:15.240 +but FastAPI Cloud is built on FastAPI, + +00:50:15.480 --> 00:50:16.440 +which kind of makes sense, right? + +00:50:16.760 --> 00:50:17.940 +And that also has an effect + +00:50:18.200 --> 00:50:20.500 +on like recent patches, updates and stuff. + +00:50:20.720 --> 00:50:22.560 +Because if we find something internally + +00:50:23.000 --> 00:50:23.860 +which we're not happy with, + +00:50:24.040 --> 00:50:25.240 +then we just fix it. + +00:50:25.440 --> 00:50:27.080 +And that's how some releases + +00:50:27.600 --> 00:50:29.460 +came out faster than months before. + +00:50:30.220 --> 00:50:31.420 +Power of dogfooding. + +00:50:31.820 --> 00:50:32.480 +Yeah, that's awesome. + +00:50:32.620 --> 00:50:33.080 +Dogfooding a lot. + +00:50:33.280 --> 00:50:35.100 +Also all the related libraries + +00:50:35.540 --> 00:50:38.360 +like SQL model and, well, others. + +00:50:38.840 --> 00:50:39.940 +they experience the same thing. + +00:50:40.160 --> 00:50:41.500 +New library is coming out. + +00:50:42.020 --> 00:50:43.680 +Patrick will announce at some point soon. + +00:50:43.860 --> 00:50:45.680 +It's not just FastJPay and friends. + +00:50:45.980 --> 00:50:46.980 +We're like really open. + +00:50:47.260 --> 00:50:49.160 +Like recently, Patrick just open-sourced + +00:50:49.540 --> 00:50:51.680 +everything we use for authentication authorization, + +00:50:52.160 --> 00:50:52.520 +for example. + +00:50:52.840 --> 00:50:53.700 +Is it open-source yet? + +00:50:53.960 --> 00:50:54.840 +Did they just leak something? + +00:50:55.000 --> 00:50:56.940 +It will be announced soon at some point. + +00:50:57.920 --> 00:50:59.720 +We build stuff internally in the moment really. + +00:51:00.520 --> 00:51:01.820 +Like we build it in a way, + +00:51:01.960 --> 00:51:02.800 +like in a separate package, + +00:51:03.260 --> 00:51:04.420 +just like an open-source library. + +00:51:04.540 --> 00:51:05.740 +And if we feel like the time is ripe, + +00:51:05.860 --> 00:51:07.260 +it's just getting open-sourced + +00:51:07.280 --> 00:51:08.860 +because a lot of things are reusable. + +00:51:09.420 --> 00:51:10.460 +And that's in all departments. + +00:51:10.900 --> 00:51:11.700 +That happens a lot. + +00:51:12.320 --> 00:51:14.660 +When I started there, I already realized that. + +00:51:14.760 --> 00:51:15.880 +Everyone was building open source, + +00:51:16.030 --> 00:51:17.600 +but now I joined in myself as well. + +00:51:17.720 --> 00:51:22.020 +I open source the library for compressing + +00:51:22.120 --> 00:51:24.240 +and decompressing archives in Python + +00:51:24.860 --> 00:51:27.440 +because the internal top high thing is just slow + +00:51:27.520 --> 00:51:28.660 +and we needed it to be faster + +00:51:28.840 --> 00:51:30.620 +because we're staring at the deployment process + +00:51:30.690 --> 00:51:33.160 +and we're like, hey, we could probably shave off a few seconds here. + +00:51:33.760 --> 00:51:35.740 +Then that's just open source for everyone to use. + +00:51:35.800 --> 00:51:38.900 +So we're contributing to the old Python ecosystem as well. + +00:51:38.900 --> 00:51:39.940 +You have to say the name. + +00:51:40.140 --> 00:51:40.820 +It's so good. + +00:51:41.500 --> 00:51:42.080 +Is it good? + +00:51:42.240 --> 00:51:46.280 +No, it's just, it's faster because it's, you know, faster than just tar. + +00:51:47.260 --> 00:51:47.880 +Fast tar? + +00:51:48.160 --> 00:51:48.500 +I love it. + +00:51:48.510 --> 00:51:49.140 +Fast tar, yes. + +00:51:49.760 --> 00:51:53.300 +And you can say with that very, very German accent, fast tar. + +00:51:53.420 --> 00:51:54.140 +I'll go star it. + +00:51:54.200 --> 00:51:55.820 +We'll get you some stars. + +00:51:56.520 --> 00:51:57.060 +This is going to happen. + +00:51:57.200 --> 00:51:58.300 +That's the irony about it. + +00:51:58.440 --> 00:51:59.960 +Like, it literally has no stars. + +00:52:00.090 --> 00:52:01.660 +But if you scroll down, you see the downloads. + +00:52:01.880 --> 00:52:04.100 +That's going to prove we're actually using it. + +00:52:04.180 --> 00:52:04.740 +Yeah, I like it. + +00:52:05.060 --> 00:52:06.100 +It's a little context manager. + +00:52:07.660 --> 00:52:12.700 +It's almost working the same as the TAL file in the standard library. + +00:52:13.000 --> 00:52:16.060 +Like the same, like almost similar API to that. + +00:52:16.360 --> 00:52:18.720 +It's basically a drop-in replacement, more or less. + +00:52:19.440 --> 00:52:21.400 +But they know they need everything to happen in Rust. + +00:52:21.680 --> 00:52:22.080 +Because Rust. + +00:52:22.420 --> 00:52:23.340 +Because Rust, yeah. + +00:52:23.560 --> 00:52:27.180 +Well, as soon as it becomes infrastructure and you've got to run it a million times, + +00:52:27.740 --> 00:52:29.100 +that starts to make sense, right? + +00:52:29.220 --> 00:52:29.280 +Yeah. + +00:52:29.420 --> 00:52:33.180 +Python is one of the fastest programming languages in the world. + +00:52:33.680 --> 00:52:36.960 +when you think about human time to build the things, right? + +00:52:37.140 --> 00:52:39.480 +Like that's one of its real superpowers is like, + +00:52:39.800 --> 00:52:44.060 +I mean, there's the whole story of Google Video and YouTube, right? + +00:52:44.160 --> 00:52:46.920 +And Google Video was written in C++ with 100 engineers + +00:52:47.160 --> 00:52:49.220 +and YouTube was a small team in Python + +00:52:49.460 --> 00:52:51.100 +and they just couldn't keep up with the features. + +00:52:51.200 --> 00:52:53.000 +So they bought this little old thing, YouTube, + +00:52:53.560 --> 00:52:55.000 +and see if we're going to make something with it. + +00:52:55.280 --> 00:52:57.220 +And last I checked, it was still in Python. + +00:52:57.420 --> 00:53:00.260 +I'm sure some of it isn't, but a few years ago it was, which is wild. + +00:53:00.760 --> 00:53:02.560 +Anyway, there's different ways of fast, + +00:53:02.680 --> 00:53:04.520 +But when it's down to like little utilities, yeah. + +00:53:04.620 --> 00:53:06.660 +I know some people that are trying to make Python fast. + +00:53:06.710 --> 00:53:07.220 +I know a couple. + +00:53:07.440 --> 00:53:07.620 +Yeah. + +00:53:08.200 --> 00:53:11.540 +And honestly, massive success in the last five years, right? + +00:53:11.740 --> 00:53:17.400 +Like since 3.11, since the specializing adaptive interpreter, there's been pretty big improvements. + +00:53:17.720 --> 00:53:20.360 +3.9 and 3.11 did a lot of like foundational work. + +00:53:20.410 --> 00:53:24.020 +And then 3.9 onward really just uncorked a lot of innovation there. + +00:53:24.340 --> 00:53:25.260 +Yeah, that's pretty awesome. + +00:53:26.000 --> 00:53:26.260 +All right. + +00:53:26.560 --> 00:53:29.320 +It sounds like, Sebastian, you've talked a lot about Kubernetes. + +00:53:29.740 --> 00:53:31.360 +So I imagine Kubernetes is happening. + +00:53:31.960 --> 00:53:34.820 +Do we get to pick what data centers it runs on? + +00:53:34.840 --> 00:53:37.720 +Do we get to pick what clouds it runs on? + +00:53:37.920 --> 00:53:40.560 +You're going to get to pick some of these things. + +00:53:41.340 --> 00:53:41.800 +Not yet. + +00:53:42.100 --> 00:53:43.000 +It's not released yet. + +00:53:43.600 --> 00:53:45.320 +But, you know, like it's top, of course. + +00:53:45.620 --> 00:53:48.680 +Like we have like regular cloud providers on the MIT + +00:53:48.720 --> 00:53:49.720 +and there's a bunch of Kubernetes. + +00:53:49.920 --> 00:53:52.140 +Then there's a bunch of additional stuff + +00:53:52.360 --> 00:53:53.560 +that needs to run on top. + +00:53:53.760 --> 00:53:55.599 +Then there's like custom Kubernetes controllers + +00:53:56.320 --> 00:53:58.560 +and things that Jonathan was saying + +00:53:58.580 --> 00:54:00.399 +that he's having to write in Go + +00:54:00.420 --> 00:54:04.180 +so that people in Python can be happy to be able to, you know, + +00:54:04.200 --> 00:54:07.040 +like manage all the Kubernetes shenanigans that need to happen + +00:54:07.160 --> 00:54:09.700 +because there's so much complexity that needs to be handled. + +00:54:10.340 --> 00:54:11.440 +There's a lot of that. + +00:54:11.760 --> 00:54:14.860 +We do a lot of advanced tricks also. + +00:54:15.380 --> 00:54:17.440 +Jonathan was recently doing a bunch of advanced tricks + +00:54:17.580 --> 00:54:20.160 +to handle the caches for the builds. + +00:54:20.520 --> 00:54:22.120 +So the way that we handle caches, + +00:54:22.260 --> 00:54:25.220 +and we also like tap into uv and how things work + +00:54:25.340 --> 00:54:28.399 +so that builds can be super, super fast + +00:54:28.420 --> 00:54:30.080 +because it's like something is, + +00:54:30.540 --> 00:54:33.280 +we are, you know, like we are very much targeted + +00:54:33.560 --> 00:54:35.940 +at FastAPI and Python in general. + +00:54:36.100 --> 00:54:38.440 +So we can take advantage of knowing + +00:54:39.060 --> 00:54:40.260 +how things run internally, + +00:54:40.720 --> 00:54:41.700 +how things are installed, + +00:54:42.220 --> 00:54:43.280 +how to optimize everything. + +00:54:43.380 --> 00:54:45.000 +So everything is just like super fast, + +00:54:45.140 --> 00:54:47.640 +super fast to install, to run, to like do everything. + +00:54:47.700 --> 00:54:50.400 +I imagine you all have base Docker images + +00:54:50.960 --> 00:54:53.700 +that are like just one layer away + +00:54:54.080 --> 00:54:55.880 +from whoever's code is running. + +00:54:56.440 --> 00:54:58.060 +You know, like you've got it all optimized, + +00:54:58.120 --> 00:55:02.020 +already pre-built with FastAPI and whatever settings of Python you want. + +00:55:02.140 --> 00:55:03.460 +A bunch of things and tricks. + +00:55:04.100 --> 00:55:05.340 +But there are also different things, + +00:55:05.560 --> 00:55:08.440 +like the different ways that we do to actually build the things + +00:55:08.540 --> 00:55:13.860 +and install things and put them inside of the actual build application. + +00:55:14.220 --> 00:55:18.060 +There's a lot of sourcing that we do there, + +00:55:18.120 --> 00:55:20.840 +and Jonathan has been working on a lot of that. + +00:55:21.340 --> 00:55:23.380 +And there's also all the logic and all the stuff. + +00:55:23.500 --> 00:55:26.840 +We have a bunch of stuff on top of that to handle out of scaling, + +00:55:27.040 --> 00:55:31.160 +which is something that is actually not that easy to find in different providers. + +00:55:31.820 --> 00:55:37.220 +We have auto-scaling based on requests, including scaling down to zero, which saves costs. + +00:55:37.840 --> 00:55:40.080 +But this is not Lambdas. + +00:55:40.280 --> 00:55:41.340 +It's not AWS Lambdas. + +00:55:42.119 --> 00:55:46.860 +It's like the full deployed application, the full container or whatever it is, + +00:55:47.300 --> 00:55:49.900 +which is the full thing with all the dependencies. + +00:55:50.100 --> 00:55:54.060 +It's running for whenever it has to run, but we can scale based on requests. + +00:55:54.740 --> 00:56:07.280 +So I guess it's like the type of thing that you will have if you have this giant cluster for a huge enterprise with a bunch of infra people making sure everything just works perfectly. + +00:56:08.000 --> 00:56:10.320 +But you just pay us to do that for you. + +00:56:10.420 --> 00:56:15.360 +This is also a good time for us to probably say lots of stuff is coming and we're in private beta. + +00:56:15.540 --> 00:56:22.660 +And so you should sign up for the wait list so that you can get admitted and try out these very cool things we've been talking about. + +00:56:22.660 --> 00:56:22.840 +Absolutely. + +00:56:23.240 --> 00:56:29.020 +And I think I'll let Tech Insider out in the audience sort of lean into it. + +00:56:29.020 --> 00:56:29.520 +Public release one. + +00:56:30.019 --> 00:56:30.900 +Sebastian, when? + +00:56:31.180 --> 00:56:32.140 +Public release one. + +00:56:32.440 --> 00:56:34.960 +My final topic, which is just what's the roadmap? + +00:56:35.560 --> 00:56:36.320 +When is this stuff? + +00:56:36.660 --> 00:56:38.540 +Like, how do we get into it here? + +00:56:38.820 --> 00:56:40.180 +Why would they like Litestar? + +00:56:40.400 --> 00:56:43.940 +We have the, right now we have the waiting list and we are onboarding people. + +00:56:44.080 --> 00:56:46.460 +We already have like a bunch of people in the private beta. + +00:56:46.920 --> 00:56:51.720 +We're going to keep onboarding people from the waiting list and like, you know, like ramp that up. + +00:56:51.780 --> 00:56:56.060 +But it will be like, you know, like through the waiting list is the main place where we are onboarding. + +00:56:56.140 --> 00:56:58.440 +People will want to make sure that everything is super fine-tuned. + +00:56:58.820 --> 00:57:00.300 +And we're going to keep it that way for a while. + +00:57:00.540 --> 00:57:06.100 +So like people that are on the waiting list are going to be like the ones that are going to be able to start using it the soonest. + +00:57:06.420 --> 00:57:10.160 +At some point, we'll probably have ways for people to invite others and things like that. + +00:57:10.500 --> 00:57:17.920 +About the things that we are building, we want to, you know, like we are super focused on FastAPI and then Python in general. + +00:57:18.060 --> 00:57:20.880 +at some point will probably support different tools, + +00:57:21.340 --> 00:57:24.060 +different ways to run, also like Python code in general, + +00:57:24.480 --> 00:57:25.360 +probably different frameworks. + +00:57:25.900 --> 00:57:30.240 +It will also depend a lot on what the users are asking for, + +00:57:30.360 --> 00:57:33.300 +whether like the tools, the frameworks, the use cases, + +00:57:33.460 --> 00:57:34.700 +the things that they need to build. + +00:57:34.740 --> 00:57:37.860 +And like, we're going to evolve the platform and the system + +00:57:38.500 --> 00:57:40.780 +based on what people need out of it. + +00:57:40.800 --> 00:57:42.900 +We have like a GitHub repo where we have issues, + +00:57:43.060 --> 00:57:45.780 +but we also have like a Slack that once people are admitted, + +00:57:45.820 --> 00:57:50.760 +they can talk directly to us and that feedback is really, really valuable for shaping the roadmap + +00:57:50.970 --> 00:57:53.500 +and figuring out all the fun things you want us to support. + +00:57:53.560 --> 00:57:58.620 +Awesome. Of course, you're going to charge money for it. It runs on servers and you guys are not + +00:57:59.170 --> 00:58:04.640 +a charity, but can you give any sense of what you're thinking about that kind of stuff or + +00:58:05.060 --> 00:58:06.080 +join the waitlist and see? + +00:58:06.240 --> 00:58:11.920 +Well, first join the waitlist and see, but we don't have like that predefined yet, + +00:58:12.220 --> 00:58:16.080 +but they will be on the ballpark of what you could get from different cloud providers. + +00:58:16.280 --> 00:58:23.920 +So different similar-ish providers will be on the ballpark of what you will get. + +00:58:24.300 --> 00:58:26.980 +But it's not written in stone yet. + +00:58:27.320 --> 00:58:32.060 +It's still a little bit different because we can auto-scale based on requests. + +00:58:32.340 --> 00:58:36.580 +So we can increase the amount of replicas of your application automatically, + +00:58:36.960 --> 00:58:40.000 +and then we can decrease them automatically, and we can scale down to zero. + +00:58:40.060 --> 00:58:49.940 +So you can probably handle all the load that you need and in the end spend a lot less because you don't have to have a bunch of instances constantly running or things like that, you know. + +00:58:50.420 --> 00:58:55.840 +So it will probably work a little bit different than what it will be for other providers. + +00:58:56.360 --> 00:58:58.380 +But in the end, it should be roughly similar. + +00:58:58.560 --> 00:59:18.360 +Okay. And given the fact that you all handle so much of it as a platform as a service type of thing, you don't have to have a cloud expert on hand or a DevOps expert necessarily, right? As soon as a company hires somebody to be an AWS cloud architect or something like that, it's no longer just what is your AWS bill. + +00:59:18.580 --> 00:59:22.380 +It's also a little bit of pain that we are swallowing so you don't have to take it. + +00:59:22.740 --> 00:59:24.580 +Exactly. It's part of taking one for the team, right? + +00:59:24.740 --> 00:59:24.920 +Yes. + +00:59:25.140 --> 00:59:25.200 +Yes. + +00:59:27.160 --> 00:59:27.340 +Indeed. + +00:59:27.960 --> 00:59:30.960 +All right, so I had one or two things specifically + +00:59:31.610 --> 00:59:32.360 +that I was seeking. + +00:59:32.550 --> 00:59:33.880 +It's like custom domains. + +00:59:34.410 --> 00:59:35.580 +How far off are custom domains? + +00:59:35.700 --> 00:59:38.100 +I was like, oh, I could put some cool things on there. + +00:59:39.420 --> 00:59:41.140 +I could tell Jonathan is psyched about this. + +00:59:41.280 --> 00:59:44.320 +It'd be really fun to put one of my really small + +00:59:44.799 --> 00:59:46.100 +FastAPI projects over there, + +00:59:46.740 --> 00:59:48.360 +something I set up for some of my courses or something, + +00:59:48.520 --> 00:59:50.100 +and then I can point people to go, + +00:59:50.160 --> 00:59:51.400 +look, it's running on FastAPI Cloud. + +00:59:51.400 --> 00:59:53.620 +How neat, you guys can check that out over there. + +00:59:54.060 --> 00:59:55.360 +And I'm like, but it's on its own domain, + +00:59:55.580 --> 01:00:00.380 +that domain is baked into the course videos, you know what I mean? And it's written in stone. + +01:00:00.580 --> 01:00:01.140 +It's marketing. + +01:00:01.520 --> 01:00:07.880 +Yeah, exactly. So I can't really move it because it has, you know, some subdomain of Talk Python, + +01:00:08.060 --> 01:00:08.120 +right? + +01:00:08.240 --> 01:00:12.120 +I was working on it. And then I got the notification by Google Calendar that I should + +01:00:12.440 --> 01:00:14.140 +join a certain podcast. So... + +01:00:14.140 --> 01:00:18.920 +Are you telling me we don't have custom domains? Because I'm here asking you about custom domains. + +01:00:19.100 --> 01:00:19.700 +How meta is that? + +01:00:19.720 --> 01:00:23.540 +You got it. It could be here already. But no, you have to wait a bit more. + +01:00:23.640 --> 01:00:24.400 +Okay. But soon? + +01:00:24.520 --> 01:00:29.660 +Yeah. As soon as broad enough, but I'm actively working on it. Let's put it like that. + +01:00:29.740 --> 01:00:34.320 +Okay. That sounds great. And then, I mean, just, it's never simple. You know, I just, + +01:00:34.780 --> 01:00:39.140 +I set up some stuff and it's like, you get the pop-up. Oh, you got to put this, you know, + +01:00:39.280 --> 01:00:45.320 +this TXT record or this CNAME or whatever record into your DNS and then we're checking it. Oh, + +01:00:45.780 --> 01:00:50.239 +it might take three days for your DNS to propagate. So hang in there and just, I can imagine like + +01:00:50.560 --> 01:00:56.560 +you're having fun yeah i guess you're kidding me that's like wow i i thought i'm almost off work + +01:00:56.780 --> 01:01:00.260 +but no you're bringing it all back but yeah that's a that's a thing i'm sure the company + +01:01:00.760 --> 01:01:04.540 +could support therapy to like work work through the issues and the trauma that you've suffered + +01:01:04.760 --> 01:01:10.340 +from the dns it's always dns that's right i mean you got that's an yes it's always dns yes + +01:01:10.460 --> 01:01:15.699 +i guess one of our goals with custom domains also to make it super simple for you to set up them + +01:01:15.940 --> 01:01:23.040 +Like, for example, if you're using one of the providers that support OAuth, we can also just do one click and then it's going to be automatically. + +01:01:23.160 --> 01:01:24.960 +Oh, that's cool. Yeah, that's really nice. + +01:01:25.120 --> 01:01:27.300 +But unfortunately, it depends on the platform you're using. + +01:01:27.460 --> 01:01:29.300 +All of them support this. + +01:01:29.480 --> 01:01:32.720 +This is said by the person in charge of most of the integrations. + +01:01:32.860 --> 01:01:37.340 +So Patrick has built, we have integrations for a bunch of database providers and things like that. + +01:01:38.060 --> 01:01:41.900 +I think now Patrick knows by memory, open ID specification. + +01:01:42.260 --> 01:01:42.720 +I don't know. + +01:01:44.300 --> 01:01:48.600 +Yeah, the other thing I wanted to talk a bit about was just integrations, like what kind of stuff you guys have coming in. + +01:01:48.600 --> 01:01:51.940 +I saw that Hugging Face is going to be integrated soon. + +01:01:52.120 --> 01:01:55.320 +You've got Supabase, which is kind of Postgres as a service. + +01:01:56.040 --> 01:01:59.860 +There's a lot of those things out there that theoretically could be added. + +01:02:00.100 --> 01:02:01.440 +Someone also asked for MongoDB. + +01:02:01.660 --> 01:02:03.640 +Maybe that's one that we're going to take a look into. + +01:02:04.070 --> 01:02:06.320 +It really depends on the provider. + +01:02:06.370 --> 01:02:11.360 +So at the moment, we don't want to ask databases for you because that's also another kind of rabbit hole. + +01:02:11.780 --> 01:02:13.200 +Jonathan is probably not ready for that. + +01:02:14.180 --> 01:02:15.880 +But yeah, definitely database. + +01:02:16.040 --> 01:02:21.260 +But I guess we can say that we're also talking with the people from Pydantic + +01:02:21.360 --> 01:02:24.720 +so we can integrate maybe Logfire automatically, that kind of stuff. + +01:02:25.120 --> 01:02:25.220 +Yeah. + +01:02:25.440 --> 01:02:28.600 +And also things like Redis, which is also another kind of database. + +01:02:29.300 --> 01:02:30.200 +That's also coming soon. + +01:02:30.360 --> 01:02:33.580 +Yeah, there's a couple of database as a service type things + +01:02:33.740 --> 01:02:37.700 +that don't require too much other than just connecting API keys + +01:02:37.920 --> 01:02:38.740 +and something like that, right? + +01:02:38.940 --> 01:02:40.180 +Those seem like low-hanging fruit. + +01:02:40.300 --> 01:02:43.940 +Like the kind of goal with the integration is not just done. + +01:02:44.000 --> 01:02:46.620 +Like, yeah, right now it's just setting up an environment variable. + +01:02:47.080 --> 01:02:51.240 +But the idea is also to more, I don't know, like the proper integration, I would say. + +01:02:52.320 --> 01:02:56.040 +Like, for example, for things like Superbase, if, yeah, I think that's support branching. + +01:02:56.080 --> 01:03:00.500 +Like, for example, once we support ProQuest Previews for GitHub, like we can also create + +01:03:00.500 --> 01:03:03.680 +a branch automatically for you if you have the Superbase integration enabled. + +01:03:04.180 --> 01:03:05.920 +And we can do this kind of stuff as well. + +01:03:06.020 --> 01:03:09.620 +Or even we could show like some information about database. + +01:03:10.000 --> 01:03:12.200 +I don't know, like load or like memory usage, + +01:03:12.320 --> 01:03:13.920 +things are directly from our dashboard. + +01:03:14.080 --> 01:03:14.860 +So you don't have to go there. + +01:03:15.080 --> 01:03:17.920 +That's the main reason why we're building this infrastructure + +01:03:18.180 --> 01:03:19.160 +for the integration. + +01:03:19.480 --> 01:03:21.620 +Well, people can sign up to the waiting list + +01:03:21.820 --> 01:03:24.420 +and hopefully get on the private beta. + +01:03:24.880 --> 01:03:26.180 +We actually check the waiting list. + +01:03:26.220 --> 01:03:29.080 +We actually check the use cases, team sizes, + +01:03:29.440 --> 01:03:31.440 +like what are people building with it? + +01:03:31.540 --> 01:03:33.400 +Like we actually go and check it + +01:03:33.480 --> 01:03:36.580 +and we bring in people from the waiting list. + +01:03:37.020 --> 01:03:39.220 +Nice. You know, I didn't join the waiting list directly. + +01:03:39.460 --> 01:03:42.820 +I was added by some guy I know who was very kind + +01:03:42.850 --> 01:03:44.560 +to help me get some behind-the-scene look. + +01:03:45.200 --> 01:03:46.360 +So I don't know what the process is. + +01:03:46.460 --> 01:03:48.560 +Do you actually say what you want to do with it? + +01:03:48.590 --> 01:03:51.360 +And you evaluate that a little bit as well based on, like, + +01:03:51.360 --> 01:03:53.340 +hey, this would be a cool use case for us to support? + +01:03:53.430 --> 01:03:54.960 +There are many types of applications + +01:03:55.150 --> 01:03:57.320 +and many types of different team sizes, + +01:03:57.600 --> 01:04:00.520 +many types of things that people might want to build. + +01:04:00.720 --> 01:04:03.020 +And we try to see, like, okay, where is a case + +01:04:03.160 --> 01:04:06.360 +where we could be a good fit and we can provide a great service? + +01:04:06.980 --> 01:04:08.880 +And where are the things that people are trying to build? + +01:04:09.220 --> 01:04:14.860 +Also, it also helps us see, like, you know, like, what are people trying to do with Fasted + +01:04:14.980 --> 01:04:17.080 +Vehicle so that we know what we have to provide? + +01:04:17.560 --> 01:04:23.800 +But we actually go and check those, you know, like those submissions on, like, sexually thousands + +01:04:24.010 --> 01:04:30.480 +of people in the waiting list, but we still go and check and approve kind of manually still + +01:04:30.780 --> 01:04:34.840 +to bring a bunch of people on board in the different ways that we have been bringing people. + +01:04:35.240 --> 01:04:39.180 +So if people go and join the waiting list and actually tell us what they are, what is their + +01:04:39.200 --> 01:04:41.900 +use case, their team, what are they planning on doing? + +01:04:42.440 --> 01:04:45.160 +There's a much higher chance that we are going to go + +01:04:45.340 --> 01:04:46.280 +on to bring them up. + +01:04:46.280 --> 01:04:46.500 +MARK MANDEL: Awesome. + +01:04:47.100 --> 01:04:49.280 +So everyone, go join the waitlist. + +01:04:49.360 --> 01:04:52.980 +If you're doing FastAPI, I'll link to it in the show notes, + +01:04:53.080 --> 01:04:53.420 +of course. + +01:04:53.860 --> 01:04:57.040 +Thank you all for being here and sharing the story. + +01:04:57.340 --> 01:05:00.900 +And I, for one, am very excited to see FastAPI Cloud exist + +01:05:01.000 --> 01:05:04.180 +and just one more way to make FastAPI stronger + +01:05:04.420 --> 01:05:06.400 +and more resilient and so on. + +01:05:06.400 --> 01:05:07.340 +FRANCESC CAMPOY: Thank you very much. + +01:05:07.580 --> 01:05:08.400 +Thank you for having us. + +01:05:08.580 --> 01:05:09.480 +Yeah, it's super fun. + +01:05:09.480 --> 01:05:10.260 +Thanks for having us. + +01:05:10.460 --> 01:05:10.840 +Yeah, you bet. + +01:05:11.220 --> 01:05:11.500 +Bye, everyone. + +01:05:11.640 --> 01:05:12.080 +Bye, folks. + +01:05:12.380 --> 01:05:12.680 +Bye-bye. + +01:05:12.940 --> 01:05:13.060 +Bye. + +01:05:14.300 --> 01:05:16.540 +This has been another episode of Talk Python To Me. + +01:05:16.820 --> 01:05:17.620 +Thank you to our sponsors. + +01:05:17.860 --> 01:05:19.120 +Be sure to check out what they're offering. + +01:05:19.340 --> 01:05:20.700 +It really helps support the show. + +01:05:21.560 --> 01:05:23.720 +This episode is brought to you by CommandBook, + +01:05:23.980 --> 01:05:26.080 +a native macOS app that I built + +01:05:26.240 --> 01:05:28.780 +that gives long-running terminal commands a permanent home. + +01:05:29.180 --> 01:05:31.180 +No more juggling six terminal tabs every morning. + +01:05:31.620 --> 01:05:33.040 +Carefully craft a command once, + +01:05:33.300 --> 01:05:34.700 +run it forever with auto-restart, + +01:05:34.900 --> 01:05:36.420 +Ural detection, and a full CLI. + +01:05:36.880 --> 01:05:39.940 +Download it for free at talkpython.fm/command book app. + +01:05:41.040 --> 01:05:43.660 +And it's brought to you by the Talk Python in Production Book, + +01:05:44.190 --> 01:05:48.820 +an inside look at 10 years of the real-world DevOps behind the Talk Python sites and apps. + +01:05:49.160 --> 01:05:51.860 +Check it out at talkpython.fm/DevOps book. + +01:05:52.720 --> 01:05:54.580 +If you or your team needs to learn Python, + +01:05:54.790 --> 01:05:59.620 +we have over 270 hours of beginner and advanced courses on topics ranging from + +01:05:59.960 --> 01:06:04.860 +complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:06:05.140 --> 01:06:07.360 +Best of all, there's no subscription in sight. + +01:06:08.020 --> 01:06:09.680 +Browse the catalog at talkpython.fm. + +01:06:10.420 --> 01:06:12.360 +And if you're not already subscribed to the show + +01:06:12.600 --> 01:06:13.740 +on your favorite podcast player, + +01:06:14.400 --> 01:06:15.040 +what are you waiting for? + +01:06:15.720 --> 01:06:17.480 +Just search for Python in your podcast player. + +01:06:17.640 --> 01:06:18.460 +We should be right at the top. + +01:06:18.880 --> 01:06:20.420 +If you enjoy that geeky rap song, + +01:06:20.500 --> 01:06:21.720 +you can download the full track. + +01:06:21.860 --> 01:06:23.760 +The link is actually in your podcast blur show notes. + +01:06:24.600 --> 01:06:25.920 +This is your host, Michael Kennedy. + +01:06:26.320 --> 01:06:27.380 +Thank you so much for listening. + +01:06:27.600 --> 01:06:28.400 +I really appreciate it. + +01:06:28.820 --> 01:06:29.540 +I'll see you next time. + +01:06:41.180 --> 01:06:43.980 +I'm out. + From 8526a02708ddfc43b079f430d8d6a42eb335e12b Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 11 Feb 2026 07:05:51 -0800 Subject: [PATCH 02/16] transcripts --- .../536-fly-inside-fastapi-cloud-youtube.vtt | 3377 ++++++++++++++++ .../538-digital-humanities-original.vtt | 3299 ++++++++++++++++ ...ith-the-python-typing-council-original.vtt | 3380 +++++++++++++++++ ...dern-python-monorepo-timeline-original.vtt | 2978 +++++++++++++++ 4 files changed, 13034 insertions(+) create mode 100644 youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt create mode 100644 youtube_transcripts/538-digital-humanities-original.vtt create mode 100644 youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt create mode 100644 youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt diff --git a/youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt b/youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt new file mode 100644 index 0000000..d4580d4 --- /dev/null +++ b/youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt @@ -0,0 +1,3377 @@ +WEBVTT + +00:00:01.659 --> 00:00:06.040 +Hello everyone, Sebastian, Patrick, Savannah, and Jonathan. + +00:00:06.830 --> 00:00:07.860 +Awesome to have you all here. + +00:00:08.680 --> 00:00:10.200 +Excited to talk about FastAPI Cloud. + +00:00:10.920 --> 00:00:11.140 +Welcome. + +00:00:11.260 --> 00:00:11.660 +Yes. + +00:00:12.240 --> 00:00:12.460 +Sorry. + +00:00:16.619 --> 00:00:17.720 +What a project. + +00:00:17.930 --> 00:00:19.540 +It's been going on for a while. + +00:00:20.060 --> 00:00:26.580 +I've heard stuff from Sebastian that maybe something was brewing and all these things, + +00:00:26.820 --> 00:00:29.180 +but not too long ago you all announced it. + +00:00:30.340 --> 00:00:34.780 +And I heard that FastAPI, some people have been using it recently. + +00:00:35.080 --> 00:00:38.580 +You know, some of the surveys show that some people use it for websites. + +00:00:38.760 --> 00:00:39.280 +I'm not sure. + +00:00:42.180 --> 00:00:42.660 +Rumors. + +00:00:43.660 --> 00:00:44.140 +Rumors. + +00:00:44.940 --> 00:00:45.420 +Rumors. + +00:00:45.820 --> 00:00:46.220 +Oh, my gosh. + +00:00:46.420 --> 00:00:48.400 +I mean, congratulations on that. + +00:00:48.420 --> 00:00:53.880 +But before we dive into FastAPI and FastAPI Cloud, let's just do a quick introduction. + +00:00:54.140 --> 00:00:55.260 +Who are you? + +00:00:55.360 --> 00:00:59.560 +We'll just go around the Brady Bunch squares of our live stream here and start with Sebastian. + +00:01:00.860 --> 00:01:01.820 +You've been on the show a few times. + +00:01:01.980 --> 00:01:06.580 +In fact, you've been on the show just recently for a really fun episode, Sebastian. + +00:01:07.230 --> 00:01:07.580 +Who are you? + +00:01:07.910 --> 00:01:08.740 +That was super fun. + +00:01:09.010 --> 00:01:10.200 +So hello, everyone. + +00:01:10.270 --> 00:01:12.380 +I'm Sebastian Ramirez or Tiangulo. + +00:01:12.720 --> 00:01:14.380 +I created FastAPI. + +00:01:14.620 --> 00:01:20.660 +That is this Python framework for building web APIs and backend in case you know. + +00:01:20.940 --> 00:01:24.100 +In case you've been living in a hole and haven't done any Python for 10 years. + +00:01:24.260 --> 00:01:32.640 +You also are famous for really pointing out the ridiculousness of modern tech recruiting. + +00:01:33.960 --> 00:01:35.140 +You know what I'm talking about? + +00:01:35.920 --> 00:01:37.500 +Yeah, you know, like it's fun. + +00:01:37.780 --> 00:01:44.800 +This is probably the thing that I am known for is for writing a tweet saying, yeah, that + +00:01:44.800 --> 00:01:53.659 +I saw a job post asking for five years of experience with API and I only had 2.5 since + +00:01:53.680 --> 00:01:54.520 +I created the thing. + +00:01:55.260 --> 00:01:58.040 +So you didn't qualify for the FastAPI job. + +00:01:58.040 --> 00:01:59.280 +I didn't qualify for it, yeah. + +00:01:59.720 --> 00:02:06.220 +And then the funny thing is, you know, like people sometimes, even people in Python itself + +00:02:06.660 --> 00:02:09.820 +and tell me like, oh, wait, like you're, and I said like, oh, yeah, I created this thing + +00:02:09.940 --> 00:02:10.399 +called FastAPI. + +00:02:10.440 --> 00:02:12.100 +Oh, wait, okay, so what is FastAPI? + +00:02:12.420 --> 00:02:15.540 +Oh, wait, you are the guy from the meme. + +00:02:16.799 --> 00:02:17.800 +Are you serious? + +00:02:18.800 --> 00:02:23.640 +Yeah, you know, like, suddenly that is super important that I am the guy for the meme of + +00:02:23.680 --> 00:02:27.220 +about FastAPI. Not the guy from Far of the Club, the guy from the. + +00:02:28.200 --> 00:02:30.720 +Israel, oh my gosh, I saw you on TikTok. It was amazing. + +00:02:31.640 --> 00:02:34.480 +It was my achievement. I wrote a viral tweet. + +00:02:35.280 --> 00:02:39.160 +You know what? Sometimes your moment in the sun is not the one you expected. No. + +00:02:39.580 --> 00:02:41.620 +Congratulations on how good FastAPI was. + +00:02:41.620 --> 00:02:41.900 +On the tweet. + +00:02:43.000 --> 00:02:46.140 +Exactly. You really nailed it. Patrick, welcome to the show. + +00:02:47.200 --> 00:02:50.180 +Hello. Thank you. It's nice to be here. Yeah, I'm Patrick. + +00:02:50.680 --> 00:02:55.220 +I guess the main thing I'm kind of known for in the community is like this library called + +00:02:55.380 --> 00:02:59.400 +Storary, which is pretty similar to FastAPI, but instead of REST is for GraphQL. + +00:03:00.540 --> 00:03:06.240 +Other than that, I help organize PyCon Italy and I used to also do EuroPython as well, + +00:03:06.360 --> 00:03:08.860 +but I stopped because of way too many things. + +00:03:10.360 --> 00:03:11.980 +Yeah, that's pretty much me. + +00:03:12.200 --> 00:03:12.560 +Okay. + +00:03:13.120 --> 00:03:13.740 +Yeah, that's awesome. + +00:03:15.260 --> 00:03:16.920 +How do you see GraphQL these days? + +00:03:17.600 --> 00:03:18.540 +Is it still popular? + +00:03:20.140 --> 00:03:24.340 +I think it's mostly popular in the enterprises, unfortunately. + +00:03:25.910 --> 00:03:29.700 +To be honest, I'm a bit annoyed about the companies that do tooling around GraphQL because + +00:03:30.879 --> 00:03:32.960 +they're not really pushing it forward. + +00:03:33.180 --> 00:03:36.520 +They're just trying to work with enterprises and that's it. + +00:03:36.760 --> 00:03:38.720 +Or maybe you're putting to AI. + +00:03:41.480 --> 00:03:43.600 +Yeah, it feels a little bit like the soap. + +00:03:44.270 --> 00:03:48.120 +The soap with the XML modern version. + +00:03:49.160 --> 00:03:49.240 +Savannah. + +00:03:50.980 --> 00:03:51.380 +Yeah. + +00:03:52.060 --> 00:03:58.000 +I was just going to say that I think you like tapping out of being an organizer for EuroPython + +00:03:58.240 --> 00:04:03.460 +is like, you know, the classic open source oversubscribed doing all the things. + +00:04:04.000 --> 00:04:04.540 +Very relatable. + +00:04:07.580 --> 00:04:07.780 +Yeah. + +00:04:08.500 --> 00:04:08.600 +Yeah. + +00:04:09.420 --> 00:04:10.100 +But yeah, I'm Savannah. + +00:04:12.060 --> 00:04:13.060 +What can I say? + +00:04:13.960 --> 00:04:18.560 +I am on the Python Steering Council for 2026, which is very exciting. + +00:04:19.120 --> 00:04:26.980 +I am also the release manager for the upcoming version of Python, Python 3.16. + +00:04:28.410 --> 00:04:32.320 +And so that'll kick off later this year, which is really cool and very exciting. + +00:04:33.250 --> 00:04:40.380 +I work on CPython stuff, the JIT, arg parse, basically whatever needs help is kind of where + +00:04:40.380 --> 00:04:40.880 +you'll find me. + +00:04:41.680 --> 00:04:42.280 +Awesome. + +00:04:43.240 --> 00:04:44.420 +Congratulations on the steering council. + +00:04:44.680 --> 00:04:47.900 +And yeah, that's a lot of cool stuff. + +00:04:49.600 --> 00:04:53.980 +Hopefully we don't get a Python 4.0 right after 3.16 + +00:04:54.810 --> 00:04:57.260 +because then your job will never end is what I've learned. + +00:04:57.800 --> 00:04:58.260 +Yeah, yeah. + +00:04:58.760 --> 00:05:02.580 +Benjamin Peterson, Python 2.7 forever kind of situation. + +00:05:03.200 --> 00:05:03.520 +Yeah, yeah. + +00:05:03.860 --> 00:05:07.620 +I mean, release management is still, I mean, it's still quite a commitment. + +00:05:07.860 --> 00:05:11.799 +It's like seven-ish years when you think about all the staggered releases + +00:05:11.820 --> 00:05:12.940 +to your release management too. + +00:05:13.340 --> 00:05:15.520 +And then you have the five year maintenance cycle. + +00:05:15.920 --> 00:05:20.740 +So yeah, it's Python forever is what I can really say. + +00:05:21.440 --> 00:05:22.840 +- Yeah, it's probably not a fad. + +00:05:22.840 --> 00:05:24.520 +It's probably gonna stick around this Python thing. + +00:05:27.380 --> 00:05:28.620 +No, that's awesome. Congratulations. + +00:05:29.020 --> 00:05:31.160 +Also cool with argpars. + +00:05:31.360 --> 00:05:34.100 +I feel like that's making a strong comeback now + +00:05:34.340 --> 00:05:39.299 +that we have these AI things that can just + +00:05:40.460 --> 00:05:42.560 +put stuff together for us instead of like, + +00:05:42.570 --> 00:05:44.400 +oh, I need to depend on this library and that library. + +00:05:44.560 --> 00:05:47.140 +I just need to take a few arguments + +00:05:47.370 --> 00:05:48.380 +and have a little help text. + +00:05:48.450 --> 00:05:50.300 +And it's like, well, you've already got this built-in thing. + +00:05:50.420 --> 00:05:51.880 +Oh, who knew? + +00:05:52.240 --> 00:05:53.480 +People were like, oh, I didn't even know. + +00:05:53.530 --> 00:05:56.380 +I thought I used typer or click or something, right? + +00:05:57.140 --> 00:05:59.480 +- Yeah, I mean, I was gonna say, + +00:06:00.320 --> 00:06:02.440 +there's the typers and clicks of the world, + +00:06:02.640 --> 00:06:04.820 +but sometimes you just want the simplest thing + +00:06:04.980 --> 00:06:06.800 +and ArgParse is pretty great at that, + +00:06:07.080 --> 00:06:09.580 +although it has many quirks that are probably + +00:06:09.980 --> 00:06:12.160 +and most definitely unfixable at this point + +00:06:12.460 --> 00:06:14.980 +because bugs are features when you have things + +00:06:15.140 --> 00:06:16.320 +that have been around as long as Python. + +00:06:18.200 --> 00:06:20.720 +But yeah, no, I mean, AI loves to write Python. + +00:06:22.000 --> 00:06:24.340 +I think it's like the language used the most + +00:06:24.940 --> 00:06:26.240 +in AI generated code, so. + +00:06:26.460 --> 00:06:29.180 +- Yeah, I'll just say we live in weird times. + +00:06:29.480 --> 00:06:30.320 +We live in very weird times. + +00:06:30.820 --> 00:06:33.160 +- I would love a precedented time at some time. + +00:06:33.220 --> 00:06:33.960 +- Exactly, yeah. + +00:06:34.130 --> 00:06:35.460 +Can we just get the boring times? + +00:06:35.860 --> 00:06:37.440 +Nothing interesting, please. + +00:06:38.180 --> 00:06:38.400 +Yeah. + +00:06:38.980 --> 00:06:39.280 +All right. + +00:06:39.950 --> 00:06:45.680 +Oh, also, I did what I said about GraphQL may sound like a bit of a smash, + +00:06:45.860 --> 00:06:49.760 +but I didn't mean it in a negative, super negative way anyway. + +00:06:50.050 --> 00:06:54.240 +Like, it used to be all the enterprises were all about SOAP and Wisdle + +00:06:54.460 --> 00:06:55.760 +and, like, subscribing your tooling. + +00:06:57.240 --> 00:06:57.980 +Please don't write me. + +00:06:58.030 --> 00:06:59.520 +I'm not trying to bash on your technology. + +00:07:01.860 --> 00:07:02.220 +All right. + +00:07:02.980 --> 00:07:04.160 +Jonathan, also, welcome. + +00:07:04.410 --> 00:07:04.520 +Hi. + +00:07:05.380 --> 00:07:05.600 +Hi. + +00:07:06.560 --> 00:07:10.060 +Yeah, I'm not nearly as famous as everyone else in this call. + +00:07:10.940 --> 00:07:17.040 +I'm more infamous internally at FastAPI Cloud, I would say, for a bunch of things. + +00:07:17.580 --> 00:07:20.300 +I've heard of emojis or something along the lines. + +00:07:20.300 --> 00:07:20.660 +One meme away. + +00:07:20.900 --> 00:07:21.980 +You're just one meme away. + +00:07:22.300 --> 00:07:23.380 +Just one meme away. + +00:07:23.680 --> 00:07:24.420 +Yeah, that's true. + +00:07:25.040 --> 00:07:26.500 +We keep piling them up internally. + +00:07:27.840 --> 00:07:32.440 +But yeah, I used to work with Patrick together for years, also on CraftGuard. + +00:07:33.140 --> 00:07:33.720 +Same like we assume. + +00:07:34.260 --> 00:07:35.540 +That's how I know him. + +00:07:35.820 --> 00:07:40.060 +And that's why I, well, made a weird sound when you said so. + +00:07:42.320 --> 00:07:47.780 +Yeah, and I've been with FastAPI Cloud since EuropePython, actually, the last one. + +00:07:48.440 --> 00:07:52.100 +I promised Sebastian I would implement some Ascent events in FastAPI, + +00:07:52.100 --> 00:07:58.180 +and I haven't started with it yet at all, but somehow I'm still here. + +00:07:58.240 --> 00:07:58.760 +So that's great. + +00:08:00.900 --> 00:08:05.300 +Well, actually, and that's actually like a sneak peek, I guess. + +00:08:05.340 --> 00:08:10.160 +we already started like having a bunch of chats and like discussing what we do should we do it + +00:08:10.320 --> 00:08:14.460 +here should we do it there what should we do so like yeah it's something that is coming to fast + +00:08:14.520 --> 00:08:20.480 +api probably soonish like there was a lot of things that needed to happen before that like + +00:08:20.900 --> 00:08:27.300 +the patrick is slightly smiley like oh no this is pressure there were some things that needed to + +00:08:27.380 --> 00:08:31.480 +happen in fast api like you know dropping support for findante version one or things like that that + +00:08:31.500 --> 00:08:33.960 +that just made the code, the internal code, so complex. + +00:08:34.500 --> 00:08:37.740 +And now that it's over, we can actually work more + +00:08:37.940 --> 00:08:39.800 +on improving performance, adding features, + +00:08:39.969 --> 00:08:40.680 +and things like that. + +00:08:40.880 --> 00:08:41.200 +So yeah. + +00:08:43.000 --> 00:08:44.080 +Yeah, very exciting. + +00:08:44.240 --> 00:08:51.060 +I definitely want to dive into how FastAPI Cloud + +00:08:51.320 --> 00:08:56.660 +has sort of influenced the whole FastAPI side of things. + +00:08:57.360 --> 00:09:03.000 +But I'm made aware that there is, in fact, an entire website. + +00:09:05.980 --> 00:09:08.120 +An entire website dedicated to the meme. + +00:09:08.940 --> 00:09:09.480 +All right. + +00:09:10.080 --> 00:09:13.560 +Yeah, and out of the audience we get, hey, everyone, is that the guy from the meme? + +00:09:13.840 --> 00:09:16.580 +And the meme is greater than Nobel Prize. + +00:09:17.080 --> 00:09:19.020 +So, you know what? + +00:09:19.480 --> 00:09:20.280 +It may be true. + +00:09:20.620 --> 00:09:21.160 +It may be true. + +00:09:21.220 --> 00:09:24.660 +I recognize the person saying, this is the guy from the meme. + +00:09:24.840 --> 00:09:25.980 +He might be my husband. + +00:09:31.300 --> 00:09:38.760 +incredible incredible all right well let's start with fast api cloud and then we'll bring it back + +00:09:38.980 --> 00:09:45.540 +around to the fast api let's let's talk org origin story so what is this fast api cloud + +00:09:47.380 --> 00:09:51.899 +nice so uh here we're looking at the fast api labs website that doesn't really show + +00:09:52.100 --> 00:09:56.700 +that much. If you click on the join the waiting list that takes you to the website for FastAPI + +00:09:56.840 --> 00:10:01.780 +Cloud, there we can see like this is what we are building. This is the thing that we are doing. + +00:10:02.620 --> 00:10:07.280 +It's actually super simple. The funny thing is that the pitch, the explanation of the product + +00:10:07.540 --> 00:10:13.920 +is so short. So it's one command. It's FastAPI Deploy. And you have a FastAPI app, you just hit + +00:10:13.980 --> 00:10:18.380 +FastAPI Deploy, and then it's on the cloud. We take care of everything. We build a thing, + +00:10:19.420 --> 00:10:23.180 +deploy it, handle HTTPS, auto-scaling, all this stuff. + +00:10:23.700 --> 00:10:27.460 +And then you can just focus on building the application, building apps. + +00:10:28.420 --> 00:10:30.500 +The funny thing is that it's super short to explain, + +00:10:30.500 --> 00:10:32.560 +but then building it is so complex. + +00:10:35.200 --> 00:10:38.500 +I feel, well, first of all, I'm glad it's so short. + +00:10:38.580 --> 00:10:39.460 +So thanks for being here. + +00:10:39.480 --> 00:10:40.440 +That was a great show, y'all. + +00:10:40.720 --> 00:10:41.540 +Well, I'm just kidding. + +00:10:44.360 --> 00:10:49.240 +No, I think it's a little bit like Jupyter Notebooks. + +00:10:49.320 --> 00:10:56.000 +in that sense that like you all are taking one for the team so that other people can have a simple + +00:10:56.340 --> 00:11:01.580 +experience whereas you know it's like those those jupyter folks they write tons of typescript and + +00:11:01.580 --> 00:11:06.140 +do all sorts of things that nobody wants to necessarily do in the data science space so that + +00:11:06.160 --> 00:11:14.319 +you can just drag your widgets around you know what i mean um exactly yeah yeah yeah thanks i feel + +00:11:14.340 --> 00:11:22.500 +like no i was just i feel like the deployment space it's a bit of a mixed bag and it's my i've + +00:11:22.580 --> 00:11:28.220 +been really frustrated to the point such that i wrote a book about it that i think about an + +00:11:28.380 --> 00:11:36.180 +alternative that i think over the last five plus years it's just trended towards a little more + +00:11:36.480 --> 00:11:41.219 +complex a little more complex oh could we just add one of these things and oh now we got these + +00:11:41.240 --> 00:11:45.160 +through we need one more thing to like make sure those things are doing you know what i mean and + +00:11:45.220 --> 00:11:52.080 +it's just like wow why are there 200 choices in my console to use this which is like kind of funny + +00:11:52.260 --> 00:11:57.000 +right because i feel like a lot of these companies started with this like i don't want to understand + +00:11:57.260 --> 00:12:01.380 +all the ins and outs of all the infrastructure that comes with the cloud service provider and + +00:12:01.480 --> 00:12:04.759 +that's really complicated to understand because i'm an app dev and i don't know anything about + +00:12:05.020 --> 00:12:11.060 +never, right? And now we're like, I don't know, kind of slowly accumulating complexity. But I + +00:12:11.070 --> 00:12:15.620 +think one of the cool things about what we're building, and I like I've worked on cloud tooling + +00:12:15.790 --> 00:12:20.540 +before is like, this is like, just spoke for Python developers. And I think that's like, + +00:12:20.880 --> 00:12:26.540 +quite, like unique in that, like, we are really trying to like, bring the bleeding edge and like + +00:12:26.620 --> 00:12:34.740 +all the new tooling that people are using and making sure that we play well with like uv. And + +00:12:34.740 --> 00:12:37.480 +care put into that by the team. + +00:12:37.660 --> 00:12:39.240 +Yeah. That's a super good point. + +00:12:39.300 --> 00:12:43.400 +I mean, I remember Azure came out with like, + +00:12:43.480 --> 00:12:44.640 +here's your platform as a service. + +00:12:44.860 --> 00:12:47.440 +You just upload your web app and we'll just take it and go. + +00:12:47.660 --> 00:12:50.960 +And now that thing is so complicated along with many, many others. + +00:12:51.180 --> 00:12:51.940 +It's not just them. + +00:12:52.040 --> 00:12:55.120 +It's you've got AWS, you've got Vercel. + +00:12:55.220 --> 00:12:57.440 +There's lots of things we could point at for, + +00:12:58.540 --> 00:12:59.380 +there's a lot of options here. + +00:13:00.820 --> 00:13:02.620 +And then there are a lot of tools and like, + +00:13:02.800 --> 00:13:07.500 +You know, like many tools and many companies are also doing a great job at many of the things that they are doing. + +00:13:08.100 --> 00:13:09.840 +But in many cases, it's just so complex. + +00:13:10.000 --> 00:13:10.700 +It's so complicated. + +00:13:10.920 --> 00:13:19.340 +You know, like I was I have always been so adamant, I think is the word, to just teaching people how to use the tools. + +00:13:19.920 --> 00:13:25.980 +I think I have the most documentation about how to deploy things on your own than any other framework. + +00:13:26.160 --> 00:13:27.440 +I have so much information. + +00:13:27.960 --> 00:13:29.520 +I hear that all the time from people. + +00:13:29.800 --> 00:13:32.060 +they say one of the reasons they chose FastAPI + +00:13:32.080 --> 00:13:34.360 +is because how clear the documentation was, you know? + +00:13:35.319 --> 00:13:37.840 +- Yeah, and then the thing is, you know, + +00:13:37.960 --> 00:13:39.780 +like just learning all those concepts + +00:13:40.000 --> 00:13:42.520 +and like learning all this stuff that needs to be learned + +00:13:42.820 --> 00:13:43.700 +just to deploy something + +00:13:43.820 --> 00:13:45.860 +and then just you barely have like the minimum. + +00:13:46.500 --> 00:13:48.660 +It's like, this is, you know, like it's just too much. + +00:13:48.780 --> 00:13:49.800 +It's too much complexity. + +00:13:50.480 --> 00:13:54.560 +I think for me, like I guess like personally, + +00:13:54.880 --> 00:13:58.719 +my analogy is that FastAPI Cloud is the equivalent + +00:13:58.740 --> 00:14:02.600 +of what FastAPI is to building web APIs and backend. + +00:14:03.000 --> 00:14:04.340 +You know, like you could do the same + +00:14:04.600 --> 00:14:05.340 +with any other framework. + +00:14:05.400 --> 00:14:07.920 +You could validate data, you could generate open API, + +00:14:08.080 --> 00:14:09.240 +you could have automatic docs, + +00:14:09.840 --> 00:14:12.560 +but you will probably have to do a lot of the wiring yourself + +00:14:13.300 --> 00:14:14.640 +and making sure that it's actually correct + +00:14:14.880 --> 00:14:16.640 +and that it doesn't explode, all the stuff. + +00:14:17.360 --> 00:14:22.540 +That is, you know, like we are trying to do a lot of that work + +00:14:23.019 --> 00:14:24.280 +for the final users. + +00:14:26.339 --> 00:14:27.640 +- Yeah, and I think it's great. + +00:14:28.280 --> 00:14:33.220 +I think it's really nice to just provide this on-ramp + +00:14:33.420 --> 00:14:36.220 +because as you said at the opening, + +00:14:36.360 --> 00:14:38.120 +when I asked the origin story, + +00:14:38.280 --> 00:14:42.660 +it's just FastAPI deploy, right? + +00:14:43.300 --> 00:14:45.000 +That solves so many stories. + +00:14:45.120 --> 00:14:46.360 +And I'm sure behind the scenes, + +00:14:46.460 --> 00:14:49.740 +what happens is just about as simple as that. + +00:14:51.520 --> 00:14:52.180 +Oh, my gosh. + +00:14:52.840 --> 00:14:53.520 +About that. + +00:14:56.800 --> 00:15:00.400 +Some of us don't even get to write Python anymore to make all of this happen. + +00:15:01.920 --> 00:15:03.560 +I'm taking one for the team. + +00:15:05.540 --> 00:15:07.100 +Yeah, that is taking one for our team, right? + +00:15:07.920 --> 00:15:08.440 +It is. + +00:15:09.240 --> 00:15:09.320 +Yeah. + +00:15:11.560 --> 00:15:14.000 +Let's save the internals for a little bit later. + +00:15:14.360 --> 00:15:19.000 +Maybe what we could do right now, maybe we could do a bit of a walkthrough + +00:15:19.940 --> 00:15:26.660 +of just kind of what it's like to set up an app from scratch, right? + +00:15:27.700 --> 00:15:28.020 +Nice. + +00:15:28.120 --> 00:15:31.280 +I see that uv is here, which is, + +00:15:32.080 --> 00:15:39.000 +I've been certainly an advocate for uv in all sorts of deployment, + +00:15:39.020 --> 00:15:42.860 +but especially when you have like repeated build type of scenarios + +00:15:43.100 --> 00:15:46.900 +for like Docker, Docker Compose or Kubernetes or whatever. + +00:15:47.400 --> 00:15:50.620 +UV makes that stuff so much faster and so on. + +00:15:50.740 --> 00:15:54.460 +So who would like to be my guide that just kind of talks us through what it + +00:15:54.580 --> 00:15:56.100 +means to set up a new project here? + +00:15:58.320 --> 00:15:58.680 +Patrick. + +00:15:59.480 --> 00:15:59.620 +Yeah. + +00:16:00.180 --> 00:16:01.460 +I feel like it should be Patrick. + +00:16:02.440 --> 00:16:04.520 +That was great to suggest, Lavana, but I can. + +00:16:05.620 --> 00:16:06.400 +I've talked enough. + +00:16:06.560 --> 00:16:07.060 +I feel like. + +00:16:07.760 --> 00:16:08.880 +Patrick has the best mic. + +00:16:10.460 --> 00:16:10.800 +Oh, yeah. + +00:16:11.200 --> 00:16:11.580 +That's the reason. + +00:16:11.740 --> 00:16:12.180 +That could be. + +00:16:13.420 --> 00:16:13.580 +Yeah. + +00:16:14.380 --> 00:16:15.600 +I mean, do you want to go from scratch? + +00:16:15.760 --> 00:16:19.680 +I mean, there is this really nice command that Savannah built, + +00:16:19.920 --> 00:16:23.200 +which is FastAPI-new, which I think is something, I don't know, + +00:16:25.010 --> 00:16:25.740 +super helpful. + +00:16:27.740 --> 00:16:30.020 +Yeah, so what does FastAPI-new do? + +00:16:31.000 --> 00:16:35.020 +Is that kind of a cookie-cutter-esque experience, or what is it? + +00:16:35.800 --> 00:16:36.360 +Yes, exactly. + +00:16:36.580 --> 00:16:41.320 +At the moment, Onesco holds a super basic FastAPI application using uv. + +00:16:41.660 --> 00:16:44.400 +It also installs dependencies, creates a folder, everything that you need. + +00:16:46.220 --> 00:16:49.040 +In future, I think we're going to plan support for templates + +00:16:49.300 --> 00:16:51.880 +so you can build multiple kind of things as well. + +00:16:52.530 --> 00:16:55.180 +But for now, it's basically just uv FastAPI new, + +00:16:55.560 --> 00:16:56.980 +sorry, uvx FastAPI new, + +00:16:57.420 --> 00:16:59.040 +and then that scaffolds the project for you. + +00:17:00.560 --> 00:17:03.360 +I don't know if you want to try it live or... + +00:17:04.040 --> 00:17:04.800 +No, go ahead. + +00:17:05.380 --> 00:17:06.260 +Let's just... + +00:17:06.579 --> 00:17:08.240 +I think it might have disrupted you. + +00:17:08.240 --> 00:17:09.280 +Just let's talk us through it. + +00:17:12.140 --> 00:17:13.180 +It could work. + +00:17:13.240 --> 00:17:15.000 +I'm just going to put that out there. + +00:17:15.180 --> 00:17:15.760 +I'm sure. + +00:17:16.439 --> 00:17:16.819 +I'm sure. + +00:17:17.160 --> 00:17:18.699 +I've had a few times recently. + +00:17:19.939 --> 00:17:21.079 +I started to do... + +00:17:22.160 --> 00:17:23.620 +I'll tell you the most insane, + +00:17:23.959 --> 00:17:25.939 +like let's do that live on the podcast experience. + +00:17:26.140 --> 00:17:26.860 +I'm pretty sure, + +00:17:27.600 --> 00:17:29.160 +yeah, this is definitely the most insane. + +00:17:29.740 --> 00:17:31.760 +I had Matthew Rocklin on from Coiled + +00:17:32.320 --> 00:17:34.100 +and those guys are all about like, + +00:17:34.100 --> 00:17:35.080 +hey, we're going to scale up + +00:17:35.140 --> 00:17:37.660 +like a bunch of available servers for you, right? + +00:17:38.160 --> 00:17:39.440 +So that you can do your data science. + +00:17:39.500 --> 00:17:41.140 +Like I want to do some ML thing + +00:17:41.140 --> 00:17:42.520 +and it needs 500 servers. + +00:17:43.160 --> 00:17:47.020 +So during the podcast, he said, oh, let me just spin up 2,000 EC2 instances. + +00:17:47.300 --> 00:17:47.500 +Hold on. + +00:17:48.480 --> 00:17:50.520 +And then we ran some code on it during the show. + +00:17:50.600 --> 00:17:51.720 +And he's like, oh, let's try that on ARM. + +00:17:51.800 --> 00:17:54.560 +And then spin up another 2,000 on ARM Linux machines. + +00:17:54.700 --> 00:17:55.700 +I'm like, okay, that's nuts. + +00:17:57.180 --> 00:17:57.940 +But let's just... + +00:17:58.320 --> 00:17:58.980 +That's a lot of time. + +00:18:00.820 --> 00:18:01.280 +I was impressed. + +00:18:01.620 --> 00:18:03.860 +But Patrick, sorry. + +00:18:04.200 --> 00:18:04.960 +I had a debrief there. + +00:18:05.540 --> 00:18:06.020 +Let's talk through it. + +00:18:07.300 --> 00:18:10.160 +Yeah, so you do uvx FastAPI new. + +00:18:10.600 --> 00:18:12.580 +Then you specify the name of the application. + +00:18:13.160 --> 00:18:14.480 +and that's almost there. + +00:18:14.480 --> 00:18:16.580 +You just need one more command to deploy, + +00:18:16.940 --> 00:18:17.700 +which is FastAPI deploy. + +00:18:18.780 --> 00:18:20.220 +The first time it's going to ask you to log in + +00:18:20.660 --> 00:18:22.900 +or join the waiting list if you haven't been invited yet. + +00:18:23.320 --> 00:18:24.320 +So you're still in beta. + +00:18:25.980 --> 00:18:27.400 +And then you follow the steps. + +00:18:27.690 --> 00:18:29.760 +So like FastAPI deploy, log in, + +00:18:31.000 --> 00:18:33.580 +decide the team, if you have multiple teams, + +00:18:34.740 --> 00:18:35.860 +decide the application name, + +00:18:36.780 --> 00:18:38.200 +and then you wait a few seconds + +00:18:38.460 --> 00:18:39.580 +and the application is going to be live. + +00:18:40.740 --> 00:18:42.100 +And just to be clear, + +00:18:42.220 --> 00:18:46.320 +FastAPI new is not required if you already have a FastAPI app. + +00:18:46.570 --> 00:18:49.640 +Like if you've already written your own code and you have your application, + +00:18:50.280 --> 00:18:53.020 +you can just go right into like logging in and deploying. + +00:18:53.540 --> 00:18:56.160 +This is just so that if you're starting something new, + +00:18:56.600 --> 00:18:59.980 +you don't have to do any thinking about all the right things that need to be there. + +00:19:00.150 --> 00:19:02.060 +So this is more of a green field application. + +00:19:02.500 --> 00:19:04.140 +I'm bootstrapping a project. + +00:19:04.640 --> 00:19:07.800 +Right, right, because you want to have the best structure. + +00:19:08.620 --> 00:19:10.120 +Now it uses uv. + +00:19:11.340 --> 00:19:13.420 +So do I have to use-- + +00:19:13.420 --> 00:19:14.020 +You can use required. + +00:19:14.720 --> 00:19:17.220 +Yeah, I was going to say, do I have to use the uv project + +00:19:17.540 --> 00:19:18.420 +management type of thing? + +00:19:18.700 --> 00:19:23.540 +Do I have to use the uv.lock files and uv add uv sync? + +00:19:23.800 --> 00:19:24.920 +Can I do requirements.txt? + +00:19:25.060 --> 00:19:25.820 +What's the story there? + +00:19:26.559 --> 00:19:29.740 +Yeah, so we support uv with uv lock. + +00:19:29.880 --> 00:19:31.300 +We also support the-- + +00:19:31.380 --> 00:19:33.300 +forget the name-- the PyLock file. + +00:19:33.700 --> 00:19:36.140 +And we also support plain requirements.txt. + +00:19:36.740 --> 00:19:38.980 +And maybe something else I don't know Jonathan can use. + +00:19:39.060 --> 00:19:40.020 +PyLock's pretty new, right? + +00:19:40.200 --> 00:19:42.820 +I think Brett Cannon just got that out pretty recently, right? + +00:19:43.400 --> 00:19:43.620 +- Yeah. + +00:19:44.730 --> 00:19:46.220 +- Brett was pretty excited, I think. + +00:19:46.420 --> 00:19:46.720 +- I know. + +00:19:47.080 --> 00:19:48.460 +- He implemented that. + +00:19:48.780 --> 00:19:49.180 +- Oh, was he? + +00:19:49.370 --> 00:19:50.600 +Okay, I'm sure he was, that's awesome. + +00:19:50.780 --> 00:19:52.300 +He put years of work into that. + +00:19:53.020 --> 00:19:53.500 +- Sure did. + +00:19:53.550 --> 00:19:53.640 +- Yeah. + +00:19:55.100 --> 00:19:57.100 +I can also say that one of the motivations + +00:19:57.400 --> 00:19:59.660 +was also like, you know, like cloud providers. + +00:20:00.100 --> 00:20:02.440 +So it's like, yes, we did it. + +00:20:03.240 --> 00:20:03.520 +- Okay. + +00:20:03.580 --> 00:20:04.760 +- The other thing is like, you know, + +00:20:04.900 --> 00:20:07.000 +if you use other different package managers, + +00:20:08.000 --> 00:20:10.980 +if they use the standard PI project, the Tomo format, + +00:20:11.360 --> 00:20:12.240 +that will also be supported. + +00:20:13.760 --> 00:20:16.400 +That means that, you know, like if you use PDM + +00:20:16.520 --> 00:20:19.660 +or if you use poetry with one of the recent versions, + +00:20:20.080 --> 00:20:20.820 +like that will work. + +00:20:21.120 --> 00:20:23.040 +If you use a very old version of poetry + +00:20:23.360 --> 00:20:25.660 +or like you use some other strange package manager + +00:20:25.720 --> 00:20:27.660 +or something that will probably be problematic. + +00:20:28.220 --> 00:20:30.820 +But for like most of the use cases + +00:20:30.960 --> 00:20:33.840 +that use the standard package formats, it will just work. + +00:20:34.640 --> 00:20:35.460 +And if you use uv, + +00:20:35.820 --> 00:20:43.820 +like you're gonna have the best experience because we are fans of uv and astra yeah they've definitely + +00:20:44.240 --> 00:20:49.800 +put a dent in the way that sort of python gets started and making that a lot easier so it totally + +00:20:49.940 --> 00:20:56.880 +makes sense and also i noticed speaking of uv that there's at least in the recommended way + +00:20:57.780 --> 00:21:04.160 +or over the way in the docs let's say it doesn't say here's how you install fast api you just + +00:21:04.880 --> 00:21:13.140 +here's how you run fast api dash new leveraging uv which then will you know silently install + +00:21:13.660 --> 00:21:19.320 +and manage all right that's pretty neat that that helps you guys tell a simpler story + +00:21:20.500 --> 00:21:25.220 +right yeah here's how you create the virtual environment to install our thing and so exactly + +00:21:25.980 --> 00:21:30.759 +yeah the idea is to make it like as someone i was saying just super simple for people just to + +00:21:30.820 --> 00:21:36.080 +start from scratch, like no idea how to create an app, how to start, how to create an environment. + +00:21:36.200 --> 00:21:41.020 +It's just you run this command and you're off to go. After the races, I'm missing. + +00:21:42.520 --> 00:21:49.780 +Anyway, that's what Colombians do. But then if you already have an app, + +00:21:50.260 --> 00:21:55.620 +you have anything with FastAPI standard installed, then that also just works. + +00:21:56.700 --> 00:22:03.100 +Yeah. Okay. And Savannah, you pointed out that it doesn't have to be a new project. If you want to + +00:22:03.130 --> 00:22:09.240 +start from an existing one, that's totally fine. But what do I got to do if I'm starting from, + +00:22:09.620 --> 00:22:14.060 +if I'm migrating an existing one? Like how easy or hard is this? + +00:22:15.000 --> 00:22:21.799 +Yeah. I mean, honestly, like I have some like legacy project demo apps I've built at other + +00:22:21.820 --> 00:22:28.180 +companies I've worked with that have used FastAPI and I literally just ran like FastAPI login and + +00:22:28.260 --> 00:22:34.600 +then FastAPI deploy and it just worked which felt really magical right like I think that's like I + +00:22:34.600 --> 00:22:39.040 +don't know like having worked on cloud products for quite a while like I think one of the biggest + +00:22:39.320 --> 00:22:44.160 +gaps is like the just I don't know like the disparity between like my local dev environment + +00:22:44.380 --> 00:22:50.519 +and what is actually like lives up in the cloud somewhere and so being able to just run one command + +00:22:50.540 --> 00:22:55.440 +and then having the project as it exists on my machine go and work somewhere without having to + +00:22:55.440 --> 00:23:00.320 +think about like the infrastructure. And of course, like, you know, we want to be like amenable to + +00:23:00.480 --> 00:23:05.940 +folks who do want a little bit, you know, like higher touch. But we also want to work for people + +00:23:05.950 --> 00:23:11.760 +who are like learning FastAPI and Python, right? Like educators and people that are teaching Python. + +00:23:11.970 --> 00:23:16.540 +I think this is like something that we've had some interest in as well from most folks. So, + +00:23:18.260 --> 00:23:37.760 +Yeah, I was just listening to the Teaching Python podcast folks just the other day and thinking, you know, like this, when I look at this, I know this is not necessarily your focus, but certainly people who are trying to teach a class, be it college class or high school class or whatever. + +00:23:39.100 --> 00:23:44.440 +And if you build anything on the web, the next question is, this is cool. + +00:23:44.660 --> 00:23:45.780 +How do I share it with people? + +00:23:46.100 --> 00:23:49.460 +And then like, oh, no, no, no, hold on. + +00:23:49.920 --> 00:23:53.400 +Are you like in like coding boot camps, right? + +00:23:53.540 --> 00:23:58.820 +Like if you're teaching someone how to write Python or how to write, build an API with + +00:23:58.960 --> 00:24:04.020 +FastAPI, like actually setting up the environment for them to deploy is not part of it. + +00:24:04.640 --> 00:24:04.720 +Right. + +00:24:04.960 --> 00:24:06.900 +Like that's not actually part of the curriculum. + +00:24:07.060 --> 00:24:11.240 +It's like this peripheral thing that ends up eating up a bunch of the educators time or + +00:24:11.240 --> 00:24:15.000 +the students time trying to understand both like how to write code and then also understand + +00:24:15.040 --> 00:24:15.620 +cloud stuff. + +00:24:15.920 --> 00:24:18.700 +And that's like a lot to ask people when they're fresh out the gate. + +00:24:19.500 --> 00:24:23.560 +Yeah. I mean, I feel the same way about like tutorials and stuff at conferences. + +00:24:24.360 --> 00:24:27.540 +Yeah. Or, yeah. Or training sessions. + +00:24:27.950 --> 00:24:31.000 +If you're doing like corporate training or like, they're all like, Oh, well, + +00:24:31.180 --> 00:24:35.720 +let's get everybody's machine work. There goes an hour, whatever. + +00:24:36.040 --> 00:24:39.040 +But yeah, if you can just say, look, I think when you're, + +00:24:40.320 --> 00:24:41.900 +either when you're trying to learn something, + +00:24:42.360 --> 00:24:48.900 +be it through school or on your own or through these like more structured ways like boot camps and + +00:24:49.740 --> 00:24:56.360 +training and so on i think if it's not the main purpose i feel so often there's like we're going + +00:24:56.360 --> 00:25:03.080 +to do 20 steps for four hours before you get any sort of reward of what you've done and if you can + +00:25:03.140 --> 00:25:07.180 +go okay do you have it running okay now you run this command look now it's on the internet like + +00:25:07.210 --> 00:25:11.699 +oh wait awesome i got an app on the internet everybody look at me you know what i mean and + +00:25:11.820 --> 00:25:16.920 +I think shortening that cycle to where people can have that aha moment. + +00:25:17.070 --> 00:25:19.820 +And then later they can dive into like, well, how is it really working? + +00:25:20.160 --> 00:25:21.380 +And what do we really need to understand? + +00:25:21.980 --> 00:25:28.900 +But that quick iteration cycle, especially in the early parts of learning new tech, it's really important. + +00:25:30.060 --> 00:25:35.480 +But also, you know, like down the line as well, I think, like, I don't know, there are so many things that I have been wanting to build. + +00:25:36.140 --> 00:25:41.540 +And I don't, but I didn't because it was just so complex to deploy stuff. + +00:25:41.630 --> 00:25:44.000 +You know, like knowing, knowing how to do the whole thing, + +00:25:44.220 --> 00:25:48.440 +how to set up the clusters, the machines, install the Linux systems, + +00:25:49.100 --> 00:25:51.160 +deploy the cluster, whatever, like all the stuff, + +00:25:51.680 --> 00:25:54.420 +deploy the things, handling load balancers and HTTPS. + +00:25:54.420 --> 00:25:57.120 +And like, you know, like I know how to do that. + +00:25:57.280 --> 00:26:01.800 +I built one of the most popular websites teaching how to use Docker Swarm, + +00:26:02.000 --> 00:26:04.940 +which was like the contender before Kubernetes won everything. + +00:26:06.680 --> 00:26:09.460 +But still, it's just so complicated doing all those steps. + +00:26:09.580 --> 00:26:12.660 +I'll just not do it. + +00:26:15.100 --> 00:26:18.500 +Now I can just play around and do random stuff and just deploy, + +00:26:18.660 --> 00:26:19.400 +and it just works. + +00:26:20.620 --> 00:26:21.260 +I really like that. + +00:26:21.940 --> 00:26:26.140 +I guess coming back to that taking one for the team point + +00:26:26.300 --> 00:26:28.520 +earlier, I feel like building Python tooling + +00:26:28.820 --> 00:26:30.560 +is kind of like taking one for the team sometimes, + +00:26:31.420 --> 00:26:34.760 +because you have these folks that are brand new to Python. + +00:26:35.440 --> 00:26:38.700 +Python is an extremely approachable language for people who are new to writing code. + +00:26:38.780 --> 00:26:44.540 +But then, you know, we also want to make FastAPI cloud work for someone that's building like an enterprise grade application. + +00:26:44.900 --> 00:26:45.040 +Right. + +00:26:45.540 --> 00:26:57.220 +And so like pretty wide spectrum of folks with like a million different use cases and different types of applications they want to deploy with different constraints and like security stuff. + +00:26:57.620 --> 00:27:01.240 +And like, so, yeah, I think, I don't know, maybe that's just like Python tooling. + +00:27:02.240 --> 00:27:05.900 +It's a lot of work, I guess, to build something that works for the masses. + +00:27:06.920 --> 00:27:10.260 +Yeah, well, it's certainly tough to make something that feels simple, + +00:27:11.000 --> 00:27:14.780 +but it's not overly simplistic, you know, that can actually solve the problems. + +00:27:15.140 --> 00:27:17.860 +Has the right knobs for the right users too, right? + +00:27:18.920 --> 00:27:22.900 +I would argue we're not only trying to do it simple and easy. + +00:27:23.180 --> 00:27:27.620 +I feel like we're choosing a particular flavor of simple, which is... + +00:27:28.220 --> 00:27:30.080 +We have this discussion a few times. + +00:27:30.340 --> 00:27:36.320 +like if you make a cloud how do we make it feel pythonic like what does that mean in a cloud + +00:27:36.540 --> 00:27:41.620 +setting like we talk about pythonic libraries by them a coding style in the community a lot + +00:27:42.160 --> 00:27:46.700 +and now we kind of try to transfer that like that flavor that feeling to the cloud and make + +00:27:47.100 --> 00:27:52.120 +everything around that feel just like we want our libraries to feel so you feel at home as a + +00:27:52.220 --> 00:27:57.599 +pipeline developer and it just feels right so that's an extra step on top of making it simple + +00:27:57.640 --> 00:28:00.760 +and we discuss that a lot. At least that's how I feel about it. + +00:28:02.360 --> 00:28:04.280 +Yeah. I love it. + +00:28:05.200 --> 00:28:07.540 +I think it's one of the... Sorry, go ahead. + +00:28:07.780 --> 00:28:08.080 +No, go ahead. + +00:28:08.580 --> 00:28:13.380 +I was going to say that I think it's one of the coolest things about this thing. + +00:28:15.500 --> 00:28:22.600 +People are being able to hear a few of us. There are a bunch of others. But each one of us is so + +00:28:22.580 --> 00:28:28.480 +passionate about the things that we are working on. So like, you know, like, each one of us is + +00:28:28.580 --> 00:28:33.340 +trying to make the best out of the things that we are building. And then like, we are so passionate + +00:28:33.500 --> 00:28:39.340 +about the thing that we care about, and we are building that I think that that ends up in an + +00:28:39.400 --> 00:28:47.080 +amazing result. For example, the CLI, we wanted to have some specific, you know, like, behavior, + +00:28:47.300 --> 00:28:53.020 +some look and feel and like we wanted to be able to have like the best kind of the clis so patrick + +00:28:53.160 --> 00:28:58.280 +went ahead and built this way bunch of tooling that we needed to be able to have it and like + +00:28:58.400 --> 00:29:04.280 +maybe open source and everything so we could have this great experience when working with clis uh + +00:29:04.880 --> 00:29:10.700 +like uh jonathan recently recently was doing so much stuff about the handling the caches and + +00:29:10.760 --> 00:29:16.640 +handling security making sure that everything was super secure super fast super snappy uh you know + +00:29:16.500 --> 00:29:22.920 +like a Alejandra is super careful about all the UI. Martin is super careful about all the infra. + +00:29:23.280 --> 00:29:28.880 +You know, it's like this, this, this passionate nest, which is a word I just made up. + +00:29:31.700 --> 00:29:36.600 +This, Alejandra goes and says like, this thing doesn't have the proper margins, we need to + +00:29:36.760 --> 00:29:41.100 +increase this a little bit. I didn't, I don't like it. She just goes and fix it. The same with Martin. + +00:29:41.140 --> 00:29:44.880 +He says like, we need to have like this, this other thing in infrastructure. And it's like, + +00:29:45.320 --> 00:29:51.640 +just comes and tells me hey we are doing this like yes sir you know it's like this with the + +00:29:51.740 --> 00:29:56.800 +work of the team like uh yuri for example that is mainly focused on the open sources constantly + +00:29:57.140 --> 00:30:03.080 +looking at all the discussions prs conversations making sure that everything that we do uh that + +00:30:03.180 --> 00:30:09.039 +that's also why you know like there have been like recently way more releases of fast api and friends + +00:30:09.100 --> 00:30:15.700 +of the open source projects and very fast book fixes very fast responses to handle everything + +00:30:15.880 --> 00:30:20.620 +for the community and now we actually like have people that is paying attention constantly to + +00:30:20.680 --> 00:30:25.040 +what is happening what what is what are the things that we have to do and that really care about + +00:30:25.700 --> 00:30:31.760 +a that part as well so i think this extreme care about what what we are you know like savannah is + +00:30:31.780 --> 00:30:42.320 +making python and hello i don't know i think this detail that each one of us cares so so much about + +00:30:42.540 --> 00:30:47.240 +each one of the things that we build hey that that helps a lot making sure that the product is + +00:30:47.400 --> 00:30:54.620 +actually amazing it's as good as it can be and we can all feel at home when doing it i don't know + +00:30:54.920 --> 00:31:01.740 +i get i get so excited because i really enjoy the end result of the product and being able to use + +00:31:01.760 --> 00:31:04.920 +and how it works in the end, how it needs to work with it, + +00:31:05.280 --> 00:31:06.200 +it's super simple. + +00:31:07.940 --> 00:31:08.600 +Yeah, that's awesome. + +00:31:08.760 --> 00:31:09.980 +Here, let me adjust your mic real quick. + +00:31:10.380 --> 00:31:13.800 +I think it was like ducking, ducking out a little bit. + +00:31:13.800 --> 00:31:17.220 +We just went through a lot of content and a lot of sweating + +00:31:17.560 --> 00:31:21.660 +because your microphone went through like six different stages. + +00:31:24.100 --> 00:31:24.940 +Yeah, that's, I think, + +00:31:27.440 --> 00:31:36.300 +So I think that really leads to something I wanted to talk about is just what impact has this had on FastAPI? + +00:31:36.300 --> 00:31:53.820 +And before you jump in and answer that question, everyone, there's, especially I think with Astral, because they've had so much success, there's been an undercurrent of concern of like, oh my gosh, commercialism is getting into our open source. + +00:31:54.240 --> 00:31:57.740 +What if it pollutes it and causes these negative aspects? + +00:31:59.299 --> 00:32:03.640 +But just hearing all of the energy around FastAPI + +00:32:03.670 --> 00:32:07.220 +with so many people because of FastAPI Cloud, + +00:32:07.550 --> 00:32:09.120 +I think that's super neat. + +00:32:09.400 --> 00:32:10.720 +So I wanted to throw out to you all, + +00:32:12.720 --> 00:32:17.180 +how is this building FastAPI Cloud and the existence of FastAPI + +00:32:17.380 --> 00:32:21.140 +Cloud been giving back to FastAPI, I guess? + +00:32:28.860 --> 00:32:31.380 +I'm always the one that's speaking the most. + +00:32:33.040 --> 00:32:34.680 +I mean, it might be your project. + +00:32:35.060 --> 00:32:36.440 +You might have started the project. + +00:32:37.679 --> 00:32:38.780 +Yeah, maybe so. + +00:32:39.660 --> 00:32:47.100 +No, but you know, like last year, I had a few keynotes in some PyCons in different places. + +00:32:47.320 --> 00:32:57.940 +And like one of the key points that I wanted to bring was this idea that I'm trying to show that in many cases, people worry about the boss factor. + +00:32:58.340 --> 00:33:00.100 +Yes, yes, I've heard this. Yes. + +00:33:01.060 --> 00:33:07.000 +Yeah, you know, like the boss factor is the idea that, oh, what happens if like there's one person doing this work? + +00:33:07.060 --> 00:33:09.960 +What happens if a boss runs over this person? + +00:33:10.480 --> 00:33:14.200 +And there's so much worry about this boss factor. + +00:33:14.860 --> 00:33:18.620 +It's sort of a morbid analogy, but I understand, right? + +00:33:18.700 --> 00:33:24.000 +It's like, what will happen to the open source project if the maintainer vanishes for some reason, right? + +00:33:24.240 --> 00:33:24.560 +Exactly. + +00:33:25.380 --> 00:33:28.400 +But, you know, like it also applies to products and to many other different things. + +00:33:29.320 --> 00:33:36.480 +But what I think is that is a disproportionate amount of attention to this detail of the boss factor. + +00:33:37.320 --> 00:33:44.760 +And I think every time people talk about the boss factor, you know, like one of my points in what I was trying to say + +00:33:44.780 --> 00:33:51.320 +these talks was I would like people to think about the bus ticket factor who is paying for those + +00:33:51.540 --> 00:33:57.340 +tickets it doesn't matter how big is the team you know like you have seen google amazon meta all the + +00:33:57.400 --> 00:34:03.420 +big ones they don't have a small boost factor they have a lot of people in their payroll and still + +00:34:03.900 --> 00:34:11.899 +they they finish products they just cancel them open source or private or whatever it's not + +00:34:11.919 --> 00:34:18.340 +that the main factor defining the success of a project being it commercial or open source of + +00:34:18.419 --> 00:34:26.060 +any type is not really how many people are behind it it's more of what is the value that whoever + +00:34:26.060 --> 00:34:31.399 +is putting the effort to keep it alive is getting from putting all that effort it could be just + +00:34:31.639 --> 00:34:36.139 +satisfaction you know it could be like open source like i feel i feel so good that i'm contributing + +00:34:36.159 --> 00:34:41.600 +into society and that is valid. It doesn't pay the rent but it is still valid. It might + +00:34:41.690 --> 00:34:46.260 +last for a while. But then also like when you see there are so many Python projects, + +00:34:46.340 --> 00:34:52.540 +so many open source projects that can do well or can do bad and it doesn't really depend + +00:34:52.700 --> 00:34:57.740 +on how many people they have. And when you are using a project, when you are using an + +00:34:57.820 --> 00:35:03.240 +open source project or when you are using a product of any type, I will encourage you + +00:35:03.260 --> 00:35:08.780 +to think about what is the bus ticket factor of this project what are the + +00:35:08.900 --> 00:35:14.540 +things that whoever is building this is receiving in exchange for giving it away + +00:35:15.460 --> 00:35:20.440 +so like you know like what are they expecting to sell you at some point or + +00:35:20.560 --> 00:35:26.380 +what are they receiving in exchange you know for example Bunn the JavaScript + +00:35:26.560 --> 00:35:29.820 +Brompton library like it was like we don't know what they are going to sell + +00:35:29.840 --> 00:35:36.840 +but now you know cloud and entropic really want to have like this thing keep working because they + +00:35:36.860 --> 00:35:41.520 +are using it internally so you can say like okay i'm gonna use it i'm gonna use it for free i know + +00:35:41.640 --> 00:35:47.360 +that what they receive for me using is just like that they just really want it so i can just like + +00:35:47.480 --> 00:35:52.060 +whenever you are using bond you are getting now you are getting free services from entropic that's + +00:35:52.140 --> 00:35:58.460 +it but you know like every time you are using a project you can think about why are people receiving + +00:35:58.480 --> 00:36:03.740 +in this change for giving this away for me. I think this is... sorry, go ahead, Sonia. + +00:36:04.000 --> 00:36:05.060 +No, no, no, you finish it. + +00:36:06.460 --> 00:36:11.560 +No, no, that I think I... this is like the thing that I would like people to think about, you know, + +00:36:11.640 --> 00:36:17.560 +like, also like how can they give back? Maybe they can also actually contribute to that community or + +00:36:17.660 --> 00:36:22.400 +to the project. There are many ways and in many cases, the thing that is needed the most is just + +00:36:22.400 --> 00:36:24.940 +like help and work just answering questions and issues. + +00:36:26.020 --> 00:36:26.400 +So yeah. + +00:36:27.420 --> 00:36:30.000 +Yeah, I was just going to say that, like kind of related to what you're saying, + +00:36:30.080 --> 00:36:34.520 +I think one of the angles that I really appreciate about the way we think about + +00:36:35.000 --> 00:36:39.380 +FastAPI and FastAPI Cloud is like we're like a lot of our team was involved in open + +00:36:39.520 --> 00:36:44.960 +source before coming to work at FastAPI Cloud on various projects around the Python ecosystem + +00:36:45.160 --> 00:36:45.680 +outside of Python. + +00:36:46.440 --> 00:36:51.640 +And I think like all of us have a deep appreciation and understanding of the value + +00:36:51.660 --> 00:36:58.460 +of open source and really, really try and build in a way that is like, I mean, Sebastian, + +00:36:58.560 --> 00:37:03.100 +you've talked about this a lot, but like solving a real problem for folks, right? And so like + +00:37:03.260 --> 00:37:07.700 +FastAPI cloud is sort of this like extension of this open source ecosystem people would + +00:37:07.700 --> 00:37:13.200 +be using. We, you know, FastAPI cloud may be an option, maybe someone picks some other + +00:37:13.320 --> 00:37:21.620 +cloud for some reason. I don't think like, I think we're all very mindful of that. But + +00:37:21.940 --> 00:37:26.560 +all work at FastAPI Cloud, like I know that I personally have time, more time for my open source + +00:37:26.760 --> 00:37:32.700 +work and my employer understands the value of my open source work, which is net positive for the + +00:37:32.740 --> 00:37:38.700 +open source community. Like I get to work on CPython sometimes and I have, you know, the bandwidth to + +00:37:38.720 --> 00:37:44.480 +go and do my steering council work or upcoming release management work. So I think like, I + +00:37:44.660 --> 00:37:50.360 +understand like this sort of like, I don't know, like tempering, like open source commercial bad, + +00:37:50.860 --> 00:37:54.440 +all bad. It's not all bad. It's actually like really good in a lot of cases for + +00:37:55.040 --> 00:37:59.900 +look at uv for an example of the holdup, right? Yeah. Astral. Yeah, yeah, totally. Yeah. Yeah. + +00:38:00.620 --> 00:38:06.220 +I think there are some really good examples of this. So I think like, that's another angle that + +00:38:06.220 --> 00:38:10.680 +I mean, I really, I get a lot of energy out of our team because we all, I don't have to, + +00:38:10.960 --> 00:38:15.660 +I don't have to fight the open source battle at FastAPI and Cloud. I think that's really cool. + +00:38:17.160 --> 00:38:18.660 +I do think that's super cool as well. + +00:38:18.660 --> 00:38:23.100 +Let me put out two examples for you here that I think everyone will be aware of. + +00:38:24.580 --> 00:38:27.920 +As sort of to add to what Sebastian was saying is, + +00:38:28.960 --> 00:38:33.600 +look how much Apple freaked out when Steve Jobs died and how many people work at Apple. + +00:38:34.280 --> 00:38:34.360 +Right. + +00:38:34.940 --> 00:38:37.360 +Like that was still like, oh, my gosh. + +00:38:37.960 --> 00:38:40.820 +But, you know, I think there's they're hanging in there. + +00:38:40.940 --> 00:38:42.080 +They're going to be probably making it. + +00:38:44.540 --> 00:38:44.820 +And then. + +00:38:44.940 --> 00:38:46.260 +They are not enough people. + +00:38:47.200 --> 00:38:48.800 +I tell you what, they got some of my money. + +00:38:49.340 --> 00:38:49.800 +That's for sure. + +00:38:51.920 --> 00:38:54.460 +But also, you know, look at Flask, right? + +00:38:54.660 --> 00:38:56.740 +Armin Roeneker worked on it for a long time, + +00:38:57.580 --> 00:38:59.360 +drifted away, which is totally fine. + +00:39:00.040 --> 00:39:03.860 +And David Lord and Pallets picked it up and kept running, right? + +00:39:04.040 --> 00:39:09.140 +Like, it's still one of the most popular frameworks out there, right? + +00:39:09.340 --> 00:39:10.880 +So it's... + +00:39:12.000 --> 00:39:21.620 +I think the bus factor is over, overblown a bit, but also looking at the team of folks here, I think it's, it's even more obvious that there's a bunch of people on the inside. + +00:39:22.280 --> 00:39:37.060 +But, but you know, like, for example, for example, Flask, you know, like I learned so many things from Flask and like, the thing is, I feel like sometimes, sometimes people go and complain about the tool and say like, oh, this is not working for this or for that. + +00:39:37.280 --> 00:39:42.260 +and in many cases is in this insensitive way towards the people that are working on that. + +00:39:42.900 --> 00:39:47.320 +And it's like, you know, like in the end, realize that there's actually people behind the scenes doing the work. + +00:39:47.600 --> 00:39:51.480 +And like, in many cases, it's just like one or two people doing a lot of work. + +00:39:51.480 --> 00:39:52.860 +In many cases, it's just for free. + +00:39:53.200 --> 00:39:57.720 +And, you know, like I think it's worth calling that out. + +00:39:57.740 --> 00:40:01.600 +Like all the work that David Lord does for Flask, it's just like so much work. + +00:40:01.860 --> 00:40:05.780 +And yeah, I think deserves a lot of respect. + +00:40:06.820 --> 00:40:06.980 +Yeah. + +00:40:07.060 --> 00:40:09.400 +Yeah, for sure. I totally agree. + +00:40:11.000 --> 00:40:19.600 +Sorry, the other thing that I forgot to mention is that there were so many ideas of potential products that I could build over the years, + +00:40:19.680 --> 00:40:27.800 +and I never did, and I never started a company, because I didn't have clarity of what will be a good thing to actually sell and will have a good alignment. + +00:40:28.300 --> 00:40:33.620 +the cloud product has such a good alignment with the open source side because + +00:40:34.400 --> 00:40:34.960 +the + +00:40:36.020 --> 00:40:37.820 +you know like as + +00:40:38.460 --> 00:40:42.120 +More successful FastAPI is the more successful + +00:40:42.700 --> 00:40:47.860 +FastAPI cloud has a potential to be the more people using Python's + +00:40:48.500 --> 00:40:54.060 +Effectively the more people might end up checking out FastAPI and the more people might end up checking out the product + +00:40:54.360 --> 00:40:59.300 +So if FastAPI does well, if the open source does well, if Python does well, that's better for the company. + +00:40:59.390 --> 00:41:05.420 +So, you know, like, it doesn't really depend on my personal principles and values or something like that. + +00:41:05.760 --> 00:41:10.100 +It's aligned with, it's financially aligned with the company. + +00:41:10.370 --> 00:41:16.740 +So, you know, like, it's just going to be beneficial in the end because it doesn't depend on good intentions. + +00:41:17.640 --> 00:41:19.020 +And FastAPI is open source. + +00:41:19.190 --> 00:41:21.120 +It has, like, 7,000 forks or something. + +00:41:21.200 --> 00:41:23.380 +So if I was running, so right. + +00:41:24.240 --> 00:41:26.300 +There are 7,000 points is not going away. + +00:41:27.360 --> 00:41:27.760 +Yeah. + +00:41:27.760 --> 00:41:27.820 +Yeah. + +00:41:28.860 --> 00:41:29.260 +Yeah. + +00:41:29.560 --> 00:41:31.640 +I, I definitely agree with you on that. + +00:41:31.680 --> 00:41:36.160 +I think, I feel like I should maybe give a little bit of a, + +00:41:36.820 --> 00:41:41.060 +I tell a little bit of the story of what's going on with, where did I put it? + +00:41:42.560 --> 00:41:46.020 +I don't think I paste it over here is what's going on with tail end right now. + +00:41:46.500 --> 00:41:50.380 +And I think tail end is having a tough time to one CSS. + +00:41:51.839 --> 00:42:03.540 +traffic to tailwind is up six times year over year on npm downloads but the revenue of tailwind + +00:42:04.010 --> 00:42:11.960 +is down five times you know i mean these are completely out of whack things because instead + +00:42:11.960 --> 00:42:16.700 +of people going to docs to learn about it it's just like well when you go to the docs you learn + +00:42:16.640 --> 00:42:20.480 +they also have premium offerings right and I think you guys are different + +00:42:20.780 --> 00:42:26.060 +because it's not just oh here's a little bit nicer of a thing right I feel like + +00:42:26.060 --> 00:42:29.160 +it would be a little bit as if you were selling cookie cutter templates for + +00:42:29.300 --> 00:42:34.960 +FastAPI apps you know it's like well the AI can make the shape of the thing + +00:42:35.060 --> 00:42:38.440 +that comes out of the cookie cutter to be honest you know I mean but but you're + +00:42:38.580 --> 00:42:45.300 +offering something that has ongoing value that it costs more and is more + +00:42:45.320 --> 00:42:57.640 +complex in other places. And so I think maybe just thinking about the how this just keeps the team + +00:42:57.900 --> 00:43:02.740 +going for FastAPI is really awesome. And I think it's got a nice flywheel effect there. + +00:43:03.920 --> 00:43:10.100 +I'll link to this, I guess, audio track. I don't know what I call it. I think blog post that has + +00:43:10.100 --> 00:43:15.180 +one sentence, but a 30 minute audio you can check out from the guy, Adam, who's one of + +00:43:15.180 --> 00:43:17.700 +the founders of Tailwind talking about going into this. + +00:43:18.540 --> 00:43:19.400 +It's kind of rough. + +00:43:20.960 --> 00:43:24.520 +I think I don't necessarily want to go into a deep AI, what it means for the industry. + +00:43:24.660 --> 00:43:26.660 +Like, let's stay focused on what you guys are doing. + +00:43:26.660 --> 00:43:30.540 +But I think it's going to be, it's going to be, I think it's going to be its own series. + +00:43:30.700 --> 00:43:37.060 +I mean, Stack Overflow had as many questions asked this month as they did in the first + +00:43:37.260 --> 00:43:38.880 +month of their existence. + +00:43:39.820 --> 00:43:39.940 +Right. + +00:43:40.040 --> 00:43:44.620 +three or four thousand, whereas at their peak, they were 200,000 questions a month. + +00:43:45.080 --> 00:43:50.620 +There's like real turmoil that's coming from some of these things, which is tricky. + +00:43:51.460 --> 00:43:57.400 +But yeah, I'm really excited to see you all doing this because I'm a big fan of FastAPI. + +00:43:57.790 --> 00:44:04.800 +And I think this is just sustaining and more for FastAPI, right? + +00:44:04.830 --> 00:44:05.480 +Like, what do you all think? + +00:44:07.440 --> 00:44:08.980 +I'm thinking about it. + +00:44:09.680 --> 00:44:10.440 +That's what we hope. + +00:44:14.860 --> 00:44:17.420 +I thought about Taiwan for a second, right? + +00:44:17.580 --> 00:44:19.780 +It's not like we're immune to what happened to them. + +00:44:19.920 --> 00:44:22.000 +Like, we also have a lot of documentation online. + +00:44:22.400 --> 00:44:23.560 +AI could train on that. + +00:44:23.760 --> 00:44:26.800 +And if it's good enough, it could maintain your infrastructure and stuff. + +00:44:26.940 --> 00:44:28.480 +It's just too hard at the moment. + +00:44:28.600 --> 00:44:33.660 +And there's an additional thing we're kind of selling, which is like, I guess, responsibility. + +00:44:33.820 --> 00:44:42.020 +like and you're shifting the risk from like letting your ai or your infantry the team um maintain your + +00:44:42.240 --> 00:44:48.540 +infrastructure to us so we're staying up at night and worry about it that's that that has a lot of + +00:44:48.620 --> 00:44:54.920 +value as well and that's probably not going to get removed yeah and i also think here here here's a + +00:44:54.940 --> 00:45:04.240 +very common cloud code, cursor, whatever conversation. Hey, build me something with Python and + +00:45:04.300 --> 00:45:09.700 +needs an API. Okay, we built it with FastAPI. How do I host it? Right? That doesn't just, + +00:45:10.220 --> 00:45:15.840 +it will build a cloud for you, right? It's going to recommend something out there. And a real + +00:45:16.040 --> 00:45:22.140 +natural way to how do I host FastAPI is FastAPI cloud, right? Like, if it suggests, oh, + +00:45:22.220 --> 00:45:26.140 +you're just going to like spread it across lamb but lambda by breaking like whoa no I want something + +00:45:26.280 --> 00:45:30.360 +simple okay give me FastAPI cloud right I think that that's a really good thing and then on the + +00:45:31.000 --> 00:45:40.200 +the enterprise side you know enterprise folks are notoriously not good at supporting open source and + +00:45:40.240 --> 00:45:47.260 +that they're not like paying for it I know some companies are big supporters of the PSF and Python + +00:45:47.280 --> 00:45:52.620 +and open source but in general it's like yeah we have this project with 5 000 people working on it's + +00:45:52.660 --> 00:45:59.840 +all python and how are we sponsoring this nope just we're just enjoying the money right and we're + +00:45:59.840 --> 00:46:05.960 +a bank so we got the money we got all the money um so they they're just not good at paying for like + +00:46:05.960 --> 00:46:11.820 +a really great framework that they use a lot but they got plenty of hosting plenty of internal apps + +00:46:11.860 --> 00:46:16.460 +that they just need to make run and stuff so i think both on like the low end and the high end + +00:46:17.200 --> 00:46:24.480 +there's a lot of synergy between these things that is not just, you know, slightly advanced, not to + +00:46:26.420 --> 00:46:31.860 +diminish it, but slightly advanced UI widgets that you could ask your AI to build or, + +00:46:32.180 --> 00:46:34.740 +or something or like cookie cutter templates for project starters. + +00:46:36.060 --> 00:46:43.940 +Yeah. Yeah, yeah. That's true. Like, I think we are in a somewhat fortunate position of like, + +00:46:43.960 --> 00:46:49.260 +you know like fast api has grown so much in like you know like when you check the statistics about + +00:46:49.660 --> 00:46:55.840 +downloads or github stars or entries in developer surveys like it's at the top in like in each + +00:46:55.960 --> 00:47:00.760 +category it's like you know like the backend framework with the most github stars across + +00:47:00.920 --> 00:47:07.140 +languages even like you know like java go ruby js like whatever it's like the top one at least + +00:47:07.200 --> 00:47:14.060 +in github stars so like you know like fast api at least is like people are liking it fortunately + +00:47:14.500 --> 00:47:19.480 +and there's probably gonna be people deploying things to fast api cloud so that's probably gonna + +00:47:19.620 --> 00:47:27.020 +be like we are probably gonna be fine i think did you know like the i guess it will be like a good + +00:47:27.060 --> 00:47:32.560 +point to to ask people to go and check where the open source projects that they are using and check + +00:47:32.440 --> 00:47:37.300 +what is the boss ticket factor for those open source projects? You know, like if you are using + +00:47:37.740 --> 00:47:43.100 +Tailwind CSS, it would have been very cool if at some point you check if the premium things were + +00:47:43.180 --> 00:47:46.640 +useful for you and for your company or your project or something like that, you know? + +00:47:49.039 --> 00:47:52.100 +Because what is the thing that keeps that project going? + +00:47:53.980 --> 00:48:02.400 +Right, exactly. And I really personally admire if a project or something + +00:48:02.620 --> 00:48:06.200 +offers like more value, not just, hey, buy me a coffee, + +00:48:07.000 --> 00:48:09.920 +but here's a thing that you get way more of, you know, + +00:48:10.000 --> 00:48:11.980 +and in that regard, I think Tailwind was doing that, right? + +00:48:11.980 --> 00:48:14.480 +They were offering this suite of pre-built things, + +00:48:14.810 --> 00:48:16.160 +and I think that that's great. + +00:48:17.320 --> 00:48:20.880 +But, yeah, I do think you've got more of a, + +00:48:22.380 --> 00:48:24.400 +these crazy AI things are going to maybe recommend + +00:48:24.779 --> 00:48:27.280 +FastAPI Cloud more than they're just going to undercut it. + +00:48:27.290 --> 00:48:28.140 +So I think that's really great. + +00:48:28.460 --> 00:48:32.380 +And by the way, I was just looking for the GitHub Stars graph + +00:48:32.400 --> 00:48:36.540 +Like there's a whole, I can't remember what the domain of that site is, but, and I + +00:48:36.680 --> 00:48:38.500 +ran across by the, I just want to give a quick shout out. + +00:48:38.650 --> 00:48:43.380 +Like your cult repo documentary on FastAPI was awesome. + +00:48:44.500 --> 00:48:44.860 +Right. + +00:48:45.000 --> 00:48:45.680 +That was so fun. + +00:48:46.400 --> 00:48:47.140 +I didn't see that coming. + +00:48:47.920 --> 00:48:48.060 +Yeah. + +00:48:48.270 --> 00:48:52.460 +It came right on the heels of the Python official documentary, the one hour one. + +00:48:52.470 --> 00:48:55.600 +This is the same group and it, the production quality was really nice. + +00:48:55.760 --> 00:48:57.520 +So like, oh, yeah, it was super fun. + +00:48:58.040 --> 00:49:01.880 +When they released the trailer for the Python documentary, + +00:49:02.160 --> 00:49:04.220 +before releasing the documentary, when they released the trailer, + +00:49:04.460 --> 00:49:08.360 +they contacted me and said, "Hey, we're doing these mini documentaries + +00:49:08.780 --> 00:49:10.220 +about different frameworks, different tools, + +00:49:10.300 --> 00:49:11.920 +and we want to include FastAPI." + +00:49:11.920 --> 00:49:13.120 +They were like, "Oh, nice." + +00:49:14.320 --> 00:49:17.260 +I was just trying to stay silent, but super excited. + +00:49:17.840 --> 00:49:19.840 +I'm sure. That's so cool. + +00:49:20.060 --> 00:49:21.520 +Yeah, I watched it as soon as it came out. + +00:49:22.320 --> 00:49:23.100 +I'll link to that. + +00:49:23.220 --> 00:49:25.280 +People should definitely-- it's only like 10 minutes or something, + +00:49:25.480 --> 00:49:30.540 +but it's worth it. We're checking out. So it's not a huge investment time. People can watch it, + +00:49:30.540 --> 00:49:35.520 +I suppose. It's not tick tock. I mean, it's not like, Oh, I saw the documentary. + +00:49:40.080 --> 00:49:42.940 +It doesn't take you on huge about it. Yeah, people. + +00:49:44.540 --> 00:49:49.840 +It is such a weird time. You guys, the I don't understand what's happened to the + +00:49:50.200 --> 00:49:55.440 +attention span of society. I'm really honestly a little concerned. I used when I create my + +00:49:55.480 --> 00:49:59.260 +is people would say, you know, like a four hour course, and there'd be like a 10, 15 minute sort + +00:49:59.260 --> 00:50:02.300 +of, hey, here's how you set up your computer. And here's all the introduction. And people, oh, + +00:50:02.440 --> 00:50:05.840 +that's so awesome. I loved how you kind of set the stage. I'm really motivated to take the course. + +00:50:06.420 --> 00:50:10.360 +Nowadays, I just get messages like, why are you still talking? This is five minutes long. Do you + +00:50:10.560 --> 00:50:17.080 +understand? I'm like, this is your job. You can't spend five minutes. Oh my gosh. Anyway, that's + +00:50:17.180 --> 00:50:21.700 +that's sort of the origin of my comment there. It's all right. So we're kind of getting somewhat + +00:50:21.720 --> 00:50:30.420 +short on time so i think i think i want to talk about a couple of things let's let's talk a little + +00:50:30.540 --> 00:50:36.020 +bit about internals like what i don't know who wants to take this one but let's talk about just + +00:50:36.240 --> 00:50:41.800 +how when i say fast api deploy then what + +00:50:46.040 --> 00:50:52.000 +It's just a UB pip install and it just goes and it's magic and it's easy, right? + +00:50:52.550 --> 00:50:55.200 +We haven't had anything before, Jonathan, can we save or no? + +00:50:55.480 --> 00:50:58.960 +Yeah, I was going to say, I was going to say, Kronathan, you should take this one. + +00:51:00.760 --> 00:51:02.820 +Oh my God, it's so funny, this happened. + +00:51:03.260 --> 00:51:06.180 +Because I told my friends, I'm so concerned about being at the podcast + +00:51:06.490 --> 00:51:10.060 +because everyone here is a visionary and then I'm the backend guy. + +00:51:10.780 --> 00:51:14.020 +And I think the things I could contribute to this conversation, + +00:51:14.090 --> 00:51:15.220 +I should probably keep to myself. + +00:51:16.480 --> 00:51:24.600 +but it's just leaking internals right um there are some things that are like not really a secret + +00:51:25.000 --> 00:51:33.300 +like as Sebastian said earlier like kubernetes one and in the infrastructure and deployment field + +00:51:33.540 --> 00:51:40.000 +to some extent so that's somewhere in there right uh but it's it's all the way deep down so no one + +00:51:40.000 --> 00:51:45.400 +has to worry about it but it's still a foundation which is a good foundation um I think one thing + +00:51:45.420 --> 00:51:50.860 +that's you you might have guessed it but fast api cloud is built on fast api which kind of makes + +00:51:50.980 --> 00:51:57.760 +sense right um and that also has an effect on like recent patches updates and stuff because if we find + +00:51:57.860 --> 00:52:03.920 +something internally uh which we're not happy with then we just fix it and that's how some releases + +00:52:04.250 --> 00:52:11.960 +uh came out faster than once before dog fooding yeah that's awesome putting a lot yeah also all + +00:52:11.980 --> 00:52:16.420 +all the related libraries like SQL model and others, + +00:52:17.400 --> 00:52:18.500 +they experience the same thing. + +00:52:18.560 --> 00:52:19.820 +- We're actually using everything now. + +00:52:20.580 --> 00:52:21.780 +- Oh yeah, okay, that's cool. + +00:52:22.360 --> 00:52:24.100 +- New library is coming out. + +00:52:25.440 --> 00:52:26.240 +- That's also a thing. + +00:52:26.680 --> 00:52:29.560 +Yeah, it's not just fast.jpi and friends. + +00:52:30.460 --> 00:52:31.360 +We're like really open. + +00:52:32.360 --> 00:52:34.940 +Recently, Patrick just open-sourced everything we use + +00:52:35.040 --> 00:52:37.380 +for authentication authorization, for example. + +00:52:37.720 --> 00:52:38.500 +Is it open-source yet? + +00:52:39.100 --> 00:52:40.000 +Did they just fix something? + +00:52:40.960 --> 00:52:42.380 +No, it is. Yeah, but no, no. + +00:52:44.079 --> 00:52:44.820 +It will be. + +00:52:45.120 --> 00:52:45.860 +Yeah, it doesn't make better. + +00:52:46.100 --> 00:52:49.560 +But yeah, we build stuff internally in the moment really. + +00:52:49.670 --> 00:52:52.880 +I like we build it in a way, like in a separate package, + +00:52:53.320 --> 00:52:54.460 +just like an open source library. + +00:52:54.560 --> 00:52:55.760 +And if we feel like the time is right, + +00:52:55.860 --> 00:52:58.900 +it's just getting open sourced because a lot of things are reusable. + +00:53:00.300 --> 00:53:01.460 +And that's in all departments. + +00:53:01.840 --> 00:53:02.720 +Like that happens a lot. + +00:53:02.840 --> 00:53:06.700 +Like when I started there, I already realized that, + +00:53:06.710 --> 00:53:07.920 +like everyone was building open source, + +00:53:08.100 --> 00:53:09.560 +but now I joined in myself as well. + +00:53:09.680 --> 00:53:13.000 +I open source the library for compression, + +00:53:13.540 --> 00:53:18.760 +like compressing and decompressing archives in Python. + +00:53:19.860 --> 00:53:22.480 +Because the internal TAR5 thingy is just slow, + +00:53:22.540 --> 00:53:24.240 +and we needed it to be faster because we're + +00:53:24.440 --> 00:53:25.620 +stirring up the deployment process. + +00:53:25.820 --> 00:53:26.740 +We're like, hey, we could probably + +00:53:27.060 --> 00:53:28.180 +shape off a few seconds here. + +00:53:29.220 --> 00:53:31.840 +And then that's just open source for everyone to use. + +00:53:32.060 --> 00:53:35.060 +So we're contributing to the whole Python ecosystem as well. + +00:53:35.100 --> 00:53:36.460 +You have to say the name. + +00:53:36.680 --> 00:53:37.320 +It's so good. + +00:53:38.420 --> 00:53:43.460 +Is it good? It's just it's faster because it's faster than just tar. + +00:53:44.680 --> 00:53:46.200 +Fast tar. I love it. + +00:53:46.400 --> 00:53:51.220 +Fast tar. You can say with that very German accent. + +00:53:51.540 --> 00:53:51.760 +With the. + +00:53:55.960 --> 00:53:57.680 +I could have, but you did it better. + +00:53:58.830 --> 00:53:58.980 +Yeah. + +00:53:59.860 --> 00:54:00.700 +I'm afraid to see. + +00:54:01.420 --> 00:54:03.740 +I'm scared of leaking stuff from our deployments. + +00:54:04.780 --> 00:54:06.180 +That's why I didn't say much. + +00:54:06.220 --> 00:54:08.380 +Is that the FASTA or is this different? + +00:54:08.680 --> 00:54:12.280 +No, it's literally the same name, but it's so fresh that, yeah, + +00:54:12.500 --> 00:54:14.200 +I guess Google showed you this one instead. + +00:54:14.820 --> 00:54:14.960 +Yeah. + +00:54:15.500 --> 00:54:16.100 +Let's see. + +00:54:16.400 --> 00:54:17.800 +Dr. John FASTA. + +00:54:18.700 --> 00:54:18.860 +Yeah. + +00:54:19.040 --> 00:54:19.340 +Okay. + +00:54:20.820 --> 00:54:20.960 +Yeah. + +00:54:21.460 --> 00:54:23.300 +But it's also like, you know, like we were- + +00:54:23.440 --> 00:54:25.540 +Doctor, full name. + +00:54:25.900 --> 00:54:26.240 +Full story. + +00:54:26.360 --> 00:54:27.400 +Yeah, yeah, yeah. + +00:54:29.700 --> 00:54:30.660 +There you go. + +00:54:31.160 --> 00:54:31.520 +One word. + +00:54:31.520 --> 00:54:31.860 +Wow. + +00:54:31.860 --> 00:54:33.420 +That's a lot of zeros. + +00:54:34.220 --> 00:54:37.100 +You need to like SEO-ify this package, man. + +00:54:37.260 --> 00:54:39.020 +I'm searching. I know. I'm searching on GitHub. + +00:54:40.120 --> 00:54:41.680 +Why is that pull request now? + +00:54:43.440 --> 00:54:44.520 +Am I that hard to find? + +00:54:45.020 --> 00:54:45.640 +Oh my gosh. + +00:54:46.120 --> 00:54:47.380 +Hold on, hold on, hold on. + +00:54:48.200 --> 00:54:48.860 +It's in the Chelsea. + +00:54:52.720 --> 00:54:53.840 +Yes, there we go. + +00:54:54.140 --> 00:54:54.660 +Okay, okay. + +00:54:55.840 --> 00:54:57.740 +The detective is at work. Here we go faster. + +00:54:58.000 --> 00:54:59.240 +Yeah, it's already gone. + +00:54:59.380 --> 00:55:01.200 +Well, I'm not logged in on this machine. + +00:55:01.320 --> 00:55:02.820 +This is just my streaming, not my dev machine. + +00:55:03.240 --> 00:55:05.900 +I'll go star it. We'll get you some stars. + +00:55:08.200 --> 00:55:09.620 +That's the irony about it. + +00:55:10.160 --> 00:55:11.280 +It literally has no stars, + +00:55:11.500 --> 00:55:12.860 +but if you scroll down and you see the downloads, + +00:55:13.200 --> 00:55:15.440 +that's going to prove we're actually using it. + +00:55:16.220 --> 00:55:18.380 +Yeah, awesome. Super cool. + +00:55:20.440 --> 00:55:24.340 +This is pretty neat. Yeah, I like it. + +00:55:25.180 --> 00:55:26.220 +It's a little context manager. + +00:55:29.360 --> 00:55:34.880 +It's almost working the same as the TAR file in the standard library, + +00:55:35.180 --> 00:55:38.760 +like the same, like almost similar API to that. + +00:55:39.340 --> 00:55:41.620 +It's basically a drop-in replacement, more or less. + +00:55:42.820 --> 00:55:42.900 +Cool. + +00:55:43.120 --> 00:55:44.960 +But then everything happens in Rust. + +00:55:45.360 --> 00:55:45.760 +Because Rust. + +00:55:46.360 --> 00:55:47.300 +Because it's fast. + +00:55:47.580 --> 00:55:48.480 +Because Rust, yeah. + +00:55:48.640 --> 00:55:50.240 +Well, as soon as it becomes infrastructure + +00:55:51.260 --> 00:55:52.640 +and you've got to run it a million times, + +00:55:53.200 --> 00:55:54.580 +that starts to make sense, right? + +00:55:55.220 --> 00:55:55.280 +Yeah. + +00:55:55.460 --> 00:55:58.760 +I mean, Python is one of the fastest programming languages + +00:55:59.480 --> 00:56:04.120 +in the world when you think about human time + +00:56:04.250 --> 00:56:05.400 +to build the things, right? + +00:56:05.540 --> 00:56:07.360 +Like that's one of its real superpowers. + +00:56:07.490 --> 00:56:09.700 +It's like, I mean, there's the whole story + +00:56:11.850 --> 00:56:14.600 +of Google Video and YouTube, right? + +00:56:14.690 --> 00:56:17.460 +And Google Video is written in C++ with 100 engineers, + +00:56:17.690 --> 00:56:19.780 +and YouTube was a small team in Python. + +00:56:19.990 --> 00:56:21.680 +And they just couldn't keep up with the features. + +00:56:21.840 --> 00:56:23.900 +So they bought this little thing YouTube + +00:56:24.040 --> 00:56:28.020 +and see if we're going to make something with it. And last I checked, it was still in Python. I'm + +00:56:28.080 --> 00:56:34.000 +sure some of it isn't but a few years ago it was, which is wild. Anyway, there's different ways of + +00:56:34.140 --> 00:56:38.920 +fast but when it's down to like little utilities. Yeah. I know. They're trying to make Python fast. + +00:56:39.120 --> 00:56:47.720 +Yeah, I honestly massive success in the last five years, right? Yeah, yeah. Like since 311, + +00:56:48.220 --> 00:56:53.880 +since the specializing adaptive interpreter has been pretty big. Yeah, 3.9 and 3.11 did a lot of + +00:56:53.900 --> 00:56:58.320 +foundational work and then three nine onward really just uncorked a lot of innovation there + +00:56:58.740 --> 00:57:04.960 +yeah yeah that's pretty awesome all right um it sounds like sebastian you've talked a lot about + +00:57:05.140 --> 00:57:12.340 +kubernetes so i imagine kubernetes is happening uh do we get to pick what data centers it runs on do + +00:57:12.380 --> 00:57:20.680 +we get to pick what clouds it runs on like you're going to get to pick some of these things not yet + +00:57:20.760 --> 00:57:27.760 +is not released yet. But you know, like it's, so of course, like we have like regular 12 providers + +00:57:27.920 --> 00:57:32.360 +in the middle and there's a bunch of Kubernetes, then there's a bunch of additional stuff that needs + +00:57:32.500 --> 00:57:36.760 +to run on top. Then there's like custom Kubernetes controllers and things that + +00:57:37.600 --> 00:57:43.260 +Jonathan was saying that he's having to write in Go so people in Python can be happy to be able to + +00:57:43.840 --> 00:57:47.160 +you know, like manage all the Kubernetes shenanigans that need to happen because + +00:57:47.180 --> 00:57:53.460 +there's so much complexity that needs to be handled and there's a lot of that we do a lot of + +00:57:54.000 --> 00:57:58.800 +advanced tricks also jonathan was recently doing a bunch of advanced tricks to handle the caches + +00:57:59.660 --> 00:58:05.720 +for the builds so the way that we handle caches and we also like tap into uv and how things work + +00:58:05.900 --> 00:58:12.440 +so that builds can be super super fast because it's like something it's we are you know like we + +00:58:12.460 --> 00:58:20.140 +are very much targeted at FastAPI and Python in general. So we can take advantage of knowing how + +00:58:20.380 --> 00:58:25.160 +things run internally, how things are installed, how to optimize everything. So everything is just + +00:58:25.240 --> 00:58:31.920 +like super fast, super fast. Yeah. I imagine you all have base Docker images that are like just + +00:58:32.860 --> 00:58:39.020 +one one layer away from whoever's code is running. You know, like you've got it all + +00:58:39.020 --> 00:58:43.360 +already pre-built with FastAPI and whatever settings of Python you want. + +00:58:43.900 --> 00:58:50.280 +Yeah. Like a bunch of things and tricks. Yeah. But also like different things, + +00:58:50.500 --> 00:58:55.260 +like the different ways that we do to actually build the things and install things and put them + +00:58:55.560 --> 00:59:01.980 +inside of the actual, you know, like build application. There's like a lot of sourcing + +00:59:02.720 --> 00:59:08.100 +that we do there and like Jonathan has been like working on a lot of that. And there's also all the + +00:59:07.900 --> 00:59:12.760 +logic and all the stuff we have a bunch of stuff on top of that to handle auto scaling which is + +00:59:12.860 --> 00:59:18.700 +something that is actually not that easy to find in different providers we have auto scaling based + +00:59:18.840 --> 00:59:25.180 +on requests including scaling down to zero which saves costs but it's you know like this is not + +00:59:25.300 --> 00:59:30.980 +lambdas it's not aws lambdas it's we are like it's like the full uh deployed application the full + +00:59:31.220 --> 00:59:36.680 +container or whatever it is uh which is you know like the full thing with all the dependencies is + +00:59:36.700 --> 00:59:42.240 +running for whenever it has to run, but we can scale based on requests. So it's like, I guess + +00:59:42.360 --> 00:59:49.700 +it's like the type of thing that you will have if you have this giant cluster for a huge enterprise + +00:59:50.220 --> 00:59:57.720 +with a bunch of infra people making sure everything just works perfectly, but you just pay us to do + +00:59:57.840 --> 01:00:05.780 +that for you. Yeah. But I think this is also a good time for us to probably say lots of stuff is + +01:00:05.740 --> 01:00:07.260 +is coming and we're in private beta. + +01:00:07.460 --> 01:00:09.300 +And so you should sign up for the wait list + +01:00:10.760 --> 01:00:12.140 +so that you can get admitted + +01:00:12.220 --> 01:00:14.460 +and try out these very cool things we've been talking about. + +01:00:14.460 --> 01:00:15.260 +Yeah, absolutely. + +01:00:15.560 --> 01:00:19.900 +And I think I'll let Tech Insider out in the audience + +01:00:21.280 --> 01:00:22.120 +sort of lead to this. + +01:00:22.120 --> 01:00:22.760 +Public release when? + +01:00:23.440 --> 01:00:27.260 +Yeah, lead us into my final topic, + +01:00:27.460 --> 01:00:28.640 +which is just what's the roadmap? + +01:00:29.220 --> 01:00:29.940 +When is this stuff? + +01:00:30.440 --> 01:00:32.820 +How do we get into it here? + +01:00:34.260 --> 01:00:45.820 +So right now we have the waiting list and we are onboarding people. + +01:00:45.860 --> 01:00:48.400 +We already have like a bunch of people in the private beta. + +01:00:48.820 --> 01:00:53.960 +We are going to keep onboarding people from the waiting list and like, you know, like wrap that up. + +01:00:54.060 --> 01:00:58.520 +But it will be like, you know, like through the waiting list is the main place where we are onboarding people. + +01:00:58.640 --> 01:01:02.600 +We want to make sure that everything is super fine tuned and we're going to keep it that way for a while. + +01:01:02.660 --> 01:01:09.160 +So people that are on the waiting list are going to be the ones that are going to be able to start using it the soonest. + +01:01:09.490 --> 01:01:13.180 +At some point, we'll probably have ways for people to invite others and things like that. + +01:01:15.340 --> 01:01:23.780 +About the things that we are building, we are super focused on FastAPI and then Python in general. + +01:01:23.920 --> 01:01:27.320 +at some point will probably support different tools, + +01:01:27.810 --> 01:01:31.320 +different ways to run, like Python code in general, + +01:01:31.560 --> 01:01:32.600 +probably different frameworks. + +01:01:33.120 --> 01:01:37.980 +It will also depend a lot on what the users are asking for, + +01:01:38.080 --> 01:01:41.020 +where are the tools, the frameworks, the use cases, + +01:01:41.190 --> 01:01:42.420 +the things that they need to build. + +01:01:42.580 --> 01:01:45.620 +And we're gonna evolve the platform and the system + +01:01:46.200 --> 01:01:48.840 +based on what people need out of it. + +01:01:49.420 --> 01:01:53.040 +- Yeah, and we have a GitHub repo where we have issues, + +01:01:53.240 --> 01:01:57.540 +But we also have like a Slack that once people are admitted, they can talk directly to us. + +01:01:57.660 --> 01:02:04.080 +And that feedback is really, really valuable for shaping the roadmap and figuring out all the fun things you want us to support. + +01:02:04.680 --> 01:02:05.120 +Yeah. + +01:02:05.790 --> 01:02:05.900 +Awesome. + +01:02:07.700 --> 01:02:09.080 +Of course, you're going to charge money for it. + +01:02:09.630 --> 01:02:13.640 +It runs on the servers and you guys are not a charity. + +01:02:16.420 --> 01:02:17.500 +But I don't know. + +01:02:17.950 --> 01:02:22.320 +Can you give any sense of what you're thinking about that kind of stuff? + +01:02:22.520 --> 01:02:23.960 +or join the waitlist and see? + +01:02:25.220 --> 01:02:30.580 +Well, first join the waitlist and see, but we don't have like that predefined yet, + +01:02:31.440 --> 01:02:34.980 +but it will be on the ballpark of what you could get from different cloud providers. + +01:02:35.080 --> 01:02:44.100 +So like, you know, like different similar-ish providers will be on the ballpark of what you will get. + +01:02:45.200 --> 01:02:48.240 +But it's not written in stone yet. + +01:02:49.000 --> 01:02:53.780 +It's still a little bit different because we can auto scale based on requests. + +01:02:54.080 --> 01:02:58.140 +So we can increase the amount of replicas of your application automatically. + +01:02:59.100 --> 01:03:02.240 +And then we can decrease them automatically and we can scale down to zero. + +01:03:02.860 --> 01:03:08.840 +So you can probably handle all the load that you need and in the end spend a lot less because + +01:03:08.860 --> 01:03:12.420 +you don't have to have a bunch of instances constantly running or things like that. + +01:03:13.220 --> 01:03:19.340 +So it will probably work a little bit different than what it will be for other providers. + +01:03:19.880 --> 01:03:22.140 +But in the end, it should be roughly similar. + +01:03:22.840 --> 01:03:23.220 +Okay. + +01:03:23.900 --> 01:03:29.840 +And given the fact that you all handle so much of it as a platform as a service type of thing, + +01:03:30.860 --> 01:03:36.280 +you don't have to have a cloud expert on hand or a DevOps expert necessarily, right? + +01:03:36.500 --> 01:03:43.800 +it's soon as a company has to hire somebody to be an AWS cloud architect or something like that. + +01:03:44.580 --> 01:03:51.500 +It's no longer just what is your AWS bill, right? Yeah. Yeah. And it's, yeah, it's also, + +01:03:52.060 --> 01:03:55.480 +it's also a lot of the pain that we are swallowing. So you don't have to. + +01:03:56.330 --> 01:03:58.880 +Exactly. It's part of taking one for the team, right? Yes. + +01:04:00.740 --> 01:04:08.120 +Indeed. All right. So I had one or two things specifically that I was seeing is like + +01:04:09.620 --> 01:04:13.880 +custom domains. How far off are custom domains? I was like, oh, I could put some cool things + +01:04:13.960 --> 01:04:21.600 +on there. I could tell Jonathan is psyched about this. I was like, I would be really + +01:04:21.800 --> 01:04:28.380 +fun to put one of my really small FastAPI projects over there. You know, something I + +01:04:28.340 --> 01:04:31.440 +set up for some of my courses or something that I can like point people to go + +01:04:31.520 --> 01:04:34.880 +look, it's running FastAPI cloud on it, you guys can check that out over there. + +01:04:35.660 --> 01:04:36.960 +And I like, but it's on its own domain. + +01:04:37.100 --> 01:04:41.500 +And those that domain is baked into the course videos, you know what I mean? + +01:04:41.580 --> 01:04:42.820 +And it's, it's written in stone. + +01:04:42.980 --> 01:04:43.360 +It's marketing. + +01:04:44.260 --> 01:04:45.380 +Yeah, exactly. + +01:04:45.740 --> 01:04:52.440 +So, so I can't really move it because it has, you know, some sub domain + +01:04:52.560 --> 01:04:53.220 +of talk Python, right? + +01:04:55.380 --> 01:04:55.640 +Jonathan. + +01:04:57.880 --> 01:04:59.160 +I was going to say that work. + +01:04:59.900 --> 01:05:03.100 +See, I was working on it and then I got the notification by Google + +01:05:03.300 --> 01:05:05.540 +calendars that I should join a certain podcast. + +01:05:06.080 --> 01:05:08.560 +Are you telling me we don't have custom domains? + +01:05:08.960 --> 01:05:10.980 +Because I'm here asking you about custom domains. + +01:05:11.030 --> 01:05:11.780 +How meta is that? + +01:05:11.790 --> 01:05:12.180 +You got it. + +01:05:12.610 --> 01:05:12.720 +Yeah. + +01:05:14.460 --> 01:05:18.420 +It could be here already, but no, you have to wait a bit more. + +01:05:19.160 --> 01:05:20.160 +OK, but soon? + +01:05:21.020 --> 01:05:21.460 +Yes. + +01:05:21.710 --> 01:05:21.840 +Yeah. + +01:05:22.200 --> 01:05:22.360 +Yeah. + +01:05:23.220 --> 01:05:26.640 +Soon is broad enough, but I'm actively working on it. + +01:05:27.120 --> 01:05:28.300 +Let's put it like that. + +01:05:28.800 --> 01:05:28.900 +Awesome. + +01:05:30.420 --> 01:05:31.180 +Okay, that sounds great. + +01:05:31.420 --> 01:05:35.500 +And then, I mean, I just, it's never simple. + +01:05:35.840 --> 01:05:39.240 +You know, I just, I set up some stuff and it's like, you get the pop-up. + +01:05:39.560 --> 01:05:47.220 +Oh, you got to put this, you know, this TXT record, this CNAME or whatever record into your DNS and then we're checking it. + +01:05:47.260 --> 01:05:50.060 +Oh, it might take three days for your DNS to propagate. + +01:05:50.280 --> 01:05:53.200 +So hang in there and just, I can imagine like you're having fun. + +01:05:54.040 --> 01:05:54.780 +Yeah, I guess. + +01:05:55.180 --> 01:05:55.580 +Yeah. + +01:05:56.320 --> 01:05:58.700 +Yeah, that's like, wow. + +01:05:58.920 --> 01:06:02.240 +I thought I'm almost off work, but no, you're bringing it all back. + +01:06:02.420 --> 01:06:03.060 +I'm sorry. + +01:06:04.560 --> 01:06:09.180 +I'm sure the company could support therapy to work through the issues + +01:06:09.260 --> 01:06:11.080 +and the trauma that you've suffered from the DNS. + +01:06:12.080 --> 01:06:12.880 +It's always DNS. + +01:06:13.160 --> 01:06:13.340 +That's right. + +01:06:13.420 --> 01:06:14.320 +It's always DNS. + +01:06:14.860 --> 01:06:15.700 +Yes, it's always DNS. + +01:06:16.420 --> 01:06:20.460 +One of our goals with custom domains is also to make it super simple + +01:06:20.500 --> 01:06:21.600 +for you to set up them. + +01:06:22.260 --> 01:06:27.340 +Like, for example, if you're using one of the providers that support OAuth, + +01:06:27.370 --> 01:06:30.120 +we can also just do one click, and then it's going to be automatically. + +01:06:30.190 --> 01:06:31.840 +Oh, that's cool. Yeah, yeah, that's really nice. + +01:06:32.440 --> 01:06:34.580 +But unfortunately, it depends on the platform you're using, + +01:06:34.660 --> 01:06:36.600 +because all of them support this. + +01:06:39.000 --> 01:06:40.700 +Yeah, just a shout-out for people out there. + +01:06:40.920 --> 01:06:43.000 +This is made by the person in charge of most of the integrations. + +01:06:43.110 --> 01:06:47.500 +So Patrick has built, we have integrations for a bunch of database providers + +01:06:47.670 --> 01:06:48.260 +and things like that. + +01:06:48.520 --> 01:06:53.360 +Patrick, I think now Patrick knows by memory the open ID specification. + +01:06:53.720 --> 01:06:54.120 +I don't know. + +01:06:57.180 --> 01:07:00.020 +Yeah, the other thing I wanted to talk a bit about was just integrations, + +01:07:00.180 --> 01:07:01.660 +like what kind of stuff you guys have come in. + +01:07:01.700 --> 01:07:05.200 +I saw that Hugging Face is going to be integrated soon. + +01:07:05.400 --> 01:07:09.320 +You've got Supabase, which is kind of Postgres as a service. + +01:07:10.240 --> 01:07:15.480 +There's a lot of those things out there that theoretically could be added. + +01:07:16.620 --> 01:07:19.900 +Yeah, I mean, someone also asked for MongoDB. + +01:07:20.000 --> 01:07:22.100 +Maybe that's one that we're going to take a look into. + +01:07:22.300 --> 01:07:24.780 +It really depends on the provider. + +01:07:24.920 --> 01:07:27.640 +So at the moment, we don't want to ask databases for you + +01:07:27.800 --> 01:07:29.800 +because that's also another kind of rabbit hole. + +01:07:30.920 --> 01:07:32.300 +Jonathan is probably not ready for that. + +01:07:34.180 --> 01:07:36.180 +But yeah, definitely like databases. + +01:07:36.320 --> 01:07:38.220 +But I guess we can say that we also + +01:07:38.640 --> 01:07:41.640 +talking with the people from Pydantic + +01:07:41.740 --> 01:07:44.080 +so we can integrate maybe Logfire automatically, + +01:07:44.500 --> 01:07:45.040 +that kind of stuff. + +01:07:46.980 --> 01:07:47.200 +Yeah. + +01:07:47.420 --> 01:07:50.880 +And also like Redis, which is also another kind of database + +01:07:51.820 --> 01:07:52.740 +that's also coming soon. + +01:07:54.060 --> 01:07:56.740 +- Yeah, there's a couple of database as a service + +01:07:58.160 --> 01:08:00.020 +type things that don't require too much + +01:08:00.080 --> 01:08:04.160 +other than just connecting API keys + +01:08:04.260 --> 01:08:05.200 +and something like that, right? + +01:08:05.440 --> 01:08:06.640 +Those seem like low hanging fruit. + +01:08:07.400 --> 01:08:11.680 +- Yeah, so like the kind of goal with the integration + +01:08:11.940 --> 01:08:13.780 +is not just like, yeah, right now, + +01:08:13.860 --> 01:08:15.360 +It's just setting up an environment variable. + +01:08:16.279 --> 01:08:19.100 +But the idea is also to do more-- + +01:08:21.150 --> 01:08:23.500 +I don't know-- the proper integration, I would say. + +01:08:24.460 --> 01:08:26.500 +For example, for things like Superbase, + +01:08:27.470 --> 01:08:28.640 +yeah, I think that support branching. + +01:08:29.020 --> 01:08:32.359 +For example, once we support Procrast previews for GitHub, + +01:08:32.700 --> 01:08:34.580 +we can also create a branch automatically for you + +01:08:34.670 --> 01:08:36.620 +if you have the Superbase integration enabled. + +01:08:37.020 --> 01:08:38.799 +And we can do this kind of stuff as well. + +01:08:38.980 --> 01:08:42.680 +Or even we could show some information about database, + +01:08:42.960 --> 01:08:45.560 +I don't know, like load or like memory usage, + +01:08:45.660 --> 01:08:47.279 +things are directly from our dashboard. + +01:08:47.339 --> 01:08:48.220 +So you don't have to go there. + +01:08:48.920 --> 01:08:50.920 +That's the main reason why we're building this + +01:08:52.140 --> 01:08:53.880 +infrastructure for the integration. + +01:08:54.900 --> 01:08:55.400 +- All right. + +01:08:56.960 --> 01:08:59.120 +Well, people can sign up to the waiting list + +01:08:59.319 --> 01:09:02.339 +and hopefully get on the private beta and... + +01:09:03.660 --> 01:09:05.140 +- We actually check the waiting list. + +01:09:05.140 --> 01:09:08.040 +We actually check the use cases, theme sizes, + +01:09:08.299 --> 01:09:10.380 +like what are people building with it. + +01:09:10.400 --> 01:09:12.419 +Like we actually go and check it + +01:09:12.440 --> 01:09:16.520 +and we bring in people from the waiting list. + +01:09:16.859 --> 01:09:19.120 +Nice. I didn't join the waiting list directly. + +01:09:19.359 --> 01:09:25.319 +I was added by some guy I know who was very kind to help me get some behind the scene look. + +01:09:26.140 --> 01:09:27.220 +I don't know what the process is. + +01:09:27.290 --> 01:09:32.140 +Do you actually say what you want to do with it and you evaluate that a little bit as well based on like, + +01:09:32.150 --> 01:09:34.200 +hey, this would be a cool use case for us to support? + +01:09:35.440 --> 01:09:36.359 +Yes, yes, of course. + +01:09:36.450 --> 01:09:41.199 +Because there are many types of applications and + +01:09:41.220 --> 01:09:46.560 +many types of different team sizes, many types of things that people might want to build. + +01:09:46.700 --> 01:09:52.400 +And we try to see, like, okay, where is the case where we could be, like, a good fit and we can provide a great service? + +01:09:53.000 --> 01:09:54.880 +And what are the things that people are trying to build? + +01:09:55.440 --> 01:10:03.420 +Also, it helps us see, like, you know, like, what are people trying to do with FastAPI Cloud so that we know what we have to provide? + +01:10:04.440 --> 01:10:11.020 +But we actually go and check those submissions. + +01:10:12.000 --> 01:10:20.760 +It's actually thousands of people in the waiting list, but we still go and check and approve kind of manually. + +01:10:21.940 --> 01:10:22.060 +Yeah. + +01:10:22.260 --> 01:10:26.600 +To bring a bunch of people on board in the different ways that we have been bringing people. + +01:10:26.960 --> 01:10:32.220 +So if people go and join the waiting list and actually tell us what is their use case, their team, + +01:10:32.360 --> 01:10:37.700 +like what are they planning on doing, there's a much higher chance that we are going to + +01:10:38.020 --> 01:10:39.400 +go and just like bring them up. + +01:10:39.960 --> 01:10:40.400 +Okay. + +01:10:41.310 --> 01:10:41.420 +Awesome. + +01:10:42.040 --> 01:10:45.880 +So, everyone, go join the waitlist if you're doing FastAPI. + +01:10:46.560 --> 01:10:48.880 +I'll link to it in the show notes, of course. + +01:10:51.040 --> 01:10:54.680 +Thank you all for being here and sharing the story. + +01:10:55.000 --> 01:10:58.960 +And I, for one, am very excited to see FastAPI Cloud exist. + +01:11:00.380 --> 01:11:05.380 +just one more way to make FastAPI stronger and more resilient and so on. + +01:11:06.780 --> 01:11:08.700 +Thank you very much. Thank you for having us. + +01:11:09.120 --> 01:11:10.060 +Yeah, thanks for having us. + +01:11:11.020 --> 01:11:12.520 +Yeah, you bet. Bye, everyone. + +01:11:13.200 --> 01:11:13.720 +Bye, folks. + +01:11:13.960 --> 01:11:14.040 +Bye. + diff --git a/youtube_transcripts/538-digital-humanities-original.vtt b/youtube_transcripts/538-digital-humanities-original.vtt new file mode 100644 index 0000000..e1c6915 --- /dev/null +++ b/youtube_transcripts/538-digital-humanities-original.vtt @@ -0,0 +1,3299 @@ +WEBVTT + +00:00:01.540 --> 00:00:03.560 +Hello, David. Welcome to Talk Python To Me. + +00:00:04.080 --> 00:00:04.700 +Amazing to have you here. + +00:00:06.200 --> 00:00:06.800 +I'm glad to be here. + +00:00:07.320 --> 00:00:10.060 +Talk Python has been part of my story up to this point. + +00:00:10.360 --> 00:00:11.760 +Has it? Okay. Well, + +00:00:12.700 --> 00:00:15.180 +you are about to write the next chapter in the story. + +00:00:15.350 --> 00:00:16.500 +So that's pretty excellent. + +00:00:17.840 --> 00:00:19.000 +I have a sense of what's coming. + +00:00:19.530 --> 00:00:22.640 +We planned out what we're going to talk about and that sort of thing. + +00:00:23.500 --> 00:00:26.280 +And I'm really excited about this topic. + +00:00:26.700 --> 00:00:29.640 +So it's going to be a good one. + +00:00:30.580 --> 00:00:34.600 +Honestly, I think one of the real powers of the Python community + +00:00:34.640 --> 00:00:37.320 +and the reason the language has such staying power + +00:00:37.320 --> 00:00:44.900 +is there's such a diversity of use cases, technology standpoints, right? + +00:00:44.960 --> 00:00:48.640 +Like I build software for this group or I build these types of apps + +00:00:48.640 --> 00:00:50.680 +and it's not just, you know, like Ruby on Rails, + +00:00:51.040 --> 00:00:53.200 +which, you know, it's been very popular and stuff, + +00:00:53.220 --> 00:00:54.840 +but it's for websites, right? + +00:00:54.960 --> 00:00:55.260 +You know what I mean? + +00:00:58.280 --> 00:01:07.880 +Yeah, absolutely. I mean, web development has dominated my use of it, but my entry into it, + +00:01:08.030 --> 00:01:12.180 +which I suppose I'll mention in a moment, was through all those little tools. + +00:01:13.260 --> 00:01:21.540 +Was it? Okay. Yeah. Well, let's hear it. Who are you, David Flood? Introduce yourself real quick + +00:01:21.560 --> 00:01:23.300 +and tell us about how you got into it. + +00:01:24.140 --> 00:01:24.200 +Yep. + +00:01:25.500 --> 00:01:30.500 +So my background is in music and the humanities. + +00:01:31.280 --> 00:01:34.820 +I mean, in 2019, I didn't know what Python was + +00:01:35.120 --> 00:01:36.560 +or the name of any programming language. + +00:01:38.800 --> 00:01:41.060 +And I've been doing textual criticism, + +00:01:41.780 --> 00:01:46.060 +which is, you know, there's lots of criticisms in the academy. + +00:01:46.640 --> 00:01:49.320 +This is the one where if you have lots and lots of versions + +00:01:49.340 --> 00:01:56.620 +of the same text you are comparing them to to work out what the initial text was and like how + +00:01:56.670 --> 00:02:03.980 +it changed over time okay give it give us an example okay so uh one of the famous examples + +00:02:04.090 --> 00:02:09.880 +hope i can remember it off the top of my head um is from shakespeare we're all familiar with the + +00:02:10.020 --> 00:02:16.280 +line to be or not to be that is the question that is the question uh well there's a there's a variant + +00:02:16.300 --> 00:02:23.840 +of it. There's a variant of it. One of the early copies written by Shakespeare himself has... + +00:02:25.720 --> 00:02:29.420 +Somebody's going to be able to type into the chat exactly what it is. They'll know this anecdote, + +00:02:29.800 --> 00:02:38.180 +but it's something more like to be or not to be. I, that's the question. And so which one is the + +00:02:38.340 --> 00:02:42.860 +original one? Why did he change it? That's kind of one example. I work mainly in the New Testament, + +00:02:43.360 --> 00:02:53.600 +which is especially complicated because no other corpus from ancient history has as many copies of the same text as that corpus does. + +00:02:53.790 --> 00:02:56.740 +So it's quite complicated. + +00:02:57.050 --> 00:03:03.140 +And our techniques have grown because of that and perhaps become more advanced than other fields. + +00:03:03.760 --> 00:03:21.400 +I mean, that many variations over that huge span of time over different groups with different, maybe not intentions, but certainly colored by different worldviews and philosophies and so on. + +00:03:21.410 --> 00:03:23.420 +And yeah, I see the travel. + +00:03:24.600 --> 00:03:24.980 +No, yeah. + +00:03:25.200 --> 00:03:26.980 +And they were people of the book. + +00:03:27.660 --> 00:03:30.420 +So copying it is something that happened a lot. + +00:03:30.500 --> 00:03:34.680 +And they copied the monks, like the medieval monks copied everything. + +00:03:35.110 --> 00:03:38.640 +They copied, you know, our Greek classics. + +00:03:39.420 --> 00:03:41.900 +So that's what I was interested in. + +00:03:42.430 --> 00:03:48.980 +And because of the wealth of data that we have, computer tools are more and more important in that field. + +00:03:49.540 --> 00:03:55.780 +So when I started my PhD in 2019, I knew that I wanted to use some of these cutting edge tools. + +00:03:56.320 --> 00:03:57.940 +Some of them may be surprising. + +00:03:58.400 --> 00:04:03.640 +For example, we've been using phylogenetic software. + +00:04:04.300 --> 00:04:15.460 +This is software that evolutionary biologists are using or computational biologists are using to track, for example, how COVID strains like mutate over time. + +00:04:16.030 --> 00:04:16.680 +Oh, interesting. + +00:04:17.519 --> 00:04:20.380 +What they're comparing are the DNA letters. + +00:04:20.890 --> 00:04:24.320 +And so you have the sequence of letters and you're comparing how those change over time. + +00:04:24.860 --> 00:04:37.120 +Well, you can swap in textual variants for DNA letters, and now we can track how texts change over time and group them into families, things like that. + +00:04:38.500 --> 00:04:41.980 +It's like a time series, but of words or letters or something. + +00:04:42.140 --> 00:04:42.720 +Yeah, how interesting. + +00:04:42.980 --> 00:04:43.540 +Yeah, I mean, yeah. + +00:04:44.100 --> 00:04:50.720 +There's lots of important algorithms for comparing sequences of things. + +00:04:51.540 --> 00:04:58.360 +And so if we can just swap in Greek, you know, Greek words and Greek text instead, then we can maybe apply it to textual criticism. + +00:04:58.870 --> 00:05:00.380 +So I was pretty interested in those things. + +00:05:00.430 --> 00:05:07.660 +That wasn't actually the method that brought me into it, but something like that kind of computer intensive tools. + +00:05:09.240 --> 00:05:13.700 +What I learned is that these tools were not like they weren't actually available to me. + +00:05:14.200 --> 00:05:15.840 +They weren't desktop applications. + +00:05:16.050 --> 00:05:18.780 +And for the most part, they weren't public web applications. + +00:05:19.050 --> 00:05:20.660 +They were software libraries. + +00:05:22.139 --> 00:05:25.780 +I see. So something on PyPI or something like that, right? + +00:05:26.280 --> 00:05:33.600 +Yeah, exactly. Exactly. Or Java. And I needed to glue them together. So the long story short on + +00:05:33.720 --> 00:05:39.300 +that is during the first year of my PhD, I was picking up Python, watching YouTube videos while + +00:05:39.300 --> 00:05:45.360 +I was doing the dishes. And then the pandemic hit while I was living in Edinburgh in Scotland, + +00:05:45.420 --> 00:05:53.760 +probably not far from will mccougan um and uh so the pandemic gave me the excuse to spend even a + +00:05:53.880 --> 00:06:00.500 +few more hours um you know each day um picking up these new these new technical skills uh but so i + +00:06:00.620 --> 00:06:06.780 +did it i was able to use these advanced tools in my in my work but what was really important to me + +00:06:06.880 --> 00:06:12.519 +was sharing like making that available to my colleagues as i had to i had to move from writing + +00:06:12.580 --> 00:06:17.800 +these like bad top to bottom Python scripts into things that could be reused by other people. + +00:06:18.470 --> 00:06:23.140 +And that led me into the web because the web is where that's how I can share with anybody. + +00:06:24.000 --> 00:06:33.020 +It's the easiest way. And it's really wild how much the web is kind of the last bastion of + +00:06:34.220 --> 00:06:40.639 +app freedom. It's so bizarre because, you know, I've many times told the stories of the insane + +00:06:40.660 --> 00:06:46.820 +battles of just getting our apps that just playback video of content that's already on the web + +00:06:47.420 --> 00:06:53.560 +into the app store. I mean, weeks of fighting about the weirdest, most nonsensical things with + +00:06:53.700 --> 00:07:01.280 +both Google and Apple. But we also now have the Mac platform and the Windows platform very + +00:07:01.580 --> 00:07:07.380 +aggressively looking for digital code certificates and all sorts of signing and other kinds of, + +00:07:07.460 --> 00:07:10.540 +You can't even just send somebody an executable anymore. + +00:07:11.150 --> 00:07:12.640 +It won't run. It's crazy. + +00:07:12.900 --> 00:07:15.840 +It's down to like, okay, put it on the web, I guess. + +00:07:17.080 --> 00:07:21.200 +- That's right. I played the game of distributing desktop apps. + +00:07:21.340 --> 00:07:21.880 +That's how I did it. + +00:07:22.980 --> 00:07:24.420 +That's how I initially distributed things. + +00:07:25.760 --> 00:07:28.840 +And at this point, I just require people to install Python + +00:07:29.200 --> 00:07:31.560 +and then install my desktop app from PyPI + +00:07:31.780 --> 00:07:34.660 +because it's too hard otherwise for me. + +00:07:34.980 --> 00:07:38.960 +I mean, I could pay for the code signing from Apple and do all of that, + +00:07:39.020 --> 00:07:42.200 +but it's just too much work for the time that I have. + +00:07:42.920 --> 00:07:44.500 +Yeah, I'm about to do another round of it. + +00:07:44.500 --> 00:07:48.060 +I'm working on an app, and my developer account is still active, + +00:07:48.260 --> 00:07:50.060 +so we might have a fresh round of fun. + +00:07:50.320 --> 00:07:51.660 +Hopefully it goes through this time. + +00:07:52.640 --> 00:07:54.540 +Anyway, I do think it's such a challenge. + +00:07:54.740 --> 00:07:55.920 +And are you leveraging? + +00:07:56.320 --> 00:07:57.560 +I don't know if the timing was right. + +00:07:57.700 --> 00:07:58.700 +Maybe this was too early. + +00:07:59.620 --> 00:08:03.980 +But these days, are you leveraging things like uvx to run, + +00:08:04.340 --> 00:08:07.840 +or are you just pip install this thing and then run it? + +00:08:08.899 --> 00:08:11.120 +Yeah, I haven't updated the readme in a while, + +00:08:11.180 --> 00:08:13.180 +so I think it just asks for pip. + +00:08:13.860 --> 00:08:17.340 +But certainly, if somebody asked me today, + +00:08:17.980 --> 00:08:20.100 +I would say, "Yeah, just install this with uv." + +00:08:20.640 --> 00:08:20.940 +Yeah, yeah. + +00:08:21.100 --> 00:08:22.380 +Because then they don't even need Python. + +00:08:23.300 --> 00:08:23.560 +Exactly. + +00:08:23.900 --> 00:08:24.700 +If they get it, + +00:08:25.020 --> 00:08:25.620 +and that's a really, + +00:08:25.820 --> 00:08:28.340 +it is another barrier reduced + +00:08:28.480 --> 00:08:29.420 +in distributing these applications. + +00:08:30.700 --> 00:08:33.719 +If you can get uv installed on a machine, + +00:08:34.080 --> 00:08:37.700 +then you don't even have to say install just the way you run it is uvx my thing + +00:08:38.419 --> 00:08:44.980 +and it's all transparent to you right which is beautiful so what was it like yeah so what was + +00:08:44.980 --> 00:08:55.920 +it like coming from what sounds like a not super screen focus super techie aspect and having to + +00:08:56.100 --> 00:09:01.019 +dive into this world and someday you're probably like how is it that i'm publishing stuff to pypi + +00:09:01.040 --> 00:09:08.160 +what has happened to me yeah uh well yeah i i remember when i when i first signed up for github + +00:09:08.500 --> 00:09:12.140 +because you know whatever youtube tutorial i was working through at the time + +00:09:13.200 --> 00:09:21.700 +um you know said that i need that that i needed to do that um you know i i think it all started + +00:09:21.860 --> 00:09:28.379 +making a lot of sense um i didn't have any technical background but the uh the world + +00:09:29.920 --> 00:09:36.560 +the world kind of open source software it it just it kind of made sense it felt like it fit really + +00:09:36.740 --> 00:09:43.740 +well into my into my academic um you know circle yeah i think a lot of the attitudes are similar + +00:09:44.540 --> 00:09:48.820 +i agree i think they are actually and i think that's i think that's a pretty neat thing + +00:09:51.040 --> 00:09:58.379 +yeah very cool all right well let's talk about what you're doing with digital humanities you're + +00:09:58.400 --> 00:10:03.960 +actually at a really interesting project or organization, I guess that does many projects, + +00:10:04.240 --> 00:10:09.700 +right? Yeah, yeah. So fast, fast forwarding, I did, I finished my PhD in the humanities. + +00:10:09.880 --> 00:10:15.500 +Yeah, sorry, I cut you off there, didn't I? That's fine. I had so much fun writing like these tools + +00:10:15.680 --> 00:10:20.640 +and then just get it solving the distribution problem to share them with other scholars. + +00:10:21.960 --> 00:10:27.120 +That was so fun that I was open to this kind of opportunity where now I'm doing this full time. + +00:10:28.080 --> 00:10:36.940 +And so, yes, I so I'm on the we call it affectionately Darth, which is digital arts and humanities at Harvard. + +00:10:37.840 --> 00:10:42.380 +Awesome. There has to be a lot of Star Wars memes and references, I'm sure. + +00:10:43.060 --> 00:10:46.360 +If you can pull up a 404, I think there will be a Darth Vader. + +00:10:47.240 --> 00:10:48.860 +Oh, seriously, I'm here for it. + +00:10:54.000 --> 00:10:57.020 +Yes, page not found. I find your lack of nav disturbing. + +00:10:59.660 --> 00:11:05.020 +you know what i think that is beautiful and i really i really think that people should embrace + +00:11:05.540 --> 00:11:14.300 +the 404 the fun 404 page you know more right there should really be something going on that + +00:11:14.440 --> 00:11:19.600 +like makes it you know something hasn't worked out but you can just you can make people laugh + +00:11:20.420 --> 00:11:21.500 +but I appreciate that + +00:11:23.200 --> 00:11:25.120 +I've heard people push back against it + +00:11:25.240 --> 00:11:26.020 +like if you're on + +00:11:27.560 --> 00:11:29.080 +your medical website + +00:11:29.660 --> 00:11:31.320 +and you're maybe about to get bad news + +00:11:31.460 --> 00:11:33.120 +and then you get like a picture of a kitten + +00:11:35.780 --> 00:11:37.720 +Dr. Kitten doesn't know where your results went + +00:11:37.820 --> 00:11:39.140 +I get that that's not funny + +00:11:40.740 --> 00:11:42.840 +but I mean most things are not that serious + +00:11:44.280 --> 00:11:44.760 +mostly + +00:11:45.840 --> 00:11:47.560 +okay so what kind of things + +00:11:47.960 --> 00:11:48.899 +does Darth do + +00:11:49.120 --> 00:11:54.160 +You've described this as kind of a web or tech agency within Harvard. + +00:11:55.020 --> 00:11:56.260 +Maybe more within the digital art group. + +00:11:56.420 --> 00:11:57.060 +Yeah, it is very much. + +00:11:57.060 --> 00:11:59.520 +So, you know, Harvard has a gigantic IT group. + +00:12:00.040 --> 00:12:06.800 +I don't know how many hundreds of people work, but more than 500 people in IT. + +00:12:08.480 --> 00:12:12.720 +We are a small team, and we operate very much like a small agency. + +00:12:13.540 --> 00:12:22.160 +So usually what happens is a faculty member has a funded research project that's going to last for an amount of time. + +00:12:23.410 --> 00:12:25.820 +And then we consult with them to build it. + +00:12:27.240 --> 00:12:35.760 +And most of the time, I kind of think of these as I kind of have these different categories of these kinds of projects that I think of. + +00:12:36.500 --> 00:12:36.900 +Mm-hmm. + +00:12:43.520 --> 00:12:46.640 +I lost in my notes what I called them, but they are there. + +00:12:46.780 --> 00:12:50.480 +You have like a one is like a virtual research environment. + +00:12:51.460 --> 00:12:57.560 +So the focus is this is this is a platform that we're building for the research to be done. + +00:12:58.080 --> 00:13:05.720 +Like the reason the research should be done in like a web app would be because you have access to visualization, to Postgres, + +00:13:06.740 --> 00:13:12.620 +to pandas so you we can we can kind of build up this platform to do the actual research on + +00:13:13.040 --> 00:13:19.220 +and some of the data entry all right so like a full-on research application yeah and i guess + +00:13:19.240 --> 00:13:27.920 +you can also kind of see your work through the different stages of research projects and academic + +00:13:28.100 --> 00:13:34.479 +research and so on and we'll get to maybe end of life in a sense uh further down down in the + +00:13:34.500 --> 00:13:41.500 +conversation. But so this would be, we have a grant or we just work here and we're going to work on + +00:13:41.920 --> 00:13:47.260 +some form of research. What do you give them? Right? Like, I think that's a super interesting + +00:13:47.560 --> 00:13:55.140 +challenge because one of the real common answers would be Jupyter, JupyterLab, Marimo, whatever. + +00:13:56.420 --> 00:14:03.940 +But that's still pretty code heavy for people who are possibly philosophers or something, you know, + +00:14:04.420 --> 00:14:16.220 +Oh, exactly. That's why in digital humanities, I won't even, maybe I won't even attempt to define it in any narrow sense, because I'll get in trouble with somebody. + +00:14:17.640 --> 00:14:33.700 +But you have two groups that are interfacing with each other. And one is digital humanities as a field, like as a subfield, all of its own. And these are people who have humanities domain, like knowledge, and technical skills. + +00:14:34.140 --> 00:14:35.140 +and they're bringing them together. + +00:14:35.720 --> 00:14:38.200 +And in a lot of cases, the audience for that kind of work + +00:14:38.540 --> 00:14:41.060 +is other people working in the digital humanities. + +00:14:42.320 --> 00:14:45.060 +But far more common, and this is what we work with, + +00:14:45.240 --> 00:14:49.280 +is people who have humanities domain expertise + +00:14:49.820 --> 00:14:52.840 +and they want to publish or do research or share + +00:14:53.040 --> 00:14:57.060 +with other people who have that same humanities domain expertise + +00:14:57.560 --> 00:15:01.400 +and they are now interested in adding a technical component to it. + +00:15:01.640 --> 00:15:03.920 +How can we supercharge what they have? + +00:15:06.360 --> 00:15:08.280 +Maybe just take a moment and speak to, + +00:15:10.140 --> 00:15:14.640 +maybe I don't know if this venue will actually speak directly to anybody who I was imagining here, + +00:15:14.760 --> 00:15:16.520 +but people who work with folks, + +00:15:17.380 --> 00:15:23.560 +what would you tell somebody who works with a group who have some technical skill, + +00:15:23.700 --> 00:15:25.920 +who could create some of these things that we're going to talk about, + +00:15:26.600 --> 00:15:30.960 +but the people who they agree for don't necessarily think they need it or know that they need it. + +00:15:31.000 --> 00:15:38.460 +Like, you know, I've gone often on rants about how programming is a superpower, not a replacement for your job, right? + +00:15:41.880 --> 00:15:51.240 +Yeah, that's a problem for a lot of people, especially because you might use some new computer tools to supercharge your research. + +00:15:51.980 --> 00:15:59.200 +But the article that you publish or the research output of that, the audience, they may not be interested in hearing about that at all. + +00:15:59.820 --> 00:16:10.380 +And so for most people who are working in this space, the tools, you have to use them in such a way that you can talk about the research output without talking about the tool. + +00:16:11.360 --> 00:16:16.700 +And we have other venues to talk about the tools themselves, like the Journal for Open Source Software. + +00:16:17.180 --> 00:16:19.100 +And you can kind of get some of it out there. + +00:16:20.200 --> 00:16:26.500 +But that's the significant challenge is convincing people that it could be useful and then convincing + +00:16:26.560 --> 00:16:31.620 +the audience that they should be interested in the methods behind how some of the new research + +00:16:31.700 --> 00:16:32.080 +comes out. + +00:16:32.980 --> 00:16:38.780 +Also, I think I'm a big believer that presenting stuff in the right order is really, really + +00:16:39.180 --> 00:16:39.320 +important. + +00:16:40.100 --> 00:16:44.240 +If you present your research and it's beautiful and powerful, and oh, look, we've also, by + +00:16:44.320 --> 00:16:47.600 +the way, covered 100 times more data than any prior research. + +00:16:48.160 --> 00:16:48.400 +Surprise. + +00:16:48.520 --> 00:16:49.420 +I wonder how I did that. + +00:16:50.120 --> 00:16:51.300 +And then people are like, this is amazing. + +00:16:52.640 --> 00:16:55.640 +Then after you kind of hook them with the inspiration and what's possible, + +00:16:55.740 --> 00:16:57.680 +then you're like, let me tell you about the tool. + +00:16:57.800 --> 00:16:59.000 +Don't say that's a cool tool, right? + +00:16:59.160 --> 00:17:03.080 +This is not just like geekery, like programmer, you know, + +00:17:03.320 --> 00:17:05.079 +a Charlie Brown speak, wah, wah, wah, wah, wah. + +00:17:05.319 --> 00:17:06.620 +You know, it's like, no, I'm listening. + +00:17:06.839 --> 00:17:07.420 +Tell me now. + +00:17:08.819 --> 00:17:09.280 +Yeah, exactly. + +00:17:09.680 --> 00:17:12.839 +I mean, one of the things I think that really opens people's eyes + +00:17:13.140 --> 00:17:15.579 +is a really powerful search interface. + +00:17:16.380 --> 00:17:17.860 +You have all of this research data. + +00:17:18.480 --> 00:17:22.180 +Just put it behind Elasticsearch with some really good filtering on it. + +00:17:22.560 --> 00:17:27.880 +And all of a sudden you have fast, rapid access to the data in a way you never had before. + +00:17:28.110 --> 00:17:34.440 +Like you were never scrolling through the Excel spreadsheets and finding exactly what you wanted, like you were with this new search interface. + +00:17:35.420 --> 00:17:37.640 +And that by itself is like so simple. + +00:17:37.710 --> 00:17:42.320 +We're so used to that in web development that like everything needs to have a fantastic search now. + +00:17:43.320 --> 00:17:47.900 +But so many people have their data locked behind, you know, a terrible search interface. + +00:17:48.540 --> 00:17:57.140 +Yeah. Yeah. Just a few things to sort of expose that. So this, give us a sense of what these data + +00:17:57.300 --> 00:18:02.840 +exploration web apps might look like. These are probably kind of mostly stuck to the inside, + +00:18:03.140 --> 00:18:09.160 +kind of internal to the research lab research team groups and so on. These are probably not + +00:18:09.460 --> 00:18:15.499 +that public facing, right? Almost everything we work on does end up having a public facing + +00:18:15.520 --> 00:18:24.000 +component. So maybe the research itself is done locked behind a user login that's just for the + +00:18:24.240 --> 00:18:30.500 +researchers. But then they expose that research to the public, usually with a good search interface + +00:18:31.040 --> 00:18:37.360 +and different pages for exploring their data and visualizations and things like that. So + +00:18:38.420 --> 00:18:43.200 +yeah, everything we do ends up becoming a production public web app in the end. + +00:18:44.340 --> 00:18:45.000 +- Yeah, awesome. + +00:18:47.020 --> 00:18:50.100 +And then another one of your categories you put in + +00:18:50.180 --> 00:18:52.360 +was virtual research environments, + +00:18:52.500 --> 00:18:55.180 +like data entry, publishing, authoring, collaboration. + +00:18:55.540 --> 00:18:56.020 +Tell us about that. + +00:18:57.800 --> 00:18:59.920 +- Yeah, so a good example of this maybe + +00:19:00.100 --> 00:19:01.420 +is one of the projects that, + +00:19:04.140 --> 00:19:05.320 +well, actually the best example of it + +00:19:05.380 --> 00:19:09.460 +is the project I worked on during my PhD. + +00:19:10.340 --> 00:19:12.400 +It's called Apatosaurus. + +00:19:13.740 --> 00:19:18.060 +short story behind the name is that it sounds like apparatus in textual + +00:19:18.360 --> 00:19:24.200 +criticism when you are when you are displaying and visualizing variant + +00:19:24.520 --> 00:19:29.460 +readings to a to a base text that like that that form of visualizing it is a + +00:19:29.680 --> 00:19:35.420 +critical apparatus a critical apparatus is a pretty boring website name but a + +00:19:35.540 --> 00:19:41.600 +patasaurus dinosaurs might make textual criticism sound fun yeah I do love + +00:19:41.540 --> 00:19:47.040 +dinosaurs. No, that's really cool. So this comes out as a web app. And I know you also have some, + +00:19:47.300 --> 00:19:53.440 +you talked about some desktop apps as well. Yeah, yeah, that's right. So yeah, so there's + +00:19:53.560 --> 00:19:57.460 +people upload their collation to this, and then they can visualize it. And + +00:20:01.020 --> 00:20:07.280 +like there's a public component of this as well. But really the back end is editing a collation, + +00:20:07.720 --> 00:20:10.040 +adding notes to all of the different readings and stuff. + +00:20:10.740 --> 00:20:15.320 +So I could show what the back end looks like, + +00:20:15.520 --> 00:20:16.740 +but we can also move on. + +00:20:19.840 --> 00:20:24.480 +Let's move on, just because most people will not tell you. + +00:20:24.530 --> 00:20:31.120 +But just give us a sense of what do you create for people + +00:20:31.300 --> 00:20:34.040 +so that they're like, yeah, I can use this app. + +00:20:34.420 --> 00:20:36.500 +Give us a sense of some of the features, I guess, + +00:20:36.680 --> 00:20:43.740 +what I'm getting to yeah so another another good example is we have what we + +00:20:43.740 --> 00:20:50.460 +have project at Darth at Harvard called mapping color in history and this is a + +00:20:52.719 --> 00:20:59.040 +collaboration with a lab this lab brings in pieces of artwork and they do like + +00:20:59.120 --> 00:21:04.400 +spectral analysis on the pigments so they can identify like what was used to + +00:21:04.360 --> 00:21:10.920 +make a particular color of this red or what was made to make this color of blue. + +00:21:11.910 --> 00:21:17.160 +And then the idea is tracking how did people make those pigments over time, + +00:21:17.960 --> 00:21:23.500 +over time, and specifically in Asian art. + +00:21:24.180 --> 00:21:25.780 +Is this the Dharmra of Puna? + +00:21:27.720 --> 00:21:31.400 +No, this is mapping color in history. + +00:21:31.430 --> 00:21:33.140 +I don't think it's up here. + +00:21:33.230 --> 00:21:33.700 +Sorry about that. + +00:21:34.040 --> 00:21:40.620 +somewhere. That's all right. I'll find it. Keep talking. Okay. So, so the front end is great, + +00:21:40.900 --> 00:21:46.220 +you know, the, like the public end, this is, people can, can explore by pigments and then + +00:21:46.500 --> 00:21:51.940 +see the images that, that contain those pigments. Now in the back end, what the research, + +00:21:52.480 --> 00:22:00.699 +researchers will be able to do is correlate exactly which point of a painting the analysis was done + +00:22:00.720 --> 00:22:04.420 +that. So they have like this deep zoom image viewer where they'll zoom in and + +00:22:04.480 --> 00:22:10.320 +they'll select the point where that was taken from. And so, you know, how else + +00:22:10.440 --> 00:22:15.760 +would you do that other than a digital interface to indicate on an image of a + +00:22:15.820 --> 00:22:20.120 +painting where that spectral analysis was performed? + +00:22:21.580 --> 00:22:24.700 +Sounds almost like astronomy in a weird way. + +00:22:24.720 --> 00:22:24.960 +Oh yeah. + +00:22:27.180 --> 00:22:32.740 +We zoomed into here and we took a different spectrum of the painting and + +00:22:32.780 --> 00:22:36.280 +we realized that it's actually identical to this, something crazy like that, right? + +00:22:36.650 --> 00:22:37.320 +Yeah, yeah. + +00:22:39.840 --> 00:22:40.080 +Nice. + +00:22:40.220 --> 00:22:42.820 +That's right, yeah, so it's essentially a pigments database. + +00:22:45.320 --> 00:22:46.360 +Yeah, wild. + +00:22:47.960 --> 00:22:54.039 +So the third category of these digital humanities projects that you put down was + +00:22:54.060 --> 00:22:56.060 +like data extraction, transformation. + +00:22:58.740 --> 00:23:00.480 +In data science, they often say, + +00:23:00.920 --> 00:23:03.080 +80% of the work is the data wrangling, + +00:23:03.280 --> 00:23:05.220 +which is like cleaning, organization, + +00:23:05.960 --> 00:23:07.320 +just getting it so you could possibly + +00:23:07.500 --> 00:23:09.040 +start asking questions about it. + +00:23:09.460 --> 00:23:10.600 +I'm sure you all do a lot of that. + +00:23:11.240 --> 00:23:11.800 +- Absolutely. + +00:23:13.500 --> 00:23:17.600 +So often the very beginning of a project + +00:23:18.020 --> 00:23:23.120 +might be an Excel sheet or several spreadsheets. + +00:23:24.380 --> 00:23:28.900 +And the first task is to ingest these into a proper database. + +00:23:29.580 --> 00:23:31.460 +Not so much MongoDB for us. + +00:23:31.820 --> 00:23:34.420 +It's going into Postgres or Django Shop. + +00:23:35.960 --> 00:23:37.120 +So it's going into Postgres. + +00:23:38.640 --> 00:23:44.280 +And yeah, no, that is probably the number one challenge of the early stage + +00:23:45.040 --> 00:23:47.800 +is figuring out what the right data model is, + +00:23:48.260 --> 00:23:51.700 +what the right relationships are to model the data. + +00:23:52.320 --> 00:24:03.360 +Doing that work is advantageous to everybody because, you know, it helps both the researchers who brought the data to think about it in a more organized way. + +00:24:03.810 --> 00:24:06.100 +I mean, they've been trying to do that and they have the spreadsheets. + +00:24:06.980 --> 00:24:14.860 +But now we're modeling out the data so that we can add it to database tables and then to use later. + +00:24:14.970 --> 00:24:16.400 +So that works out well for everybody. + +00:24:17.120 --> 00:24:21.200 +And yeah, absolutely cleaning the data, getting dates, + +00:24:21.460 --> 00:24:26.340 +working with fuzzy dates, being able to parse July of 2020 + +00:24:27.700 --> 00:24:31.680 +or summer of 2020 and handling kind of all of those cases + +00:24:31.750 --> 00:24:33.840 +so that we do get dates in the end. + +00:24:35.240 --> 00:24:38.800 +One of the crazy stories from data parsing history + +00:24:39.060 --> 00:24:40.980 +is one of the-- + +00:24:41.540 --> 00:24:43.040 +I can't remember exactly what it was. + +00:24:44.720 --> 00:24:47.800 +We talked about biology tools or genetics tools earlier. + +00:24:47.880 --> 00:24:50.520 +One of the groups that names genes + +00:24:50.940 --> 00:24:52.940 +had to change the name of a gene + +00:24:53.140 --> 00:24:55.560 +because it kept getting parsed by Excel into a date. + +00:24:56.980 --> 00:24:57.760 +- I remember that. + +00:24:57.840 --> 00:24:58.940 +I remember that. - So crazy. + +00:24:59.640 --> 00:25:00.040 +- Yes. + +00:25:00.660 --> 00:25:02.440 +- So like these are the weird edge cases + +00:25:02.640 --> 00:25:03.460 +I'm sure you run into. + +00:25:05.200 --> 00:25:06.440 +Like it's not even supposed to be a date. + +00:25:06.580 --> 00:25:07.240 +Why is this a date? + +00:25:07.420 --> 00:25:08.640 +I don't know why is it a date? + +00:25:08.720 --> 00:25:09.540 +Help me out here. + +00:25:10.380 --> 00:25:11.160 +The code keeps crashing. + +00:25:11.420 --> 00:25:13.820 +Like pandas parsed it as a date and it's not or whatever. + +00:25:15.600 --> 00:25:16.360 +Yeah, absolutely. + +00:25:16.680 --> 00:25:16.820 +Yeah. + +00:25:16.940 --> 00:25:17.120 +Yeah. + +00:25:17.260 --> 00:25:22.200 +So yeah, usually lots of test suites around that ingest process until we've got it. + +00:25:22.420 --> 00:25:28.100 +Now, once we've got it in, usually the research is ongoing and then we're able to provide them + +00:25:28.220 --> 00:25:33.800 +now a new cleaned interface to, to do the additional data entry as the project is going. + +00:25:34.280 --> 00:25:35.660 +And that's usually a win-win for everybody. + +00:25:36.200 --> 00:25:36.520 +Sure. + +00:25:36.760 --> 00:25:43.380 +And so this sort of ETL ingestion side of everything is it's like, don't worry. + +00:25:43.760 --> 00:25:47.300 +Darth has got it for you. And then we'll provide you like a + +00:25:47.960 --> 00:25:51.620 +database connection to start working or do you give them the + +00:25:51.780 --> 00:25:55.160 +tools and then they kind of iterate on them? And how much? + +00:25:55.790 --> 00:25:58.340 +How much is this you and how much is this you providing like + +00:25:58.400 --> 00:26:00.680 +CLI tools and stuff or notebooks over to people? + +00:26:02.580 --> 00:26:06.660 +Our the I'd say most of the people that we're working with + +00:26:06.820 --> 00:26:10.620 +are aware of the technical tools, but they don't want a + +00:26:10.480 --> 00:26:15.640 +database connection. So we are giving them, we're doing the ingest and then building a platform + +00:26:15.990 --> 00:26:21.980 +where they can begin interacting with their data. Yeah, I'm sure they don't want one. + +00:26:23.470 --> 00:26:28.420 +Maybe you give them an app though, right, with like Elasticsearch and other things that they can... + +00:26:28.470 --> 00:26:31.020 +No, absolutely. Yeah, that's what we do. Yeah, okay. + +00:26:31.370 --> 00:26:36.860 +Yeah, we give them a web platform to begin exploring, to begin publishing. + +00:26:38.960 --> 00:26:46.420 +So I was thinking that you said you're a Django shop, which is cool. + +00:26:47.820 --> 00:26:51.760 +It sounds, though, to me like describing what you're doing, just imagining how this is. + +00:26:52.400 --> 00:26:55.300 +You're probably creating these projects often. + +00:26:56.720 --> 00:27:00.460 +How often does one of these projects actually last? + +00:27:02.240 --> 00:27:04.240 +Or how many of them do you iterate? + +00:27:05.740 --> 00:27:06.440 +I'm trying to get a sense. + +00:27:06.680 --> 00:27:09.680 +Do you work on stuff for a year or is it like every two weeks we're on a new project? + +00:27:11.100 --> 00:27:21.920 +It's why I think of us as like an agency because we get to work on greenfield projects fairly often, like you're imagining, which would not be the case normally at a big university IT department. + +00:27:23.420 --> 00:27:28.860 +So, you know, maybe two or three projects a year, two or three big ones a year. + +00:27:29.510 --> 00:27:35.960 +And then we have to put to bed a few a year as well because these things, they're funded with grant money. + +00:27:36.440 --> 00:27:38.940 +and then the grant money runs out, and it's time. + +00:27:39.180 --> 00:27:41.280 +And then we have to figure out, what do we do with it now? + +00:27:41.340 --> 00:27:46.620 +We don't want to lose the data and this way of presenting it. + +00:27:46.680 --> 00:27:48.700 +But we can't keep paying for Elasticsearch. + +00:27:49.560 --> 00:27:50.200 +Yeah, of course. + +00:27:50.800 --> 00:27:51.680 +Certainly, we're going to dive into that. + +00:27:53.020 --> 00:27:54.360 +But let's save that for the end. + +00:27:54.400 --> 00:27:57.500 +It seems like that's the arc of the story of these things. + +00:27:57.600 --> 00:28:02.400 +But I certainly think it's something that you don't think about that much, right? + +00:28:02.680 --> 00:28:06.240 +Like you said, it was only $100 a month for this, and we got a big grant. + +00:28:06.380 --> 00:28:07.180 +There's a bunch of, no big deal. + +00:28:07.210 --> 00:28:08.480 +But like when the grant's out, + +00:28:08.680 --> 00:28:10.160 +who's on the hook for a hundred dollars a month + +00:28:11.240 --> 00:28:13.260 +and making sure it survives upgrades + +00:28:13.700 --> 00:28:15.260 +and all that kind of business. + +00:28:16.660 --> 00:28:17.080 +No, that's right. + +00:28:17.700 --> 00:28:17.780 +Yeah. + +00:28:17.970 --> 00:28:21.240 +So my original question when I started on this path + +00:28:21.400 --> 00:28:22.660 +was thinking like, do you, + +00:28:25.380 --> 00:28:26.560 +how do you get started on these? + +00:28:26.560 --> 00:28:27.900 +Do you have like a big framework + +00:28:28.760 --> 00:28:30.440 +or a cookie cutter sort of thing or something? + +00:28:30.530 --> 00:28:31.700 +Like this is how we do it + +00:28:31.810 --> 00:28:33.520 +because it plugs into all this other automation + +00:28:33.750 --> 00:28:35.920 +and tools we built for the last 10 projects. + +00:28:36.400 --> 00:28:37.700 +You know, that's kind of a unique position. + +00:28:39.400 --> 00:28:47.420 +A lot of companies build one website for themselves and that's their app or they're an agency that goes across so much variation they can't do that kind of stuff, right? + +00:28:48.680 --> 00:28:49.060 +That's right. + +00:28:49.620 --> 00:28:50.000 +That's right. + +00:28:51.640 --> 00:28:52.240 +That's a good question. + +00:28:52.840 --> 00:28:56.300 +We have things that we reuse. + +00:28:56.540 --> 00:29:06.080 +Some of them are open source, different, you know, like search components and things that we maintain that we'll use across projects. + +00:29:06.570 --> 00:29:09.900 +And we have tried to do the cookie cutter Django project. + +00:29:10.720 --> 00:29:13.200 +The truth is each project is different enough. + +00:29:13.640 --> 00:29:13.980 +Yeah. + +00:29:14.440 --> 00:29:24.120 +Really, we like to evaluate it from first principles as we're evaluating it and thinking, what is the best technology to use? + +00:29:26.920 --> 00:29:32.280 +Yeah, so we don't have a cookie cutter. + +00:29:32.320 --> 00:29:38.520 +We don't have a kind of a meta framework for bootstrapping them because they're sufficiently different from each other. + +00:29:40.639 --> 00:29:41.680 +I find that too. + +00:29:42.820 --> 00:29:47.780 +The idea of how we could just grab this cookie cutter or copier. + +00:29:47.880 --> 00:29:48.880 +Are you familiar with copier? + +00:29:49.320 --> 00:29:50.740 +People out there might be familiar with that. + +00:29:50.840 --> 00:29:56.360 +It's a little bit like cookie cutter with the bonus that you can update it later + +00:29:56.800 --> 00:29:58.220 +if you change your mind about something, + +00:29:58.300 --> 00:30:01.840 +like actually change this project to use Postgres rather than SQLite or something, + +00:30:03.800 --> 00:30:04.480 +which is pretty cool. + +00:30:04.640 --> 00:30:08.759 +But every time that I do, every time I try to work with one of those projects, + +00:30:09.600 --> 00:30:11.900 +even ones that I've created for myself, I'm not hating on anyone. + +00:30:12.380 --> 00:30:16.660 +I'm like, oh, it's like 75% awesome and 25% I just got to take this stuff out. + +00:30:17.720 --> 00:30:19.600 +You know, I'll just do it from scratch. + +00:30:19.700 --> 00:30:20.700 +It's not how hard is this? + +00:30:20.780 --> 00:30:23.240 +I'll just create a few folders and put a few things in there, + +00:30:23.740 --> 00:30:27.040 +and I'll copy the PyProject.tom, like the one thing that's like, + +00:30:27.120 --> 00:30:27.900 +how do I do this again? + +00:30:27.940 --> 00:30:29.620 +I'll just copy that, and we're good to go. + +00:30:33.880 --> 00:30:35.460 +Yeah, I mean, that's what I find. + +00:30:35.660 --> 00:30:36.220 +That's what I find. + +00:30:36.440 --> 00:30:40.080 +I find it seems like a really brilliant idea, but in practice, + +00:30:41.380 --> 00:30:42.920 +it hasn't saved us time yet. + +00:30:44.080 --> 00:30:45.480 +No, I mean, maybe it's a case study. + +00:30:45.620 --> 00:30:47.480 +Like, okay, let's see what they're doing for this one. + +00:30:47.520 --> 00:30:49.360 +Oh, that is interesting how they're integrating + +00:30:49.780 --> 00:30:52.580 +this other thing maybe, but as a true foundation, + +00:30:52.720 --> 00:30:55.060 +I find it in theory awesome. + +00:30:55.480 --> 00:30:58.080 +In practice, I just end up not doing it for various reasons. + +00:30:58.440 --> 00:30:58.780 +Don't know why. + +00:31:04.720 --> 00:31:05.680 +I'm going to save this for later. + +00:31:06.460 --> 00:31:07.820 +Because the question I'm about to ask you + +00:31:07.780 --> 00:31:10.760 +is going to send us just down a rat hole. + +00:31:11.460 --> 00:31:17.300 +So instead, before we go down the rat hole, maybe we could, not that one, maybe we could + +00:31:17.380 --> 00:31:23.340 +talk about, I mean, you talked about something, but let's maybe just feature some of the projects + +00:31:23.520 --> 00:31:25.520 +that are maybe more well-known that you guys have done. + +00:31:26.380 --> 00:31:26.480 +Sure. + +00:31:27.060 --> 00:31:27.420 +Yeah, good. + +00:31:28.020 --> 00:31:31.600 +So yeah, one of them is called the amendments project. + +00:31:33.519 --> 00:31:37.380 +And this is, I didn't know this until I started working on this project. + +00:31:37.600 --> 00:31:45.260 +that there have been thousands of, I think it's at least 22,000 proposed amendments + +00:31:45.930 --> 00:31:49.600 +to the United States Constitution that never went anywhere. + +00:31:50.530 --> 00:31:55.840 +And so kind of the goal of this project is to show that there have been lots of attempts + +00:31:56.660 --> 00:32:01.140 +to amend the Constitution, but actually the Constitution is frozen. + +00:32:01.990 --> 00:32:06.940 +I mean, it's not actually amendable anymore, at least not in the politics of any time recently. + +00:32:07.740 --> 00:32:16.080 +yeah yeah it's so this is a database i cannot imagine a situation where the u.s constitution + +00:32:16.400 --> 00:32:22.000 +gets admitted it has to be unanimous across all the states right is that right if i can't remember + +00:32:22.160 --> 00:32:25.400 +but it's been a long time i remember off the top of my head it has to be unanimous but it certainly + +00:32:25.580 --> 00:32:30.519 +has to be it's got to be lines yeah it's got to be pretty darn close if it's not it all + +00:32:30.540 --> 00:32:36.480 +Alstair in. Yeah, it's like, you know, time travel, or the speed of your travels be light, + +00:32:36.960 --> 00:32:39.200 +could be theoretically possible, probably not going to happen. + +00:32:40.840 --> 00:32:44.640 +No, it's hard to see. It's hard to see. Yeah. So this is from a historian, + +00:32:45.700 --> 00:32:51.560 +historian at Harvard. Interesting. Okay. So, so it's a database of all of and the full text + +00:32:52.740 --> 00:33:00.180 +from all of these amendments. And, you know, it's, it's from the public's point of view, + +00:33:00.480 --> 00:33:12.340 +It's a Postgres full text vector search interface for finding and filtering through on all of the different amendments that have been proposed. + +00:33:13.980 --> 00:33:14.200 +Awesome. + +00:33:14.480 --> 00:33:14.660 +Okay. + +00:33:18.080 --> 00:33:18.640 +Let's see. + +00:33:20.660 --> 00:33:21.220 +I love it. + +00:33:23.600 --> 00:33:24.700 +Yeah, this is a nice looking site. + +00:33:26.000 --> 00:33:27.240 +We work with a designer. + +00:33:27.860 --> 00:33:28.280 +Okay. + +00:33:29.060 --> 00:33:29.240 +Okay. + +00:33:29.240 --> 00:33:29.320 +Yeah. + +00:33:29.400 --> 00:33:31.800 +Yeah. Of course, like an agency would, right? + +00:33:32.580 --> 00:33:33.040 +Yep. Yep. + +00:33:36.840 --> 00:33:42.380 +Nice. So we'll get a really pretty rich search interface and then off you go. + +00:33:42.660 --> 00:33:44.280 +I have no idea even what I would search for. + +00:33:45.720 --> 00:33:47.460 +Yeah. Well, you can always search for something religious, + +00:33:47.900 --> 00:33:50.600 +something abortion related. There's going to be lots of things there. + +00:33:50.860 --> 00:33:53.520 +I thought all those also like guns, but like, I don't want to go down. + +00:33:53.640 --> 00:33:56.020 +I'm not sure I even want to go down that route. + +00:33:57.020 --> 00:33:58.320 +Awesome. This looks super useful. + +00:33:59.440 --> 00:34:02.940 +Maybe someday we'll have a functional government again. + +00:34:03.200 --> 00:34:03.460 +We'll see. + +00:34:05.400 --> 00:34:06.120 +Let's change it. + +00:34:06.520 --> 00:34:07.780 +Or maybe we'll go down in its folklore. + +00:34:08.610 --> 00:34:09.300 +Like, look, it used to work. + +00:34:09.300 --> 00:34:09.460 +All right. + +00:34:10.190 --> 00:34:10.520 +So, yeah. + +00:34:10.530 --> 00:34:21.440 +So another really great project, at least from a content point of view, that's interesting, the research that it's doing, is the Finn Folklore Database. + +00:34:23.419 --> 00:34:39.940 +So in Celtic storytelling, you know, moms have been telling stories to daughters and people have been telling stories for a very long time, hundreds or a thousand years. + +00:34:42.280 --> 00:34:49.540 +Finn McCommel who is a hero a hero from Irish mythology some of it some of it + +00:34:49.700 --> 00:34:56.820 +based in you know historical events but it goes back it goes back so far so there + +00:34:56.919 --> 00:35:03.020 +are there's many hundreds or thousands of of these stories that have been spread + +00:35:03.160 --> 00:35:06.640 +and versions of these stories that have that have been told and so some of them + +00:35:06.540 --> 00:35:12.220 +are audio recordings where somebody like some research researcher has has gone out to an island + +00:35:12.250 --> 00:35:18.180 +off the coast of scotland and recorded somebody telling you know their version of the hero of finn + +00:35:18.360 --> 00:35:25.000 +and his band of you know his band of heroes you know they defend scotland and ireland from from + +00:35:25.480 --> 00:35:31.959 +invaders and attackers uh very exciting uh stories and stuff and a team of a team of characters + +00:35:33.580 --> 00:35:37.440 +So there's audio recordings and then there's documents, + +00:35:37.830 --> 00:35:39.920 +like written documents that contain these. + +00:35:39.920 --> 00:35:44.280 +And so this is a database of kind of all of those all in one place with, + +00:35:45.550 --> 00:35:51.980 +on the public side, a nice search interface for discovering them, + +00:35:52.330 --> 00:35:54.700 +you know, either using the map view or searching. + +00:35:55.340 --> 00:35:56.140 +Yeah, that's cool. + +00:35:56.540 --> 00:35:59.320 +I got my map view for some random thing I searched about here. + +00:36:00.380 --> 00:36:00.820 +Amazing. + +00:36:02.980 --> 00:36:06.720 +Yeah, but this is pretty interesting, all these different tellings and stuff. + +00:36:07.610 --> 00:36:14.020 +Oh, and yeah, one of the big challenges with this project is that it's fully internationalized. + +00:36:14.330 --> 00:36:16.000 +So it's available in English. + +00:36:16.220 --> 00:36:21.040 +Everything is available in English, Scottish Gaelic, and Irish Gaelic. + +00:36:21.370 --> 00:36:22.960 +But that extends into the database. + +00:36:23.330 --> 00:36:26.900 +So usually people have multiple names recorded for them. + +00:36:28.180 --> 00:36:33.100 +And so, yeah, you may have one person with any number of names in different languages, + +00:36:33.260 --> 00:36:36.260 +sometimes more than one Scottish name, that kind of thing. + +00:36:36.280 --> 00:36:43.180 +And so the data model on this one is quite messy, but sensible. + +00:36:44.400 --> 00:36:47.100 +But yeah, it's quite a lot of different kinds of data to wrangle. + +00:36:47.320 --> 00:36:49.220 +And then with all of the translations for each thing. + +00:36:49.960 --> 00:36:50.460 +Yeah, that's wild. + +00:36:50.540 --> 00:36:56.520 +It's not just, we need the user interface of this thing to translate about. + +00:36:57.920 --> 00:36:58.720 +That's way more, right? + +00:36:59.420 --> 00:37:04.180 +Right. Yeah. Yeah. It is that. It is that. And then it is also, yes, all the items in the database + +00:37:04.820 --> 00:37:06.380 +have a translation or can. + +00:37:06.450 --> 00:37:11.940 +Okay. Yeah. I can see how that makes a lot of sense. You want to work in the native language + +00:37:12.460 --> 00:37:15.180 +of the people who did that part of the folklore or whatever, right? + +00:37:17.520 --> 00:37:21.280 +Yeah. Well, and people are still speaking those languages. So people who would use this + +00:37:21.700 --> 00:37:25.499 +to, you know, like somebody may have heard a story from their mom or dad + +00:37:25.680 --> 00:37:28.000 +and now would like to find other versions of that story. + +00:37:28.430 --> 00:37:32.200 +And they live in a part of Scotland where they speak Scottish Gaelic as their first language. + +00:37:32.760 --> 00:37:33.700 +Right, right. Very cool. + +00:37:33.700 --> 00:37:34.540 +They can still access the site. + +00:37:35.230 --> 00:37:39.960 +Yeah. And then that mapping color history one, that's another one of the public ones that you + +00:37:40.740 --> 00:37:41.940 +said is pretty major. + +00:37:43.620 --> 00:37:51.339 +Yeah, that's right. Yeah. So yeah, that's a pigments database. You can search by either + +00:37:51.360 --> 00:37:57.520 +English color names like blue and find all of these Asian paintings that have blue or a particular kind + +00:37:57.520 --> 00:38:05.440 +of pigment of how they made the blue. Yeah, nice. So what's the open source story? You're creating + +00:38:05.440 --> 00:38:11.740 +all these apps, maybe some of these frameworks, there's got to be some tools. Is there a big + +00:38:12.540 --> 00:38:19.579 +desire or already an effort to have a lot of these things open source or is it too niche or is it just + +00:38:19.600 --> 00:38:26.560 +like this is the advantage of harvard has is other universities don't get this uh no it's it's + +00:38:26.820 --> 00:38:32.720 +something we talk about quite a bit um usually these things start usually they start closed source + +00:38:33.100 --> 00:38:38.460 +um during development and then we and then we work with the faculty and we talk about um + +00:38:39.200 --> 00:38:45.099 +how we can take you know like the repo for the for the web app how we can take that um public + +00:38:46.000 --> 00:38:48.640 +And so we've done that for a number of projects. + +00:38:48.760 --> 00:38:49.620 +Not all of them are. + +00:38:50.720 --> 00:38:54.420 +But the ideal is that they all make their way into the open, + +00:38:55.440 --> 00:38:57.220 +and especially when they become archived. + +00:38:58.580 --> 00:38:58.740 +Sure. + +00:38:59.220 --> 00:39:01.500 +Yeah, that's a good way to help them live on. + +00:39:02.320 --> 00:39:08.800 +And they might even go into GitHub's Arctic Vault, which is crazy. + +00:39:08.840 --> 00:39:10.180 +I don't know if people know about that out there, + +00:39:10.360 --> 00:39:17.240 +but GitHub has quite a while ago started taking copies of all of the repos and putting them, + +00:39:18.200 --> 00:39:20.780 +backing them up and storing them in the Arctic vault. + +00:39:21.700 --> 00:39:22.140 +It's kind of cool. + +00:39:23.150 --> 00:39:26.280 +I really, really, really hope we never need that, but it's kind of neat. + +00:39:26.920 --> 00:39:27.380 +Yeah, me too. + +00:39:28.380 --> 00:39:31.140 +Usually universities have their own archival system, + +00:39:31.440 --> 00:39:38.680 +so any important research data is usually part of that system as well. + +00:39:39.080 --> 00:39:39.460 +I see. + +00:39:39.590 --> 00:39:39.700 +Okay. + +00:39:39.930 --> 00:39:40.060 +Yeah. + +00:39:40.840 --> 00:39:48.960 +obviously right like i'm i can't remember where it was it was somewhere i think it was south korea + +00:39:49.040 --> 00:39:55.180 +or taiwan where like seven years of government data got lost or something like that it was really + +00:39:55.340 --> 00:39:59.280 +really bad recently there was a fire and i think they had backups but maybe just into the building + +00:39:59.750 --> 00:40:02.979 +you know like we'll put that out we'll back it up to the hard drive over here + +00:40:05.920 --> 00:40:06.280 +Not good. + +00:40:06.880 --> 00:40:07.280 +No. + +00:40:09.180 --> 00:40:09.620 +Not good. + +00:40:09.640 --> 00:40:11.060 +You definitely want this stuff to survive. + +00:40:11.060 --> 00:40:14.460 +I mean, academia has this history of, like, + +00:40:15.520 --> 00:40:17.040 +tomes that have survived the past + +00:40:17.680 --> 00:40:21.500 +and really, really long-lived information, right? + +00:40:23.400 --> 00:40:26.160 +Besides the Library of Alexandria or something like that, maybe. + +00:40:26.680 --> 00:40:27.460 +That's what we want. + +00:40:27.700 --> 00:40:28.160 +That's what we want. + +00:40:28.220 --> 00:40:30.400 +We want it to, yeah, we want it to last. + +00:40:32.020 --> 00:40:32.300 +Absolutely. + +00:40:32.680 --> 00:40:37.820 +So maybe that's a good time to sort of talk about the trailing end. + +00:40:37.820 --> 00:40:40.300 +I think there's a lot of interesting things going on here. + +00:40:43.180 --> 00:40:49.840 +Just like you've run out of money, not because you actually run out of money. + +00:40:50.380 --> 00:40:53.620 +The grant is done, and you've either spent or given back or whatever + +00:40:53.810 --> 00:40:55.580 +with the remaining little bits of money. + +00:40:56.800 --> 00:40:58.260 +It's always a weird balance with research. + +00:40:58.620 --> 00:41:01.800 +It's like, oh, we got $3,000 left on this research grant. + +00:41:01.900 --> 00:41:02.780 +What are we going to do with it? + +00:41:02.840 --> 00:41:03.900 +It's not like, well, we're going to give it back. + +00:41:04.040 --> 00:41:04.720 +We just didn't need it. + +00:41:04.960 --> 00:41:09.380 +It's like, we're going to find a way to like fund a student to do a little more work or + +00:41:09.460 --> 00:41:09.540 +whatever. + +00:41:09.760 --> 00:41:11.880 +But eventually the grant is over. + +00:41:14.440 --> 00:41:18.380 +Then you've got some expensive app access to a big database because it needs a big search + +00:41:18.640 --> 00:41:20.740 +or a lot of compute or something. + +00:41:21.760 --> 00:41:22.040 +That's right. + +00:41:22.460 --> 00:41:27.720 +Everything during like, I mean, anything, anything that's a, that's a Django app. + +00:41:28.520 --> 00:41:38.600 +We deploy to AWS using containers, which isn't the cheapest way to host anything. + +00:41:40.699 --> 00:41:43.700 +But that's for the most part the Harvard way. + +00:41:45.680 --> 00:41:47.780 +And it is robust and is reliable. + +00:41:48.540 --> 00:41:58.460 +And we don't have a DevOps person on call on the weekend to rescue one of these people. + +00:41:58.480 --> 00:42:04.780 +apps so having having them reliable is good okay so it's so it's on AWS and + +00:42:05.120 --> 00:42:09.320 +paying you know paying for the containers paying for that elastic search cluster + +00:42:09.800 --> 00:42:16.640 +the art RDS Postgres database okay well even if somebody wants to start paying + +00:42:16.740 --> 00:42:19.740 +for that out-of-pocket all of those little services they add up they add up + +00:42:19.780 --> 00:42:26.359 +to enough that we need to do something when the project hits end of life and so + +00:42:26.420 --> 00:42:53.720 +So our gold standard that we've developed so far is asking, can this become a static website? Can we bake this out into all HTML files and acknowledge that there will be some trade-offs? We will trade off some searching. It's not going to have Elasticsearch. It doesn't mean that it won't have any search, though. So we'll trade out Elasticsearch, and it'll be very difficult to add new data. + +00:42:54.360 --> 00:43:20.360 +But that's okay because it's being archived. So can we get it into a static site? And that's challenging depending on how you've set it up. So we now have projects where we set them up from the beginning to be archivable like this. And one of them is called Water Stories. And it was a companion to an art installation at the Radcliffe Institute on the Harvard campus. + +00:43:21.440 --> 00:43:30.420 +And so this was this live site during the duration of the art installation where people could come in and add stories that they had about water onto an iPad. + +00:43:31.960 --> 00:43:33.200 +And then those went up to our database. + +00:43:35.500 --> 00:43:43.980 +We built that with something called Django Bakery, which if you opt in and you use all of their class-based views the way that they're meant to be used, + +00:43:45.680 --> 00:43:48.520 +then you can bake this out into static files when you're done. + +00:43:49.000 --> 00:43:49.700 +Very low effort. + +00:43:50.310 --> 00:43:50.880 +That was perfect. + +00:43:51.320 --> 00:44:15.740 +That is such a cool idea. And mad props to them for ASCII art logos. Come on now. I feel like that should be in the view source if it's not. But this is such a cool idea because you can just take a working site. You guys are at Django Shop, so a lot of your sites are written in Django and you just go, make it static, right? More or less? + +00:44:15.800 --> 00:44:20.900 +Yes. And what's really great about it is if they wanted to make a change, and they have, + +00:44:21.040 --> 00:44:24.620 +they have asked since we, since we made it static, they've asked for a couple of changes. + +00:44:25.300 --> 00:44:31.220 +So locally, I just Docker compose up this whole application, make the change in the Django admin + +00:44:31.840 --> 00:44:32.740 +and rebake the site. + +00:44:34.020 --> 00:44:34.620 +Yeah, that's cool. + +00:44:34.980 --> 00:44:35.820 +It can still be updated. + +00:44:36.540 --> 00:44:40.660 +Something, if you've never tried this, like something like, hey, can we just add one more + +00:44:40.840 --> 00:44:44.720 +menu item? And you're like, no, no, no, we're not adding the menu item because you know what that + +00:44:44.700 --> 00:44:51.180 +That means we're changing 7,300 pages because they all bake in the whole HTML, right? + +00:44:51.680 --> 00:44:52.060 +Exactly. + +00:44:52.530 --> 00:44:53.140 +Yeah, exactly. + +00:44:53.230 --> 00:44:58.060 +But if that's in my Django database, in my SQLite file, then no problem at all. + +00:44:58.570 --> 00:44:59.840 +Because then I just rebake it. + +00:45:00.270 --> 00:45:00.440 +Yeah. + +00:45:00.780 --> 00:45:02.240 +Yeah, absolutely. + +00:45:03.960 --> 00:45:07.480 +So I think this is super neat. + +00:45:07.540 --> 00:45:14.720 +There's also Frozen Flask, if I could get rid of all the ads. + +00:45:14.750 --> 00:45:16.540 +I do not need a Yeti thing, whatever that is. + +00:45:17.700 --> 00:45:20.320 +The glass, not the mythical thing. + +00:45:21.260 --> 00:45:27.680 +But Frozen Flask, which does a similar thing for Flask apps, + +00:45:27.840 --> 00:45:30.740 +if you're a Flask person, probably would work with Quart. + +00:45:31.040 --> 00:45:32.460 +Don't know for sure, but probably. + +00:45:33.600 --> 00:45:35.720 +So that's a pretty interesting idea as well. + +00:45:36.490 --> 00:45:37.060 +Throw that in there. + +00:45:38.200 --> 00:45:42.140 +But also, what else? + +00:45:42.900 --> 00:45:46.420 +Also, you talked about search, right? + +00:45:46.660 --> 00:45:49.920 +That can be such a problem. + +00:45:50.560 --> 00:45:56.040 +And I'm a huge fan of your recommendation here with PageFind. + +00:45:58.620 --> 00:45:59.760 +Tell us about PageFind. + +00:46:00.140 --> 00:46:04.299 +So this has been, I think it's been a bit of a game changer in how + +00:46:04.320 --> 00:46:10.640 +functional one of these archived sites can remain. So we're actually in the process of that amendments + +00:46:11.200 --> 00:46:18.640 +website that searches across 22,000 full texts of amendments. We are in the process of sunsetting + +00:46:18.800 --> 00:46:24.620 +that and that will become a static site. And that's for that search. We already have an internal demo + +00:46:24.840 --> 00:46:33.020 +that that proves that we can replace that Postgres full search with PageFind. You lose vector search. + +00:46:33.980 --> 00:46:38.200 +Yeah. You kind of got to get really true keyword matching. + +00:46:39.670 --> 00:46:44.820 +Yeah. Yeah, that's right. But you still get filtering. I mean, and really faceting and filtering + +00:46:45.520 --> 00:46:52.460 +is when it comes to discovery of things, I mean, I find that's really what's useful. So filtering + +00:46:52.800 --> 00:46:59.500 +these amendments by state or by the Congress that was active at the time or by the person who + +00:47:02.280 --> 00:47:09.660 +who co-wrote it um all of those are totally great in page find and the uh the keyword search is just + +00:47:09.780 --> 00:47:14.240 +fine in page find uh one of the things i really like about it is that it it takes your index + +00:47:15.300 --> 00:47:20.520 +and it chops it up into lots of little files that can just fly across the network so it's a very fast + +00:47:20.740 --> 00:47:28.140 +search um it's it's not a huge network load even if your index is initially very large and it + +00:47:28.160 --> 00:47:30.580 +It essentially cuts it up somewhat alphabetically. + +00:47:31.110 --> 00:47:36.800 +So if your search starts with T, or I should say a better word for audio, + +00:47:36.920 --> 00:47:43.400 +if it starts with W, then it will load up the index for words that start with W + +00:47:43.970 --> 00:47:46.180 +and fly that over the network instead of the whole thing. + +00:47:46.270 --> 00:47:49.400 +So it's pretty slick, and it has a great Python API. + +00:47:50.280 --> 00:47:53.960 +So to do the proof of concept for the amendments search, + +00:47:54.200 --> 00:48:00.860 +I just took a database dump and then manually indexed with a Python script into PageFind. + +00:48:01.060 --> 00:48:03.880 +Wait, there's a Python API for PageFind? + +00:48:04.820 --> 00:48:04.900 +Yeah. + +00:48:05.080 --> 00:48:13.100 +So the way PageFind works, I should have said that, is the way most people will use it is by normally PageFind consumes HTML. + +00:48:13.780 --> 00:48:16.880 +So you give it access to your dist folder. + +00:48:18.280 --> 00:48:18.880 +Oh, okay. + +00:48:20.040 --> 00:48:22.700 +And then it crawls through all of your HTML files. + +00:48:22.840 --> 00:48:31.800 +And you can do great things like adding little HTML tags that are just for PageFind that give it the filtering ability or that you want to sort by something. + +00:48:32.260 --> 00:48:33.300 +And so that's great. + +00:48:34.100 --> 00:48:41.600 +Or you can just call PageFind from Python or from TypeScript and just build that index manually. + +00:48:42.400 --> 00:48:43.200 +Well, thanks a lot, David. + +00:48:43.320 --> 00:48:44.680 +I have another thing I've got to go research. + +00:48:44.900 --> 00:48:45.380 +This is awesome. + +00:48:46.300 --> 00:48:48.060 +I'm a huge fan of PageFind, as I said. + +00:48:48.080 --> 00:48:53.540 +on my personal website, mkennedy.codes, is just a pure stat. + +00:48:53.570 --> 00:48:56.020 +It starts in Markdown and ends up in HTML. + +00:48:56.440 --> 00:48:59.100 +But if you add page find in, you get a super rich, + +00:48:59.320 --> 00:49:00.740 +if you want to just know, you want to talk about, + +00:49:00.860 --> 00:49:06.480 +like what was about Docker, it shows you really nice results, + +00:49:07.500 --> 00:49:10.620 +pulling out the different parts of the page and sections that talk about it, + +00:49:10.720 --> 00:49:12.720 +like the headers and then what is said. + +00:49:12.760 --> 00:49:17.400 +and it even does like sub word, you know, + +00:49:17.520 --> 00:49:20.300 +like if you just type doc, it finds all the words that match that. + +00:49:20.560 --> 00:49:23.520 +And what I really like about it is a couple of things is it's instant. + +00:49:23.800 --> 00:49:26.580 +It basically is like nearly instant. + +00:49:26.760 --> 00:49:29.800 +If you type a few things, it gets way faster because it's pulling down. + +00:49:29.910 --> 00:49:33.560 +And if you go and look in the network console here + +00:49:33.740 --> 00:49:35.000 +and you type something, + +00:49:35.210 --> 00:49:39.520 +you can see that it's actually pulling in these little tiny fragments, + +00:49:40.440 --> 00:49:43.120 +which this one's coming off disk cache in three milliseconds, right? + +00:49:43.240 --> 00:49:48.940 +But it breaks your index into a bunch of very small page find fragments + +00:49:50.600 --> 00:49:54.000 +that I think it starts with anything that starts with the word DO. + +00:49:54.260 --> 00:49:56.680 +These are all the prebuilt results and stuff like that, right? + +00:49:57.420 --> 00:49:58.360 +That's right. That's right. + +00:49:58.840 --> 00:49:59.820 +Yeah, that's super cool. + +00:50:00.620 --> 00:50:00.700 +Yeah. + +00:50:01.320 --> 00:50:09.740 +One of our open source projects that we maintain is a Vue.js component library for page find. + +00:50:10.120 --> 00:50:13.780 +so that we can style it and reuse it across different projects. + +00:50:15.500 --> 00:50:16.460 +Oh, that's awesome. + +00:50:16.850 --> 00:50:17.280 +I love it. + +00:50:18.300 --> 00:50:19.760 +Yeah, I think this really unlocks it. + +00:50:19.760 --> 00:50:24.980 +And when you go to so many sites, like their documentation + +00:50:25.620 --> 00:50:28.760 +or just their web app, and the search is so bad. + +00:50:29.000 --> 00:50:32.500 +You type something, and it's like thinking, spinning, spinning, + +00:50:33.020 --> 00:50:34.040 +spinning, spinning. + +00:50:34.420 --> 00:50:37.800 +And then five seconds later, it gives you kind of janky results. + +00:50:38.240 --> 00:50:39.960 +And if you just throw a page find in there, + +00:50:41.740 --> 00:50:44.520 +you can't type fast enough to outrun the results. + +00:50:44.540 --> 00:50:44.980 +You know what I mean? + +00:50:45.280 --> 00:50:45.860 +No, that's right. + +00:50:45.860 --> 00:50:46.000 +Yeah. + +00:50:46.900 --> 00:50:48.880 +Too many static site search solutions, + +00:50:49.040 --> 00:50:53.060 +they use like a JSON blob that you have to pull down + +00:50:53.700 --> 00:50:54.520 +and then iterate through. + +00:50:55.100 --> 00:50:57.060 +You know what's worse, and I see this a lot, + +00:50:57.200 --> 00:51:03.780 +would be if you go to google.com, and then you would say, + +00:51:04.400 --> 00:51:05.780 +effectively, site, colon, whatever, + +00:51:05.960 --> 00:51:06.880 +and then you search Docker. + +00:51:07.780 --> 00:51:10.040 +They basically pull that. + +00:51:11.080 --> 00:51:13.040 +They just say, search this, and you just + +00:51:13.180 --> 00:51:15.840 +get Google results for your site. + +00:51:15.930 --> 00:51:18.840 +And obviously, Google's fine, but it's just-- + +00:51:18.840 --> 00:51:20.580 +No, I find that unusable, really. + +00:51:20.720 --> 00:51:21.100 +I do, too. + +00:51:21.210 --> 00:51:22.720 +It really-- you're like, ah, jeez. + +00:51:23.500 --> 00:51:25.500 +But now I'm super excited to realize + +00:51:25.650 --> 00:51:28.500 +I can do that from my dynamic content as well. + +00:51:29.740 --> 00:51:31.580 +So with the Python integration. + +00:51:32.070 --> 00:51:32.180 +OK. + +00:51:34.000 --> 00:51:34.100 +Nice. + +00:51:36.880 --> 00:51:38.660 +What about something truly static? + +00:51:38.780 --> 00:51:41.620 +Have you looked at Hugo and some of the other type of things? + +00:51:42.100 --> 00:51:42.340 +Sure. + +00:51:42.570 --> 00:51:47.220 +So when I see you've even got the tab up for the SUMEB project, + +00:51:47.560 --> 00:51:49.040 +which is-- + +00:51:50.260 --> 00:51:57.400 +that's essentially a database of many, many specimens taken + +00:51:57.620 --> 00:51:58.540 +from the SUMEB mine. + +00:52:00.140 --> 00:52:00.580 +OK. + +00:52:01.440 --> 00:52:02.080 +Looks beautiful. + +00:52:02.920 --> 00:52:03.540 +Oh, it is. + +00:52:03.550 --> 00:52:03.660 +Yeah. + +00:52:03.770 --> 00:52:04.240 +Yeah, it is. + +00:52:04.700 --> 00:52:06.240 +So if you click on Minerals database, + +00:52:06.740 --> 00:52:07.920 +you open up that search interface + +00:52:08.100 --> 00:52:10.700 +and that's powered by PageFind. + +00:52:10.880 --> 00:52:11.840 +- Oh, this is? + +00:52:12.320 --> 00:52:12.620 +- Yes. + +00:52:16.160 --> 00:52:18.260 +I forget what I was... + +00:52:19.220 --> 00:52:19.560 +- I see. + +00:52:19.940 --> 00:52:21.620 +You've even like hooked into, + +00:52:21.980 --> 00:52:25.240 +I was thinking just like pure static, like Hugo, like... + +00:52:25.800 --> 00:52:27.200 +- Oh yes, yes, yes. + +00:52:27.460 --> 00:52:28.620 +So this is an astro site. + +00:52:29.420 --> 00:52:32.680 +So for this website, we have this as an astro site + +00:52:32.700 --> 00:52:34.080 +so that we have a little bit, + +00:52:34.260 --> 00:52:37.960 +Because with Astro, they make it so easy to pull in Vue components. + +00:52:38.720 --> 00:52:39.280 +I see. + +00:52:39.420 --> 00:52:44.180 +So our page find is a custom Vue.js component library. + +00:52:44.920 --> 00:52:46.900 +With Astro, you can use React components. + +00:52:47.000 --> 00:52:48.620 +You can use Vue components. + +00:52:48.820 --> 00:52:51.640 +But what it does is it's just a static site generator. + +00:52:51.740 --> 00:52:51.880 +Yeah. + +00:52:52.460 --> 00:52:52.560 +Yeah. + +00:52:53.780 --> 00:52:54.180 +Fantastic. + +00:52:54.820 --> 00:52:58.780 +So a little bit more designable than Hugo or something. + +00:52:59.040 --> 00:52:59.800 +Here's your markdown file. + +00:52:59.880 --> 00:53:00.420 +Good luck with that. + +00:53:00.480 --> 00:53:01.320 +Yeah, I love Hugo, though. + +00:53:01.560 --> 00:53:05.240 +Yeah, I use Hugo for different personal sites here and there. + +00:53:05.560 --> 00:53:07.780 +And it's just so fast and easy to get up and running. + +00:53:08.000 --> 00:53:08.940 +But it's great. + +00:53:08.940 --> 00:53:09.140 +I do too. + +00:53:09.620 --> 00:53:11.320 +That's what my website's written in, is in Hugo. + +00:53:13.540 --> 00:53:15.660 +But if I'm integrating with anything else, + +00:53:15.780 --> 00:53:17.120 +I used to kind of split it up. + +00:53:17.160 --> 00:53:19.300 +Like this part's Hugo and this part's a Python app. + +00:53:20.300 --> 00:53:22.300 +And it's pretty easy to get something + +00:53:22.440 --> 00:53:23.920 +that will take a bunch of Markdown files + +00:53:24.060 --> 00:53:27.160 +and just turn them into HTML and just put a page template + +00:53:27.300 --> 00:53:27.720 +around that. + +00:53:27.900 --> 00:53:31.740 +So I've kind of stepped away from mixing and matching that + +00:53:31.840 --> 00:53:32.700 +as much as I used to. + +00:53:32.940 --> 00:53:35.960 +So now if I've got a static section of a dynamic site. + +00:53:36.340 --> 00:53:40.500 +But that has nothing to do with the archival side of things, + +00:53:40.700 --> 00:53:40.860 +right? + +00:53:41.480 --> 00:53:43.780 +Because the idea is that the thing that I'm describing + +00:53:43.980 --> 00:53:44.900 +is gone on purpose. + +00:53:45.220 --> 00:53:45.620 +That's right. + +00:53:47.800 --> 00:53:48.280 +OK. + +00:53:51.560 --> 00:53:53.100 +So you've got some-- + +00:53:54.540 --> 00:53:55.760 +we've got Django Bakery. + +00:53:56.340 --> 00:54:02.780 +I threw out Frozen Flask, and I'm sure there's a ton more that neither of us are aware of at the moment. + +00:54:03.260 --> 00:54:10.800 +So Django Bakery was really good for that purpose, and we're keeping our eyes open for projects that it's a good fit for. + +00:54:11.730 --> 00:54:13.740 +But that was a pretty simple website. + +00:54:13.950 --> 00:54:16.580 +It needed a dynamic backend, but it was quite straightforward. + +00:54:17.420 --> 00:54:22.080 +And for Django Bakery, you have to opt into inheriting from their class-based views. + +00:54:22.190 --> 00:54:22.440 +I see. + +00:54:23.060 --> 00:54:25.120 +So you've got to think ahead of it. + +00:54:25.920 --> 00:54:27.260 +Yeah, absolutely. + +00:54:27.920 --> 00:54:30.080 +Hard to add retroactively, probably impossible. + +00:54:31.680 --> 00:54:34.500 +Now our other websites, like the fin example + +00:54:34.740 --> 00:54:40.080 +and the mapping color example, those are APIs. + +00:54:40.640 --> 00:54:42.940 +That's a Django API, Django REST framework for one, + +00:54:43.960 --> 00:54:45.360 +GraphQL for the other. + +00:54:46.160 --> 00:54:48.460 +One has a view front end, one has a React front end. + +00:54:48.940 --> 00:54:52.480 +Okay, well Django Bankery just isn't gonna work very well + +00:54:52.700 --> 00:54:54.080 +for like serializing JSON. + +00:54:54.700 --> 00:54:57.980 +It's like, awesome, here's your unrendered JavaScript + +00:54:58.220 --> 00:55:00.220 +front-end code and it's just gonna look empty or something. + +00:55:01.340 --> 00:55:04.360 +- Yeah, so it is a good reason to consider using + +00:55:05.230 --> 00:55:07.140 +like vanilla Django templates when possible, + +00:55:08.110 --> 00:55:11.300 +like for that reason, but those were inherited + +00:55:12.700 --> 00:55:15.580 +from the vendors, those two sites, + +00:55:15.610 --> 00:55:17.080 +and we've made a lot of progress on those. + +00:55:17.700 --> 00:55:21.020 +So, you know, what to do in that, + +00:55:21.580 --> 00:55:24.380 +Like in that situation, Django Bakery isn't an option. + +00:55:26.130 --> 00:55:28.120 +And those projects are not end of life yet. + +00:55:28.310 --> 00:55:32.080 +So we have some time, but we're, so what we're doing is strategizing. + +00:55:32.280 --> 00:55:33.740 +Okay, how will we rescue them? + +00:55:33.790 --> 00:55:39.160 +How will we keep them alive once somebody needs to stop paying for hosting? + +00:55:40.220 --> 00:55:41.600 +And we have ideas. + +00:55:41.990 --> 00:55:45.160 +We have, I think there's clever, interesting things out there. + +00:55:46.160 --> 00:55:47.400 +We'll have to keep looking into it. + +00:55:51.320 --> 00:55:55.760 +There are some pretty interesting ideas and I know that you had been thinking about them. + +00:55:56.640 --> 00:55:59.720 +What if instead of having a back-end that ran in a container, + +00:56:00.440 --> 00:56:03.440 +you could just have WebAssembly, + +00:56:04.070 --> 00:56:05.520 +but still have it go, + +00:56:05.730 --> 00:56:07.460 +just a local loopback type of thing? + +00:56:08.000 --> 00:56:14.159 +Yeah, I'm really interested in this one because it enables essentially + +00:56:14.180 --> 00:56:21.260 +the full functionality of the live site to exist as what is just a static site. + +00:56:22.440 --> 00:56:30.300 +So because of Pyodide and projects like PyScript, we can run Python in the browser. + +00:56:31.760 --> 00:56:33.620 +And we can run SQLite in the browser. + +00:56:34.240 --> 00:56:37.960 +And now we can even run Postgres in the browser with PG Lite. + +00:56:39.540 --> 00:56:44.280 +So we can run all those things in the browser, then couldn't we run, couldn't we have Django + +00:56:44.900 --> 00:56:46.620 +hosted right in the browser? + +00:56:47.500 --> 00:56:47.880 +And you can. + +00:56:48.730 --> 00:56:49.300 +You can. + +00:56:49.310 --> 00:56:54.860 +So there's a proof of concept that proves it's possible called Django WebAssembly. + +00:56:56.660 --> 00:57:02.860 +And if you load up this, if you load this up, it'll let you log in to the Django admin + +00:57:03.100 --> 00:57:04.900 +and you're not logging into anybody's backend. + +00:57:05.240 --> 00:57:11.120 +logging into your own browser where this is running in a service worker. + +00:57:11.960 --> 00:57:12.400 +Awesome. + +00:57:12.980 --> 00:57:13.380 +Look at that. + +00:57:14.200 --> 00:57:14.700 +Oh, hold on. + +00:57:14.760 --> 00:57:15.640 +I told you where the password was. + +00:57:17.280 --> 00:57:17.820 +Very secure. + +00:57:18.420 --> 00:57:18.800 +Matt? + +00:57:19.960 --> 00:57:20.400 +Password. + +00:57:20.500 --> 00:57:25.600 +Well, it can be entirely insecure because, yeah, it's running right in your own browser. + +00:57:25.920 --> 00:57:26.500 +Yeah, that's awesome. + +00:57:26.570 --> 00:57:27.980 +And here we are, Django admin. + +00:57:28.520 --> 00:57:28.940 +Incredible. + +00:57:30.220 --> 00:57:31.360 +Yeah, so I'm pretty interested in this. + +00:57:32.140 --> 00:57:38.160 +You've got to convert an RDS Postgres database into either SQLite or something like PGLite. + +00:57:38.440 --> 00:57:39.400 +But I think that's all doable. + +00:57:39.820 --> 00:57:41.800 +So I think it's an exciting possibility. + +00:57:42.160 --> 00:57:42.760 +Yeah, for sure. + +00:57:42.820 --> 00:57:49.680 +I do think, so maybe you have a rich query system that you're powering by your database + +00:57:50.020 --> 00:57:50.580 +that's really heavy. + +00:57:51.560 --> 00:57:51.940 +Exactly. + +00:57:52.240 --> 00:57:56.420 +And it's got a bunch of data that's like, here's all of our working data that you might ask + +00:57:56.580 --> 00:57:57.080 +questions about. + +00:57:57.760 --> 00:58:01.240 +Maybe you just convert that to PageFind to help you find the pieces. + +00:58:01.360 --> 00:58:03.200 +and just keep the operational data + +00:58:03.330 --> 00:58:06.000 +and maybe like even a SQLite with like the Django ORM, + +00:58:06.000 --> 00:58:09.000 +you can just switch the connection, keep talking to it. + +00:58:09.110 --> 00:58:12.780 +I mean, there's possibilities to just get something not too terrible + +00:58:12.920 --> 00:58:14.560 +that's not the same, but not that far off. + +00:58:15.560 --> 00:58:15.960 +Yeah, exactly. + +00:58:16.680 --> 00:58:19.900 +And then it goes on GitHub pages and it can live hopefully forever. + +00:58:20.050 --> 00:58:22.520 +I mean, it feels like GitHub will last forever, + +00:58:23.160 --> 00:58:25.020 +but it'll last longer than funding will anyways. + +00:58:26.400 --> 00:58:34.920 +It's definitely going to last longer than just something that we can't pay for anymore. + +00:58:36.320 --> 00:58:41.900 +I don't know how long GitHub is going to be around for, I think a while, but you never know. + +00:58:42.430 --> 00:58:43.740 +It seems like stuff is going to last forever. + +00:58:43.830 --> 00:58:46.920 +Then it gets changed. + +00:58:47.050 --> 00:58:47.740 +We had subversion. + +00:58:49.380 --> 00:58:50.880 +Now it's completely gone, right? + +00:58:51.430 --> 00:58:53.100 +Just 20 years, 15 years later. + +00:58:53.320 --> 00:58:55.440 +But still, I think 100% there. + +00:58:56.340 --> 00:58:59.120 +- Yeah, but if something ever happened, + +00:58:59.540 --> 00:59:02.360 +somebody just needs to copy that folder of HTML, + +00:59:03.200 --> 00:59:06.740 +CSS and JavaScript files and dump it into an S3 bucket + +00:59:07.400 --> 00:59:10.140 +or somewhere else and then it can continue living there. + +00:59:10.840 --> 00:59:11.900 +So it's a good option. + +00:59:12.640 --> 00:59:13.420 +- It's a great option. + +00:59:13.740 --> 00:59:14.820 +It's a really, really good option. + +00:59:14.950 --> 00:59:21.040 +I mean, I guess one of the long-term concerns might be + +00:59:22.100 --> 00:59:25.440 +what if the WebAssembly standard changes so much + +00:59:25.460 --> 00:59:32.020 +it's not supported anymore but you could probably bite wise convert it if you had to you know like + +00:59:32.140 --> 00:59:37.300 +somebody would probably be able to create one yeah that would that would be unfortunate so + +00:59:37.720 --> 00:59:43.820 +i suppose if that happens i mean if that happens yeah we're um booting up one of these projects + +00:59:44.360 --> 00:59:50.680 +projects is like booting up an emulator for some old dos game right right well i mean i guess + +00:59:51.940 --> 00:59:59.180 +let's think about this for a second somebody got oh gosh what was the chain this is the whole um + +01:00:02.360 --> 01:00:09.820 +yavascript the pycon talk where got like firefox compiled into + +01:00:11.060 --> 01:00:18.960 +wet not wasm into asm js or something like that so it was run like chrome was running firefox + +01:00:19.060 --> 01:00:25.820 +which was running I think Doom which was also Asm.js. If we can do that we could get something + +01:00:25.960 --> 01:00:30.800 +that would run that would read old WebAssembly into new WebAssembly if it really mattered to the world. + +01:00:33.440 --> 01:00:39.780 +Absolutely. Yeah. Especially if it's in a public repo that people who care about the data can + +01:00:40.260 --> 01:00:47.000 +can rescue it somehow. Yeah. What about like a virtual machine? You know, I agree. Yeah, + +01:00:47.020 --> 01:00:53.960 +I'm gonna save me some, take a snapshot of Ubuntu LTS, + +01:00:54.100 --> 01:00:55.880 +some version and just, what are we gonna do? + +01:00:56.680 --> 01:00:58.560 +- Everything we do is Dockerized. + +01:00:59.100 --> 01:00:59.980 +Everything is in a container. + +01:01:00.450 --> 01:01:01.560 +So in the worst case scenario, + +01:01:02.010 --> 01:01:03.240 +we could give somebody the image + +01:01:03.970 --> 01:01:05.300 +and they could run it if they have Docker. + +01:01:06.500 --> 01:01:10.060 +I think that's a nice peace of mind to know that + +01:01:10.170 --> 01:01:13.360 +no matter what, something will be able to run this container. + +01:01:13.620 --> 01:01:15.660 +And even in, I don't know if you've used GitHub, + +01:01:16.340 --> 01:01:17.440 +What is it called? Codespaces. + +01:01:18.740 --> 01:01:18.940 +- Yeah. + +01:01:19.760 --> 01:01:22.080 +- I've, I've, I archived one project. + +01:01:23.480 --> 01:01:25.220 +It was kind of dramatic and all, + +01:01:25.440 --> 01:01:27.320 +and sudden that it needed to be archived. + +01:01:27.500 --> 01:01:29.260 +So without much time to do anything. + +01:01:29.320 --> 01:01:31.660 +And it was a Ruby on Rails project + +01:01:31.980 --> 01:01:34.200 +and I'm not a Rails developer, + +01:01:34.380 --> 01:01:37.880 +but I was able to get it archived in a way that anybody could + +01:01:38.480 --> 01:01:41.320 +with one command, go to the repo on GitHub + +01:01:41.640 --> 01:01:45.220 +and boot it up in Codespaces and then have it live. + +01:01:46.120 --> 01:01:47.980 +running from their code space. + +01:01:48.040 --> 01:01:49.340 +And so that works too. + +01:01:51.240 --> 01:01:54.040 +Very cool. I think as WebAssembly grows, + +01:01:54.520 --> 01:01:57.700 +there'll be more possibilities for these types of things. + +01:01:59.780 --> 01:02:03.280 +Yeah, amazing. I'm pretty excited + +01:02:03.460 --> 01:02:05.520 +about PageFind having a Python API. + +01:02:05.620 --> 01:02:06.400 +I didn't realize that. + +01:02:06.540 --> 01:02:09.020 +So I'm going to be doing something with that for sure. + +01:02:12.300 --> 01:02:13.700 +Let me ask you one more thing + +01:02:13.700 --> 01:02:19.180 +forward, I kind of let you wrap up with some final thoughts here. What about AI? + +01:02:20.700 --> 01:02:27.640 +Oh, that's a good question. So AI, I mean, there's like, in my story, there's like one + +01:02:27.840 --> 01:02:33.900 +interesting part of AI, which is that I got started and self-learned everything I needed + +01:02:34.000 --> 01:02:41.319 +to about software development to begin doing this right before ChatGPT really came on and was able + +01:02:41.340 --> 01:02:48.420 +to do real programming yeah you're like four years of legit programming before right so i think i + +01:02:48.520 --> 01:02:52.340 +mean so i was thinking i was thinking and i was thinking about how i got into it i thought what + +01:02:52.500 --> 01:03:02.120 +if i was four years later starting my phd and wanting to do these tools um i would have been + +01:03:02.160 --> 01:03:08.760 +able to accomplish what i needed to for my research without acquiring the technical skills i know and + +01:03:08.780 --> 01:03:12.020 +I don't know that's a good thing. I'm not sure if that's a good or bad thing. It could be both. + +01:03:12.170 --> 01:03:29.260 +I would have thought it was a good thing. I would have thought it's a good thing. But in my hands now, like a software engineer, AI is more powerful in my hands now than it would have been then. + +01:03:30.240 --> 01:03:31.720 +Even the same model, same everything. + +01:03:31.840 --> 01:03:36.380 +for me. Yeah, I can make it work for me in a way that I couldn't have been able to then. So I'm + +01:03:36.560 --> 01:03:41.860 +thankful for that. But it's something I think of, you know, I don't want to say it's a necessarily + +01:03:41.990 --> 01:03:47.580 +a bad thing. But it definitely marks a difference, a difference in time between other people who are + +01:03:48.220 --> 01:03:53.140 +maybe wanting to get into digital humanities, their humanities researchers, they want to add + +01:03:53.150 --> 01:03:58.660 +some digital tools. You know, I think this will kind of this will probably knock people off of + +01:03:58.680 --> 01:03:59.980 +the more technical path. + +01:04:00.680 --> 01:04:01.700 +I think it will too. + +01:04:01.850 --> 01:04:03.800 +And I think that that might be a negative. + +01:04:04.080 --> 01:04:07.000 +When you were telling me your story originally, + +01:04:07.630 --> 01:04:12.520 +I was thinking kind of like how neat is it that you didn't sign up for + +01:04:12.570 --> 01:04:15.900 +and the people you were working with probably didn't intend to sign you up + +01:04:16.020 --> 01:04:18.960 +for learning true software development. + +01:04:19.610 --> 01:04:22.940 +But look at this cool and interesting job that you now have + +01:04:23.560 --> 01:04:24.900 +that you never would have imagined. + +01:04:25.180 --> 01:04:26.780 +I'm sure when you signed up for your PhD, + +01:04:26.810 --> 01:04:28.440 +you're like, you know what I'm going to do when I get my PhD? + +01:04:28.600 --> 01:04:31.260 +I'm going to go X, Y, like, I'm going to join the Darth program. + +01:04:31.380 --> 01:04:34.020 +Like, no, probably not. Right. But here you are. + +01:04:34.740 --> 01:04:38.260 +And I think that's actually a really interesting knock on effect for a lot of + +01:04:38.560 --> 01:04:41.300 +researchers and people in grad schools. + +01:04:41.660 --> 01:04:44.960 +They're kind of put into this programming adjacent type of thing. + +01:04:46.100 --> 01:04:49.760 +You know, and a lot of folks sort of, actually, that's pretty interesting. + +01:04:49.960 --> 01:04:53.160 +I'm going to kind of lean into that. And I think AI might knock, + +01:04:54.580 --> 01:04:56.560 +not like you said, knock people off that path to some degree. + +01:04:57.800 --> 01:05:22.000 +Yeah, definitely. So that's just one part of the AI story. The other one is how we use it. It's great for data extraction, pulling data out of different, to make these search interfaces more powerful, to extract different data from them. That's just one example where it's been handy. + +01:05:22.780 --> 01:05:27.700 +we're looking for ways that it can really empower faculty. + +01:05:28.380 --> 01:05:30.840 +You know, we're still very much in the exploration phase + +01:05:31.040 --> 01:05:34.220 +of like how we can use it and provide it to faculty + +01:05:34.620 --> 01:05:37.520 +as a like digital humanities tool. + +01:05:38.380 --> 01:05:40.140 +- Sure, I was thinking pretty much + +01:05:40.780 --> 01:05:41.800 +when I asked the question of it, + +01:05:41.820 --> 01:05:44.800 +it's just like two parts, like one, how does it, + +01:05:45.020 --> 01:05:48.080 +are you guys using it to help take project? + +01:05:48.260 --> 01:05:49.100 +Well, that would have been a month. + +01:05:49.180 --> 01:05:50.140 +No, actually it's three days. + +01:05:50.540 --> 01:05:51.060 +You know what I mean? + +01:05:52.200 --> 01:05:57.880 +that and then if people are asking you know a professor comes along and says and we want our + +01:05:57.930 --> 01:06:05.280 +own custom ai thing or we're using harvard's internal one that we're allowed to use + +01:06:06.880 --> 01:06:09.400 +but we won't be able to use it once the grant runs out you know what i mean + +01:06:11.120 --> 01:06:16.340 +yeah yeah i think um one one good example of this type of thing is that um what we're starting to + +01:06:16.280 --> 01:06:20.540 +to get is faculty who are vibe coding. + +01:06:21.160 --> 01:06:23.300 +And now we're going to teach them. + +01:06:23.780 --> 01:06:25.020 +We're going to teach them how to do it. + +01:06:25.860 --> 01:06:25.960 +Yeah. + +01:06:26.600 --> 01:06:27.120 +Instead of-- + +01:06:27.120 --> 01:06:27.780 +It is a skill. + +01:06:28.520 --> 01:06:28.860 +Yeah. + +01:06:28.940 --> 01:06:30.120 +It's absolutely a skill. + +01:06:30.940 --> 01:06:31.140 +Yeah. + +01:06:31.220 --> 01:06:31.780 +No, it is. + +01:06:31.900 --> 01:06:32.260 +It is. + +01:06:33.020 --> 01:06:37.980 +Instead of copy and pasting from ChatGPT into VS Code, + +01:06:39.400 --> 01:06:42.060 +having them learn Copilot, maybe even having them download + +01:06:42.300 --> 01:06:42.460 +cursor. + +01:06:43.100 --> 01:06:45.660 +Download some real dedicated tools + +01:06:45.680 --> 01:06:47.880 +to get this done to make them more productive. + +01:06:48.320 --> 01:06:52.820 +So yeah, educating about how to do it is one thing. + +01:06:53.740 --> 01:06:54.760 +You asked if we're using it. + +01:06:55.920 --> 01:07:02.640 +We have access to a co-pilot and that's great. + +01:07:04.820 --> 01:07:06.540 +I can't say that we've shipped anything in three days + +01:07:06.940 --> 01:07:10.500 +instead of a month yet, but one anecdote is that + +01:07:11.880 --> 01:07:13.620 +right now I'm doing some really interesting + +01:07:15.400 --> 01:07:22.420 +Processing of music of music audio files and somebody asked to have a beatboxer if I could chop that file up + +01:07:22.860 --> 01:07:29.100 +So that all of the individual sounds that the beatboxer makes are identified in a file and so okay + +01:07:30.120 --> 01:07:34.400 +So I'm using some music library. So a Python library library called Librosa + +01:07:35.000 --> 01:07:38.400 +There's some complicated math in there. It's a little bit too much for me + +01:07:39.080 --> 01:07:45.360 +It's no problem for Claude Claude knows how to do that math and then and I use my expertise to string it together to get + +01:07:45.600 --> 01:07:53.420 +good output. Yeah, awesome. You got time for one more quick question before we clap things up. + +01:07:53.420 --> 01:07:58.320 +For sure. Raymond out there, Raymond Yees asks, it says, it'd be good to hear how Harvard uses + +01:07:59.240 --> 01:08:04.160 +containers on AWS and its reliability. It's reliable, not the cheapest way to host things. + +01:08:05.280 --> 01:08:11.420 +Are you thinking about moving that or is it not that much, you know, not that big of an expense? + +01:08:11.440 --> 01:08:21.839 +about a failed experiment. We were using ECS and we're still using ECS. So that's AWS's main, + +01:08:22.759 --> 01:08:28.859 +it's not Kubernetes, but it's one step down with their horizontal scaling container clusters. + +01:08:30.859 --> 01:08:36.100 +And I wanted to move us onto a single EC2 instance because our projects are popular, + +01:08:36.319 --> 01:08:39.900 +but they're not so popular that we actually have to worry about horizontal scaling. + +01:08:40.080 --> 01:08:46.600 +Right. It's not like it's front page in New York Times. I guess it probably could be. But even so, + +01:08:46.630 --> 01:08:48.759 +for the static sites, they probably still can take it. + +01:08:50.049 --> 01:08:55.640 +Yeah. So I priced it out and I got an example deployed, an example project deployed, + +01:08:56.420 --> 01:09:03.199 +and was able to confirm that it would indeed be much cheaper. And it was deployed in a similar way + +01:09:03.220 --> 01:09:10.319 +using AWS CDK. So it's all infrastructure is code all the way down. But it turns out there's all + +01:09:10.500 --> 01:09:16.180 +kinds of compliance when you are in charge of the VM at like a big university, or I'm sure any + +01:09:16.620 --> 01:09:23.620 +any corporate setting, if you are in charge of the VM and the OS on it, then you have to know that you + +01:09:23.700 --> 01:09:28.720 +have the latest patches in. You have to know that you have latest Ubuntu. And then there's other + +01:09:28.740 --> 01:09:35.759 +things of different observability things that you have to have in place that you that are not usually + +01:09:35.980 --> 01:09:41.859 +required if you're running in a container cluster like ecs i see so it ends up being a lot less + +01:09:42.540 --> 01:09:48.020 +a lot less work and much easier to achieve compliance if we run containers or some other + +01:09:48.180 --> 01:09:54.960 +serverless some other serverless thing if i run my all my personal projects they all run in a single + +01:09:55.020 --> 01:10:01.500 +a single virtual machine but um sure we're running in containers yeah yeah and you've got all the + +01:10:01.640 --> 01:10:08.520 +sock 2 stuff and all those different things right like there's layers yeah that's right awesome okay + +01:10:09.000 --> 01:10:16.360 +yeah very cool final thoughts you want to talk to people who may be doing similar all right go ahead + +01:10:16.660 --> 01:10:20.760 +you've got something more to add for you yeah i mean i'll mention that um but what i didn't say is + +01:10:20.660 --> 01:10:28.060 +that in that 2019 when i started learning python i discovered talk python almost immediately and + +01:10:28.180 --> 01:10:33.780 +one of the first episodes that i listened to was was the other digital humanities um the one with + +01:10:34.260 --> 01:10:39.140 +cornelius van lit he was an awesome guest yeah that's right yeah and i thought i thought that + +01:10:39.140 --> 01:10:44.620 +was great and that was also a bit about manuscripts a little bit more in the image side than the text + +01:10:44.800 --> 01:10:50.620 +side and um i didn't understand everything that everybody was saying but i just i kept tuning + +01:10:50.640 --> 01:10:56.180 +in. And I think because of that, because Talk Python was like this, you know, I've been remote + +01:10:56.590 --> 01:11:03.480 +working for most of my time. And Talk Python has been kind of like that conversation with + +01:11:03.610 --> 01:11:08.340 +the open source community that's been always in my ear. And I think that made, you know, + +01:11:08.510 --> 01:11:14.820 +a difference making me feel like I understood the software landscape and like the developer + +01:11:14.940 --> 01:11:19.660 +culture and what was going on. And then the different Python libraries and what was possible. + +01:11:20.400 --> 01:11:32.520 +So to people who are interested in taking things in a more technical direction, I think it's helpful just to find a few things like that, that give you an insight into that world. + +01:11:32.980 --> 01:11:43.320 +And the more you listen to it, the more you start to hear the same acronyms and the same things said enough that you start to feel like, okay, now you're part of the club. + +01:11:43.820 --> 01:11:45.060 +I really appreciate that. + +01:11:45.940 --> 01:11:46.300 +That's cool. + +01:11:46.880 --> 01:11:52.140 +I've certainly had people reach out to me and say things that at first didn't make any sense to me. + +01:11:52.260 --> 01:11:56.440 +Like, I've been listening for six weeks now, and it's starting to make sense what you're talking about. + +01:11:56.500 --> 01:11:58.720 +Like, why have you been listening for six months when it made no sense? + +01:11:58.860 --> 01:11:59.200 +That's insane. + +01:11:59.820 --> 01:12:06.640 +But a lot of people use listening to the podcast, mine and others, as language immersion, right? + +01:12:06.880 --> 01:12:13.000 +Like, I could get Duolingo, and I could learn Portuguese, or I could move to Brazil for a month. + +01:12:13.340 --> 01:12:13.800 +You know what I mean? + +01:12:14.100 --> 01:12:15.060 +Then I would really learn it. + +01:12:15.120 --> 01:12:15.220 +Yeah, exactly. + +01:12:15.580 --> 01:12:15.840 +Right. + +01:12:16.600 --> 01:12:23.820 +Exactly. I think there's truth to that. Some of the things I did was search the word deployment + +01:12:24.440 --> 01:12:28.340 +because I'm trying to get my head around how to deploy for the first time. I just want to hear + +01:12:28.480 --> 01:12:32.280 +people talk about it. I could read about it. I could read the tutorial, but I just want to hear + +01:12:32.400 --> 01:12:37.560 +people talk about deployment to get a sense of what actual deployment sounds like. + +01:12:37.860 --> 01:12:42.940 +There's something really different when you're learning or trying, even you're maybe an experienced + +01:12:42.960 --> 01:12:48.820 +programmer but not in this particular area to hear a human side of it not just the docs not a sterile + +01:12:49.260 --> 01:12:55.160 +these are the four steps but like a human i love it i mean it's probably why i create the show + +01:12:55.420 --> 01:13:00.840 +because like i didn't hear those stories we got to tell those stories awesome i appreciate that so + +01:13:02.100 --> 01:13:06.980 +super cool all right so if other people are listening maybe one of your pieces of advice is + +01:13:08.000 --> 01:13:15.480 +keep listening you'll get there yeah and and if anybody's in is like in the humanities and somehow + +01:13:15.620 --> 01:13:22.120 +found their way onto this episode with no technical uh experience i just would would give the caution + +01:13:22.280 --> 01:13:29.840 +of of like you know the anecdote that if um ai coding had been around the way it is now when i + +01:13:29.940 --> 01:13:36.620 +was learning i wouldn't be doing digital humanities at harvard i wouldn't have been able to get into + +01:13:36.640 --> 01:13:39.240 +this field, I wouldn't have known about it. + +01:13:39.980 --> 01:13:45.280 +So I guess just think about that when you're learning and applying new tools. + +01:13:46.960 --> 01:13:49.740 +I don't really know what the right fix for that is. + +01:13:49.840 --> 01:13:51.080 +That's a very challenging problem. + +01:13:51.300 --> 01:13:54.780 +I mean, you can say I'm just literally not going to fire it up. + +01:13:54.940 --> 01:13:59.440 +But I mean, we used to hunt through Stack Overflow and the web and over and over. + +01:13:59.560 --> 01:14:03.680 +And if you're really stuck or you really don't understand, like they're good at explaining + +01:14:03.780 --> 01:14:04.260 +stuff to you. + +01:14:04.800 --> 01:14:10.220 +really stay in a learner's mindset, not just press the easy button and make this thing and move on. + +01:14:13.440 --> 01:14:18.940 +Easier said than done. Easier and said than done. So yeah, I just I want to leave this with kind of a + +01:14:19.800 --> 01:14:29.239 +thought about how much things like Python and these, these tools and technology can really empower + +01:14:29.940 --> 01:14:31.460 +stuff that you wouldn't think is even related, + +01:14:31.980 --> 01:14:33.400 +like understanding old manuscripts + +01:14:33.800 --> 01:14:40.480 +and how painting is connected or changed over time and stuff, right? + +01:14:40.680 --> 01:14:43.720 +Those sound very much disjointed from tech and software, + +01:14:44.200 --> 01:14:47.800 +but they really are superpowers that you can bring to your work, + +01:14:48.460 --> 01:14:50.000 +whatever your industry is. + +01:14:50.020 --> 01:14:51.540 +I know our field of study, + +01:14:51.560 --> 01:14:54.540 +I know there's some sociologists out in the audience + +01:14:54.720 --> 01:14:55.700 +and I'm sure others as well. + +01:14:58.960 --> 01:15:01.500 +All right. Final thoughts, David, close it out. + +01:15:09.800 --> 01:15:10.400 +That's it. + +01:15:10.820 --> 01:15:11.400 +That's it. All right. + +01:15:12.280 --> 01:15:12.800 +You said it great. + +01:15:12.990 --> 01:15:18.600 +I mean, just applying these technical tools to old questions, + +01:15:19.170 --> 01:15:21.080 +that is the core of digital humanities. + +01:15:21.940 --> 01:15:22.520 +Awesome. Yeah. + +01:15:24.080 --> 01:15:25.120 +When I first started hearing about this, + +01:15:25.120 --> 01:15:26.900 +I thought I really don't know how this ties together. + +01:15:27.120 --> 01:15:30.760 +and after seeing it a few times, I definitely see the power of it. + +01:15:31.680 --> 01:15:32.980 +Thank you for your time coming on. + +01:15:33.220 --> 01:15:37.460 +Thank you for sharing your look and the look inside of your team + +01:15:37.800 --> 01:15:40.200 +and inside of a small piece of Harvard. + +01:15:41.240 --> 01:15:43.020 +I really like these kinds of episodes + +01:15:43.320 --> 01:15:46.000 +because it's hard to see this from the outside. + +01:15:47.260 --> 01:15:48.720 +You just see the results, + +01:15:48.830 --> 01:15:51.000 +but you don't see the inner workings of the team + +01:15:51.070 --> 01:15:51.980 +and the motivation and stuff. + +01:15:52.840 --> 01:15:54.460 +Thank you so much for being here. + +01:15:55.760 --> 01:15:56.420 +Bye, everyone. + +01:15:56.860 --> 01:15:57.000 +Yep. + diff --git a/youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt b/youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt new file mode 100644 index 0000000..6993e95 --- /dev/null +++ b/youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt @@ -0,0 +1,3380 @@ +WEBVTT + +00:00:02.040 --> 00:00:04.000 +Hello, Rebecca and Carl. + +00:00:04.780 --> 00:00:07.700 +Welcome to all of you type loving Pythonistas. + +00:00:08.480 --> 00:00:09.640 +Awesome to have you here on the show. + +00:00:11.639 --> 00:00:12.680 +Thanks for being here. + +00:00:14.040 --> 00:00:15.340 +We're going to talk Python typing, + +00:00:16.200 --> 00:00:19.880 +especially from the perspective of the Python typing council, + +00:00:20.660 --> 00:00:24.360 +which honestly, I am a huge fan of Python typing. + +00:00:24.620 --> 00:00:26.840 +It's still something I learned about not too long ago. + +00:00:26.980 --> 00:00:33.040 +So I'm going to be learning along with everyone else what it is you all do and so on. + +00:00:33.220 --> 00:00:36.560 +So I'm really excited to be diving into this. + +00:00:36.950 --> 00:00:42.520 +I think since types came to Python, I think it's made it a little bit more rigorous. + +00:00:43.100 --> 00:00:47.880 +You know, for all those people out there like, oh, it's not a real language without any form of static typing. + +00:00:47.930 --> 00:00:49.040 +We can't use it on real projects. + +00:00:49.090 --> 00:00:52.820 +I don't know how true that was, but certainly it's less true now. + +00:00:53.120 --> 00:00:54.560 +You know, you can pick per project. + +00:00:54.630 --> 00:00:55.380 +So it's super cool. + +00:00:56.600 --> 00:01:00.040 +Before we get into all that, though, let's just go around for a quick introductions. + +00:01:01.420 --> 00:01:03.440 +Jala, welcome to the show. Awesome to have you here. + +00:01:04.480 --> 00:01:08.900 +Hi, yeah. I'm Jala. I've been on the Python typing council since the beginning. + +00:01:09.360 --> 00:01:15.440 +I helped set it up a couple years ago. Outside of my typing work, I currently work at OpenAI, + +00:01:15.740 --> 00:01:19.859 +where I work on developer productivity, which means things like running CI for people + +00:01:21.319 --> 00:01:24.080 +and generally helping people be productive. + +00:01:25.920 --> 00:01:28.140 +I've been working with Python for more than a decade. + +00:01:28.740 --> 00:01:31.460 +I started out because my previous job was mostly in Python + +00:01:31.920 --> 00:01:34.140 +and then got more and more involved with the language. + +00:01:35.700 --> 00:01:35.900 +Awesome. + +00:01:36.590 --> 00:01:37.860 +So let me get this right. + +00:01:38.110 --> 00:01:41.440 +At OpenAI, you're basically helping developers there + +00:01:41.620 --> 00:01:45.040 +have better developer tooling and common packages + +00:01:45.500 --> 00:01:46.720 +and workflows and stuff like that. + +00:01:47.000 --> 00:01:47.580 +That's right. + +00:01:48.450 --> 00:01:48.600 +Yeah. + +00:01:48.680 --> 00:01:54.480 +Yeah, mostly around things that happen in CI, like running tests efficiently, figuring + +00:01:54.550 --> 00:01:58.820 +out the right tests to run, getting the right CI workers up. + +00:01:59.800 --> 00:02:01.320 +Yeah, that sounds very exciting. + +00:02:01.920 --> 00:02:06.300 +Right in the epicenter of all the big tech stuff these days. + +00:02:06.440 --> 00:02:06.820 +Super cool. + +00:02:07.700 --> 00:02:09.240 +Rebecca, hello, welcome. + +00:02:10.500 --> 00:02:12.160 +Hey, thanks for having me. + +00:02:13.120 --> 00:02:13.540 +I'm Rebecca. + +00:02:14.280 --> 00:02:19.360 +I've been on the typing council also for about three years, I think, + +00:02:19.360 --> 00:02:21.940 +since the, less than three, since the beginning. + +00:02:22.480 --> 00:02:27.260 +But my day job, I work at Meta on Python typing, + +00:02:27.840 --> 00:02:31.980 +gone Pyrefly, which is a new type checker and language server + +00:02:32.060 --> 00:02:33.680 +written in Rust, still in beta. + +00:02:34.500 --> 00:02:37.940 +Prior to that, I was at Google for eight years, + +00:02:38.280 --> 00:02:39.480 +also on the Python team. + +00:02:39.520 --> 00:02:40.920 +I just, I really like Python. + +00:02:42.240 --> 00:02:42.560 +Awesome. + +00:02:43.580 --> 00:02:50.040 +Yeah, super neat. I'm a big fan of both Pyrefly and ty, which will both have representatives here, + +00:02:50.150 --> 00:02:55.660 +I know. And I think it's just a super exciting time for Python types. And certainly that's one + +00:02:55.740 --> 00:03:03.580 +of the reasons. So very cool. Carl, welcome back. Thank you. Great to be here. Yeah, Carl Meyer. + +00:03:04.540 --> 00:03:10.919 +I currently work at Astral, where I work on ty, which is a Python type checker and language server + +00:03:12.180 --> 00:03:18.600 +written in Rust, also in beta. And yeah, I guess how did I get into typing? Or I've been on the + +00:03:18.680 --> 00:03:23.860 +typing council not since the beginning. I think it's been a year and a half. + +00:03:28.000 --> 00:03:36.199 +And yeah, I got into Python typing at the time in 2016, 2017. I was working at Instagram and + +00:03:37.160 --> 00:03:39.320 +That was in the very early days of Python typing. + +00:03:41.360 --> 00:03:46.780 +The PEP484, PEP483, the early Python typing peps had recently come out within the last couple of years. + +00:03:47.610 --> 00:03:53.680 +And one of the co-authors of some of those peps, Lukash Lange, was actually sitting at a desk right next to me at the time. + +00:03:56.459 --> 00:04:02.420 +And at some point we started to think that we should try this Python typing stuff on the Instagram server monolith. + +00:04:03.420 --> 00:04:10.140 +And so I took that on as a side project, and then it eventually became the main project, and then it took like three years. + +00:04:10.640 --> 00:04:13.160 +So a lot of Python typing experience there. + +00:04:13.900 --> 00:04:15.000 +Wow, there absolutely is. + +00:04:15.220 --> 00:04:17.859 +You know, I think a couple of things I'd like to touch on there. + +00:04:20.239 --> 00:04:25.800 +First of all, Instagram, is it maybe the biggest Django deployment in the world? + +00:04:25.960 --> 00:04:27.360 +It's certainly one of the bigger ones, right? + +00:04:27.420 --> 00:04:32.860 +And I think a lot of people don't necessarily know that a core chunk of Instagram is actually Python, right? + +00:04:36.980 --> 00:04:42.600 +i don't i mean i don't know if we have any way to know how big uh the django to plant in the + +00:04:42.760 --> 00:04:47.340 +wild might be but it's certainly a big one yeah it's definitely a big one there was some talks + +00:04:47.660 --> 00:04:54.300 +about um dismissing the garbage collector from the instagram folks uh that wasn't you giving the talk + +00:04:54.440 --> 00:05:00.699 +but at pycon so that was pretty interesting but i think actually that work that you're talking + +00:05:00.720 --> 00:05:07.140 +about, especially with Lukash, really kind of opened a lot of people's eyes about Python typing. + +00:05:08.260 --> 00:05:14.780 +He gave a couple of PyCon talks, showed real metrics of how much of the code base is typed, + +00:05:16.400 --> 00:05:23.520 +how much it's changed, like error detection, that kind of stuff. So let me ask you, + +00:05:23.960 --> 00:05:29.679 +do you feel like it would be different? Would it have gone different now if tools like ty and + +00:05:29.700 --> 00:05:36.000 +Pyrefly exists existed back then is Python typing different now than it was then + +00:05:41.420 --> 00:05:44.240 +uh certainly yes I mean there's been the type system has gotten + +00:05:45.220 --> 00:05:49.780 +more complex over time so it is both more expressive and more complex + +00:05:52.380 --> 00:05:56.159 +and yeah we have more more type checkers available now now + +00:06:08.060 --> 00:06:08.400 +Hold on. + +00:06:09.700 --> 00:06:11.780 +I think for some reason my audio just cut out. + +00:06:11.820 --> 00:06:12.580 +Can you all still hear me? + +00:06:13.680 --> 00:06:14.340 +I can hear you. + +00:06:15.500 --> 00:06:16.260 +Okay, yeah, it came back. + +00:06:16.640 --> 00:06:17.000 +I don't know. + +00:06:19.420 --> 00:06:19.800 +These AirPods. + +00:06:20.100 --> 00:06:21.240 +You know how AirPods switch? + +00:06:23.940 --> 00:06:28.160 +I got a text which made a noise on my iPad and I think my sound switched to my iPad. + +00:06:28.280 --> 00:06:28.900 +Anyway, sorry about that. + +00:06:29.670 --> 00:06:32.600 +I do agree that it's more complicated and I don't know how to feel about that. + +00:06:33.400 --> 00:06:40.440 +It is more expressive, but I feel like it's starting to get, I mean, we're not at C++ + +00:06:41.360 --> 00:06:49.400 +ATL like templates of templates of templates, but still, it's getting more serious. + +00:06:49.560 --> 00:06:57.220 +But I guess one of the really nice parts is that you can just take as much as you want of the complexity and you can just leave the rest, right? + +00:06:57.780 --> 00:07:02.860 +That's part of the magic of Python typing is that it's a gradual typing system, right? + +00:07:03.300 --> 00:07:07.720 +That's a choice people get to make. + +00:07:07.720 --> 00:07:08.380 +It can be none. + +00:07:08.900 --> 00:07:12.920 +It can be quite a bit and anywhere in between. + +00:07:14.440 --> 00:07:16.660 +So I guess that's probably one of the decisions. + +00:07:17.760 --> 00:07:19.160 +Let's talk about the typing council. + +00:07:19.240 --> 00:07:23.780 +So when did the typing council come along? + +00:07:25.940 --> 00:07:30.520 +And did the typing council exist to create all these peps and make this happen? + +00:07:30.610 --> 00:07:31.680 +Or was it afterwards? + +00:07:32.600 --> 00:07:35.420 +What's the history of the typing council and its purpose, folks? + +00:07:36.040 --> 00:07:37.860 +Yeah, it postdates most of the peps. + +00:07:38.370 --> 00:07:42.400 +So initially, the type system was created just through the regular PEP process, + +00:07:42.610 --> 00:07:43.960 +which means that something gets submitted. + +00:07:45.140 --> 00:07:47.820 +At first, still to Guido as the BDFL. + +00:07:48.500 --> 00:07:49.580 +later the steering council. + +00:07:50.960 --> 00:07:54.720 +But that meant that it's very hard to make changes to like this specification. + +00:07:54.990 --> 00:07:58.520 +Like anytime we want to make change something about how the type systems would work, + +00:07:59.400 --> 00:08:00.960 +we had to go through this PEP procedure, + +00:08:01.800 --> 00:08:03.640 +talk to the steering council who are very busy people + +00:08:03.900 --> 00:08:06.900 +who deal with a lot of other aspects of the language other than typing. + +00:08:07.980 --> 00:08:13.000 +So, Shantanu and I came up with this idea of creating a separate council + +00:08:13.780 --> 00:08:15.279 +to specifically in charge of typing + +00:08:16.160 --> 00:08:17.360 +that would be in a specification + +00:08:17.680 --> 00:08:19.440 +where we can make small changes ourselves + +00:08:19.740 --> 00:08:21.440 +without having to go through this whole PEP process. + +00:08:23.800 --> 00:08:26.060 +And this way, when all the type checkers agree + +00:08:26.200 --> 00:08:27.680 +that something needs to go a certain way + +00:08:27.820 --> 00:08:29.240 +and it's not exactly what's in the PEPs, + +00:08:29.740 --> 00:08:32.780 +we can change it and have a place to record that + +00:08:32.940 --> 00:08:34.020 +and people can refer to it + +00:08:34.280 --> 00:08:36.940 +and new type checkers can also try to follow those decisions. + +00:08:39.419 --> 00:08:39.900 +Very interesting. + +00:08:40.120 --> 00:08:41.520 +I didn't realize that it was sort of, + +00:08:42.340 --> 00:08:46.660 +was there to allow for small changes to be made to make that much easier. + +00:08:46.770 --> 00:08:49.120 +But of course that makes sense because the PEP process is, + +00:08:50.060 --> 00:08:51.560 +it's pretty serious and drawn out. + +00:08:51.570 --> 00:08:58.880 +And we've seen even small language changes have quite passionate folks, + +00:08:58.910 --> 00:09:00.640 +I guess we should say. + +00:09:05.180 --> 00:09:06.620 +So yeah, yeah, very nice. + +00:09:09.940 --> 00:09:18.560 +Do you have any examples of the types of changes that you all have that have happened over the years that maybe were typing council only? + +00:09:23.660 --> 00:09:28.580 +One was the specification for how overloads work, which is perhaps not really a small change. + +00:09:28.700 --> 00:09:35.920 +But one of the most complicated features in the type system really is the overloads, where you can give multiple signatures for a function. + +00:09:37.000 --> 00:09:39.800 +and type checkers sort of select which one to use + +00:09:40.040 --> 00:09:42.140 +based on the arguments when the function is called. + +00:09:43.560 --> 00:09:44.840 +And when it was initially created, + +00:09:45.840 --> 00:09:46.460 +I, from what I recall, + +00:09:46.860 --> 00:09:48.280 +there just wasn't really a specification. + +00:09:48.740 --> 00:09:50.860 +It's just like you use the signatures + +00:09:51.180 --> 00:09:52.260 +in a way that makes sense. + +00:09:53.820 --> 00:09:55.820 +And Eric Trout, who's currently on the council + +00:09:56.180 --> 00:09:58.800 +came up with a pretty specific procedure + +00:09:58.920 --> 00:10:00.680 +for exactly how overload should work + +00:10:01.580 --> 00:10:03.840 +to make it so that type checkers have, + +00:10:04.840 --> 00:10:06.419 +both sort of users can understand how it works + +00:10:06.440 --> 00:10:10.300 +type checkers can have something to work towards to make sure that they will implement overloads + +00:10:10.300 --> 00:10:16.780 +in the same way. Yeah. Maybe a smaller example that is an example of something that would have been + +00:10:17.160 --> 00:10:23.040 +too small for a PEP and hard to accomplish before the typing council existed. And this + +00:10:23.040 --> 00:10:28.860 +is actually a change that I pushed through before I was on the typing council, but the typing + +00:10:29.060 --> 00:10:36.400 +council approved it, was a clarification around the interpretation of data class fields. If a final + +00:10:36.440 --> 00:10:43.740 +annotation is applied to a data class field, does that mean, so if you apply a final annotation to + +00:10:43.740 --> 00:10:48.840 +a regular class attribute, since it can't be changed, that implies that it's a class variable. + +00:10:49.620 --> 00:10:53.040 +And there was a question of if that should be the interpretation with the data class or not. + +00:10:53.640 --> 00:11:00.100 +So we discussed that and made a clarification to the spec. Okay, interesting. I've never really + +00:11:00.100 --> 00:11:04.820 +thought about final being applied to a class field, + +00:11:05.060 --> 00:11:07.180 +but I've always used them sort of just for constants. + +00:11:08.220 --> 00:11:09.920 +But maybe people out there don't know, + +00:11:10.280 --> 00:11:13.960 +like typing dot final bracket type, right? + +00:11:15.460 --> 00:11:18.280 +That's kind of the way you can do constants in Python, right? + +00:11:20.860 --> 00:11:22.200 +Constant for the type checker. + +00:11:22.720 --> 00:11:24.340 +Nothing in the runtime will stop you from editing it. + +00:11:25.220 --> 00:11:25.700 +But-- + +00:11:25.700 --> 00:11:26.500 +Not there. + +00:11:26.880 --> 00:11:27.380 +Not there. + +00:11:27.580 --> 00:11:29.320 +I have some examples coming up, and I'm + +00:11:30.240 --> 00:11:31.380 +interested to hear your thoughts on them. + +00:11:31.480 --> 00:11:35.400 +But for sure, there is this tension, right? + +00:11:35.860 --> 00:11:38.120 +I think that's probably worth touching on as well + +00:11:38.320 --> 00:11:40.600 +is this is a tension for Python in general + +00:11:40.860 --> 00:11:43.340 +is you can write all the types you want. + +00:11:43.920 --> 00:11:48.060 +And then when you run your code, it just doesn't care. + +00:11:48.180 --> 00:11:52.120 +There's a few instances, Pydantic, FastAPI, a few others. + +00:11:52.260 --> 00:11:55.120 +But generally speaking, it's there + +00:11:55.260 --> 00:11:57.300 +for the editors and the type checkers and the linters. + +00:11:57.520 --> 00:12:00.980 +and not for runtime, right? + +00:12:04.330 --> 00:12:04.960 +- Yeah, that's right. + +00:12:05.380 --> 00:12:07.680 +There's many exceptions to that. + +00:12:07.940 --> 00:12:10.320 +There's a product like mypyC, + +00:12:11.000 --> 00:12:13.060 +which comes with mypy that's used those types + +00:12:13.260 --> 00:12:16.340 +to compile your code into more efficient machine codes. + +00:12:17.760 --> 00:12:19.100 +Maybe there's gonna be more products like that + +00:12:19.100 --> 00:12:20.040 +in the future, I don't know. + +00:12:20.880 --> 00:12:23.260 +But yes, in general, it's separate from the runtime. + +00:12:24.140 --> 00:12:26.599 +Sort of a similar model to TypeScript + +00:12:26.620 --> 00:12:28.740 +where TypeScript gets compiled into JavaScript + +00:12:28.980 --> 00:12:29.940 +and types just go away. + +00:12:30.580 --> 00:12:31.980 +Here we don't do a compilation step, + +00:12:32.140 --> 00:12:33.620 +but still the same idea of the types, + +00:12:33.840 --> 00:12:35.100 +just not influencing the runtime. + +00:12:36.400 --> 00:12:38.960 +- Although we do make them available for introspection + +00:12:39.340 --> 00:12:40.780 +via the annotations attributes, + +00:12:41.180 --> 00:12:44.140 +which is what has enabled the projects like Pydantic + +00:12:44.420 --> 00:12:46.540 +and other sort of runtime checkers + +00:12:46.740 --> 00:12:49.280 +to make use of type annotations at runtime also. + +00:12:49.860 --> 00:12:51.520 +- Yeah, I don't know if the typing council + +00:12:51.720 --> 00:12:52.460 +was around for this, + +00:12:52.560 --> 00:12:55.460 +but there was a proposal, + +00:12:55.620 --> 00:13:01.500 +I don't remember the exact details, but something to the effect of for type checking, not actually + +00:13:01.780 --> 00:13:06.880 +doing some of the full imports or something along those lines, right? + +00:13:06.980 --> 00:13:11.720 +Where the runtime behavior would have made it hard for tools like Pydantic and others + +00:13:11.820 --> 00:13:12.420 +to get that. + +00:13:12.920 --> 00:13:14.720 +And there was some kind of compromise, right? + +00:13:14.760 --> 00:13:15.720 +I don't remember the details. + +00:13:16.220 --> 00:13:16.600 +Anyone does? + +00:13:17.240 --> 00:13:18.120 +Future import? + +00:13:18.960 --> 00:13:19.740 +Yes, I think so. + +00:13:20.000 --> 00:13:23.180 +Yeah, it's like 563 and 649 or something. + +00:13:24.120 --> 00:13:24.840 +And 749. + +00:13:25.520 --> 00:13:31.600 +Yeah, what happened was that there was going to be a change. That's what the from future + +00:13:31.740 --> 00:13:37.200 +import annotations import does. It changes all annotations into raw strings. So the default + +00:13:37.360 --> 00:13:43.440 +behavior before recently was that annotations are regular codes. If you write devf return + +00:13:43.530 --> 00:13:48.259 +to ints and you import the module, it just looks up the name ints and puts that in annotations + +00:13:48.260 --> 00:13:55.480 +dictionary, which makes introspection easy, but it's made, it has some costs on performance + +00:13:55.880 --> 00:14:00.160 +because memory usage sometimes is high and also made things harder to use sometimes because + +00:14:01.180 --> 00:14:06.060 +if you use a name that's not defined yet at runtime, you get an error. That often comes + +00:14:06.080 --> 00:14:14.379 +up if you have like a class that has a reference in an annotation to the class itself or circular + +00:14:14.400 --> 00:14:21.280 +dependency classes right the circular imports because you want to say this + +00:14:21.340 --> 00:14:27.240 +glass is created by that thing and it returns one but you know somehow you've + +00:14:27.300 --> 00:14:33.180 +got to import the other one and that's such a hassle yeah it's yeah even out in + +00:14:33.180 --> 00:14:40.359 +the audience we have circular imports oh yeah for sure what about lazy imports + +00:14:40.380 --> 00:14:43.040 +That just recently got accepted and will be in 3.15, + +00:14:44.080 --> 00:14:45.700 +which I'm super excited about + +00:14:45.870 --> 00:14:48.480 +because I think it'll make app startup a lot faster + +00:14:49.060 --> 00:14:50.680 +for many use cases. + +00:14:51.560 --> 00:14:53.600 +But does that have knock-on effects for typing? + +00:14:57.480 --> 00:14:58.560 +Not that directly, + +00:14:58.980 --> 00:15:00.860 +because I think for a type checker, + +00:15:01.000 --> 00:15:03.380 +lazy imports mostly just look like regular imports. + +00:15:03.800 --> 00:15:05.560 +I guess I should maybe leave that to the people + +00:15:05.560 --> 00:15:07.300 +who are actually working on type checkers + +00:15:07.360 --> 00:15:08.300 +that are being written right now. + +00:15:10.320 --> 00:15:17.460 +yeah yeah rebecca do you see this making any difference for you lazy imports yeah uh + +00:15:18.680 --> 00:15:25.520 +to be honest it's not something we've uh looked at too carefully yet 315 it seems a little + +00:15:26.400 --> 00:15:32.500 +fine in the future but i don't think it's likely to make a huge difference + +00:15:36.880 --> 00:15:37.040 +Karl? + +00:15:38.860 --> 00:15:43.600 +Yeah, I've thought about it briefly, and I think that it, I think the type checkers really won't need to care. + +00:15:44.560 --> 00:15:48.080 +Maybe there will be some edge cases that will come up that I haven't thought of, but it shouldn't be a big deal. + +00:15:48.940 --> 00:15:50.260 +Yeah, that's what I thought as well. + +00:15:50.440 --> 00:15:58.720 +the one variation that I can certainly see is if you have a type if you have something specified + +00:15:58.940 --> 00:16:05.880 +in a type like say for a field of a class or a pydantic model or something that would otherwise + +00:16:06.310 --> 00:16:13.320 +not trigger the lazy import to become imported would potentially having types specify + +00:16:14.410 --> 00:16:20.280 +cause more importing to happen sooner in the runtime yeah there's actually an issue related + +00:16:20.300 --> 00:16:25.460 +But I think we may need to resolve before 3.15, but I don't know how yet. + +00:16:26.800 --> 00:16:32.220 +In data classes, if you have, if you use a type in a data class annotation that's lazily + +00:16:32.380 --> 00:16:35.780 +imported, actually creating a data class will delay the import. + +00:16:36.140 --> 00:16:41.720 +It will try to resolve the import and actually make it not lazy. + +00:16:42.460 --> 00:16:46.620 +This is because data classes doesn't really need to look at all of the annotations in + +00:16:46.580 --> 00:16:51.720 +year class but it looks at them enough to trigger reification of the imports. + +00:16:53.690 --> 00:16:57.560 +I shared this with some of the people on the lazy imports team but we haven't yet come up + +00:16:57.560 --> 00:16:58.520 +with a good way around it. + +00:16:58.650 --> 00:17:02.640 +I think this might end up being a bit of a food gun so I feel like we should ideally + +00:17:02.880 --> 00:17:05.339 +find a workaround but I don't know what it would be yet. + +00:17:05.820 --> 00:17:11.920 +Yeah, I don't know that it's wrong that it converts it to an eager import which it needs + +00:17:12.069 --> 00:17:13.839 +to know what it is potentially, right? + +00:17:14.079 --> 00:17:17.980 +It actually doesn't. Data classes just need to know whether it's a classifier or not. + +00:17:19.680 --> 00:17:23.560 +I think that's pretty much all. I guess there's an initifier also, but it doesn't really need to + +00:17:23.600 --> 00:17:29.080 +know anything else. In theory, it should be possible to just say, hey, this is not classifier, + +00:17:29.300 --> 00:17:33.780 +so don't bother importing it. Interesting. That's for data classes, + +00:17:34.040 --> 00:17:40.600 +but say if I specify a parameter type on a function. Yeah, then it should be fine. I guess, + +00:17:40.840 --> 00:17:46.180 +again unless something is uh if it does annotations so if you have something like a decorator that + +00:17:46.340 --> 00:17:54.000 +looks at annotations in your function it might reify those imports okay interesting there is one + +00:17:54.070 --> 00:18:00.020 +other potentially interesting thing for type checkers it's already uh difficult for type + +00:18:00.200 --> 00:18:10.780 +checkers to figure out when uh like a sub module should be considered to be an attribute of the + +00:18:10.800 --> 00:18:17.260 +anywhere will attach that submodule as an attribute on the parent module. But at runtime, + +00:18:18.180 --> 00:18:21.680 +that could literally happen anywhere. It could happen in totally unrelated code outside of the + +00:18:21.830 --> 00:18:26.860 +module, and a type checker probably won't be able to see that. So type checkers already have sort + +00:18:26.870 --> 00:18:31.920 +of complex sets of rules around where they look for these submodule imports and when they consider + +00:18:31.930 --> 00:18:39.520 +a submodule import to be reliably happening enough that the type checker should consider + +00:18:39.540 --> 00:18:47.280 +this submodule to exist as an attribute. And lazy imports may make that even... We'll add one more + +00:18:47.460 --> 00:18:52.760 +wrinkle to those bits of heuristics in that we'll have to decide if you have a lazy import of a + +00:18:52.940 --> 00:18:59.420 +submodule and you're under init.py, it's lazy. So should the type checker consider that submodule to + +00:19:00.160 --> 00:19:06.160 +be imported or not be imported? I think it'll be another case where there's no clear right answer + +00:19:06.180 --> 00:19:09.440 +and we'll just have to make a decision one way or the other. + +00:19:10.180 --> 00:19:10.300 +Right. + +00:19:11.160 --> 00:19:13.180 +Yeah, there's some variations across type checkers, + +00:19:13.270 --> 00:19:15.240 +which we'll get to later. + +00:19:16.020 --> 00:19:18.080 +I think, though, before we move off this, + +00:19:18.680 --> 00:19:21.740 +there's actually off introducing the typing council. + +00:19:21.900 --> 00:19:24.560 +I think we should point out that there's two other folks + +00:19:24.590 --> 00:19:27.460 +who couldn't be here who are also on the typing council, + +00:19:27.730 --> 00:19:32.360 +Eric Trout and Yuka Letosilo. + +00:19:33.020 --> 00:19:34.360 +Sorry, Yuka. + +00:19:35.540 --> 00:19:39.120 +But I want to make sure that we point out there's actually five people, not just the three of you, right? + +00:19:41.420 --> 00:19:42.940 +How do you get on the council? + +00:19:43.800 --> 00:19:44.540 +Is there an election? + +00:19:44.940 --> 00:19:46.200 +Do you just apply? + +00:19:51.880 --> 00:19:53.400 +Vacancies are filled by the members themselves. + +00:19:53.680 --> 00:20:00.460 +So when somebody declares the intention to leave the council, we basically ask for people who are interested and then make a selection. + +00:20:01.840 --> 00:20:05.040 +Generally, we try to get people who have experience in the type system. + +00:20:05.840 --> 00:20:09.060 +We try to get a good cross-representation of people working on different type checkers. + +00:20:11.200 --> 00:20:17.040 +So we have Carl and Rebecca here who work on two type checkers, ty and Pyrefly. + +00:20:18.020 --> 00:20:19.380 +Yuka works on mypy. + +00:20:19.530 --> 00:20:22.920 +Eric Chart works on Pyrites, which are two of the most widely used type checkers. + +00:20:25.580 --> 00:20:30.000 +Yeah, so we try to get wider representation of people working on those parts of the ecosystem. + +00:20:30.180 --> 00:20:31.400 +Yeah, that's really cool about it. + +00:20:31.740 --> 00:20:34.760 +you know, it's got a bias towards finding people actually doing the work. + +00:20:36.400 --> 00:20:41.020 +So let's talk about the specification project at typing.python.org. + +00:20:42.300 --> 00:20:43.880 +What is, what is this here? + +00:20:50.320 --> 00:20:51.520 +I mean, I can. + +00:20:51.980 --> 00:20:52.500 +Yeah, go for it. + +00:20:53.060 --> 00:20:58.879 +Talk a bit about, I guess it's a specification for how the type system + +00:20:59.320 --> 00:21:04.500 +used to work. The way it started was that, Yala, you basically took all the typing peps and like + +00:21:04.900 --> 00:21:12.240 +stapled them together, right, to make like one long doc. And since then, we've been iterating on it, + +00:21:12.620 --> 00:21:19.180 +filling in parts that were missing, like overload evaluation and making other changes as well. + +00:21:20.260 --> 00:21:28.219 +Yeah, it's tricky, right? Because traditionally, the typing system is kind of defined across a + +00:21:28.220 --> 00:21:29.480 +series of peps, right? + +00:21:30.940 --> 00:21:33.280 +And so what is the document that tells you how it works, right? + +00:21:34.800 --> 00:21:38.760 +Yeah, that is hard because often those peps build on top of each other. + +00:21:39.900 --> 00:21:45.320 +So then in the extreme, you might see like one thing in one PEP and then another PEP that + +00:21:45.420 --> 00:21:47.640 +adds an aspect of it, another one that adds another aspect. + +00:21:49.080 --> 00:21:50.500 +And overall it makes it very hard to follow. + +00:21:51.210 --> 00:21:54.340 +One of the things I did recently was rewrite the typed dict spec. + +00:21:55.180 --> 00:22:01.860 +dict is a feature of the python type system that has been added to a lot from one PEP to another + +00:22:02.720 --> 00:22:07.280 +and i ended up rewriting the whole thing to basically put all those features together in + +00:22:07.740 --> 00:22:12.800 +a coherent whole rather than just having them all copy pasted one after the other + +00:22:14.220 --> 00:22:20.519 +very nice okay so if somebody really wants a good understanding of the python type system + +00:22:21.360 --> 00:22:23.160 +They go to typing.python.org. + +00:22:24.300 --> 00:22:26.440 +You know, one thing I think maybe is worth touching on, + +00:22:26.660 --> 00:22:30.100 +it's just kind of out of the blue a bit, + +00:22:30.260 --> 00:22:32.080 +but I think it's a really interesting aspect + +00:22:32.480 --> 00:22:35.260 +of the Python typing system is the, + +00:22:35.630 --> 00:22:37.940 +what is it called, the numerical tower or the number tower, + +00:22:39.080 --> 00:22:42.020 +where it's like, if I have a number, + +00:22:42.490 --> 00:22:44.360 +I could specify it as an int, + +00:22:44.840 --> 00:22:46.440 +or I could specify it as a float, + +00:22:47.320 --> 00:22:48.280 +and those kinds of things. + +00:22:49.760 --> 00:22:55.460 +But do you really need to say it's a int pipe float + +00:22:55.640 --> 00:22:58.340 +or a union of int and float if it could be either, right? + +00:22:58.460 --> 00:22:59.700 +And what is it called? + +00:22:59.880 --> 00:23:01.220 +It's the numerical tower, right? + +00:23:04.240 --> 00:23:05.660 +There are many different-- yeah, there + +00:23:05.660 --> 00:23:06.640 +are different towers too. + +00:23:06.800 --> 00:23:09.080 +In Python, there's also this thing called a numbers module + +00:23:09.360 --> 00:23:11.280 +that you have there. + +00:23:11.920 --> 00:23:13.820 +That's just basically ignored by the type system. + +00:23:15.160 --> 00:23:16.640 +I think it's-- + +00:23:17.539 --> 00:23:18.160 +I don't know. + +00:23:19.000 --> 00:23:20.160 +It's been useful for some people. + +00:23:20.220 --> 00:23:23.860 +I feel like in general that module just hasn't worked out very well as being very useful. + +00:23:25.420 --> 00:23:29.860 +Yeah, I think the item, for some reason I'm failing to find this in real time. + +00:23:29.880 --> 00:23:35.400 +But I think the interesting aspect is that you can say it's a float, + +00:23:35.500 --> 00:23:41.000 +and that's basically equivalent to union of integer and float and so on, right? + +00:23:41.200 --> 00:23:44.320 +I think the typing numbers in Python is pretty interesting. + +00:23:46.160 --> 00:23:52.640 +I think every type checker has a different interpretation of what a float annotation actually means. + +00:23:53.520 --> 00:23:54.100 +Oh, really? + +00:23:54.360 --> 00:23:57.660 +It's an area of some lack of clarity in the spec. + +00:23:59.300 --> 00:24:00.520 +Yeah, and a lot of contentiousness. + +00:24:01.600 --> 00:24:05.760 +If we could go back in time, I would... + +00:24:05.760 --> 00:24:09.460 +Knowing what I know now, I'd probably advocate for things being done differently. + +00:24:09.720 --> 00:24:16.120 +because like beginning, you know, like there were multiple things like with similar flavor, + +00:24:16.360 --> 00:24:22.020 +like there was also one where you could put, give some, give a parameter a non-none annotation + +00:24:22.270 --> 00:24:24.420 +and default it to none for convenience. + +00:24:24.730 --> 00:24:28.820 +And we've largely like moved away from stuff like that in favor of explicitness. + +00:24:29.600 --> 00:24:33.460 +Yeah, what the current spec says is that basically if you have a function that takes a float, + +00:24:34.040 --> 00:24:35.540 +you're also allowed to pass an int. + +00:24:36.560 --> 00:24:39.160 +But that's not really enough. + +00:24:39.400 --> 00:24:41.940 +It doesn't tell you how these things work in all cases. + +00:24:43.640 --> 00:24:47.560 +And we've had some attempts to try to come up with a way + +00:24:47.780 --> 00:24:52.540 +to specify that special case in a way that makes more sense, + +00:24:52.880 --> 00:24:53.840 +at least makes more sense to me. + +00:24:54.900 --> 00:24:55.820 +It's been very contentious. + +00:24:56.060 --> 00:24:57.700 +People have very strong opinions about this. + +00:24:59.200 --> 00:24:59.640 +Yeah. + +00:25:02.300 --> 00:25:03.760 +It is a little weird, isn't it? + +00:25:04.420 --> 00:25:05.020 +It's quite weird. + +00:25:05.320 --> 00:25:07.860 +I guess non-obvious is what I'd like to say, really, + +00:25:08.200 --> 00:25:19.360 +honestly. So I'd like to get the official counsel's thoughts on this. When, when is too much, + +00:25:19.600 --> 00:25:25.960 +when is typing too much typing, right? I made the joke about C++ ATL. If you've ever worked with + +00:25:26.080 --> 00:25:31.520 +that, it's like a parameter, you know, a template class where templated classes are part of the + +00:25:31.700 --> 00:25:36.319 +concrete type of the template. Like it's just off the hook. There's certainly places where + +00:25:36.800 --> 00:25:45.800 +typing can be too much. And a lot of the purity of Python or the readability of Python is the fact + +00:25:45.940 --> 00:25:52.380 +that it's got so few symbols. And so adding types adds context, but it also takes away, + +00:25:53.080 --> 00:25:58.560 +you know, makes it a little harder to read. When is too much typing? When do you recommend typing? + +00:25:58.860 --> 00:26:03.640 +You know, maybe Rebecca, I'll let you go first, but like, what are your thoughts in the sort of, + +00:26:03.860 --> 00:26:06.120 +How much typing should I use in Python? + +00:26:10.200 --> 00:26:14.600 +I'll give you what is sort of my official stance, + +00:26:14.720 --> 00:26:17.240 +which is that if you want your type checker to work well, + +00:26:17.900 --> 00:26:20.700 +you should type annotate your API boundaries. + +00:26:21.320 --> 00:26:23.520 +So like parameters and returns in public functions, + +00:26:24.040 --> 00:26:26.460 +public class attributes, things like that. + +00:26:26.720 --> 00:26:28.680 +And like even things that seem trivial, + +00:26:28.900 --> 00:26:31.500 +like, oh, this function returns none, + +00:26:32.180 --> 00:26:37.500 +better to annotate it because you know someone else might be depending on your library and + +00:26:37.780 --> 00:26:46.400 +consuming that type of information. I will say personally what I tend to do is like I annotate + +00:26:46.460 --> 00:26:53.980 +things that I think are non-trivial because I want to see that as documentation and if something + +00:26:54.640 --> 00:26:59.119 +like you know a function that does return none to be honest I will probably forget to annotate it + +00:26:59.180 --> 00:27:03.240 +half the time because i'll be like i honestly don't need to see it so + +00:27:06.580 --> 00:27:14.300 +yeah you know one of the interesting features of the pyrefly vs code extension that's the only one + +00:27:14.340 --> 00:27:18.860 +i can speak of at the moment and carl you've got to tell me if the ty one does this as well + +00:27:19.480 --> 00:27:27.780 +is it will sort of overlay its belief of what types are like if there's you say x equals a + +00:27:27.780 --> 00:27:30.060 +a function return value and it knows what the function returns, + +00:27:30.260 --> 00:27:33.540 +it'll have a gray like colon int if it returned an int or something. + +00:27:33.940 --> 00:27:37.980 +So you can read the code and see what the types are + +00:27:38.160 --> 00:27:40.980 +without actually putting it into the text of the code. + +00:27:41.160 --> 00:27:42.100 +It's only within the editor. + +00:27:43.400 --> 00:27:44.580 +Does ty do something like that, Carl? + +00:27:45.440 --> 00:27:47.360 +Yes, we also have inlay type ints. + +00:27:47.660 --> 00:27:48.500 +Yeah, inlay type ints. + +00:27:48.560 --> 00:27:49.000 +That's what it's called. + +00:27:49.240 --> 00:27:51.040 +So yeah, I don't know. + +00:27:51.140 --> 00:27:53.620 +That also brings an interesting challenge-- not a challenge, + +00:27:53.780 --> 00:27:56.980 +like a wrinkle to the recommendation of, + +00:27:59.470 --> 00:28:02.220 +should I put types on like the return value + +00:28:02.350 --> 00:28:03.960 +because I want to know that's a list of user, + +00:28:04.180 --> 00:28:06.900 +not a list of user IDs or whatever, for example, + +00:28:07.080 --> 00:28:08.100 +like a list of UUID. + +00:28:09.920 --> 00:28:12.680 +But if it's going to show up anyway in the editor, + +00:28:13.460 --> 00:28:15.020 +maybe I don't have to write that, right? + +00:28:15.070 --> 00:28:18.560 +And so that becomes sort of somewhere + +00:28:18.610 --> 00:28:20.200 +where you could debate again, I think. + +00:28:20.640 --> 00:28:23.360 +However, I do 100% agree with you, Rebecca, + +00:28:23.420 --> 00:28:25.260 +that put it on your API boundaries. + +00:28:25.540 --> 00:28:30.260 +If this is the place that people get into some part of your code + +00:28:30.420 --> 00:28:33.020 +and they don't know or want to know about the inside of it, + +00:28:33.820 --> 00:28:36.580 +having types there is really helpful both for editors, + +00:28:37.660 --> 00:28:39.760 +for type checkers, and just for reading code, + +00:28:39.940 --> 00:28:42.120 +and even for AI, which is a crazy world. + +00:28:43.100 --> 00:28:43.180 +Yeah. + +00:28:43.630 --> 00:28:45.240 +Carl, what are your thoughts here? + +00:28:45.320 --> 00:28:46.540 +How much typing is too much typing? + +00:28:47.160 --> 00:28:48.460 +What's the guidelines here? + +00:28:49.600 --> 00:28:52.840 +I think I agree with Rebecca's answer. + +00:28:52.980 --> 00:29:02.260 +I mean, that one place you definitely want to have explicit type annotations is that API boundaries, public API of a library, et cetera. + +00:29:03.260 --> 00:29:22.460 +In terms of what's too much typing, I mean, there's certainly patterns that have historically been used in Python that we still can't express well in the type system or that require extremely complex type annotations to express well. + +00:29:22.540 --> 00:29:25.760 +And I think there it becomes a judgment call. + +00:29:27.380 --> 00:29:31.300 +If it's like a core widely used API, + +00:29:31.790 --> 00:29:33.340 +you may get a lot of benefit + +00:29:33.960 --> 00:29:36.360 +from some very complex and verbose annotations. + +00:29:37.280 --> 00:29:40.320 +And so then it's worth sort of going through that pain + +00:29:40.480 --> 00:29:43.400 +and the pain of adding them and of reading them + +00:29:43.610 --> 00:29:46.320 +in order to get that additional typing coverage + +00:29:46.600 --> 00:29:47.680 +everywhere you use that API. + +00:29:48.860 --> 00:29:51.579 +If it's much less frequently used code + +00:29:51.600 --> 00:29:54.660 +that's highly dynamic, maybe it's not worth it in that case. + +00:29:55.290 --> 00:29:56.740 +I think there's a lot of judgment calls here. + +00:29:57.180 --> 00:29:59.600 +What about one-off scripts? + +00:30:01.260 --> 00:30:04.080 +I'm going to write this thing to just move this data from here to there. + +00:30:04.220 --> 00:30:05.520 +Once it's moved, I don't need it again. + +00:30:06.920 --> 00:30:07.900 +We're done with that old system. + +00:30:07.990 --> 00:30:08.860 +We're going to the new one. + +00:30:09.900 --> 00:30:10.580 +Maybe less typing. + +00:30:12.680 --> 00:30:14.440 +Yeah, I think that's what's useful for you. + +00:30:15.440 --> 00:30:17.640 +Often I feel like one-off scripts are not really one-off. + +00:30:17.780 --> 00:30:19.680 +Maybe you want to move some similar data later, + +00:30:20.000 --> 00:30:22.080 +And then it's useful if you can understand your code again, + +00:30:22.230 --> 00:30:23.880 +if you want to read what you did. + +00:30:24.600 --> 00:30:26.020 +You thought you didn't need it again. + +00:30:26.180 --> 00:30:27.660 +And all of a sudden, at six months old, + +00:30:27.710 --> 00:30:28.620 +you don't understand it. + +00:30:29.310 --> 00:30:30.560 +And the types would help a lot, right? + +00:30:31.420 --> 00:30:31.480 +Yeah. + +00:30:32.180 --> 00:30:32.360 +Yeah. + +00:30:33.120 --> 00:30:35.020 +Jelal, what's your advice? + +00:30:35.640 --> 00:30:38.140 +Yeah, I think what Kar and Rebecca said + +00:30:38.340 --> 00:30:38.980 +makes sense to me too. + +00:30:39.930 --> 00:30:44.380 +I think types have advantages in terms of documenting human + +00:30:44.560 --> 00:30:48.279 +readers, what is going on, and in terms of catching mistakes + +00:30:48.300 --> 00:30:50.700 +that otherwise would not be called until runtime, perhaps. + +00:30:51.660 --> 00:30:53.840 +They have costs in maybe making your code harder to reach + +00:30:53.890 --> 00:30:55.020 +if there's too much going on. + +00:30:56.020 --> 00:30:58.760 +So add types as long as those benefits outweigh the costs. + +00:31:01.080 --> 00:31:01.600 +Yeah. + +00:31:02.210 --> 00:31:07.980 +I mean, do you recommend to anyone that they just 100% go full + +00:31:08.370 --> 00:31:12.540 +like C++, C# on it and just type it every single thing? + +00:31:13.720 --> 00:31:16.580 +Is there an advantage for static type checkers, + +00:31:17.140 --> 00:31:20.740 +like mypy type stuff you can run across and get that. + +00:31:20.740 --> 00:31:23.200 +I mean, you could do that with PowerFly or ty and the CLI + +00:31:23.200 --> 00:31:23.540 +as well. + +00:31:23.720 --> 00:31:27.360 +But thinking more mypy is like kind of being real strict + +00:31:27.520 --> 00:31:28.200 +on some of that stuff. + +00:31:29.220 --> 00:31:32.000 +Personally, I do tend to annotate almost all function + +00:31:32.260 --> 00:31:35.440 +parameters and if class attributes, if I make a class. + +00:31:37.900 --> 00:31:39.340 +Sometimes it's not as necessary. + +00:31:39.640 --> 00:31:41.900 +Like you don't need to annotate your tests, perhaps, + +00:31:42.200 --> 00:31:44.240 +or you don't need to annotate internal functions as much. + +00:31:45.520 --> 00:31:48.460 +But for my own coding, I usually find it helpful to do that. + +00:31:50.680 --> 00:31:53.520 +But sometimes I see people annotating even local variables + +00:31:53.860 --> 00:31:55.920 +where it's very obvious to type check what the type is, + +00:31:56.220 --> 00:31:57.420 +and they can just infer it reliably. + +00:31:58.190 --> 00:31:59.960 +And then it really just adds noise, + +00:32:00.180 --> 00:32:01.000 +and you shouldn't do it. + +00:32:02.140 --> 00:32:02.840 +Yeah, exactly. + +00:32:02.980 --> 00:32:05.380 +If you've got a function that's annotated with a return value, + +00:32:05.430 --> 00:32:07.760 +and you say x equals the function call, + +00:32:08.380 --> 00:32:09.900 +then the type checkers can infer that. + +00:32:10.100 --> 00:32:14.880 +And you're just causing extra noise, I guess. + +00:32:17.920 --> 00:32:21.420 +So suppose you all want to change something. + +00:32:22.820 --> 00:32:25.860 +What's the process of actually going through and making some changes? + +00:32:30.020 --> 00:32:32.500 +Yeah, there's mostly sort of two levels of this. + +00:32:34.460 --> 00:32:35.600 +Well, maybe there's even three levels. + +00:32:35.740 --> 00:32:40.160 +The first one is if it's something that's so small that's just like a wording clarification or something, + +00:32:40.160 --> 00:32:45.140 +We just make a PR to the repo and a few of us look at it and we change it. + +00:32:45.920 --> 00:32:51.680 +The second level is when it's sort of a smaller change that doesn't really introduce a new + +00:32:51.880 --> 00:32:58.600 +feature and then we make a PR to the typing spec repo and we formally have all of us sign + +00:32:58.600 --> 00:32:59.020 +off on it. + +00:32:59.900 --> 00:33:04.480 +That's what happens like with what Carl mentioned earlier of the final change in data classes. + +00:33:06.660 --> 00:33:07.100 +Nice. + +00:33:10.720 --> 00:33:12.520 +I just had to merge this one. + +00:33:14.980 --> 00:33:15.780 +I love it. + +00:33:16.260 --> 00:33:19.140 +Yeah, I guess this repo itself doesn't have anything. + +00:33:20.740 --> 00:33:24.080 +It's the Python typing repo where the decisions are made. + +00:33:24.840 --> 00:33:28.780 +The typing council just has some documentation. + +00:33:29.880 --> 00:33:30.340 +Maybe this one. + +00:33:30.350 --> 00:33:30.480 +Yeah. + +00:33:30.990 --> 00:33:32.200 +And then the third level is peps. + +00:33:32.270 --> 00:33:33.540 +Like really big new changes. + +00:33:33.650 --> 00:33:34.700 +You can still write a pep. + +00:33:35.420 --> 00:33:38.780 +And then we make a recommendation in the steering code so it makes a decision eventually. + +00:33:39.800 --> 00:33:40.140 +Okay. + +00:33:42.840 --> 00:33:50.240 +So if I wanted to suggest something, I could come up here and I could open up an issue, + +00:33:50.500 --> 00:33:55.080 +maybe start a conversation on typing, Python/typing. + +00:33:56.120 --> 00:33:58.300 +And you can make a pull request to change the spec. + +00:33:59.560 --> 00:33:59.660 +Okay. + +00:33:59.920 --> 00:34:07.600 +and so the pull request would not be to change the code like how python maybe interprets code that + +00:34:07.740 --> 00:34:12.899 +has this new thing but to suggest that the spec has it which then would start a process that + +00:34:13.260 --> 00:34:18.480 +ultimately might make c python understand it right well c python itself probably doesn't do anything + +00:34:18.620 --> 00:34:25.020 +with it um i guess most of the things that go directly here are changes to how to interpret + +00:34:25.040 --> 00:34:26.780 +the things that are already in CPython. + +00:34:27.659 --> 00:34:29.659 +If it's adding something new, it will usually + +00:34:29.899 --> 00:34:32.440 +need to go through a path, except if it's something very small. + +00:34:34.380 --> 00:34:37.200 +Yeah, I guess maybe let's talk about that for a minute. + +00:34:37.300 --> 00:34:42.340 +We got two representatives here of the newer breed of tools. + +00:34:44.020 --> 00:34:48.780 +What's the story for inconsistencies + +00:34:50.100 --> 00:34:51.760 +across interpretations of the spec? + +00:34:51.919 --> 00:34:53.280 +I know that there's slight variations. + +00:34:54.520 --> 00:34:58.480 +I've also, you know, not putting either of you on the spotlight, + +00:34:58.760 --> 00:35:04.320 +but like using, say, PyCharm and like writing code. + +00:35:04.320 --> 00:35:09.120 +So it's type checkers happy and then using something like Pyright. + +00:35:09.560 --> 00:35:14.140 +And so it has a real different interpretation of what you should let slide + +00:35:14.140 --> 00:35:14.760 +and what you shouldn't. + +00:35:14.780 --> 00:35:23.040 +I feel like Pyright is much more focused on like enforcing the nullability + +00:35:23.040 --> 00:35:27.720 +or the lack thereof, and it warns of inconsistencies there + +00:35:27.900 --> 00:35:29.460 +where PyCharm doesn't seem to care as much. + +00:35:30.060 --> 00:35:32.120 +I don't know which one I like better, but I know they're different. + +00:35:32.400 --> 00:35:34.160 +And if I write code in one, then I open the other. + +00:35:34.230 --> 00:35:35.680 +I'm like, huh, why is it upset? + +00:35:35.920 --> 00:35:37.040 +It seemed like it was fine. + +00:35:40.020 --> 00:35:40.960 +How do you all navigate this? + +00:35:41.240 --> 00:35:41.300 +Yeah. + +00:35:41.740 --> 00:35:45.260 +I think one thing useful to say about the spec there + +00:35:45.290 --> 00:35:48.000 +is that the spec covers a lot of things. + +00:35:48.250 --> 00:35:50.440 +In particular, it tends to cover sort of the details + +00:35:50.760 --> 00:35:52.500 +of more advanced type system features. + +00:35:53.420 --> 00:35:58.940 +but there's a lot of very fundamental stuff about how a type checker works in terms of + +00:35:59.620 --> 00:36:04.840 +how it does inference and how it does type narrowing. And even in some cases, like you + +00:36:04.960 --> 00:36:10.240 +mentioned, what it chooses to emit errors on that isn't really covered by the spec, + +00:36:11.320 --> 00:36:18.540 +partly maybe because we haven't gotten to it and also partly intentionally in that there may be + +00:36:18.860 --> 00:36:22.640 +room in some of those cases for different type checkers to work differently if they're serving + +00:36:22.660 --> 00:36:29.400 +needs. Like if PyCharm is primarily concerned about being a useful kind of IDE and providing + +00:36:29.580 --> 00:36:36.160 +go-to definition and that sort of thing, maybe emitting lots of warnings or errors and all kinds + +00:36:36.160 --> 00:36:41.260 +of things where your code might be doing something wrong isn't as high a priority and another type + +00:36:41.420 --> 00:36:50.640 +checker might have a different priority. One thing I want to mention is that it may not + +00:36:50.660 --> 00:36:58.280 +seem like it, but things are already much better than they used to be. Like previously I worked + +00:36:58.660 --> 00:37:05.000 +on different type checker called PyType. And at that time it was, you know, sort of the wild west. + +00:37:05.130 --> 00:37:10.480 +Like we want to know how other type checkers like do something. Well, you know, like open up the mypy + +00:37:10.560 --> 00:37:17.539 +playground, open up the PyRite playground, see what tells you. Now we at least have spec and + +00:37:17.560 --> 00:37:27.740 +conformance tests. Yeah, that's really cool. How much would you say that your two type checkers + +00:37:28.380 --> 00:37:34.980 +maybe bring in mypy as well? How much do they agree versus disagree? You only see the differences. + +00:37:35.100 --> 00:37:39.400 +You don't see in which ways that they are the same as a consumer of them so much, right? You're + +00:37:39.480 --> 00:37:45.920 +like, why is this one squiggly when it wasn't squiggly before? But how similar or different are + +00:37:48.280 --> 00:37:53.060 +I don't know how we would quantify that there. I think there's a lot that is the same just because + +00:37:53.820 --> 00:37:59.380 +it's based on how Python actually works. We're both trying to model the same language. And then + +00:37:59.440 --> 00:38:03.020 +there's certainly also plenty of differences or things that we handle differently. So + +00:38:04.880 --> 00:38:15.900 +quantify that. Yeah. I agree. It's hard to quantify, suppose, talk a bit abstractly about + +00:38:15.920 --> 00:38:20.320 +various like type checkers philosophies like with Pyrefly, + +00:38:20.380 --> 00:38:23.100 +you know, like we really try to do a lot of type inference. + +00:38:23.400 --> 00:38:24.840 +So in that way in which, you know, + +00:38:24.940 --> 00:38:27.400 +like we intentionally diverge a bit from myPy, + +00:38:27.640 --> 00:38:30.200 +but like other than like that deliberate decision, + +00:38:30.440 --> 00:38:32.660 +if we see ways in which we are like accidentally different, + +00:38:32.940 --> 00:38:37.000 +you know, we do try to fix that because otherwise people + +00:38:37.060 --> 00:38:39.440 +would have a hard time like running multiple type checkers + +00:38:39.520 --> 00:38:40.000 +or migrating. + +00:38:41.180 --> 00:38:45.200 +- Yeah, differences obviously cause pain for users + +00:38:45.220 --> 00:38:46.520 +who are using multiple type checkers + +00:38:46.680 --> 00:38:48.240 +or writing libraries that need to support + +00:38:48.480 --> 00:38:49.260 +multiple type checkers. + +00:38:49.400 --> 00:38:51.020 +So like Rebecca said, it's like, + +00:38:51.560 --> 00:38:53.780 +if we are different from other type checkers, + +00:38:53.780 --> 00:38:55.160 +we wanna be sure that there's a good reason + +00:38:55.800 --> 00:38:56.280 +for that difference. + +00:39:02.400 --> 00:39:03.640 +Oh, lost your audio, Michael. + +00:39:05.120 --> 00:39:05.760 +- Right, sorry. + +00:39:06.620 --> 00:39:11.100 +The difference should be because of philosophical choice, + +00:39:11.420 --> 00:39:14.800 +not just you happen to have chosen slightly differently, right? + +00:39:18.520 --> 00:39:22.560 +Yeah, and it's not just people who run different type checkers. + +00:39:23.640 --> 00:39:26.340 +Like you pointed out, Carl, a lot of times it is if I have a library + +00:39:27.660 --> 00:39:30.500 +and then different people want to consume that library, + +00:39:30.620 --> 00:39:33.160 +then their type checker may or may not warn them + +00:39:33.400 --> 00:39:36.640 +about how my library declares its types and so on. + +00:39:37.560 --> 00:39:39.520 +I'll give you a real quick example. + +00:39:41.080 --> 00:39:43.800 +I have a, I can't remember which one it was, + +00:39:43.880 --> 00:39:50.960 +I have three or four different open source libraries that I've created that somehow work with creating, + +00:39:51.980 --> 00:39:55.780 +basically passing data to templates in web apps. + +00:39:56.420 --> 00:40:02.220 +So one is like, I want to use the Chameleon web template framework, but with FastAPI or with Flask. + +00:40:02.380 --> 00:40:04.680 +And there's some other variations like partials and so on. + +00:40:04.920 --> 00:40:06.900 +I can't remember which one, but it doesn't really matter. + +00:40:07.000 --> 00:40:10.000 +One of them decorated a Flask. + +00:40:10.520 --> 00:40:12.900 +I think it was a Flask, especially makes it irrelevant. + +00:40:13.040 --> 00:40:21.220 +a flask endpoint and pyrite was really upset like the error message filled the entire page of how it + +00:40:21.280 --> 00:40:27.200 +was inconsistent with what it expected for the definition of the flask view method i'm like + +00:40:27.620 --> 00:40:33.500 +no one is going to call this like what does it even matter what this type is like it still runs + +00:40:33.740 --> 00:40:40.020 +fine the runtime is fine you know it's no problem with this decorator it worked fine but something + +00:40:40.040 --> 00:40:47.360 +about the way that the flask at get returned the type versus what my thing returned varied in like + +00:40:47.360 --> 00:40:54.560 +a really slight way. I didn't care. But somebody was using some editor that used pyrite and they're + +00:40:54.620 --> 00:40:58.880 +like, you have to help fix this. I can't take all these warnings that are huge and they're everywhere. + +00:40:59.120 --> 00:41:06.560 +Like, okay, I'll go fix it. Right. And I went and I put a way more effort than was justified into a + +00:41:06.560 --> 00:41:12.300 +function type that no one ever calls just to make the errors on some type checker I didn't use + +00:41:12.900 --> 00:41:16.220 +go away, right? And that's the kind of thing where it becomes just a headache. + +00:41:19.580 --> 00:41:24.000 +I don't know. I wish I remember. I probably got that written down in an issue somebody filed, + +00:41:24.180 --> 00:41:28.880 +but it was a gnarly error. Or if you're working on an open source project, you know, + +00:41:28.930 --> 00:41:33.840 +you can't make everybody use the same editor that wants to contribute on a big project. + +00:41:34.380 --> 00:41:36.200 +And so you might run into this variation as well. + +00:41:36.260 --> 00:41:37.060 +So there's a lot of cases. + +00:41:40.220 --> 00:41:43.100 +Yeah, it can be really difficult to make these decisions + +00:41:43.440 --> 00:41:49.660 +about what sorts of errors people want their type checker + +00:41:49.660 --> 00:41:51.620 +to catch or what's too pedantic. + +00:41:53.380 --> 00:41:55.080 +Because you want your type checker + +00:41:55.080 --> 00:41:59.120 +to catch non-obvious errors, not just the obvious ones + +00:41:59.180 --> 00:42:00.800 +that you probably would have seen by looking at the code + +00:42:00.900 --> 00:42:01.020 +yourself. + +00:42:03.780 --> 00:42:06.120 +But then there'll be cases where somebody says, + +00:42:06.120 --> 00:42:07.340 +well, I don't care, that's too pedantic. + +00:42:07.560 --> 00:42:10.120 +And it is difficult to make everyone happy. + +00:42:11.060 --> 00:42:11.700 +- Yeah, exactly. + +00:42:12.760 --> 00:42:13.960 +When I was seeing, I'm like, well, + +00:42:15.520 --> 00:42:18.480 +who decides what the right signature + +00:42:18.720 --> 00:42:21.440 +of a flask view and point should be like? + +00:42:21.620 --> 00:42:24.500 +If the framework can call it, it should be okay. + +00:42:24.920 --> 00:42:26.900 +There's not, just 'cause it had a decorator before, + +00:42:26.960 --> 00:42:28.820 +that doesn't mean that's the official structure. + +00:42:28.980 --> 00:42:29.300 +But anyway, + +00:42:32.120 --> 00:42:34.800 +I do think one of the bigger philosophical differences + +00:42:35.160 --> 00:42:39.560 +has to do around this concept of nullability. + +00:42:39.910 --> 00:42:42.320 +Do you guys call it nullability or noneability? + +00:42:42.600 --> 00:42:44.820 +Like, nullability comes from the other languages. + +00:42:44.950 --> 00:42:48.340 +And by that, I mean I can specify that I have an integer. + +00:42:48.770 --> 00:42:52.280 +And in the Python type system, it cannot be set to none, + +00:42:52.470 --> 00:42:53.580 +even though in the runtime it can. + +00:42:54.220 --> 00:42:57.560 +It has to be a concrete int type, unless you + +00:42:57.680 --> 00:43:01.240 +make it a optional int or an int pipe none, + +00:43:01.560 --> 00:43:02.880 +or one of those type things, right? + +00:43:03.120 --> 00:43:05.460 +And how strong that gets enforced + +00:43:05.760 --> 00:43:09.000 +seems to be one of the biggest difference of opinions + +00:43:09.700 --> 00:43:10.520 +that I've seen around. + +00:43:10.900 --> 00:43:11.980 +How do you all think about that? + +00:43:15.200 --> 00:43:17.620 +- That's interesting to me that that's your experience + +00:43:17.900 --> 00:43:20.640 +because my experience has been that that's actually an area + +00:43:20.710 --> 00:43:24.060 +where everyone seems to agree as far as I can tell + +00:43:24.660 --> 00:43:26.080 +that these are- - Interesting. + +00:43:26.080 --> 00:43:27.420 +- Is an important source of bugs + +00:43:27.420 --> 00:43:28.500 +and it's better to catch them. + +00:43:28.500 --> 00:43:29.460 +So I think all of the type checkers, + +00:43:30.080 --> 00:43:31.500 +Maybe you said PyCharm doesn't? + +00:43:31.640 --> 00:43:33.640 +I don't think PyCharm does that. + +00:43:33.840 --> 00:43:38.740 +I'm pretty sure it doesn't because I + +00:43:38.980 --> 00:43:40.840 +agree that it's an important thing to check, + +00:43:41.940 --> 00:43:44.540 +but it's also a point of a lot of friction. + +00:43:45.580 --> 00:43:48.400 +And by that, I mean, let's suppose + +00:43:48.700 --> 00:43:52.740 +I'm going to have a class that I need to create an instance of + +00:43:53.060 --> 00:43:54.600 +and then put values into. + +00:43:55.380 --> 00:43:57.540 +And I know once I put the values into it, + +00:43:57.560 --> 00:43:58.860 +let's say it has a user ID. + +00:43:59.900 --> 00:44:02.840 +I know for certain that that's going to be an integer. + +00:44:03.560 --> 00:44:08.400 +So I'd like to say userid colon int because everywhere I use that object later, + +00:44:09.440 --> 00:44:13.740 +if it's a function that takes an int and I specify it as optional int, + +00:44:15.060 --> 00:44:19.540 +I will get a type check warning every single call site when I try to pass that. + +00:44:20.060 --> 00:44:26.460 +But I know from the semantics of the behavior that it's going to always be an int + +00:44:27.320 --> 00:44:28.760 +unless it's not initialized. + +00:44:29.500 --> 00:44:32.220 +like in this short period where I want to create it. + +00:44:32.270 --> 00:44:33.800 +So I can't set the type to int. + +00:44:33.810 --> 00:44:35.960 +I have to set to optional int until I've loaded it. + +00:44:36.870 --> 00:44:38.500 +But there's like this, I don't know, + +00:44:38.620 --> 00:44:41.380 +that's the part where I see a lot of it show up + +00:44:42.319 --> 00:44:46.320 +is inconsistencies and then warnings all over the place. + +00:44:46.720 --> 00:44:49.060 +So I'm like, well, but that function is actually checking + +00:44:49.130 --> 00:44:52.240 +if it's none and it'll return null, you know, none + +00:44:52.410 --> 00:44:53.120 +or something like that. + +00:44:53.280 --> 00:44:54.460 +So I don't know. + +00:44:54.920 --> 00:44:56.200 +I totally agree with you. + +00:44:56.480 --> 00:45:02.540 +It's just somewhere I've seen the most inconsistencies across maybe PyCharm versus others. + +00:45:03.940 --> 00:45:09.640 +mypy also has a legacy mode for not checking none things called non-strict optional. + +00:45:10.860 --> 00:45:15.700 +We are trying to get rid of that from mypy because strict optional, + +00:45:16.000 --> 00:45:18.000 +like being strict about it is the more sensible thing to do. + +00:45:18.860 --> 00:45:19.000 +Sure. + +00:45:19.180 --> 00:45:21.120 +But it's possible that you've seen that too. + +00:45:21.580 --> 00:45:22.060 +Yeah, I agree. + +00:45:22.320 --> 00:45:29.200 +So what you mentioned is maybe sort of a special case of the case where you pass something to a class and there's initialization that changes the types. + +00:45:30.120 --> 00:45:31.540 +Doesn't necessarily have to deal with none. + +00:45:31.570 --> 00:45:34.480 +It could also just be like the attribute doesn't exist at the little beforehand or something. + +00:45:35.220 --> 00:45:35.360 +Yeah. + +00:45:35.500 --> 00:45:35.720 +Yeah. + +00:45:36.040 --> 00:45:37.660 +We don't have a good solution for that. + +00:45:37.820 --> 00:45:41.400 +Maybe there's room for something to support that use case better. + +00:45:41.720 --> 00:45:42.880 +I don't know what it would look like. + +00:45:44.200 --> 00:45:44.980 +Yeah, I don't know. + +00:45:45.240 --> 00:45:54.320 +In some cases, there's ways you can, these things can sometimes nudge you towards a different design that is actually safer and will avoid errors. + +00:45:54.540 --> 00:46:03.640 +Like in the kind of case you're talking about, you know, is it actually necessary that an uninitialized object and an initialized one are represented by the same type? + +00:46:04.400 --> 00:46:11.660 +Or is there a way to adjust the API so that those are actually different types than you solve the problem and your code is safer? + +00:46:11.760 --> 00:46:16.620 +So yeah, no, I totally hear you. + +00:46:16.730 --> 00:46:21.560 +You know, I'm thinking like you submit a web form and before you, + +00:46:22.000 --> 00:46:22.820 +before you parse it, + +00:46:22.860 --> 00:46:24.480 +you've got to create the instance to set the values. + +00:46:26.300 --> 00:46:28.800 +And I don't know, it's, it's not worth diving into, + +00:46:28.880 --> 00:46:33.780 +but I do find this differentiation between like the strict enforcement of none + +00:46:34.030 --> 00:46:36.000 +versus not none. I think it's powerful. + +00:46:36.050 --> 00:46:39.020 +And I do think you all are right that it does catch a lot of errors. It's just, + +00:46:39.120 --> 00:46:42.700 +It's just a difference, and it's just an interesting choice. + +00:46:43.100 --> 00:46:46.240 +But I didn't get a concrete answer from the official counsel. + +00:46:47.160 --> 00:46:48.440 +Nullable or noneable? + +00:46:49.180 --> 00:46:49.540 +What is it? + +00:46:51.140 --> 00:46:53.200 +I feel like you just don't really even talk about it + +00:46:53.220 --> 00:46:53.980 +as a term, mostly. + +00:46:55.860 --> 00:46:59.260 +None is special in the type system in how you represent it, + +00:46:59.440 --> 00:47:02.520 +but it's not really special in other ways. + +00:47:04.180 --> 00:47:06.720 +So you don't have a term for int type none? + +00:47:08.120 --> 00:47:13.780 +to our none. Historically, the term was optional, although I think that term has problems and + +00:47:14.540 --> 00:47:20.620 +we're sort of moving away from it because specifically one problem is that optional can + +00:47:20.840 --> 00:47:26.960 +mean you don't have to pass it in, like I say, as a function parameter. Right. And that's not + +00:47:27.400 --> 00:47:32.240 +the intention. Yeah. It has two total different meanings there. Interesting. + +00:47:34.580 --> 00:47:37.680 +Let's talk a little bit about TypeShed. + +00:47:38.110 --> 00:47:39.840 +I think TypeShed is pretty interesting. + +00:47:40.020 --> 00:47:42.980 +Maybe people don't know too much about it. + +00:47:43.520 --> 00:47:46.240 +So I'm sure you all are familiar with this project, + +00:47:46.350 --> 00:47:50.220 +that you can basically add type information + +00:47:50.500 --> 00:47:53.180 +that the libraries didn't bother to include for you, right? + +00:47:57.240 --> 00:47:58.600 +What are thoughts on TypeShed? + +00:47:58.800 --> 00:48:03.900 +How much do you all lean on this to round out + +00:48:03.920 --> 00:48:04.580 +missing types? + +00:48:07.660 --> 00:48:11.720 +I mean, there are two parts of types shed, right? + +00:48:11.980 --> 00:48:18.380 +There's the standard library type stubs, which I think are invaluable, like all the type + +00:48:19.300 --> 00:48:20.100 +checkers use those. + +00:48:20.540 --> 00:48:24.960 +And I mean, will the standard library itself ever have inline types? + +00:48:25.860 --> 00:48:27.480 +Who knows, this might be around forever. + +00:48:28.300 --> 00:48:32.720 +Then there are also the third party stubs. + +00:48:33.080 --> 00:48:34.860 +And I think that's what you're describing. + +00:48:35.040 --> 00:48:36.700 +Their libraries that for whatever reason + +00:48:36.940 --> 00:48:38.640 +don't ship with stubs themselves. + +00:48:39.200 --> 00:48:41.260 +Those are in TypeShed. + +00:48:42.880 --> 00:48:46.260 +And it's been like, for a while, + +00:48:46.420 --> 00:48:48.260 +this has been a question of like what we want to do + +00:48:49.060 --> 00:48:51.380 +with like TypeShed third party stubs, right? + +00:48:51.680 --> 00:48:54.560 +'Cause like, ideally like libraries would ship + +00:48:54.590 --> 00:48:56.220 +with their own types, + +00:48:56.480 --> 00:48:58.040 +but there are various obstacles to that. + +00:49:00.020 --> 00:49:04.260 +- You know, the obstacles that I know of used to be like, + +00:49:04.920 --> 00:49:07.500 +we want this to run on Python 2 and Python 3, + +00:49:08.100 --> 00:49:10.940 +or we want it to run on Python 3.3 still. + +00:49:12.600 --> 00:49:14.020 +But it's been a long time + +00:49:14.740 --> 00:49:17.680 +since any non-type supporting version of Python + +00:49:20.920 --> 00:49:24.080 +was a real, you know, a supported type of thing, right? + +00:49:25.500 --> 00:49:27.800 +I mean, even 3.9 became deprecated. + +00:49:30.180 --> 00:49:33.880 +So on one hand, I feel like they could be merged in, + +00:49:33.880 --> 00:49:40.180 +but there's also a lot of other areas that are maybe we don't-- + +00:49:40.580 --> 00:49:42.360 +they're not common, right? + +00:49:42.540 --> 00:49:49.120 +Like other libraries like-- + +00:49:49.700 --> 00:49:49.880 +I don't know. + +00:49:49.920 --> 00:49:51.860 +Just pick some-- let's say Pyramid. + +00:49:52.200 --> 00:49:56.180 +I don't think the Pyramid Web Framework really ever got types added to it. + +00:49:56.980 --> 00:50:00.080 +Somebody could go and create a type shed stub or + +00:50:01.100 --> 00:50:05.180 +a types_pyramid you could pip install and then we'll add the types, right? + +00:50:06.430 --> 00:50:09.920 +I certainly see it being really valuable for third party things that are just not + +00:50:10.020 --> 00:50:12.380 +going to get the type attention they need. + +00:50:20.020 --> 00:50:20.680 +Yellow, what do you think? + +00:50:21.680 --> 00:50:22.620 +- Yeah, I think TypeShed is great. + +00:50:22.880 --> 00:50:25.360 +I've spent a lot of time on improving it. + +00:50:26.660 --> 00:50:28.460 +As Rebecca said, especially with a standard library, + +00:50:28.800 --> 00:50:29.760 +it's irreplaceable. + +00:50:30.880 --> 00:50:31.800 +For third party libraries, + +00:50:32.170 --> 00:50:34.960 +I think it's become less needed over time. + +00:50:36.640 --> 00:50:38.900 +It used to be that very few third party libraries + +00:50:39.200 --> 00:50:39.960 +had any types. + +00:50:40.680 --> 00:50:41.940 +Now that's obviously changed. + +00:50:41.950 --> 00:50:44.680 +A lot of libraries ship their own types. + +00:50:46.580 --> 00:50:49.900 +But still, there are quite a few types of libraries left + +00:50:49.920 --> 00:50:54.520 +where there aren't inline types and TypeShit can provide useful types. + +00:50:55.040 --> 00:50:59.780 +I think TypeShit also provides a service because it has a really great framework for testing these types. + +00:51:00.560 --> 00:51:11.540 +We have tools like step test and various type checkers that help to make sure these types are good and meet a high standards. + +00:51:12.760 --> 00:51:14.820 +So yeah, I think they're still useful for many libraries. + +00:51:15.980 --> 00:51:19.820 +Yeah, I was just looking at the types-flask, + +00:51:20.180 --> 00:51:22.940 +and I guess it must be gone + +00:51:23.090 --> 00:51:25.180 +because now typeflask must have it internally. + +00:51:25.580 --> 00:51:28.120 +So it's kind of an interim sort of thing. + +00:51:28.220 --> 00:51:28.640 +That's pretty cool. + +00:51:28.670 --> 00:51:30.740 +Yeah, in general, types has a policy + +00:51:30.810 --> 00:51:32.380 +that we remove the steps from typeset + +00:51:32.410 --> 00:51:34.460 +if they are in the library itself. + +00:51:35.359 --> 00:51:37.280 +Yeah, yeah, very cool. + +00:51:37.610 --> 00:51:39.600 +Okay, I find these super valuable + +00:51:39.780 --> 00:51:42.540 +because if there's a library I want to work with + +00:51:42.540 --> 00:51:44.720 +and it just doesn't have types for whatever reason, + +00:51:45.100 --> 00:51:48.620 +You can install stuff from here, and all of a sudden, + +00:51:48.630 --> 00:51:50.480 +your editor's way happier. + +00:51:50.800 --> 00:51:55.940 +I mean, I know you all agreed on the API boundaries, + +00:51:56.230 --> 00:51:57.340 +and I did as well. + +00:51:57.350 --> 00:51:58.780 +It's like that's one of the really cool things. + +00:51:58.870 --> 00:52:01.100 +The other thing that really makes me excited about types + +00:52:01.130 --> 00:52:04.660 +is if I hit dot in my editor, I get + +00:52:04.760 --> 00:52:08.260 +a meaningful list of real information + +00:52:08.540 --> 00:52:09.380 +about what I'm working on. + +00:52:09.380 --> 00:52:11.700 +And so adding these types of things + +00:52:11.920 --> 00:52:12.840 +are pretty interesting. + +00:52:13.400 --> 00:52:18.980 +I want to ask you all about these rogue tools + +00:52:19.340 --> 00:52:23.380 +that do stuff with Python typing that maybe you all didn't intend. + +00:52:23.580 --> 00:52:25.380 +Like we all mentioned Pydantic. + +00:52:26.540 --> 00:52:28.260 +We've got Typer and FastAPI. + +00:52:30.380 --> 00:52:33.180 +But even a little farther out there is BearType. + +00:52:33.280 --> 00:52:34.440 +Are you familiar with BearType? + +00:52:36.560 --> 00:52:37.000 +Yeah. + +00:52:37.440 --> 00:52:38.800 +Yeah, BearType's interesting. + +00:52:40.340 --> 00:52:44.440 +You can import-- they have fun. + +00:52:45.540 --> 00:52:50.560 +They have fun with their import names and stuff. + +00:52:53.720 --> 00:52:57.140 +But basically, you can put either a decorator + +00:52:57.300 --> 00:53:00.360 +onto some sort of call site or something, + +00:53:00.460 --> 00:53:03.820 +or you can just do it to an entire package-- + +00:53:04.020 --> 00:53:04.880 +or entire modules, rather. + +00:53:05.800 --> 00:53:09.720 +So just from bear type dot claw, import bear type this. + +00:53:09.840 --> 00:53:13.180 +and then it actually turns into runtime type checks. + +00:53:15.060 --> 00:53:17.600 +Good idea, bad idea, interesting. + +00:53:18.260 --> 00:53:18.860 +What do you all think? + +00:53:20.600 --> 00:53:23.380 +So un-Pythonic, you won't even open the web page? + +00:53:24.100 --> 00:53:27.460 +I think people should feel free to write whatever code helps + +00:53:27.860 --> 00:53:29.380 +them make better software. + +00:53:30.410 --> 00:53:32.140 +I haven't really used bar type much myself, + +00:53:32.500 --> 00:53:34.040 +but clearly useful for some people. + +00:53:35.560 --> 00:53:37.700 +And I think in general, in designing a type system, + +00:53:37.900 --> 00:53:40.660 +We should try to accommodate all users who do useful things + +00:53:40.760 --> 00:53:41.340 +for the type system. + +00:53:42.900 --> 00:53:44.960 +And that includes things like Pydantic or BearType. + +00:53:48.020 --> 00:53:49.500 +Yeah, it's pretty fast. + +00:53:49.720 --> 00:53:53.920 +It's not as big of a hit as you would imagine. + +00:53:54.940 --> 00:53:55.320 +Let me see. + +00:53:55.380 --> 00:54:00.900 +What are they-- somewhere they had a really fun saying in here. + +00:54:01.060 --> 00:54:02.340 +But here we go. + +00:54:03.000 --> 00:54:06.380 +BearType brings Rust C++-inspired zero-cost abstractions + +00:54:06.400 --> 00:54:09.100 +into the lawless world of dynamically typed Python + +00:54:09.240 --> 00:54:12.160 +by enforcing type safety at the granular level of functions + +00:54:12.420 --> 00:54:15.200 +and methods against type hints standardized + +00:54:15.540 --> 00:54:16.480 +by the Python community. + +00:54:16.660 --> 00:54:19.520 +And order one, non-amortize worst case time + +00:54:19.660 --> 00:54:20.860 +with negligible constant factors. + +00:54:21.080 --> 00:54:21.660 +Like how about that? + +00:54:24.880 --> 00:54:26.740 +No, it's a pretty neat library and it's pretty fast. + +00:54:27.120 --> 00:54:28.780 +Honestly, I've never used it in production. + +00:54:30.260 --> 00:54:33.880 +Having type hints and squigglies in the editors + +00:54:33.880 --> 00:54:37.000 +or in the linters has always been enough for me. + +00:54:37.140 --> 00:54:40.480 +But I can see using this if it's really critical + +00:54:40.740 --> 00:54:41.800 +and you're having issues, maybe you + +00:54:41.880 --> 00:54:43.300 +want to catch some runtime errors. + +00:54:43.840 --> 00:54:44.160 +I don't know. + +00:54:45.900 --> 00:54:48.240 +It's not quite an endorsement, but it sure is like, huh, + +00:54:48.540 --> 00:54:50.700 +that's different type of thing. + +00:54:51.020 --> 00:54:56.940 +I definitely think that the extent to which type checkers + +00:54:59.260 --> 00:55:01.580 +may have a different understanding of your code + +00:55:01.590 --> 00:55:02.640 +from what happens at runtime. + +00:55:03.100 --> 00:55:06.100 +and there isn't anything built in to catch that + +00:55:06.260 --> 00:55:07.800 +is sometimes a pain point. + +00:55:08.230 --> 00:55:10.380 +And so the desire to have your type annotations, + +00:55:10.910 --> 00:55:13.180 +to find out at runtime if your type annotations + +00:55:13.230 --> 00:55:14.140 +are telling you a lie, + +00:55:14.880 --> 00:55:17.940 +it makes a lot of sense why people would like that. + +00:55:18.660 --> 00:55:20.780 +I mean, it's something you get used to from other languages + +00:55:20.870 --> 00:55:22.580 +where the type checker is built into the compiler. + +00:55:22.660 --> 00:55:23.020 +- Right, right. + +00:55:23.160 --> 00:55:24.800 +You get like a runtime typecast. + +00:55:24.980 --> 00:55:27.600 +Like cannot, you know, we kind of get that + +00:55:27.700 --> 00:55:30.860 +if you try to parse a thing, you know, + +00:55:31.000 --> 00:55:33.440 +like put the int param around a string, + +00:55:33.550 --> 00:55:35.920 +and it's not really a parsable as an int. + +00:55:36.660 --> 00:55:39.100 +But for real type information, I think personally, + +00:55:39.250 --> 00:55:42.560 +I would use this as like I might apply types, + +00:55:43.510 --> 00:55:49.460 +type checking to a module for debugging and development + +00:55:49.550 --> 00:55:50.680 +for a minute and just see what happens, + +00:55:50.810 --> 00:55:51.680 +and then turn it back off. + +00:55:52.000 --> 00:55:54.700 +I don't know that I'd just ship production code that way. + +00:55:55.940 --> 00:55:58.740 +But anyway, I got a couple more questions. + +00:55:58.940 --> 00:56:00.240 +We're getting shorter on time here. + +00:56:02.240 --> 00:56:03.800 +What was one of the harder questions + +00:56:04.700 --> 00:56:07.120 +that you all-- harder decisions you all + +00:56:07.120 --> 00:56:10.340 +had to address on the council? + +00:56:12.420 --> 00:56:16.880 +I think the most contentious one was PEP724, + +00:56:17.520 --> 00:56:18.740 +if I remember the number correctly. + +00:56:19.820 --> 00:56:22.440 +We've just around a feature called type guards, + +00:56:22.640 --> 00:56:25.200 +which is around user-defined type narrowing functions. + +00:56:27.340 --> 00:56:31.720 +We initially defined that in a way that later was found to be somewhat problematic, + +00:56:31.980 --> 00:56:35.980 +and we basically came up with a better set of proposed semantics + +00:56:37.440 --> 00:56:40.700 +that maybe we should have done the first time around. + +00:56:41.860 --> 00:56:44.940 +And what this PEP proposed, and as you can see, I sponsored it, + +00:56:45.420 --> 00:56:48.720 +is that we basically changed the meaning of the existing type cards + +00:56:49.040 --> 00:56:52.900 +under certain conditions, which, yeah. + +00:56:53.920 --> 00:56:55.220 +What is a type card? + +00:56:55.720 --> 00:56:58.720 +A type card is a function, like there's a good example there, the isiterable. + +00:56:59.280 --> 00:57:02.860 +It's a function that tells you how to narrow something. + +00:57:03.840 --> 00:57:06.960 +So in this example, there's an isiterable type card, + +00:57:07.340 --> 00:57:09.960 +which narrows an object to an iterable of anything. + +00:57:11.080 --> 00:57:15.500 +And then inside defunc there, you can see if isiterable file, + +00:57:16.180 --> 00:57:18.540 +it knows that it's an iterable. + +00:57:19.140 --> 00:57:23.620 +And in this case, yeah, I guess it just narrows exactly to iterable any. + +00:57:24.140 --> 00:57:26.000 +That's one of the ways that type cards works. + +00:57:26.780 --> 00:57:27.400 +I see. + +00:57:27.540 --> 00:57:30.960 +And the type that returns kind of communicates to the type system + +00:57:31.230 --> 00:57:36.380 +like that this function ensures that the thing that came in + +00:57:36.430 --> 00:57:38.960 +as an arbitrary object, in fact, is one of these. + +00:57:39.420 --> 00:57:39.640 +Yeah. + +00:57:40.300 --> 00:57:41.040 +Okay, interesting. + +00:57:42.800 --> 00:57:44.520 +Yeah, so that was a tricky one, huh? + +00:57:45.760 --> 00:57:45.980 +Yeah. + +00:57:48.000 --> 00:57:49.800 +Any other standout, Rebecca or Carl? + +00:57:50.960 --> 00:57:52.259 +Well, the current discussion around + +00:57:52.280 --> 00:57:53.880 +what is the meaning of a float annotation? + +00:57:54.960 --> 00:57:57.060 +Still unresolved contentious topic. + +00:57:57.540 --> 00:57:58.520 +Okay, gotcha. + +00:58:00.520 --> 00:58:07.980 +I mean, this on PEP724 is also what came to my mind immediately as well + +00:58:08.140 --> 00:58:10.440 +because this was challenging discussion + +00:58:10.880 --> 00:58:15.660 +because there were very conflicting considerations at play. + +00:58:15.840 --> 00:58:20.079 +It's like what semantics did we want in the long term + +00:58:20.100 --> 00:58:22.740 +and what did we want the type system to look like, + +00:58:22.900 --> 00:58:24.280 +you know, say like 10 years from now + +00:58:24.580 --> 00:58:25.820 +versus backwards compatibility + +00:58:26.880 --> 00:58:28.700 +and what the migration story would look like. + +00:58:28.740 --> 00:58:31.120 +It was quite tricky. + +00:58:31.360 --> 00:58:32.900 +Yeah, I guess that's something + +00:58:32.980 --> 00:58:35.920 +you will always have to be cognizant of + +00:58:36.000 --> 00:58:38.640 +is like every change, + +00:58:38.940 --> 00:58:39.980 +even if it's an improvement, + +00:58:40.380 --> 00:58:45.700 +has to justify the fact that now you have challenges + +00:58:46.120 --> 00:58:48.680 +with the version history over time. + +00:58:49.320 --> 00:58:55.100 +I'm thinking like dict of string comma int with a capital or lowercase d. + +00:58:56.480 --> 00:59:00.440 +I've got people, I did a YouTube video showing something with the lowercase version + +00:59:00.530 --> 00:59:03.780 +because I was using something super modern like Python 3.11. + +00:59:05.160 --> 00:59:09.600 +And I got a message like, hey, Michael, you don't know how to write Python. + +00:59:09.710 --> 00:59:10.600 +Your code is broken. + +00:59:11.480 --> 00:59:13.640 +This code that you wrote just doesn't even run. + +00:59:13.710 --> 00:59:14.700 +I don't know how this is. + +00:59:14.840 --> 00:59:17.600 +I'm like, what version of Python is in? + +00:59:17.750 --> 00:59:18.160 +3.8. + +00:59:18.620 --> 00:59:19.920 +Nope, you can't use 3A for that. + +00:59:20.020 --> 00:59:21.020 +You're going to need to get a newer one. + +00:59:21.050 --> 00:59:21.580 +You know what I mean? + +00:59:21.990 --> 00:59:26.120 +But those are complexities that get added to Python + +00:59:26.470 --> 00:59:27.440 +because of that. + +00:59:27.520 --> 00:59:30.480 +Now you've got two ways to specify what a dict is. + +00:59:30.820 --> 00:59:32.860 +There's a preferred new way, but there's still the old way. + +00:59:32.890 --> 00:59:35.740 +And it just sort of piles up. + +00:59:36.299 --> 00:59:38.720 +And it's very hard to ever actually get rid of the old way, + +00:59:38.870 --> 00:59:40.420 +even if there's no good reason to use it anymore. + +00:59:40.960 --> 00:59:41.320 +Exactly. + +00:59:41.620 --> 00:59:44.600 +Once it's there, it's written in ink pretty much, right? + +00:59:44.780 --> 00:59:47.520 +Like we have five or six different ways to format strings. + +00:59:47.760 --> 00:59:49.620 +maybe with t-strings at six now. + +00:59:51.900 --> 00:59:53.480 +They're all gonna still be there, right? + +00:59:53.500 --> 00:59:55.520 +So every change, every decision you make + +00:59:55.540 --> 00:59:58.780 +is not just a matter of, is it the right decision, right? + +00:59:58.900 --> 01:00:01.980 +It's the, is it worth it, I'm sure. + +01:00:06.760 --> 01:00:08.920 +Yeah, I don't know, how do you all balance that? + +01:00:09.060 --> 01:00:10.560 +Like, that's tricky. + +01:00:16.040 --> 01:00:20.880 +With things like the big change, at least we know we're moving towards better states. + +01:00:21.200 --> 01:00:24.860 +And there's two things, but they mean exactly the same thing. + +01:00:25.160 --> 01:00:27.660 +So the confusion is not as bad. + +01:00:28.700 --> 01:00:33.720 +The problem with type card is that we're going to change how some existing thing works, like + +01:00:33.840 --> 01:00:34.320 +what it meant. + +01:00:35.700 --> 01:00:38.400 +And I think there are good reasons that maybe that was the right thing to do. + +01:00:38.740 --> 01:00:43.699 +But it would also have been pretty confusing for people if their existing types suddenly + +01:00:43.720 --> 01:00:52.740 +started meaning something completely different yeah absolutely hence float okay all right let's + +01:00:52.880 --> 01:01:02.180 +let's uh do what's coming next like 315 316 do you all have things that are in the works that + +01:01:02.180 --> 01:01:10.220 +you think are going to come or debates that are brewing i think for 315 the there's a type dict + +01:01:10.200 --> 01:01:16.260 +feature coming, extra items. You can already use it in typing extensions if you want to use it, + +01:01:16.600 --> 01:01:23.360 +but it will be in CPython as of 2.15. It's likely we'll have a small thing I added called + +01:01:23.560 --> 01:01:28.120 +disjoint basis, which is very technical, but helps type narrowing in some cases. + +01:01:30.670 --> 01:01:36.140 +Yeah, I think those are the things that are likely to make it. We can only speculate about + +01:01:36.120 --> 01:01:37.280 +but what else people can propose, + +01:01:37.450 --> 01:01:39.840 +we're sort of bound by what people actually write up as peps. + +01:01:41.680 --> 01:01:42.940 +We have to wait for people to write the peps + +01:01:43.030 --> 01:01:43.840 +before we can approve them. + +01:01:45.780 --> 01:01:48.360 +I think there's PEP 747 for type form, + +01:01:48.660 --> 01:01:52.080 +which I think we recommended its acceptance, + +01:01:52.320 --> 01:01:54.320 +but I don't think the steering council accepted it yet, + +01:01:54.480 --> 01:01:55.640 +or it hasn't been accepted formally. + +01:01:56.220 --> 01:01:57.160 +I think that's right. + +01:01:57.730 --> 01:01:59.160 +I think that's on their plate. + +01:01:59.270 --> 01:02:02.680 +Yeah, so that's also pretty likely to make it into 3.15. + +01:02:04.640 --> 01:02:04.800 +Okay. + +01:02:05.460 --> 01:02:10.920 +So this is one example of a case that will be pretty useful to people working with type + +01:02:11.140 --> 01:02:17.820 +annotations at runtime because it'll allow you to... It's sort of a meta thing where you can annotate, + +01:02:18.700 --> 01:02:21.400 +have a type annotation that describes another type annotation. + +01:02:22.670 --> 01:02:25.940 +So it's useful if you're writing code that works with type annotations. + +01:02:26.920 --> 01:02:31.560 +Sure. Okay. Make the peidantics of the world very happy. + +01:02:33.600 --> 01:02:37.580 +I am actually pretty excited about type form + +01:02:38.000 --> 01:02:39.640 +because I feel like there's a gap + +01:02:39.680 --> 01:02:41.260 +and we can express in the type system. + +01:02:41.860 --> 01:02:42.080 +Yeah. + +01:02:42.380 --> 01:02:43.280 +And we good. + +01:02:43.800 --> 01:02:46.840 +And there are cases in the existing type system, + +01:02:47.040 --> 01:02:48.520 +like for instance, the cast function + +01:02:49.140 --> 01:02:53.100 +and some other cases where something takes any type + +01:02:53.300 --> 01:02:54.520 +expression as an argument. + +01:02:54.560 --> 01:02:57.100 +We actually don't have a good way to annotate that today. + +01:02:57.140 --> 01:02:59.040 +And this will provide a nice way to express that. + +01:03:00.400 --> 01:03:00.880 +Yeah. + +01:03:01.480 --> 01:03:02.020 +OK, cool. + +01:03:03.240 --> 01:03:05.860 +Let me pull up one thing really quick. + +01:03:10.460 --> 01:03:11.940 +Quick shout out to Will McGuggan here. + +01:03:11.990 --> 01:03:15.800 +He just released his Toad project, which is the new-- + +01:03:16.840 --> 01:03:19.420 +takes textual and rich and all that kind of stuff + +01:03:19.620 --> 01:03:21.460 +and applies it to like, what if we had a better + +01:03:21.760 --> 01:03:23.880 +cloud code type of experience, which is pretty interesting. + +01:03:26.020 --> 01:03:27.960 +So the reason I bring this up is-- + +01:03:29.160 --> 01:03:29.820 +final question. + +01:03:30.220 --> 01:03:39.400 +What about, do you all even worry about the role of like how types interact with AI and agentic coding tools? + +01:03:40.940 --> 01:03:46.640 +I know that if you have some code that has types on it and you give it to an AI, + +01:03:47.100 --> 01:03:52.120 +it's got a better chance of understanding what's happening than if you give it purely untyped code and say, + +01:03:53.180 --> 01:03:54.140 +tell me about this, right? + +01:03:54.250 --> 01:03:56.380 +It doesn't even know necessarily what's being passed to it. + +01:03:57.360 --> 01:03:59.540 +But is that anything you'll think about + +01:03:59.610 --> 01:04:01.800 +or what are your thoughts on this? + +01:04:05.220 --> 01:04:06.460 +- Certainly think about it some. + +01:04:06.570 --> 01:04:09.620 +I mean, I think overall my feeling is that + +01:04:10.400 --> 01:04:12.280 +these coding agents seem to do better + +01:04:12.390 --> 01:04:15.800 +the more kind of the tighter feedback loops + +01:04:15.830 --> 01:04:17.140 +you can give them to work with. + +01:04:17.380 --> 01:04:19.740 +And so typing is another useful source of feedback + +01:04:20.000 --> 01:04:23.480 +where you can say add type annotations + +01:04:23.510 --> 01:04:25.920 +and make sure the type checker passes and seems. + +01:04:26.260 --> 01:04:28.520 +So it still seems pretty useful in that world. + +01:04:29.000 --> 01:04:31.420 +Yeah, you can easily write rules that say, + +01:04:31.860 --> 01:04:34.140 +when you are done on anything I've asked you to do, + +01:04:34.820 --> 01:04:37.580 +always run ty or always run Pyrefly + +01:04:37.680 --> 01:04:40.900 +and make sure that there's no new errors or at least-- + +01:04:41.280 --> 01:04:42.360 +ideally, zero errors, right? + +01:04:43.100 --> 01:04:44.340 +But nothing has been introduced. + +01:04:45.440 --> 01:04:45.940 +Yeah, pretty interesting. + +01:04:47.360 --> 01:04:48.320 +Other folks? + +01:04:48.500 --> 01:04:49.400 +Rebecca, Yella? + +01:04:53.780 --> 01:04:55.239 +Yeah, I guess in general, I think + +01:04:55.260 --> 01:04:57.740 +Typing will remain useful for AI. + +01:04:58.260 --> 01:04:59.840 +We are probably rapidly moving to a world + +01:05:00.000 --> 01:05:03.080 +where a large proportion of all code is written by AI. + +01:05:05.440 --> 01:05:05.840 +And-- + +01:05:06.600 --> 01:05:08.120 +Not everybody likes that opinion. + +01:05:08.260 --> 01:05:09.060 +Not everybody likes that. + +01:05:09.460 --> 01:05:12.140 +I guess maybe my current line of work makes me + +01:05:12.320 --> 01:05:13.380 +think that's more likely to happen. + +01:05:14.160 --> 01:05:16.400 +Though, I mean, it's also-- + +01:05:17.860 --> 01:05:20.300 +you don't have to like the fact it's going to be night soon. + +01:05:20.500 --> 01:05:21.200 +But it's going to be night-- + +01:05:21.200 --> 01:05:21.600 +you know what I mean? + +01:05:23.220 --> 01:05:25.640 +I just think there's so much momentum on this, + +01:05:25.760 --> 01:05:27.600 +at least in the next five years or something, + +01:05:27.760 --> 01:05:28.680 +that it's going to be really-- + +01:05:28.740 --> 01:05:31.700 +it's a truth of how many people are writing code + +01:05:31.920 --> 01:05:34.660 +regardless of whether individuals want to write code that way. + +01:05:34.860 --> 01:05:35.240 +You know what I mean? + +01:05:35.320 --> 01:05:36.280 +So I think it's a consideration. + +01:05:37.000 --> 01:05:37.180 +Yeah. + +01:05:38.220 --> 01:05:40.960 +Yeah, I forgot that you worked at OpenAI, so of course-- + +01:05:42.180 --> 01:05:44.880 +I should pull up a codex example or something, shouldn't I? + +01:05:45.940 --> 01:05:47.060 +Yeah, codex is great. + +01:05:47.260 --> 01:05:47.520 +Use it. + +01:05:50.640 --> 01:05:55.720 +No, but I mean, do you have any further insight into the role of types and coding agents? + +01:05:56.400 --> 01:05:58.120 +I know that's not exactly what you work on, right? + +01:05:58.340 --> 01:06:00.200 +Yeah, it's not too much, really. + +01:06:00.340 --> 01:06:06.560 +I think, as Carl said, types can also be helpful for AI to understand code better and to get a better feedback loop. + +01:06:08.920 --> 01:06:11.480 +I feel like the better we make AI, the more it is like humans. + +01:06:11.940 --> 01:06:19.460 +And if typing makes humans better at writing and understanding this code, they'll probably also make AI better at it. + +01:06:20.080 --> 01:06:21.740 +Yeah, it's the locality of information. + +01:06:21.850 --> 01:06:26.440 +You can read the function and know everything you need to know about what's going into it + +01:06:26.620 --> 01:06:30.940 +without bouncing around and trying to understand blocks of code and like what might have been + +01:06:31.110 --> 01:06:31.880 +created that's getting passed. + +01:06:32.980 --> 01:06:35.580 +It's good for humans and also good for AI, right? + +01:06:36.300 --> 01:06:36.560 +Rebecca? + +01:06:38.500 --> 01:06:46.479 +I mean, I guess I don't have much need to, I'll say I am maybe a little more skeptical + +01:06:46.500 --> 01:06:51.960 +than most of my coworkers about the quality of AI generated code. + +01:06:52.220 --> 01:06:56.200 +But that means I think I am particularly gung-ho about, + +01:06:56.440 --> 01:07:00.460 +you know, like get AI to use types, type checkers, + +01:07:00.960 --> 01:07:02.960 +keep the guardrails there. + +01:07:03.240 --> 01:07:04.020 +I think that'd be very important. + +01:07:04.020 --> 01:07:05.100 +- Yeah, if it's gonna make a mistake, + +01:07:05.440 --> 01:07:08.380 +don't let it at least like make the type system become + +01:07:08.840 --> 01:07:10.320 +disconnected and not working. + +01:07:10.560 --> 01:07:12.620 +Like it has to keep the types hanging together + +01:07:12.820 --> 01:07:13.920 +as a minimum bar, right? + +01:07:13.920 --> 01:07:15.880 +And you can easily set that up as an automation. + +01:07:18.340 --> 01:07:21.700 +Yeah, interesting to think of it as guardrails rather than an accelerant. + +01:07:21.880 --> 01:07:23.080 +But yeah, 100% it is. + +01:07:26.000 --> 01:07:26.580 +All right, folks. + +01:07:26.940 --> 01:07:29.080 +I think that's it for all the time that we have. + +01:07:29.640 --> 01:07:29.860 +Thank you. + +01:07:29.960 --> 01:07:30.680 +Thank you for being here. + +01:07:31.900 --> 01:07:33.680 +Final thoughts before we go. + +01:07:35.380 --> 01:07:36.180 +Carl, I'll let you go first. + +01:07:36.620 --> 01:07:38.560 +Final thoughts for people out there interested in Python typing. + +01:07:39.580 --> 01:07:42.200 +Yeah, well, first of all, thanks for having us on the podcast. + +01:07:42.460 --> 01:07:43.200 +Really appreciate it. + +01:07:44.580 --> 01:07:46.000 +And thoughts for people out there. + +01:07:46.960 --> 01:08:00.900 +I guess if you have ideas of how Python typing could be improved, discuss.python.org is a good place to bring up ideas and discuss them with the typing community and see what positive changes we can make. + +01:08:03.079 --> 01:08:04.280 +Awesome. Rebecca? + +01:08:06.400 --> 01:08:10.260 +Yeah, first, thank you, Michael. This is a lot of fun. + +01:08:11.980 --> 01:08:25.940 +Last thoughts. So, you know, like, look at the typing council and sometimes think, oh, you know, like the PEP has like governance in its name, but I wouldn't say we're really a governing body or anything. + +01:08:26.299 --> 01:08:32.040 +It's like people who are using the type system, like users, they're the ones who come up with, + +01:08:32.080 --> 01:08:35.200 +you know, like all the best ideas, propose them, discuss them. + +01:08:35.400 --> 01:08:43.380 +And we're just here to sort of be like, hey, you know, like we have some background and + +01:08:43.460 --> 01:08:47.279 +like how type checkers work and maybe some of the history and we can provide input. + +01:08:47.680 --> 01:08:51.240 +But I just encourage people, if there's a change you want to see in the type system, + +01:08:52.040 --> 01:08:54.060 +you know, like propose it yourself. + +01:08:54.240 --> 01:08:56.339 +It's very friendly and open community. + +01:08:56.720 --> 01:09:00.640 +Yeah, now people who have listened know a little bit more about how to do so. + +01:09:00.970 --> 01:09:01.060 +Awesome. + +01:09:01.400 --> 01:09:01.460 +Thanks. + +01:09:02.080 --> 01:09:02.819 +Jale, final word. + +01:09:03.520 --> 01:09:05.380 +Also, again, thank you for having me here. + +01:09:06.060 --> 01:09:07.240 +It's been great talking to all of you. + +01:09:08.370 --> 01:09:10.640 +I guess what I want to say is similar to what Karo and Rebecca just said. + +01:09:11.759 --> 01:09:16.339 +If you want to say something changes to the type system, I'd really encourage you to sign + +01:09:17.190 --> 01:09:20.259 +up for discuss.python.org, make a proposal, go through the process. + +01:09:20.799 --> 01:09:23.420 +can be somewhat daunting perhaps, especially if you have to + +01:09:24.140 --> 01:09:27.380 +read a pep, but it is doable. There are several recent typing + +01:09:27.580 --> 01:09:31.720 +peps have just been community members who saw something they + +01:09:31.839 --> 01:09:35.960 +wanted to improve, proposed a PEP and saw it to completion. If + +01:09:36.060 --> 01:09:38.560 +there's something you want to see in the type system, then you + +01:09:38.560 --> 01:09:43.940 +can do it too. Okay, awesome. Well, thank you all for keeping + +01:09:44.100 --> 01:09:47.020 +Python typing going strong. Really appreciate your time on + +01:09:47.020 --> 01:09:48.380 +the show. See you later. + +01:09:51.960 --> 01:09:52.080 +Bye. + +01:09:52.580 --> 01:09:53.060 +Bye. + diff --git a/youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt b/youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt new file mode 100644 index 0000000..3b053fd --- /dev/null +++ b/youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt @@ -0,0 +1,2978 @@ +WEBVTT + +00:00:00.639 --> 00:00:06.800 +Hello, hello, Jarek, Amogh. Welcome to Talk Python To Me. Awesome to have Amogh, you here, + +00:00:07.260 --> 00:00:14.900 +and Jarek, you back. Yep. Welcome. Very nice to be again at Talk Python To Me. That's one of my + +00:00:15.170 --> 00:00:21.480 +favorite podcasts I listen to all the time. Thank you. Thank you. It's my first, but yeah, + +00:00:22.260 --> 00:00:27.080 +thanks for having me here, Michael. Yeah. Happy to have you here. You all have built, + +00:00:27.720 --> 00:00:35.760 +You and a team of people, given the scale of this project, have built an amazing product with Apache Airflow. + +00:00:37.240 --> 00:00:38.960 +It's going to be really fun to dive into it. + +00:00:38.980 --> 00:00:46.480 +And specifically, we're going to focus on not building workflows exactly, although I'm sure we'll talk about that somewhat. + +00:00:47.140 --> 00:00:53.400 +The real goal, the thing that we're going to focus on is how do you manage such a big project + +00:00:53.640 --> 00:00:57.600 +with so many different little inner internal packages + +00:00:57.940 --> 00:00:59.720 +that all depend upon each other and so on. + +00:01:00.500 --> 00:01:02.300 +And monorepos and that. + +00:01:02.520 --> 00:01:06.280 +I've touched on monorepos before, but two things. + +00:01:06.340 --> 00:01:08.300 +I think this makes a really interesting discussion + +00:01:08.520 --> 00:01:09.640 +for listeners out there. + +00:01:09.960 --> 00:01:14.140 +One, this is going to be very concrete with exact steps. + +00:01:14.700 --> 00:01:15.420 +And it's even open source. + +00:01:15.560 --> 00:01:17.280 +You can go check it out and play with it. + +00:01:17.800 --> 00:01:21.740 +And two, the tooling and the standards + +00:01:21.880 --> 00:01:23.160 +have changed significantly. + +00:01:23.180 --> 00:01:25.060 +since I talked about this three or four years ago, + +00:01:25.300 --> 00:01:27.360 +making much of what we're going to talk about possible, right? + +00:01:28.840 --> 00:01:29.640 +Yes, absolutely. + +00:01:30.500 --> 00:01:30.580 +Yeah. + +00:01:31.580 --> 00:01:33.300 +Now, before we dive into that, of course, + +00:01:34.260 --> 00:01:35.820 +let's do quick introductions. + +00:01:36.740 --> 00:01:38.220 +Jarek, it's been a while since you've been on the show. + +00:01:38.220 --> 00:01:38.640 +Who are you? + +00:01:38.920 --> 00:01:39.640 +Tell people who you are. + +00:01:40.920 --> 00:01:43.340 +I'm an Apache Airflow maintainer, + +00:01:43.380 --> 00:01:45.420 +one of the PMC members as well, + +00:01:45.600 --> 00:01:48.960 +and also one of the Apache Software Foundation members. + +00:01:49.120 --> 00:01:51.920 +I've got this nice pin, new logo of Apache Software Foundation + +00:01:51.940 --> 00:02:00.940 +that we got at FOSDEM. I'm also an Apache Airflow security committee member, which is an important + +00:02:01.190 --> 00:02:06.140 +aspect for what we are discussing today because of supply chain and dependencies and + +00:02:08.460 --> 00:02:15.100 +lots of security potential potential security issues these dependencies bring. Yeah, I'm one of + +00:02:15.100 --> 00:02:20.800 +the few lucky people to get to contribute to open source full time and get paid for it, which is like + +00:02:20.880 --> 00:02:26.500 +amazing. Maybe another podcast one day about that because I think that's also an interesting one. + +00:02:26.800 --> 00:02:30.700 +Yeah, I have something like that, a topic somewhat like that brewing. So yeah, potentially, + +00:02:31.239 --> 00:02:34.920 +I'll have to have you back for that. Amok? + +00:02:35.850 --> 00:02:41.860 +Yep. So, hey, I'm Amok Desai. Again, similar to Jarek, I'm a PMC member and + +00:02:42.420 --> 00:02:51.200 +computer at Apache airflow. And I'm also a part of, I'm one of the top 10 contributors to the project, + +00:02:51.760 --> 00:02:54.400 +top 10 all time contributors to the project, Jarek being number one. + +00:02:57.100 --> 00:03:03.040 +So I work at Astronomer as a senior software engineer, where I get to live in both worlds. + +00:03:03.500 --> 00:03:10.080 +One is contributing to airflow score development, and also supporting the companies that are trying + +00:03:10.020 --> 00:03:17.540 +to run air flow at scale right awesome what is astronomer tell people about that yeah it's a + +00:03:17.760 --> 00:03:23.480 +it's a company where most of our uh we're a company which is almost one of the leading + +00:03:23.920 --> 00:03:29.760 +contributors to apache airflow and also the leading consumer of it we supply uh and we provide a + +00:03:29.860 --> 00:03:35.680 +managed distribution of uh corporate managed distribution of apache airflow inside astro + +00:03:36.680 --> 00:03:44.380 +and yeah i think we have a data platform as well to try and make your life easier to use airflow at + +00:03:44.460 --> 00:03:50.980 +scale yeah incredible let me add to it two comments so as airflow has a number of stakeholders and the + +00:03:51.760 --> 00:03:57.040 +commercial stakeholders who are hosting airflow as a service as well and you know like using airflow + +00:03:57.420 --> 00:04:01.939 +and we have contributed contributions from all over the place astronomer by far like the biggest + +00:04:01.860 --> 00:04:08.020 +number of contributions and fantastic open source stakeholder. We were very much focused on making + +00:04:08.060 --> 00:04:15.340 +Apache Airflow truly vendor-neutral Apache project. I'm always amazed how well this works. + +00:04:16.560 --> 00:04:22.740 +And the second thing, the number one, I'm cheating a bit. I do a lot of small PRs. This is how you get + +00:04:22.740 --> 00:04:31.820 +the number one. I guess it depends how you measure it. You could always just do one ginormous AI + +00:04:32.380 --> 00:04:36.600 +PR that's like 100,000 lines of code in your PR and people would love you for it. + +00:04:36.640 --> 00:04:37.940 +You'd be a mega contributor. + +00:04:38.560 --> 00:04:39.680 +Oh, yeah. + +00:04:40.180 --> 00:04:40.500 +Well, no. + +00:04:40.920 --> 00:04:40.980 +No. + +00:04:40.980 --> 00:04:41.460 +He does both. + +00:04:42.100 --> 00:04:42.460 +Well, no. + +00:04:43.340 --> 00:04:44.820 +The funny part is Jarek does both. + +00:04:45.140 --> 00:04:48.460 +It's his velocity amazes me or, I don't know, shocks me sometimes. + +00:04:49.360 --> 00:04:53.420 +He does massive PRs and also like a lot of tiny ones. + +00:04:53.520 --> 00:04:55.900 +And by the time I'm looking, there are like three more out of it. + +00:04:55.940 --> 00:04:56.780 +I don't know how he does it. + +00:04:57.800 --> 00:05:05.040 +Yeah, we're going to get to a bit of how much traffic there is on Airflow in terms of like open source activity. + +00:05:06.320 --> 00:05:07.520 +It's some, it's a little bit. + +00:05:09.940 --> 00:05:14.240 +Before we move on though, Jarek, what is the Apache Software Foundation? + +00:05:14.340 --> 00:05:19.680 +What is this Apache thing that you're just talking about and why is Airflow part of it? + +00:05:20.220 --> 00:05:24.660 +Okay, so very quickly, it's a foundation, one of the oldest foundations, open source foundation in the world. + +00:05:25.560 --> 00:05:28.040 +25, 26, seven years now, I think. + +00:05:29.400 --> 00:05:31.760 +The main thing about Apache Software Foundation + +00:05:31.760 --> 00:05:33.120 +is that it's individual driven. + +00:05:33.960 --> 00:05:36.200 +So every member is an individual, not a corporate, + +00:05:36.720 --> 00:05:38.160 +as opposed to like Linux Software Foundation + +00:05:38.360 --> 00:05:39.760 +where members are corporates. + +00:05:40.560 --> 00:05:43.920 +And people make decisions in both foundation and projects + +00:05:44.960 --> 00:05:48.200 +or in PMCs so-called or project management committees. + +00:05:49.120 --> 00:05:50.960 +And Airflow is one of the PMCs. + +00:05:51.300 --> 00:05:53.720 +So one of the project management committees + +00:05:53.740 --> 00:05:55.420 +which has PMC members. + +00:05:56.260 --> 00:05:59.940 +We are both PMC members and we have like 50 other individuals + +00:06:00.400 --> 00:06:02.240 +or 60, I can't remember, like the number changes, + +00:06:02.420 --> 00:06:04.160 +like we are inviting new ones all the time. + +00:06:05.460 --> 00:06:08.540 +And we make decisions as humans, as individuals, + +00:06:08.960 --> 00:06:11.820 +not the corporates who are employing us, for example, + +00:06:12.000 --> 00:06:15.320 +because it's a meritocracy based system + +00:06:16.060 --> 00:06:20.160 +where people have merit and the merit doesn't expire + +00:06:20.280 --> 00:06:23.040 +and the merit doesn't belong to individuals, + +00:06:23.320 --> 00:06:25.580 +not to the corporates. + +00:06:26.960 --> 00:06:29.880 +And yeah, so that's one of the big, + +00:06:31.700 --> 00:06:34.380 +like pretty much all the open source software out there + +00:06:34.520 --> 00:06:38.420 +like has some Apache foundation or Apache component in it. + +00:06:38.420 --> 00:06:42.220 +It started with Apache server 28, 30 years ago almost, + +00:06:42.860 --> 00:06:46.060 +but now we have more than 300, 200 PMCs. + +00:06:47.020 --> 00:06:51.940 +We just passed 10,000 committers mark two months ago, I think. + +00:06:52.620 --> 00:06:56.420 +So like lots of individuals, lots of people contributing to the foundation. + +00:06:56.660 --> 00:06:59.760 +And the main thing about foundation is community over code. + +00:06:59.800 --> 00:07:02.980 +So we value building communities more than actually producing code. + +00:07:03.060 --> 00:07:07.800 +We believe producing code is just byproduct of great communities working together. + +00:07:10.100 --> 00:07:15.200 +And ASF is a charity, is a public good charity in US registered in Delaware. + +00:07:15.920 --> 00:07:17.340 +So we actually cannot be sold. + +00:07:17.400 --> 00:07:18.540 +We cannot change our license. + +00:07:18.840 --> 00:07:21.340 +Nothing like that can happen because of the status of the foundation. + +00:07:22.040 --> 00:07:25.880 +How interesting. And a really positive force for open source, right? + +00:07:26.440 --> 00:07:37.560 +Oh, absolutely. Absolutely. And it's when I met when I first got into like learning how ISF works, I said like, that it has no chance to work like there is no way it works. + +00:07:38.240 --> 00:07:39.680 +It's too idealistic. There's no way. + +00:07:39.680 --> 00:07:44.260 +It's too. Yeah, absolutely. And nobody in the foundation who makes decision gets any money. + +00:07:44.500 --> 00:07:47.120 +Like everyone is a volunteer, all the PMC members, + +00:07:47.220 --> 00:07:50.160 +all the commuters, all the board members, + +00:07:50.400 --> 00:07:52.800 +all president, all the VVPs, + +00:07:53.120 --> 00:07:55.100 +those are all volunteer driven roles. + +00:07:55.440 --> 00:07:56.880 +And those are the people who make decisions. + +00:07:57.400 --> 00:08:00.000 +We just pay a few people in infrastructure and security. + +00:08:00.280 --> 00:08:00.840 +That's basically it. + +00:08:01.140 --> 00:08:01.380 +- Right. + +00:08:02.500 --> 00:08:02.980 +Awesome. + +00:08:03.340 --> 00:08:03.880 +Well, very cool. + +00:08:04.520 --> 00:08:06.400 +So let's dive into it. + +00:08:06.420 --> 00:08:09.680 +What is, let's start by just talking about + +00:08:10.300 --> 00:08:11.620 +high level abstract. + +00:08:12.780 --> 00:08:13.860 +What is a monorepo? + +00:08:14.700 --> 00:08:19.120 +I think it's so easy to make that sound + +00:08:19.320 --> 00:08:20.900 +like the same thing as a monolith. + +00:08:21.020 --> 00:08:23.140 +You're like, oh yeah, monorepo, monolith, same thing, right? + +00:08:23.760 --> 00:08:25.280 +And yet you're shaking your head. + +00:08:25.680 --> 00:08:26.800 +- No, not even close. + +00:08:27.380 --> 00:08:29.900 +Yeah, the first time I met personally monorepo, + +00:08:30.040 --> 00:08:31.460 +maybe I can continue with that, + +00:08:31.480 --> 00:08:32.719 +but that was like at Google, + +00:08:33.300 --> 00:08:34.880 +where I worked at Google years ago, + +00:08:35.099 --> 00:08:37.960 +and I was surprised coming to Google + +00:08:38.140 --> 00:08:41.159 +that all the code there is in a single monorepo, + +00:08:41.719 --> 00:08:43.580 +even though we have hundreds of products + +00:08:43.760 --> 00:08:45.520 +and all the stuff you see. + +00:08:45.840 --> 00:08:47.420 +- It's gotta be a lot of code, right? + +00:08:47.500 --> 00:08:49.740 +Like a giant, giant repo. + +00:08:50.170 --> 00:08:52.460 +- It's like now they have like maybe four. + +00:08:52.840 --> 00:08:54.620 +I don't know, I've heard some stories. + +00:08:55.790 --> 00:08:57.220 +I don't work there for a long time now. + +00:08:57.780 --> 00:08:59.980 +But for me, that was a sign that + +00:09:00.780 --> 00:09:03.120 +you don't really have to split and dice and slice + +00:09:03.340 --> 00:09:06.620 +your repositories into many, many small ones, + +00:09:06.790 --> 00:09:09.580 +even if you have like non-monolithical product, + +00:09:10.080 --> 00:09:14.420 +that it all can be kept in a single repository, + +00:09:15.280 --> 00:09:17.060 +separate source trees maybe, separate like, + +00:09:18.040 --> 00:09:19.740 +we'll talk about how we do it in Airflow, + +00:09:20.190 --> 00:09:24.120 +but it's a way how you can bind it together + +00:09:24.480 --> 00:09:28.000 +and have it tested together and have it developed together, + +00:09:28.640 --> 00:09:30.880 +even though each piece is pretty much separate + +00:09:31.140 --> 00:09:32.700 +and you can work on them separately. + +00:09:33.080 --> 00:09:35.160 +So that's the monorepo, as opposed to multirepo, + +00:09:35.280 --> 00:09:38.679 +which is like when you have multiple repositories + +00:09:38.700 --> 00:09:41.360 +consisting of whatever is comes up as a product. + +00:09:43.720 --> 00:09:48.360 +Yep. I agree with everything that Jarek said, plus just a small addition, + +00:09:48.560 --> 00:09:54.440 +which is each of the component or the tiny bit of a monorepo + +00:09:55.000 --> 00:09:58.000 +can have its own build artifacts, its dependencies. + +00:09:58.900 --> 00:10:03.480 +And also it can also have its own release cycle or a release vehicle. + +00:10:05.320 --> 00:10:06.820 +That's the only addition, + +00:10:06.980 --> 00:10:11.220 +but everything is put together as a big puzzle just to keep the right it could have it if it's + +00:10:11.400 --> 00:10:16.380 +in python terms you know not every mono repo is python but in python terms it could have its own + +00:10:16.560 --> 00:10:22.540 +pyproject.toml potentially its own virtual environment yes right yes exactly yeah yeah + +00:10:22.720 --> 00:10:31.919 +i think one of the nomenclature ironies of this is often the mono repo i think makes more sense + +00:10:31.940 --> 00:10:37.140 +when you are working with lots of small parts right where the monolith maybe it has a couple + +00:10:37.140 --> 00:10:41.800 +of things but it doesn't depend real deeply the more interconnections you have and the harder + +00:10:41.900 --> 00:10:48.120 +it is to manage those versions the more something like this makes sense right absolutely so so the + +00:10:48.140 --> 00:10:54.200 +main thing is like people really make a connection between isolated work on part of the system + +00:10:54.960 --> 00:11:00.260 +into having to have separate repository for that which is completely not the case like you can + +00:11:00.260 --> 00:11:05.080 +actually have an isolated sub part of the repository even if it's git git doesn't have + +00:11:05.080 --> 00:11:09.700 +like gift have some modules and sub repos and all the stuff but even like in a single git + +00:11:09.820 --> 00:11:15.500 +repository you can easily have like start working and focusing on a small part of of the whole + +00:11:15.760 --> 00:11:21.040 +monorepo and only care about that and that's what the monorepo is + +00:11:23.680 --> 00:11:27.920 +yeah you know i'm not a i'm gonna go ahead and put it out there i'm not a big fan of + +00:11:27.940 --> 00:11:37.260 +of microservice architectures, I kind of find it's trading code complexity for DevOps and + +00:11:37.380 --> 00:11:41.160 +deployment complexity. And I think we have better tools to manage code complexity than DevOps + +00:11:41.540 --> 00:11:49.860 +complexity. But something like this does help you manage those kinds of deployments as well, + +00:11:50.340 --> 00:11:50.800 +better, right? + +00:11:51.440 --> 00:11:57.900 +Yes, yes. I use the term mini services, not microservices. Microservices is just + +00:11:57.920 --> 00:12:02.360 +much. But then you can have a lot of a number of mini services, but not micro. + +00:12:02.780 --> 00:12:05.480 +It's like micro was just too much of a mixed feed. + +00:12:05.520 --> 00:12:07.440 +Yeah, I can get on board with that. Amak, what do you think? + +00:12:08.320 --> 00:12:12.040 +Yeah, I like that idea as well. Mini services. Maybe you should coin that. + +00:12:12.880 --> 00:12:16.540 +Yeah, it's like it's the microservices are too small. It feels to me like the + +00:12:16.840 --> 00:12:22.620 +equivalent of when you're trying to write unit tests and you're like, oh, what if I + +00:12:23.020 --> 00:12:26.580 +get a customer and I set their first name and then I check that their first name is + +00:12:26.600 --> 00:12:29.600 +set, like, what are you doing? You don't need to check that assignment works. This is too, + +00:12:29.960 --> 00:12:31.840 +you're just too much in the weeds. You know what I mean? + +00:12:31.840 --> 00:12:34.320 +This is what AI agents do now all the time. + +00:12:38.280 --> 00:12:41.240 +Yeah, think of the code coverage. Just think of the code coverage. Come on. + +00:12:43.880 --> 00:12:47.700 +You've got some goals to hit. You said 80% code coverage. It's on top of it. + +00:12:49.100 --> 00:12:53.880 +All right. So that's, that sets the stage. Let's talk a little bit about + +00:12:54.700 --> 00:12:56.940 +specifically how Apache Airflow + +00:12:58.600 --> 00:13:01.120 +has come to need this basically. + +00:13:03.300 --> 00:13:07.160 +You shared with me the GitHub pulse for Apache Airflow. + +00:13:08.240 --> 00:13:13.020 +And I think it's kind of worth looking at just how much + +00:13:14.560 --> 00:13:17.120 +open source interest and traffic there is. + +00:13:17.330 --> 00:13:20.800 +Who wants to kind of summarize this weekly pulse here? + +00:13:21.020 --> 00:13:21.380 +I'm okay. + +00:13:23.280 --> 00:13:24.100 +No, Mark, do it. + +00:13:25.260 --> 00:13:33.700 +So, yep. This is not the best week in terms of the number of commits. We have had even more read, + +00:13:33.920 --> 00:13:43.160 +but this is just one of those weeks. Yeah, one of the usual weeks. So between Feb 3 and Feb 10, + +00:13:43.220 --> 00:13:53.440 +we've had about 310 active pull requests so you can imagine that's uh about 40 plus pull requests + +00:13:53.600 --> 00:14:00.620 +a day a lot of a lot of them are being assisted by the uh you know the ai revolution going on but + +00:14:00.720 --> 00:14:07.120 +yeah again that's a lot of pull requests and we have merged about 200 of them and + +00:14:08.400 --> 00:14:14.280 +about 100 are open and similarly with issues right uh 35 new issues uh five issues per day + +00:14:14.920 --> 00:14:21.580 +so that's a lot of traffic so you can imagine the amount of review pressure each of + +00:14:21.800 --> 00:14:29.080 +uh each of the maintenance has here there's 300 pull requests spread across i don't know one + +00:14:29.660 --> 00:14:37.519 +120 130 maybe 140 distributions and each of each of the distributions having like a swim lane owner + +00:14:38.660 --> 00:14:42.060 +who is actively trying to take a look at these pull requests. + +00:14:42.220 --> 00:14:45.780 +So it's just another week, to be very honest. + +00:14:46.860 --> 00:14:50.040 +And it's more than 25 PRs a day, including weekends. + +00:14:51.860 --> 00:14:52.340 +Incredible. + +00:14:52.620 --> 00:14:57.580 +And how many of these PRs are high value? + +00:14:58.380 --> 00:14:59.800 +I guess I'm trying to get the sense of, + +00:14:59.920 --> 00:15:01.140 +how much does this get accepted? + +00:15:01.280 --> 00:15:03.420 +Are these just people throwing stuff out there + +00:15:03.440 --> 00:15:05.320 +that doesn't make sense for the direction of airflow? + +00:15:05.440 --> 00:15:13.060 +So well, those merged all make sense because they are reviewed and merged by Airflow maintainers. + +00:15:13.240 --> 00:15:14.520 +And we are very serious about that. + +00:15:14.670 --> 00:15:18.300 +So like we don't merge anything that doesn't pass our bar, which is like very high, + +00:15:18.780 --> 00:15:19.580 +like extremely high. + +00:15:19.670 --> 00:15:28.720 +Like we have 170 prac hooks which are checking if the code is doing what it was supposed to be doing + +00:15:28.830 --> 00:15:30.320 +and if it's architected properly. + +00:15:30.600 --> 00:15:38.200 +on top of that we have uh individuals people like among myself and then maybe 50 other uh pmc + +00:15:38.320 --> 00:15:43.880 +members and committees uh who are reviewing it and making their comments and know the system uh + +00:15:44.280 --> 00:15:50.040 +enough to direct people so they may make sense we do have recently and that was a recurring them at + +00:15:50.040 --> 00:15:55.860 +the fosden uh conference last week when i was there about like the ai uh generated contributions + +00:15:55.920 --> 00:16:01.760 +and many of the ai generated contributions are not the best quality it's not like ai is bad + +00:16:01.960 --> 00:16:07.440 +quality is that like many of those are easier to produce and they might have bad quality so we are + +00:16:07.540 --> 00:16:14.460 +now learning how to filter them out and how to make the to handle them quickly but but those are the + +00:16:14.470 --> 00:16:23.940 +the actual high value prs that we merged nice in in terms of numbers if you if i may the it would be + +00:16:23.960 --> 00:16:28.640 +maybe a third of the open pull requests that are nice, general trend. + +00:16:29.200 --> 00:16:30.320 +Yeah, that's pretty good, honestly. + +00:16:31.770 --> 00:16:32.380 +Yep. That's pretty good. + +00:16:32.390 --> 00:16:36.920 +We have some guidelines published very recently and due to that we have seen a dip in + +00:16:37.960 --> 00:16:45.840 +such quality of PRs. We published some guidelines in our contribution guides about what will be the + +00:16:45.860 --> 00:16:54.220 +action taken if uh you know bad quality prs are raised or uh non or ps erased where the author + +00:16:54.340 --> 00:17:01.340 +does not know the context but the ai does yeah sure yeah yeah i don't want to go down this rat + +00:17:01.340 --> 00:17:07.260 +hole people hear this enough lately but i just it's been in the news lately that um open source + +00:17:07.500 --> 00:17:14.800 +projects have been kind of getting a barrage of ai submissions and i think that comes in a couple of + +00:17:14.819 --> 00:17:22.079 +flavors one people who just want to get their name listed as a contributor maybe it helps them with + +00:17:22.079 --> 00:17:27.120 +their job or whatever so there's like a small incentive there but it's been really bad for bug + +00:17:27.180 --> 00:17:32.480 +bounties like curl closed its bug bounty program because people were trying to make the 50 or 250 + +00:17:32.780 --> 00:17:38.740 +dollars by finding some issue with ai is that a problem for you all just taking the pulse of a big + +00:17:38.740 --> 00:17:44.120 +project like that it it it is i actually had a talk about that at the global vulnerability + +00:17:44.560 --> 00:17:49.920 +intelligence platform summit just before fosden so that was exactly like i i even quoted daniel + +00:17:50.060 --> 00:17:56.640 +stanberg and i met him there at fosden like they were really cool uh that uh but there are some + +00:17:56.860 --> 00:18:02.280 +different motivations of people who are submitting those those ai issues and we should fight with the + +00:18:02.960 --> 00:18:07.920 +in different ways with different uh approaches or like you know respond to those motivations + +00:18:07.940 --> 00:18:13.760 +somehow we have some ideas we have an open discussion uh in github maintainers uh list + +00:18:13.980 --> 00:18:19.340 +right now and github is trying to up to uh to address it by like just discussing what they can + +00:18:19.370 --> 00:18:24.200 +do right now and that's the highest priority for them also we have a discussion with ossf + +00:18:24.810 --> 00:18:31.760 +for security uh kind of guidelines or policies for open source maintainers how to deal with those + +00:18:31.800 --> 00:18:38.160 +issues and i'm sure we will work out some ways and toolings and most of all uh processes + +00:18:38.380 --> 00:18:42.840 +and like being assertive is one thing like just saying no when the report doesn't meet all the bars + +00:18:43.740 --> 00:18:48.480 +immediately and you know directing people to the description is good enough of a of a you know + +00:18:48.740 --> 00:18:55.760 +barrier for uh you know getting kind of completely broken prs because we have to just make it more + +00:18:55.760 --> 00:19:01.860 +expensive for the reporters than for the maintainers to diagnose the issues or yeah decide if the + +00:19:01.920 --> 00:19:06.840 +issues are bad or good yeah and i'm not necessarily saying that there's something inherently bad + +00:19:07.100 --> 00:19:11.940 +because ai wrote some of the code than a person ai can write really good code better than a lot of + +00:19:12.040 --> 00:19:19.080 +people i've seen but it it has this sort of shotgun effect often of just like i'm going to change all + +00:19:19.180 --> 00:19:24.780 +these files and it's not as focused and clear a lot of times it just it doesn't it doesn't get the + +00:19:24.660 --> 00:19:25.560 +the Zen of it, you know? + +00:19:26.100 --> 00:19:26.800 +Amogh, what do you think? + +00:19:27.420 --> 00:19:28.420 +- Yeah, I agree with that. + +00:19:29.540 --> 00:19:32.080 +It'll generate code, which it thinks is good, + +00:19:32.240 --> 00:19:34.380 +but we don't really know the ripple effects. + +00:19:35.060 --> 00:19:37.700 +And we want to avoid such things. + +00:19:38.260 --> 00:19:43.760 +- Yeah, yeah, you've such a long living app + +00:19:43.820 --> 00:19:45.440 +with lots of complexity, right? + +00:19:46.020 --> 00:19:48.440 +- We all are using AI for generating the code, + +00:19:48.900 --> 00:19:49.220 +to be honest. + +00:19:49.560 --> 00:19:51.020 +So like, most of my-- + +00:19:51.020 --> 00:19:52.520 +- And you should, you should, yeah. + +00:19:53.100 --> 00:19:53.300 +- Yeah. + +00:19:55.100 --> 00:19:55.560 +it's incredible. + +00:19:56.880 --> 00:20:00.200 +I pulled up this graphic here, and I'll link to it in the show notes. + +00:20:00.980 --> 00:20:05.160 +Just giving people a sense, I got this little utility that I released this week + +00:20:05.230 --> 00:20:09.420 +called Tallyman, which analyzes code and gives you more of a breakdown + +00:20:09.610 --> 00:20:11.640 +than just this many lines or whatever. + +00:20:12.630 --> 00:20:16.460 +So I want to just highlight, maybe you all can riff on this a little bit + +00:20:16.460 --> 00:20:17.160 +to give a sense. + +00:20:17.320 --> 00:20:23.860 +So 1.2 million lines of Python, 918,000 excluding comments. + +00:20:24.320 --> 00:20:30.140 +Maybe a little overcounting the way this thing works, but still 200,000 restructured texts. + +00:20:31.250 --> 00:20:38.420 +The one that really stood out to me, 81,000 lines of YAML and 16,000 lines of TAML, you guys. + +00:20:41.280 --> 00:20:41.940 +That's impressive. + +00:20:42.230 --> 00:20:43.180 +And you know what? + +00:20:43.680 --> 00:20:49.940 +Hat tip to just a sprinkle, just a hint of Java at 42 lines of Java. + +00:20:51.020 --> 00:20:54.520 +But you know, a hundred, almost a million, + +00:20:54.980 --> 00:20:57.080 +just over a million lines of code without comments. + +00:20:57.960 --> 00:20:59.440 +That's a big project. + +00:21:00.400 --> 00:21:01.340 +What do you think? + +00:21:03.640 --> 00:21:05.440 +- Amok, what happened when you joined? + +00:21:07.400 --> 00:21:08.220 +- I don't know. + +00:21:08.500 --> 00:21:10.920 +I think it was much lesser, but yeah. + +00:21:11.420 --> 00:21:12.880 +- Yes, you did contribute a lot. + +00:21:13.600 --> 00:21:14.680 +- It's a lot of code. + +00:21:15.200 --> 00:21:18.720 +And you can imagine so because of the number of packages + +00:21:18.980 --> 00:21:20.980 +we have in the monorepo discussion + +00:21:21.000 --> 00:21:26.440 +We have a lot of packages and the YAML might surprise you at first, but + +00:21:26.940 --> 00:21:32.000 +if you actually go and see why the YAML, it's mostly for our providers. + +00:21:32.790 --> 00:21:35.380 +So integrations with other systems is something we call as providers. + +00:21:36.240 --> 00:21:38.680 +And the spec of the providers is written in YAML. + +00:21:39.500 --> 00:21:42.400 +And Toml, I'm sure we'll come to it very, very soon. + +00:21:42.420 --> 00:21:50.960 +Yeah, yes, that's kind of why I pulled this up actually, is the Toml aspect is + +00:21:50.980 --> 00:21:52.460 +of us with that number as we move on. + +00:21:53.280 --> 00:21:54.340 +16,000 lines of TAML. + +00:21:54.440 --> 00:21:57.720 +That's a lot of pyproject.taml going on right there, folks. + +00:21:58.340 --> 00:21:59.000 +Oh, yes. + +00:21:59.040 --> 00:21:59.640 +Oh, yes. + +00:22:00.320 --> 00:22:01.740 +And lots of it is generated, actually. + +00:22:01.900 --> 00:22:05.580 +So because we actually generate quite a lot of the YAML + +00:22:05.600 --> 00:22:07.760 +and TAML that we have, and we keep it in the repo, + +00:22:08.120 --> 00:22:10.280 +because we don't want to regenerate every time. + +00:22:10.560 --> 00:22:14.960 +So it's a lot-- we don't write YAML by hand, no. + +00:22:16.019 --> 00:22:17.100 +Yeah, of course. + +00:22:18.140 --> 00:22:18.480 +Very cool. + +00:22:18.660 --> 00:22:21.300 +- Okay, so now, you know, let's, + +00:22:22.320 --> 00:22:24.460 +maybe we can start by introducing this + +00:22:24.460 --> 00:22:27.940 +by just giving a shout out to this series + +00:22:28.100 --> 00:22:31.940 +that you wrote over here on Medium, Erik. + +00:22:32.960 --> 00:22:33.140 +- Yeah. + +00:22:33.340 --> 00:22:35.760 +- Modern Python repo for Apache Airflow, + +00:22:36.140 --> 00:22:37.300 +parts one through four. + +00:22:38.100 --> 00:22:38.460 +- Yes. + +00:22:39.620 --> 00:22:42.920 +Yes, I initially started discussing this blog post idea + +00:22:43.260 --> 00:22:44.020 +with a few people. + +00:22:45.280 --> 00:22:51.000 +And people are busy and I couldn't get people to write it. + +00:22:51.220 --> 00:22:52.560 +So I decided to write it myself. + +00:22:52.900 --> 00:22:55.800 +Well, with a lot of AI help, of course. + +00:22:56.060 --> 00:22:58.500 +It's not that everything is written by hand. + +00:22:59.900 --> 00:23:03.180 +And when I wrote it, I realized it's too big. + +00:23:03.580 --> 00:23:05.320 +And I had to split it into four. + +00:23:06.500 --> 00:23:08.460 +But the idea was to document what we've done. + +00:23:09.180 --> 00:23:10.260 +Because I think that a lot of people + +00:23:10.260 --> 00:23:13.500 +are struggling with monorepo versus multirepo + +00:23:13.520 --> 00:23:15.720 +or like how they should do their repository + +00:23:16.140 --> 00:23:18.380 +in when they are a project growth. + +00:23:19.500 --> 00:23:22.600 +And there were lots of discussions in the past, + +00:23:22.760 --> 00:23:26.260 +including one of the podcasts of yours + +00:23:26.460 --> 00:23:28.100 +were Monorepo versus MultiRepo. + +00:23:28.100 --> 00:23:29.580 +And I can't remember who that was, + +00:23:29.580 --> 00:23:31.940 +but there was discussion about like going back and forth + +00:23:32.160 --> 00:23:34.420 +and like finding that people sometimes go back + +00:23:34.600 --> 00:23:37.640 +and then go forth and like in different directions + +00:23:37.900 --> 00:23:41.520 +because there are different problems with both approaches. + +00:23:42.180 --> 00:23:44.780 +So I just wanted to document the reasoning why we are doing it, + +00:23:45.880 --> 00:23:49.680 +like why it's possible now because of the packaging ecosystem + +00:23:50.700 --> 00:23:56.140 +maturing for Python and uv and other tools coming into the space. + +00:23:57.080 --> 00:24:01.160 +And then the last part was like really the kind of a little bit innovative + +00:24:01.330 --> 00:24:05.000 +approach that we do where the tooling is still not catching up with what we need + +00:24:05.100 --> 00:24:06.900 +and what we did. + +00:24:07.120 --> 00:24:11.460 +So yeah, so those are the kind of history + +00:24:11.840 --> 00:24:14.840 +why we are doing it, the packaging, + +00:24:16.779 --> 00:24:19.300 +the automated verification with Prack, + +00:24:19.470 --> 00:24:20.360 +that was the third part. + +00:24:20.370 --> 00:24:23.000 +And the fourth part was about like this shared + +00:24:24.600 --> 00:24:28.040 +libraries, innovative concept that we added for Earthful. + +00:24:28.520 --> 00:24:28.960 +- Nice. + +00:24:29.090 --> 00:24:31.980 +Yeah, I'll link to the series as well as to a talk + +00:24:31.980 --> 00:24:34.580 +that you gave at Vostum that just got published, right? + +00:24:35.600 --> 00:24:36.740 +Yes, a few days ago, yes. + +00:24:37.240 --> 00:24:38.320 +Yeah, that's a good talk. + +00:24:38.400 --> 00:24:41.880 +They have an amazing system of recording and publishing stuff. + +00:24:42.180 --> 00:24:45.460 +Like for the volunteer-driven conference, 1,000 speakers. + +00:24:46.080 --> 00:24:47.240 +Oh, that's amazing. + +00:24:47.440 --> 00:24:48.480 +That works like brilliant. + +00:24:50.160 --> 00:24:51.740 +Probably some automation going on there. + +00:24:52.340 --> 00:24:53.480 +Oh, a lot. + +00:24:55.980 --> 00:25:00.760 +Yeah, so let's talk a little bit about, I guess, the problems that you ran into + +00:25:02.600 --> 00:25:06.820 +because initially there were some challenges with the standards and tooling + +00:25:06.940 --> 00:25:11.080 +not be there and you actually one of the takeaways if people read the series or + +00:25:11.120 --> 00:25:16.600 +watch the talk is you actually had to work with some of the tool providers to + +00:25:16.760 --> 00:25:21.020 +make this possible so not only is it like well the tools have changed what we + +00:25:21.180 --> 00:25:26.020 +could do this it's you all have changed the tools a little bit through you know + +00:25:25.860 --> 00:25:33.860 +working closely like hey we've got this 1 million line project with 100 sub modules or more help + +00:25:34.500 --> 00:25:39.600 +like it's your tools to support this help me make this work right what were some of the problems + +00:25:41.740 --> 00:25:46.140 +okay so let me start with this cooperation and maybe you know amok can also explain like what + +00:25:46.180 --> 00:25:51.480 +was before and after because like he experienced that firsthand as a as a user kind of this kind of + +00:25:51.300 --> 00:25:57.620 +repository structure but for me the idea was like i i was working on it for years like uh when + +00:25:57.740 --> 00:26:03.920 +we went to airflow two five years ago uh we or four years ago i can't remember that's a long time + +00:26:04.420 --> 00:26:09.000 +we didn't have all the tooling and we had to do pretty much everything that we do now with the + +00:26:09.140 --> 00:26:16.880 +with monorepo and uv uh by hand uh by bash scripts by that time by crazy so like if you run it three + +00:26:16.900 --> 00:26:24.520 +years ago your code you would see more than 10 000 lines of bash code which i wrote oh wow we since + +00:26:24.620 --> 00:26:30.120 +that is not joyful that doesn't spark joy no no no that's why we removed it with some outreach + +00:26:30.360 --> 00:26:36.200 +internship actually and shout out to edit and borna who were our outreach mentors who helped us to + +00:26:36.879 --> 00:26:44.480 +convert it to python which was really helpful uh so that that's how it started no tooling uh a need + +00:26:44.540 --> 00:26:50.500 +need because we grew we wanted to have more providers more integrations and it already was + +00:26:50.980 --> 00:26:55.760 +quite difficult to manage if they all were part of single distribution so we have to split into + +00:26:56.240 --> 00:27:03.880 +many distributions 60 i think at the beginning now we have more than 100 now now uh and uh when we + +00:27:03.890 --> 00:27:09.760 +did that i we had to do all manually and like working with that was like really cumbersome + +00:27:09.820 --> 00:27:14.940 +and maybe you know like i can i can switch to to amok so he can say like the past experience a new + +00:27:15.060 --> 00:27:21.680 +experience because like he experienced the change himself yeah the past experience was + +00:27:21.780 --> 00:27:28.960 +scary to be to speak the least uh whenever i uh switch branches or have to rebase for whatever + +00:27:29.140 --> 00:27:35.680 +reason it i had a nightmare a very bad time trying to you know package things together and try to + +00:27:35.700 --> 00:27:36.300 +to run something. + +00:27:36.440 --> 00:27:38.200 +And I think Jarek found me often, + +00:27:39.220 --> 00:27:40.680 +ranting on the Slack channels that, + +00:27:40.700 --> 00:27:42.380 +hey, this doesn't work, hey, that doesn't work, + +00:27:42.580 --> 00:27:42.940 +what do we do? + +00:27:44.740 --> 00:27:46.300 +Now it's very easy. + +00:27:48.059 --> 00:27:50.320 +It's effortless, almost effortless compared to + +00:27:51.060 --> 00:27:53.040 +what we had years, maybe like five years ago, + +00:27:53.140 --> 00:27:53.640 +four years ago. + +00:27:54.340 --> 00:27:54.960 +- Yeah, amazing. + +00:27:55.580 --> 00:27:56.680 +How does GitHub deal? + +00:27:57.780 --> 00:27:58.840 +- Just to add to that, + +00:27:59.060 --> 00:27:59.240 +- Go ahead. + +00:27:59.360 --> 00:28:00.320 +- Before we go on. + +00:28:00.700 --> 00:28:02.900 +So like the part of it was also that the, + +00:28:05.240 --> 00:28:09.500 +been listening so like i was the only one who actually managed the whole thing for years and + +00:28:09.550 --> 00:28:14.960 +i was like overwhelmed as well when people have problems of course so then the change that we've + +00:28:15.020 --> 00:28:20.020 +done was not only with the tooling and as you mentioned we were actually cooperating with charlie + +00:28:20.500 --> 00:28:27.000 +from astral charlie marsh and with joe from feck uh because we had this need we had it implemented + +00:28:27.220 --> 00:28:31.700 +ourselves and then they could look at how we've done that and they could implement it properly in + +00:28:31.720 --> 00:28:37.280 +their tooling and we've been like exchanging the you know like charlie was even interviewing me at + +00:28:37.360 --> 00:28:43.340 +some point of time how we how what what are our needs uh so so i have for a long time i have this + +00:28:43.960 --> 00:28:50.960 +this motto that uh the best way to foresee future is to shape it and like so we did shape the future + +00:28:51.160 --> 00:28:55.760 +by you know talking to those tool providers so that they can or builders so that they could build + +00:28:55.770 --> 00:29:00.400 +it for us and work with us and we help them to test and everything like that but also it was like + +00:29:00.360 --> 00:29:07.120 +listening to amok and other contributors like all the problems they had or like and then when i + +00:29:07.180 --> 00:29:12.240 +solved it i would i wouldn't also only solve it with the new tooling but we also engaged all the + +00:29:12.540 --> 00:29:17.800 +more people from the from the team like amog and few other active contributors and they were + +00:29:17.980 --> 00:29:22.800 +actually part of the whole process of conversion and they are now part of the team and now we can + +00:29:22.840 --> 00:29:27.539 +have this podcast while things are being broken in airflow right now and somebody is probably + +00:29:27.560 --> 00:29:28.800 +we're fixing it as we speak. + +00:29:28.980 --> 00:29:30.360 +So not me anymore. + +00:29:30.540 --> 00:29:33.600 +So those old things are really great. + +00:29:34.860 --> 00:29:36.060 +- That's really, really good. + +00:29:37.260 --> 00:29:37.740 +Yeah, incredible. + +00:29:38.660 --> 00:29:41.600 +How does GitHub deal with so many files + +00:29:41.860 --> 00:29:42.700 +and such a big project? + +00:29:42.940 --> 00:29:45.760 +Is it fine or is it a challenge? + +00:29:46.160 --> 00:29:49.140 +- Except yesterday where half of the time- + +00:29:49.140 --> 00:29:50.780 +- Yeah, yeah, except yesterday. + +00:29:50.780 --> 00:29:52.380 +Yeah, for people who don't know, + +00:29:52.460 --> 00:29:55.100 +yesterday morning, at least morning US time, + +00:29:55.100 --> 00:29:55.480 +GitHub was having a moment. + +00:29:55.720 --> 00:29:58.380 +Like it was, I couldn't clone stuff. + +00:29:58.840 --> 00:30:04.180 +I pulled up the random page on GitHub and got the 503 Unicorn. + +00:30:04.480 --> 00:30:05.520 +It was not good, right? + +00:30:05.720 --> 00:30:07.480 +Besides that, excluding that time. + +00:30:08.040 --> 00:30:11.900 +So the Unicorn is actually a little bit like looking kind of angry at you. + +00:30:12.360 --> 00:30:14.340 +That's one of the observations I had from yesterday. + +00:30:14.870 --> 00:30:18.260 +I saw it so many times that it doesn't look nice. + +00:30:18.840 --> 00:30:19.520 +I agree. + +00:30:19.580 --> 00:30:20.940 +That's not a great error page. + +00:30:21.080 --> 00:30:23.420 +Like some error pages are amazing where it's like, + +00:30:23.940 --> 00:30:25.780 +you know, the coyote fell off of a cliff. + +00:30:26.000 --> 00:30:26.140 +Woo! + +00:30:26.760 --> 00:30:28.680 +That one just looks like it's angry back at you. + +00:30:29.380 --> 00:30:30.140 +- Exactly, exactly. + +00:30:30.460 --> 00:30:32.100 +So no, besides that, it's perfect. + +00:30:32.360 --> 00:30:33.720 +It's like, it works like seamlessly, + +00:30:33.960 --> 00:30:36.100 +no problems whatsoever with the size, with the numbers. + +00:30:36.340 --> 00:30:38.000 +Like we are very, very happy in general. + +00:30:38.660 --> 00:30:40.300 +And of course, like things like that happen. + +00:30:40.420 --> 00:30:41.260 +There is nothing wrong. + +00:30:41.660 --> 00:30:42.900 +Like there is something wrong, + +00:30:42.980 --> 00:30:45.020 +but like it's not like that it happens all the time. + +00:30:45.200 --> 00:30:45.440 +Not really. + +00:30:45.660 --> 00:30:47.740 +- No, it's actually, it's super rare. + +00:30:47.900 --> 00:30:49.100 +GitHub is an incredible service. + +00:30:49.400 --> 00:30:56.800 +I mean, I know there's been some grief about the GitHub actions, but that's a different conversation, right? + +00:30:57.060 --> 00:30:57.200 +Yeah. + +00:30:59.340 --> 00:31:00.040 +All right. + +00:31:00.300 --> 00:31:09.120 +Let's talk next about how the package standards have changed and how basically some of those things have made it possible. + +00:31:09.500 --> 00:31:12.580 +So in your talk, you pulled up a bunch of different peps, + +00:31:13.080 --> 00:31:14.180 +nine of them or something like that, + +00:31:15.480 --> 00:31:17.400 +that were about packaging, + +00:31:18.780 --> 00:31:21.160 +recently packaging standards and different things like that, + +00:31:21.300 --> 00:31:25.060 +that have made basically the structure that you're working with + +00:31:25.120 --> 00:31:26.440 +and the tools that do it possible. + +00:31:27.500 --> 00:31:29.060 +Do you want to maybe highlight either of you, + +00:31:29.100 --> 00:31:31.300 +some of these things that stand out as like, + +00:31:31.740 --> 00:31:33.460 +actually this one is really important. + +00:31:37.400 --> 00:31:41.760 +Well, for me, the one which is maybe not super related to Monorepo, + +00:31:41.760 --> 00:31:44.460 +but it actually helped us a lot, like the PEP 7.2.3, + +00:31:45.180 --> 00:31:49.180 +the last one but last inline script metadata, + +00:31:50.060 --> 00:31:52.200 +which is like one of the biggest successes + +00:31:52.500 --> 00:31:57.160 +and the biggest kind of usages I see from a PEP implemented. + +00:31:57.380 --> 00:31:59.460 +It caught up very, very quickly. + +00:32:00.080 --> 00:32:04.560 +It allows to embed inline script metadata into the Python scripts, + +00:32:04.760 --> 00:32:07.460 +which is like something that we've been dreaming of for years, + +00:32:08.280 --> 00:32:11.200 +especially for this kind of tooling, the FCI environment, etc. + +00:32:11.520 --> 00:32:12.860 +This is really, really helpful. + +00:32:12.940 --> 00:32:14.760 +So that's the one that I would like to highlight. + +00:32:15.520 --> 00:32:18.480 +But I read all of them many times, all the tabs, + +00:32:18.580 --> 00:32:20.980 +and they are difficult things to read and understand. + +00:32:23.020 --> 00:32:26.560 +But they were like, we actually did all that we could + +00:32:26.800 --> 00:32:28.780 +to be fully compliant, + +00:32:29.580 --> 00:32:31.540 +not only with the specification of those tabs, + +00:32:31.900 --> 00:32:34.400 +but also with the kind of spirit of the specification, + +00:32:34.740 --> 00:32:39.020 +sometimes things are not very precisely described and there are some interpretations and stuff + +00:32:39.700 --> 00:32:45.700 +so we just we just made sure and this is our our goal as well like we just make sure that all the + +00:32:45.800 --> 00:32:51.640 + PEP standards that are being published are actually very very meticulously followed and we just try + +00:32:51.640 --> 00:32:56.460 +to adapt to any changes that are coming in the environment so we know how difficult it is if + +00:32:56.580 --> 00:33:00.879 +people are sticking to the old ways and like that's that makes difficult for python maintainers + +00:33:05.080 --> 00:33:06.380 +- Mag, any other thoughts? + +00:33:07.480 --> 00:33:13.960 +- Yeah, this one is a particularly very important one for us also because it simplifies our + +00:33:13.990 --> 00:33:20.960 +pre-commit configurations where earlier we had to specify the dependencies as requires + +00:33:21.210 --> 00:33:25.960 +or whatever the particular version was, but now it's all in the script. + +00:33:27.540 --> 00:33:34.000 +the pre-commit remains as clean as it could just with the hook name and you know the rejects for + +00:33:34.100 --> 00:33:42.180 +the file filter and minimal configurations for it to work well and dependency group is also the other + +00:33:42.380 --> 00:33:48.280 + PEP i don't recall the name but i recall the number i think it's six oh i can't remember all the + +00:33:48.400 --> 00:33:56.340 +numbers but one of that would be 735 folks 735 yeah so that's also particularly nice for us we can + +00:33:57.380 --> 00:34:05.400 +define the dependency groups in our by projects and it's it's nice to uh how it's how it's really + +00:34:05.410 --> 00:34:12.280 +nice how it works with uv so we are very happy with this particular uh dependency group PEP as + +00:34:12.280 --> 00:34:18.080 +well as the uh inline scripts i think right the inline scripts are cool i you know especially + +00:34:18.260 --> 00:34:26.799 +with uv these days it really makes running some kind of python code so much easier it's it's almost + +00:34:26.820 --> 00:34:32.980 +as if everything is standard library you know i can give somebody a file i can say the way you run + +00:34:33.100 --> 00:34:38.320 +it no no no don't i know it looks like you say python but don't say that you say uv run this + +00:34:38.560 --> 00:34:43.540 +and then and that's it like they didn't have to have python they might need 10 dependencies and + +00:34:43.600 --> 00:34:49.340 +so on it but it doesn't matter right yeah yeah and being standard it makes it also you know like + +00:34:49.419 --> 00:34:54.879 +other tools are doing the same or hatch run that's the same that's like yeah and then there are there + +00:34:54.899 --> 00:34:57.060 +There is even support for inline script metadata + +00:34:57.460 --> 00:35:00.160 +just released in latest pip 26. + +00:35:00.960 --> 00:35:04.280 +So it's all good because of the standards + +00:35:04.380 --> 00:35:06.140 +and not because a single particular tool + +00:35:06.200 --> 00:35:07.620 +does it in an opinionated way. + +00:35:07.740 --> 00:35:09.680 +So this is really, really, really cool. + +00:35:09.860 --> 00:35:13.780 +And there is one big benefit of those kinds of apps, + +00:35:14.100 --> 00:35:16.060 +and particularly inline script metadata, + +00:35:16.660 --> 00:35:18.880 +is we have less YAML because of that. + +00:35:19.180 --> 00:35:19.480 +- Yeah. + +00:35:21.600 --> 00:35:22.760 +- You already have a lot of YAML. + +00:35:22.860 --> 00:35:23.680 +- Less is better. + +00:35:24.840 --> 00:35:26.740 +We have a lot still, we can't complain about that. + +00:35:28.380 --> 00:35:29.480 +- It's better than it was. + +00:35:30.020 --> 00:35:32.020 +Yeah, and so the dependency groups are like, + +00:35:32.740 --> 00:35:37.060 +you know, for dev or for test or something like that, right? + +00:35:37.060 --> 00:35:44.640 +So you can say like uv sync or uv pip install + +00:35:44.820 --> 00:35:47.160 +and you can say like thing bracket dev + +00:35:47.200 --> 00:35:48.060 +or something like that, right? + +00:35:48.540 --> 00:35:51.380 +- Or actually the nice thing about uv sync + +00:35:51.520 --> 00:35:53.920 +is that it syncs the dev dependencies automatically + +00:35:53.940 --> 00:35:55.220 +without you even specifying that, + +00:35:55.790 --> 00:35:57.860 +which is like the best thing for development + +00:35:58.080 --> 00:35:59.460 +because you actually always want to have + +00:35:59.580 --> 00:36:02.120 +development tools with you. + +00:36:02.600 --> 00:36:03.520 +- That's a good point, yeah. + +00:36:04.540 --> 00:36:04.940 +Totally agree. + +00:36:05.170 --> 00:36:06.200 +Okay, fantastic. + +00:36:06.200 --> 00:36:07.600 +- That's really cool, that's really cool. + +00:36:10.300 --> 00:36:13.540 +- So that was the changes to Python itself + +00:36:13.690 --> 00:36:15.960 +through the PEPS, but there's also tools + +00:36:16.240 --> 00:36:18.820 +and you've already mentioned some of them, both of them, + +00:36:18.960 --> 00:36:20.460 +but tools that make this possible, + +00:36:21.140 --> 00:36:24.420 +which I mean, I think uv has to be number one + +00:36:24.510 --> 00:36:25.620 +that goes on this list, right? + +00:36:25.720 --> 00:36:29.380 +Like uv has really done some powerful stuff here, right? + +00:36:29.950 --> 00:36:30.900 +- Yeah, absolutely. + +00:36:31.820 --> 00:36:33.820 +Yeah, so maybe again, Amok can say like, + +00:36:34.580 --> 00:36:37.360 +I introduced it, but Amok was the one to switch + +00:36:37.410 --> 00:36:38.920 +to use uv at some point of time. + +00:36:40.560 --> 00:36:42.720 +- Yup, uv has been a game changer. + +00:36:42.790 --> 00:36:45.100 +I think we were using poetry before this or? + +00:36:45.860 --> 00:36:47.280 +- No, no, no, not even that, just pip. + +00:36:48.210 --> 00:36:48.620 +Just pip, just pip. + +00:36:48.840 --> 00:36:49.360 +- Just pip, right? + +00:36:50.280 --> 00:36:53.220 +It's so good. I don't even remember the last thing. + +00:36:53.900 --> 00:36:55.440 +Yeah. So, yep. + +00:36:55.580 --> 00:36:58.220 +I think the main, you know, + +00:36:58.220 --> 00:37:02.060 +game changing aspect that uv brought in was this notion of workspaces. + +00:37:04.259 --> 00:37:08.680 +It's something very, you can compare it very similar to, you know, + +00:37:09.160 --> 00:37:13.240 +a coworking space or something similar where it's a unified environment where + +00:37:14.920 --> 00:37:19.020 +multiple interconnected pieces coexist and they're very easy to manage. + +00:37:19.960 --> 00:37:22.360 +And that's something that eventually led us + +00:37:22.650 --> 00:37:26.880 +to splitting the whole repository across our distributions. + +00:37:26.990 --> 00:37:29.080 +And that's the reason you see so many TOML files. + +00:37:29.800 --> 00:37:32.240 +So everything has a PI project TOML, + +00:37:32.440 --> 00:37:35.680 +everything defines the dependency groups it needs + +00:37:36.400 --> 00:37:39.700 +and development of a particular package + +00:37:40.420 --> 00:37:42.020 +is restricted only to its dependencies. + +00:37:43.360 --> 00:37:46.040 +So you develop it, you run uv sync, + +00:37:46.840 --> 00:37:50.940 +You can run your pytest using uv, + +00:37:51.260 --> 00:37:53.360 +and everything that is supposed to run with it + +00:37:53.560 --> 00:37:57.240 +is running with it, and any bad or cross imports + +00:37:57.320 --> 00:37:58.520 +are caught really easily. + +00:37:58.780 --> 00:38:02.780 +So I think the workspace feature, at least, + +00:38:02.940 --> 00:38:05.020 +was the most important one for me, + +00:38:05.880 --> 00:38:07.640 +and obviously the speed that it brings with it. + +00:38:07.820 --> 00:38:10.380 +So that's very impressive as well. + +00:38:11.140 --> 00:38:12.020 +It is. + +00:38:12.080 --> 00:38:15.920 +And I think this workspace concept-- + +00:38:15.940 --> 00:38:16.560 +It's new to me. + +00:38:16.640 --> 00:38:17.560 +I'll say it's new to me. + +00:38:17.580 --> 00:38:20.180 +I don't know how new it is to other folks. + +00:38:22.080 --> 00:38:26.120 +So you've got this giant monorepo. + +00:38:27.020 --> 00:38:29.720 +And how many different, conceptually, + +00:38:29.920 --> 00:38:34.540 +different packages or projects are in there right now? + +00:38:35.160 --> 00:38:35.960 +120 plus. + +00:38:36.700 --> 00:38:39.440 +It changes by day because Amok is doing a lot + +00:38:39.580 --> 00:38:41.360 +to increase the number very, very quickly + +00:38:41.880 --> 00:38:44.560 +because we are just now in the middle of finishing + +00:38:44.580 --> 00:38:46.960 +some isolation kind of restructuring. + +00:38:47.540 --> 00:38:48.360 +And Amog is the one that, + +00:38:48.520 --> 00:38:50.900 +that's why he's here also to lead the introduction + +00:38:51.120 --> 00:38:52.840 +of new packages that we, + +00:38:52.900 --> 00:38:54.580 +or new distributions that we have, + +00:38:54.740 --> 00:38:57.400 +like a shared libraries that we will talk about later. + +00:38:58.080 --> 00:39:00.000 +So we have a lot of those, yes. + +00:39:00.500 --> 00:39:01.180 +Yeah, amazing. + +00:39:01.880 --> 00:39:04.060 +So I think this is super important to dive into + +00:39:04.360 --> 00:39:05.840 +and how uv makes this possible. + +00:39:05.960 --> 00:39:08.200 +And I think you said also Hatch, + +00:39:08.420 --> 00:39:09.460 +you talked with Ofec, + +00:39:10.020 --> 00:39:11.860 +who runs Hatch as well about this, right? + +00:39:12.320 --> 00:39:12.760 +Yes, yes. + +00:39:12.980 --> 00:39:14.260 +Hatch is also supporting WorkSpaces, + +00:39:14.340 --> 00:39:20.920 +which are modeled mainly about what like after the what uv has done has done we haven't tried it yet + +00:39:21.140 --> 00:39:25.660 +but i've heard it's very very similar or even like you can use it as a one-to-one replacement + +00:39:26.100 --> 00:39:32.020 +in some cases or maybe even in all uh but generally i would love this eventually to become some kind + +00:39:32.020 --> 00:39:37.360 +of standard so that multiple tools are supporting this but but yes the there are a few other tools + +00:39:37.480 --> 00:39:43.060 +that we were considering before but uv is by far the kind of like yeah well we worked together we + +00:39:42.980 --> 00:39:47.540 +shaped it together with the U18. So it definitely works well. + +00:39:47.600 --> 00:39:53.620 +Amazing. Yeah, amazing. So let me describe this a little bit. And then you all can, can actually + +00:39:53.860 --> 00:39:59.700 +introduce it. So the idea is we've got this mono repo with a bunch of different folders for the + +00:40:00.660 --> 00:40:07.460 +sections, right, like airflow dash CLI or CTL, and airflow dash core and so on. And you'd like to be + +00:40:07.440 --> 00:40:13.120 +able to kind of just jump into one section and treat it as a top level project, right? It's got + +00:40:13.120 --> 00:40:18.180 +a pyproject.toml, it's got a source file, tests, and so on. But the challenge is you can't just + +00:40:18.180 --> 00:40:24.240 +have a bunch of disconnected pieces, like maybe Airflow core depends on five other parts of it + +00:40:24.340 --> 00:40:31.240 +that are also themselves have their own pyproject.toml and different things. And you've got to set up, + +00:40:31.520 --> 00:40:34.780 +you know, set up. If you jump into the Airflow core, you got to set up the environment just + +00:40:34.820 --> 00:40:40.240 +right to be working on those other parts right it sounds um it sounds pretty tricky so how does + +00:40:40.690 --> 00:40:48.300 +how does that work who wants to make sense of this for us yeah okay so like it works perfectly + +00:40:48.730 --> 00:40:52.880 +like it's super it's super simple actually you know like the whole thing about the uv is + +00:40:53.080 --> 00:40:58.380 +like its simplicity of the of of the of the of the not of the concept the implementation is actually + +00:40:58.560 --> 00:41:02.900 +quite tricky but the way how you use it is very simple you just go to the directory and run + +00:41:02.880 --> 00:41:08.840 +uv sync that's basically it so like this is uh this is to the directory you want to work on and + +00:41:08.840 --> 00:41:14.440 +it it does exactly what you would expect it to do which means that it syncs it actually updates the or + +00:41:15.220 --> 00:41:19.260 +recreates basically the virtual environment that you're using with all the dependencies that this + +00:41:19.500 --> 00:41:26.000 +particular uh distribution needs and anything that it needs uh as a transitive dependency as well + +00:41:26.370 --> 00:41:30.780 +so if it refers to another project project inside the workspace it will also use it from there + +00:41:30.860 --> 00:41:35.120 +not from like installed by pi pi so we can immediately start working on this because + +00:41:35.760 --> 00:41:39.400 +everything after uv sync everything is exactly as you expect for this particular + +00:41:40.040 --> 00:41:44.560 +subset of the repository that you work on and that's basically it this is this is all + +00:41:44.710 --> 00:41:49.760 +like there is nothing more basically it's like that's it it works and and you can when + +00:41:49.860 --> 00:41:55.580 +you're done you run uv sync py test run it will do exactly what you want so in this folder because + +00:41:55.600 --> 00:41:58.200 +because it will just, oh, sorry, uv run pytest. + +00:41:58.280 --> 00:41:58.680 +- uv run pytest. + +00:41:58.680 --> 00:41:59.760 +- You can do exactly what you want, + +00:41:59.940 --> 00:42:02.060 +because even uv run will automatically sync + +00:42:03.180 --> 00:42:05.660 +the virtual env very, very quickly + +00:42:06.160 --> 00:42:08.060 +to the one that your project needs, + +00:42:08.420 --> 00:42:10.260 +and then it will just run pytest + +00:42:10.420 --> 00:42:11.340 +in this virtual environment, + +00:42:11.620 --> 00:42:13.120 +and it will run all the tests in your project, + +00:42:13.220 --> 00:42:14.560 +and that's basically it. + +00:42:15.020 --> 00:42:17.180 +So it's like, conceptually for the users, + +00:42:17.380 --> 00:42:19.040 +it's like, you don't have to do much, + +00:42:19.180 --> 00:42:20.640 +just uv sync, and that's it. + +00:42:24.520 --> 00:42:30.000 +yeah how how interesting i i think one of the big challenges here is you know how does how do + +00:42:30.170 --> 00:42:37.460 +different parts of the project know about each other right yeah yeah you said that it uh it sim + +00:42:37.780 --> 00:42:44.200 +links the different elements in well okay so like the basic the basic the basic kind of workspace + +00:42:44.460 --> 00:42:48.580 +and implementation is just a workspace definition so you have to have the finishing of workspace + +00:42:48.920 --> 00:42:54.400 +in the top level byproduct tom so there you have all of them listed you have links to it you have + +00:42:54.420 --> 00:43:00.780 +describe where they are and uv will read the PI project on from the top level and will + +00:43:01.260 --> 00:43:03.520 +know where to look for particular distributions. + +00:43:04.190 --> 00:43:10.680 +So that's the simple discovery and the way how we know that we are using it from the sources + +00:43:10.900 --> 00:43:13.780 +and not from the PI PI. + +00:43:14.920 --> 00:43:19.840 +But then like the shared libraries is like something that we added on top of it and the + +00:43:19.780 --> 00:43:25.400 +sim links are on the top of it and this is kind of extra innovative thing that we are doing for + +00:43:25.580 --> 00:43:29.680 +something else that we need but you know we can we can talk about that now or like i'm not can + +00:43:29.780 --> 00:43:37.640 +talk about that sure okay yeah yeah good yeah we can talk about it when we're talking about it + +00:43:39.700 --> 00:43:41.480 +okay sounds good so + +00:43:43.860 --> 00:43:51.180 +this is really cool one of the things that happens here is these different these different slices or + +00:43:51.500 --> 00:43:56.640 +subsections of the monorepo have a pyproject.toml that pyproject.toml + +00:43:56.860 --> 00:44:03.540 +defines its true dependencies and its dev dependencies and so on so when you go and jump + +00:44:03.740 --> 00:44:12.320 +into a section it will uv will basically realign the virtual environment with whatever + +00:44:13.440 --> 00:44:17.080 +dependencies are supposed to be there from those things right so that means installing stuff + +00:44:17.400 --> 00:44:22.160 +obviously but actually what surprised me a little bit not a lot but oh yeah i guess it does do that + +00:44:22.280 --> 00:44:28.820 +that's cool is it actually uninstalled stuff that's not explicitly put there which i can imagine + +00:44:29.300 --> 00:44:35.900 +before that you could be like well this one part way down here depends on this weird library + +00:44:37.060 --> 00:44:41.640 +and somehow i used to be over there then i went back to the this other piece then i came back and + +00:44:41.560 --> 00:44:43.640 +and I forgot where that even came from. + +00:44:43.720 --> 00:44:45.260 +Like, why is that in my virtual environment? + +00:44:45.500 --> 00:44:47.080 +And like, how do I specify that? + +00:44:47.360 --> 00:44:49.140 +Probably juggling that was a big problem, right? + +00:44:49.260 --> 00:44:52.700 +This like loading and unloading dependencies + +00:44:52.920 --> 00:44:55.020 +based on what part of the monorepo you're in. + +00:44:55.020 --> 00:44:58.880 +And I think that actually makes it really much easier + +00:44:58.940 --> 00:45:01.000 +to deal with like this type of code structure. + +00:45:01.740 --> 00:45:02.260 +Yeah, absolutely. + +00:45:02.540 --> 00:45:04.300 +And let me add to that one more thing + +00:45:04.400 --> 00:45:06.340 +because it's also not only the dependencies + +00:45:06.520 --> 00:45:07.960 +that you might have from somewhere else, + +00:45:08.700 --> 00:45:14.940 +but also it's cross dependencies between different distributions inside so for example if our flow + +00:45:15.880 --> 00:45:21.040 +ctl does not use airflow core if you go there and run you think you will not be able to import and + +00:45:21.140 --> 00:45:26.160 +use any of the source code which is in airflow core because it's not a dependency of our flow ctl + +00:45:26.660 --> 00:45:30.540 +so uv sync will not only uninstall the dependencies that you have but also i mean install the + +00:45:30.680 --> 00:45:35.700 +the source code that you have from other parts of europeo which is a fantastic thing for us and + +00:45:35.580 --> 00:45:37.220 +And that was exactly what was missing before, + +00:45:37.440 --> 00:45:38.860 +kind of isolation between those. + +00:45:39.490 --> 00:45:41.660 +You only actually can, from your source, + +00:45:41.730 --> 00:45:45.000 +you only can refer to the source code + +00:45:45.090 --> 00:45:46.980 +of those distribution that you depend on + +00:45:47.490 --> 00:45:48.960 +and nothing else from the monorepo. + +00:45:49.150 --> 00:45:51.220 +So this means that it's like, + +00:45:51.720 --> 00:45:54.560 +you can slice and dice your repository as you want. + +00:45:55.120 --> 00:45:58.020 +So depending on in which directory you are + +00:45:58.020 --> 00:46:00.240 +and when you run uv sync, you will have like subset, + +00:46:00.720 --> 00:46:03.800 +like the actual useful and the used subset + +00:46:03.930 --> 00:46:04.660 +from your repository. + +00:46:04.940 --> 00:46:07.560 +And it can be completely different if you go to another directory. + +00:46:07.920 --> 00:46:09.000 +Some of that can be overlapping. + +00:46:09.280 --> 00:46:11.040 +Some of that can be completely different. + +00:46:11.520 --> 00:46:13.740 +Depends like which dependencies are defined. + +00:46:13.850 --> 00:46:16.300 +And this is like, this all magically happens + +00:46:16.960 --> 00:46:20.340 +like by just defining the dependency in PyProject.com, nothing else. + +00:46:20.480 --> 00:46:22.780 +I mean, this thing will handle it for you in the workspace. + +00:46:23.460 --> 00:46:27.420 +This is like exactly the reason why it's so useful for development. + +00:46:29.280 --> 00:46:32.520 +- One thing, one more thing to add to that is, yeah, exactly. + +00:46:32.780 --> 00:46:37.580 +as I said, it helped us in our vision to actually, you know, + +00:46:37.740 --> 00:46:42.800 +decompose the project into multiple parts and avoid the classic + +00:46:43.300 --> 00:46:44.700 +problem of coupling, + +00:46:45.910 --> 00:46:49.320 +which every Mono repo faces at some point in their life cycle, + +00:46:49.740 --> 00:46:52.740 +because everything is out there. Why don't we just, you know, + +00:46:53.260 --> 00:46:57.500 +have code leaks all over the place. So this helps us prevent that. + +00:46:57.720 --> 00:47:01.580 +And I cannot imagine a time how we did it earlier before uv. + +00:47:02.420 --> 00:47:04.500 +I don't know if we did it, but if we did it, + +00:47:04.500 --> 00:47:06.500 +it would have been a really tough thing to do. + +00:47:07.240 --> 00:47:07.420 +Yeah. + +00:47:08.000 --> 00:47:09.820 +There's a bunch of tools that you can, + +00:47:10.700 --> 00:47:12.440 +like linters and code analysis, + +00:47:12.580 --> 00:47:15.060 +things you can run on your code that breaks down. + +00:47:15.540 --> 00:47:17.400 +For these different modules and these layers, + +00:47:17.740 --> 00:47:20.380 +here's like a directed graph of how this thing, + +00:47:21.020 --> 00:47:22.440 +and you can set up rules to say, + +00:47:22.540 --> 00:47:24.280 +this should never cross that boundary, + +00:47:24.820 --> 00:47:29.680 +but these are just very vague things. + +00:47:29.940 --> 00:47:35.580 +And this setup actually makes it so it's not accessible to your code if you didn't say it + +00:47:35.680 --> 00:47:35.860 +should be. + +00:47:36.360 --> 00:47:36.480 +Yes. + +00:47:36.940 --> 00:47:37.040 +Yes. + +00:47:37.110 --> 00:47:42.200 +So it's just built in exactly the definition of your distribution, which you anyhow have + +00:47:42.340 --> 00:47:45.420 +to do because you have to define what the dependencies are. + +00:47:45.980 --> 00:47:47.540 +And yes, we did something like that before. + +00:47:47.670 --> 00:47:49.960 +So we got a number of rough rules or whatever. + +00:47:50.520 --> 00:47:51.840 +Don't import here. + +00:47:53.000 --> 00:47:56.960 +We still have them for shared libraries, which we can talk about now, because I think this + +00:47:56.900 --> 00:48:02.980 +is an important modification of the of the concept so we do do have some automated check for uh + +00:48:03.260 --> 00:48:10.340 +quality and for imports uh with prague our precomit hook implementation but but before that it was + +00:48:10.460 --> 00:48:16.900 +just completely completely uh like handwritten and unmaintainable people will not we're not actually + +00:48:17.240 --> 00:48:23.359 +updating it with all the distributions you couldn't really you know follow uh when things change with + +00:48:23.380 --> 00:48:28.960 +with my project tom being the for each distribution being the single source of truth you don't have to + +00:48:29.020 --> 00:48:33.880 +do anything because the dependency is declared there and this is like the best part of of + +00:48:34.520 --> 00:48:40.000 +uv understanding that and and doing everything that is like reasonable in this case yeah + +00:48:42.160 --> 00:48:51.680 +yeah yeah so the other major tool involved here was prick which it's a pre-commit framework for + +00:48:51.700 --> 00:48:58.120 +running hooks many languages but especially python relevant here written in rust so it pairs well + +00:48:58.250 --> 00:49:04.680 +with uv i suppose oh yeah it was inspired by uv as well and joe was mentioning mentions that + +00:49:06.180 --> 00:49:14.960 +he was actually contributing to uv yeah cool all right so where's preck how's preck show up here + +00:49:15.450 --> 00:49:18.800 +uh among i feel like this is leading towards what you were hinting at earlier + +00:49:19.740 --> 00:49:29.960 +yeah i'm okay yeah yep so i think it started with uh it was called pre-flight earlier i believe + +00:49:30.420 --> 00:49:35.360 +that's the name preferably git that was the original name oh yeah it was like terrible + +00:49:37.440 --> 00:49:43.200 +it's a new name uh preq so yep this allows us to do things which uh + +00:49:44.520 --> 00:49:52.740 +pre-commit did not do or you know did not accept as suggestions so one certain thing that preck + +00:49:52.840 --> 00:50:00.880 +offers is obviously it's written in rust so speed is the obvious uh advantage that we get but apart + +00:50:00.940 --> 00:50:05.980 +from that we also get this notion of it pairing well with uv in terms of modularized hooks + +00:50:06.800 --> 00:50:13.200 +earlier we had all the hooks in one place in that in the top level pre-commit yaml right and + +00:50:14.060 --> 00:50:22.840 +it was it was a big fight it was you can imagine so yeah so this break allowed us to + +00:50:23.670 --> 00:50:30.240 +break again you know it it consumed the concept of workspaces here i would say so it allowed you to + +00:50:30.680 --> 00:50:38.480 +define pre-commit hooks or preq hooks within a module itself and this paired well with uv in + +00:50:38.500 --> 00:50:41.660 +in the sense that when you have to run hooks + +00:50:41.980 --> 00:50:43.820 +that are bound to a certain distribution, + +00:50:44.400 --> 00:50:48.640 +all you have to do is check in into the, you know, + +00:50:48.670 --> 00:50:50.400 +the sub module and just do a preck run. + +00:50:50.450 --> 00:50:54.060 +It will run the relevant hooks for that particular module. + +00:50:54.700 --> 00:50:58.540 +And the other thing that I really love about preck + +00:50:58.620 --> 00:51:03.680 +is auto-completion, which is not something pre-commit had. + +00:51:03.750 --> 00:51:06.800 +So you can imagine that something fails in the CI, + +00:51:06.980 --> 00:51:13.860 +have to copy that and copy the ID and try to kind of backtrack it in your repo as to which one is + +00:51:13.900 --> 00:51:20.200 +failing. So it used to be a nightmare, but now with the dApp completion, it's amazing. + +00:51:20.700 --> 00:51:24.680 +Nice. Are you talking about like shell autocomplete integration? + +00:51:24.680 --> 00:51:24.800 +Yes. + +00:51:25.540 --> 00:51:28.260 +Yeah. Yeah. So, okay. Nice. I've seen... + +00:51:28.260 --> 00:51:31.160 +So I have some story about that very short one. So like, + +00:51:31.560 --> 00:51:36.900 +we actually tried to get autocompletion for hook names with pre-commit, which was the predecessor. + +00:51:36.900 --> 00:51:43.680 +of Prac, like Prac was largely based on Pracomit, but somehow the author of it didn't accept + +00:51:44.020 --> 00:51:50.420 +even idea of us contributing it or actually had some very, very excessive expectations for that. + +00:51:51.120 --> 00:51:55.200 +And we, you know, discussed and like, there were like other people were also trying to + +00:51:55.640 --> 00:52:01.460 +convince the author to do that, but they refused, he refused basically, and refused to accept + +00:52:01.540 --> 00:52:07.200 +contributions even uh so when when we spoke to joe that was like completely different stories like + +00:52:07.480 --> 00:52:13.060 +we need that and next day it was there like it's like completely different approach so so this + +00:52:13.140 --> 00:52:17.800 +is uh and then we said like we need workspaces and like a few weeks later because it took a little + +00:52:17.940 --> 00:52:22.220 +bit of time it was there and we worked together and we tested that and like i raised i don't know + +00:52:22.220 --> 00:52:28.740 +how many issues in the initial kind of pre-release version when when we wanted to use it so so the + +00:52:28.760 --> 00:52:33.800 +i think the collaboration and being you know working together listening to your users and be + +00:52:33.920 --> 00:52:39.040 +responding to them and and actually working as an open source maintainers together this this + +00:52:39.220 --> 00:52:43.700 +actually worked perfectly well here both both in uv and and in practice and this is why we love + +00:52:43.920 --> 00:52:49.240 +prague actually because because we know we can rely if something is not working that it's gonna be + +00:52:49.320 --> 00:52:56.560 +like we can discuss and either submit a fix or or you know joe will do this or even like lots of + +00:52:56.360 --> 00:53:01.440 +other people can do it because i think the there was uh there was a few features that we wanted + +00:53:01.620 --> 00:53:06.820 +and somebody else implemented it and that wasn't joe they contributed to crack because of this + +00:53:06.880 --> 00:53:13.780 +openness and you know being able to accept the needs of the users so that was yeah that was very + +00:53:13.900 --> 00:53:18.980 +very important part like why we moved to crack yeah i think airflow was also one of the initial + +00:53:19.300 --> 00:53:25.620 +case studies for break it's a project of that scale and if you kind of satisfy that project's + +00:53:25.720 --> 00:53:33.520 +needs you are you're pretty good with most use cases i think that's exactly both like and uv as + +00:53:34.500 --> 00:53:39.420 +yeah right there at the top of the prec repo it says although prec is pretty new it's already + +00:53:39.480 --> 00:53:45.460 +powering real projects you know little things like c python apache airflow and fast api so yeah you + +00:53:45.460 --> 00:53:51.540 +know i i know who gov and kamerate from uh release manager of python uh so we met at for them as + +00:53:51.480 --> 00:53:56.000 +well. And like he was actually listening to our preg discussion and he converted, you know, + +00:53:56.540 --> 00:54:00.980 +cpython to use preg because of the of the needs they had. So like, it was all about, you know, + +00:54:01.100 --> 00:54:04.500 +people talking to each other, word of mouth and things like that. + +00:54:05.240 --> 00:54:12.780 +Yeah. You know, there's a feature listed here that just makes me jealous. One of the features + +00:54:12.960 --> 00:54:17.540 +of preg is a single binary with no dependencies that doesn't require Python or any other runtime + +00:54:17.560 --> 00:54:24.680 +to be installed. Like, how incredible would it be with Python if we had a, you know, a Python, dash, + +00:54:24.870 --> 00:54:30.300 +dash, build app or something, you know what I mean, you can put it at your thing and you get something + +00:54:30.600 --> 00:54:35.940 +you could distribute. I know uv solves a lot, but you still got to have uv installed. And then you + +00:54:35.970 --> 00:54:41.520 +know, like this, that is a huge advantage of things like Rust and go and some other languages. + +00:54:42.500 --> 00:54:45.000 +It's both good and bad in some cases. + +00:54:45.140 --> 00:54:46.900 +So it's like there are always trade-offs. + +00:54:47.140 --> 00:54:49.160 +It's different choice made by Python here. + +00:54:49.600 --> 00:54:52.860 +I don't think it's like the best choice for Python. + +00:54:53.100 --> 00:54:54.440 +I think Python being script language, + +00:54:55.360 --> 00:54:57.480 +it's okay to have, you know, like dependencies + +00:54:57.940 --> 00:54:59.920 +and especially like inline script metadata + +00:55:00.600 --> 00:55:02.960 +almost did it because you just, you know, + +00:55:03.420 --> 00:55:04.900 +can install stuff and uv also, + +00:55:05.280 --> 00:55:07.660 +and the kind of tooling is also doing all the stuff + +00:55:07.820 --> 00:55:09.640 +like uv install or uv tool install, + +00:55:11.080 --> 00:55:13.780 +whatever, and it would not only install the project, + +00:55:14.320 --> 00:55:15.780 +its dependencies, but also install Python + +00:55:16.000 --> 00:55:17.320 +that is needed to run it. + +00:55:17.440 --> 00:55:20.100 +So like all this is really a matter of two weeks + +00:55:20.330 --> 00:55:22.340 +and it has improved dramatically over the last few years. + +00:55:22.340 --> 00:55:23.220 +- It definitely has. + +00:55:24.060 --> 00:55:27.000 +Yeah, I was pining for an option, + +00:55:27.240 --> 00:55:29.900 +not a only binary thing. + +00:55:30.600 --> 00:55:32.120 +All right, so one thing I actually wanna talk about, + +00:55:32.240 --> 00:55:34.720 +going back to this workspaces thing real quick is, + +00:55:36.660 --> 00:55:43.800 +what does it look like from a ide or editor experience to work on this all right like + +00:55:44.520 --> 00:55:51.200 +you've got pyjarm projects you've got maybe vs code workspaces where you can pull in different + +00:55:51.420 --> 00:56:00.800 +pieces how do you all manage that i'm working yeah so i cannot talk for vs code i'm a pyjarm user + +00:56:00.820 --> 00:56:09.940 +here, but we had to do a little bit of hacking, I would say, or more like a helper script for the + +00:56:10.180 --> 00:56:18.140 +IDs, right? Because so we have a ID helper script right in the repo and we recommend the users to + +00:56:18.140 --> 00:56:26.320 +run it so that the ID knows what is where in terms of maintaining things, right? Because in normal + +00:56:26.340 --> 00:56:33.420 +projects, there's usually just one source, one test at the top level, but as 120 plus and + +00:56:34.520 --> 00:56:39.140 +the helper script is, it does a pretty simple thing. It just auto discovers all the packages + +00:56:39.210 --> 00:56:48.600 +in the monorepo and adds, there's a, so IntelliJ and PyCharm both have a dot idea within each, + +00:56:49.790 --> 00:56:54.340 +a hidden folder within each of the projects that it opens and it has a, + +00:56:55.060 --> 00:57:01.160 +and it supports XML like format for IML where you can define certain things. So this essentially + +00:57:01.500 --> 00:57:07.820 +does a very simple thing. It just for each package, it adds the module slash source as the source + +00:57:07.980 --> 00:57:14.940 +root and the module slash tests are the test. So it's as if you went through all 120 things + +00:57:14.970 --> 00:57:18.360 +and right clicked and said mark as sources root or something like that. Yeah. You wouldn't want + +00:57:18.180 --> 00:57:21.960 +to do that. We did that actually. We did that for some time. + +00:57:22.520 --> 00:57:22.960 +- Initially. + +00:57:25.780 --> 00:57:30.300 +Yeah, we had this PyCharm script and then we have the same approach for VS Code. So we have another + +00:57:30.500 --> 00:57:35.240 +script for VS Code as well, which was contributed by someone who uses VS Code because either me or + +00:57:35.720 --> 00:57:44.060 +Amog are VS Code users. But communities also, and somebody said, "Okay, I'll do it." And there it was. + +00:57:44.100 --> 00:57:48.840 +and they tested it and you know like that's that was super cool actually so so yeah it works well + +00:57:49.240 --> 00:57:54.720 +also the you know a little bit of words uh i would probably we don't talk we won't talk too much + +00:57:55.360 --> 00:58:00.560 +about like the we don't have too much time but the shared libraries concept a little bit might maybe + +00:58:01.100 --> 00:58:06.500 +it's the right time to introduce the yeah yeah because we like one thing that amov mentioned + +00:58:06.740 --> 00:58:12.160 +is like the uh we had we solved this coupling problem but also we wanted to solve the dry + +00:58:12.180 --> 00:58:17.540 +problem. And those two are always kind of mixture, like you get dry and then you get more dry and + +00:58:17.660 --> 00:58:22.540 +less coupling and like, like more dry and more coupling and like, all these things are complex + +00:58:22.720 --> 00:58:27.960 +when you have lots of code. Dry being the architectural philosophy of do not repeat + +00:58:28.140 --> 00:58:32.800 +yourself. But if you're not repeating yourself, every, everything where if it exists somewhere, + +00:58:33.000 --> 00:58:37.100 +everything's got to depend on that somewhere. And it starts to become more linked together, right? + +00:58:37.540 --> 00:58:44.060 +precisely so so it's a little bit of like uh uh you know it can have it too like the the + +00:58:44.300 --> 00:58:49.000 +we want to have dry code and not to repeat it for like common utilities like logging + +00:58:49.600 --> 00:58:53.480 +configuration whatever all the things that that are kind of common between all the different + +00:58:53.640 --> 00:58:59.240 +distributions but also we didn't want to depend on a single version of those because if we do + +00:58:59.820 --> 00:59:04.000 +then it means that we have to make sure that the backwards compatibility is maintained because like + +00:59:04.020 --> 00:59:08.120 +when we install different version of different distributions coming from different time + +00:59:08.940 --> 00:59:14.540 +of repository they might use different version of those search libraries and like how to make sure + +00:59:14.620 --> 00:59:18.580 +that they don't do not have breaking changes and stuff like so this is all the whole level + +00:59:18.590 --> 00:59:23.740 +of complexity between like how to manage the dependencies there and manage versions especially + +00:59:24.020 --> 00:59:29.600 +manage the backwards compatibility so we figure out that with some very simple approach we tried + +00:59:29.620 --> 00:59:35.160 +few different approaches but like one of the approaches was using the vendor ink library from + +00:59:35.320 --> 00:59:42.640 +pip and from byton uh not from people from pip and the second one and that's the one we came up we + +00:59:42.690 --> 00:59:47.200 +we finally implemented with like using sim links to share the code between different distributions + +00:59:47.490 --> 00:59:52.880 +and that's a very innovative approach that i hope uh will make it into some kind of standard + +00:59:53.160 --> 00:59:58.000 +eventually so like we came with this approach where we actually have cake and eat it too like + +00:59:58.020 --> 01:00:02.740 +which is like pretty amazing if you fought with like for years with this kind of + +01:00:03.360 --> 01:00:08.720 +common dependency issues that and and and backwards compatibility so in our case like the symlink + +01:00:08.880 --> 01:00:14.880 +approach we have and it needs some preq kind of pre-processing of by project tom so some parts of + +01:00:14.940 --> 01:00:19.720 +the by project tom are generated to make it actually work but this is all automated with + +01:00:19.920 --> 01:00:24.240 +preq which is like you don't have to think about that even and once we do that and once we create + +01:00:24.260 --> 01:00:30.260 +some similar sim links between different parts of code like one library one distribution is + +01:00:30.360 --> 01:00:35.340 +sim linking code from the shared distribution the end result is that this code gets automatically + +01:00:35.780 --> 01:00:41.300 +vendored in during the building of the package which means that we actually have the same library + +01:00:41.520 --> 01:00:46.920 +in different package in different version in different distributions so distribution released + +01:00:47.520 --> 01:00:53.600 +week ago will have a shared configuration from a week ago but this another distribution will have + +01:00:53.620 --> 01:00:59.280 +the same shared configuration code from today if it's released today and we can install them + +01:00:59.520 --> 01:01:06.200 +together and all of them have effectively uh like if they had a different version of the + +01:01:06.360 --> 01:01:13.400 +same library installed i see it's as if the airflow dash ctl said it had a dependency on + +01:01:13.600 --> 01:01:18.840 +core and it pinned that version to something but a different part of the repo pinned it to a + +01:01:18.940 --> 01:01:22.980 +different version right they can both kind of coexist that's but it's actually all within the + +01:01:22.900 --> 01:01:28.800 +same code file. That's insane. Okay. And this is like largely like, it's nothing new. It's largely + +01:01:30.380 --> 01:01:35.940 +inspired by how the, you know, libraries work in C and like traditional kind of building code. Like + +01:01:35.940 --> 01:01:41.880 +you have dynamic libraries and static libraries. So this is like essentially equivalent of static + +01:01:42.000 --> 01:01:47.280 +static libraries where you take the code of the version that you compile the stuff in and put it + +01:01:47.220 --> 01:01:52.960 +inside the final binary and then it results like in rust the kind of single binary thing so + +01:01:53.200 --> 01:01:59.120 +it's a little bit like so this we have a little bit of this single binary by doing that in like + +01:01:59.240 --> 01:02:03.600 +in the sense that we automatically vendor in all the you know shared dependencies that we have + +01:02:04.220 --> 01:02:10.820 +in the same distribution so so it's it's kind of hybrid but it's always like it's also like so rust + +01:02:10.880 --> 01:02:14.900 +is a little bit too far because everything is single binary in our case we have a bit of both + +01:02:14.900 --> 01:02:20.360 +like we can use libraries dynamically but we can also embed libraries as shared inside the + +01:02:21.140 --> 01:02:33.180 +single distribution. That's very cool. Amogh? Sounds like you were instrumental in this part. + +01:02:34.450 --> 01:02:44.860 +Yeah, so that's the nice thing about the approach that was chosen, right? We all came together as a + +01:02:44.980 --> 01:02:50.360 +we had one email uh devilish discussion one fine day that hey we want to achieve something like this + +01:02:51.080 --> 01:02:57.420 +which more or less was a uh was something everyone agreed upon so people started chiming in and we + +01:02:57.480 --> 01:03:01.920 +started trying different things out the first one obviously using the rendering tool from + +01:03:02.650 --> 01:03:08.980 +uh somebody did a poc on that but it felt like it's going to be difficult to achieve that over + +01:03:08.880 --> 01:03:13.920 +long term and and also it would it could be brittle so Jarek came up with this particular + +01:03:14.940 --> 01:03:20.640 +option with sim links which again was discussed within the community few of us a few of us picked + +01:03:20.640 --> 01:03:26.980 +it the picked this pr up passed it locally played around and gave the feedback so i don't think + +01:03:27.050 --> 01:03:34.060 +this would be possible with ai in in the sense that this has never been done before yeah so some + +01:03:34.120 --> 01:03:40.140 +something like this where a community comes together and solves a rather difficult problem + +01:03:40.440 --> 01:03:47.680 +is something that makes me really happy and also something that all of us are working towards a + +01:03:47.720 --> 01:03:55.860 +common goal while while also bound by our corporate hats right is something that is again really nice + +01:03:55.920 --> 01:04:04.660 +to see we have about how 11 i think at this point we have about 11 to 12 shared libraries where + +01:04:05.420 --> 01:04:12.700 +the main notion here is to reimagine airflow as a independent server and more like a control plane + +01:04:12.700 --> 01:04:19.240 +and execution plane what we did with airflow three and this shared libraries is helping us achieve + +01:04:20.560 --> 01:04:27.020 +that model. And we have about 11 to 12 of them. And I think a few more coming very soon. But yeah, + +01:04:28.860 --> 01:04:32.720 +it's been nice working on the shared libraries. It's been great. + +01:04:33.300 --> 01:04:38.200 +That's cool. Is this something that people can take and adopt into their monorepo if they want + +01:04:38.200 --> 01:04:44.540 +to live that life? Absolutely. It's really like one or two kind of + +01:04:44.400 --> 01:04:46.840 +of preq hooks, which are maintaining the consistency. + +01:04:47.590 --> 01:04:51.080 +And like, so that you don't forget to add this sim link here + +01:04:51.240 --> 01:04:54.480 +and that kind of PyProjectTOM definition here + +01:04:54.700 --> 01:04:57.560 +and the, or that hatch definition for the hatch link + +01:04:57.820 --> 01:05:01.320 +to actually embed your sim linked code + +01:05:01.490 --> 01:05:02.660 +into the final distribution. + +01:05:03.000 --> 01:05:04.380 +So like there are like a few pieces + +01:05:04.580 --> 01:05:07.420 +that have to be put together from existing libraries. + +01:05:08.320 --> 01:05:09.180 +So that's basically it. + +01:05:09.210 --> 01:05:11.340 +And once you do it, it's just, those are, + +01:05:11.730 --> 01:05:13.840 +the funny thing is like those shared libraries + +01:05:13.860 --> 01:05:19.760 +just standalone distributions you can actually build them separately as a library as well we could + +01:05:19.920 --> 01:05:24.560 +we could potentially even you know like just use them as library as well no problem whatsoever + +01:05:25.000 --> 01:05:29.880 +because they are just standard plain distributions or any other we just happen to take the source code + +01:05:29.980 --> 01:05:36.180 +of it and then embed it in into into the target distribution that wants to use it rather than you + +01:05:36.200 --> 01:05:41.120 +know link to it by dependency so that's basically other than that it's it's it's a kind of + +01:05:41.140 --> 01:05:46.320 +completely standard library and uh or standard distribution and one one more thing that is really + +01:05:46.400 --> 01:05:52.440 +important to add here is like this also has a side effect but i think a very nice one and ammo can + +01:05:52.560 --> 01:05:57.380 +confirm that because he has been doing a lot of that is like we actually come up with like way + +01:05:57.560 --> 01:06:03.020 +better internal architecture because of that because a lot of those that the shared libraries + +01:06:03.180 --> 01:06:08.000 +they depended on each other sometimes in a circular fashion sometimes it really depended + +01:06:08.020 --> 01:06:10.060 +and like which import you did first, like what happened, + +01:06:10.260 --> 01:06:11.480 +like what was initialized? + +01:06:11.960 --> 01:06:13.920 +And it was like complete spaghetti of dependencies + +01:06:14.360 --> 01:06:17.500 +between generally independent pieces of functionality. + +01:06:18.240 --> 01:06:20.140 +Right now by having shared libraries, + +01:06:20.860 --> 01:06:24.380 +we are actually forcing ourselves to make them isolated + +01:06:24.800 --> 01:06:28.380 +and we are changing the way how we initialize them. + +01:06:28.420 --> 01:06:30.540 +For example, we are injecting all the configuration + +01:06:30.940 --> 01:06:32.900 +rather than using them from inside the library + +01:06:33.380 --> 01:06:35.360 +because like configuration library is another library, + +01:06:35.400 --> 01:06:37.320 +so you don't want to depend on the other library. + +01:06:37.560 --> 01:06:39.660 +So it's really nice. + +01:06:39.820 --> 01:06:46.040 +I think it comes, the result is that really the architecture of Airflow internally is + +01:06:46.160 --> 01:06:47.460 +so much better because of that. + +01:06:47.680 --> 01:06:53.660 +So less surprises and explicit initialization is like something that we'll have to do rather + +01:06:53.680 --> 01:07:00.200 +than implicit initialization during imports, which has always been plaguing as a big issue. + +01:07:02.320 --> 01:07:21.180 +Certainly. It also allows you to imagine each component having an entry point, per se, where you have an initial starting point and it initializes everything it needs by injecting and calling certain factories, which makes a very clean for anyone visiting the project also. + +01:07:21.560 --> 01:07:23.660 +they look at something and they know the entry point very clearly. + +01:07:24.180 --> 01:07:26.640 +But here, this is how it starts, this is what it initializes. + +01:07:27.660 --> 01:07:34.200 +Reminds me of Golang or Java projects where they have a nice main where in Python, + +01:07:35.200 --> 01:07:38.720 +Python it's not really the same way, but yeah. + +01:07:39.480 --> 01:07:40.500 +Yeah, very cool. + +01:07:42.180 --> 01:07:45.280 +All right, well, I think that's about it for all the time we have. + +01:07:46.580 --> 01:07:50.600 +I guess let's close it out with one final thought here. + +01:07:50.610 --> 01:07:54.080 +It's just people who are maybe inspired by your design, + +01:07:54.660 --> 01:07:56.860 +by the way you put together Airflow + +01:07:56.930 --> 01:08:00.460 +and this monorepo concept, especially Python people, + +01:08:00.780 --> 01:08:01.900 +what do you say to them? + +01:08:02.690 --> 01:08:03.280 +Final thoughts here. + +01:08:04.660 --> 01:08:06.780 +- Final thoughts, just do it. + +01:08:06.790 --> 01:08:10.840 +I mean, like I initially, there was always discussion. + +01:08:10.950 --> 01:08:12.720 +Like we had lots of discussions internally, + +01:08:12.890 --> 01:08:14.760 +even some of the PM's members in Airflow, + +01:08:14.920 --> 01:08:19.620 +they let's split the repository into smaller one like let's make more of them because it's going + +01:08:19.620 --> 01:08:26.100 +to make things easier i was always the monorepo fan and and i made a lot of work to make it possible + +01:08:26.400 --> 01:08:30.900 +but that was a very very difficult thing it's changed so like the reasons why you would + +01:08:31.020 --> 01:08:36.700 +like to have multiple repos are gone now if you're using the right tooling and only the benefits or + +01:08:36.940 --> 01:08:41.060 +mostly the benefits from having it in one place where you can test everything together and work + +01:08:41.100 --> 01:08:48.060 +on it together remain all the rest is basically gone so for me uh like uh the discussion + +01:08:48.080 --> 01:09:00.580 +on the repo versus multi-repo is already solved nice yeah just do it uh we it's it's not even uh + +01:09:01.060 --> 01:09:07.779 +so personally i've been using uh using the readme that we have present in the + +01:09:08.680 --> 01:09:15.359 +shared libraries as a context for my uh id so it's it's turning out to be very nice uh for the + +01:09:15.880 --> 01:09:21.220 +shared library split for example all all i have to do is just provide it the context and tell + +01:09:21.240 --> 01:09:26.920 +it hey just just construct the structure for me and i can do everything else so it's that easy + +01:09:27.839 --> 01:09:33.700 +we have all the things in place we are in the right era to do it so just just do it okay + +01:09:34.859 --> 01:09:44.480 +very inspiring thank you for being here awesome for this look inside and it's apache airflow it's + +01:09:44.480 --> 01:09:50.680 +on github people can go look and see it's not just a talking vaguely about some internal project right + +01:09:50.839 --> 01:09:56.740 +so people can go check it out yeah all right see you later thanks thank you thanks bye + From 2a9605fa2732d7112ece3ad78907aeefd1cd0ce9 Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Fri, 27 Feb 2026 14:26:14 -0800 Subject: [PATCH 03/16] datastar episode transcripts --- transcripts/537-datastar.txt | 2588 +++++++++++++++++++ transcripts/537-datastar.vtt | 4586 ++++++++++++++++++++++++++++++++++ 2 files changed, 7174 insertions(+) create mode 100644 transcripts/537-datastar.txt create mode 100644 transcripts/537-datastar.vtt diff --git a/transcripts/537-datastar.txt b/transcripts/537-datastar.txt new file mode 100644 index 0000000..5f49e54 --- /dev/null +++ b/transcripts/537-datastar.txt @@ -0,0 +1,2588 @@ +00:00:00 You love building web apps with Python, and HTMX got you excited about the hypermedia approach. + +00:00:05 Let the server drive the HTML, skip the JavaScript build step, keep things simple, right? + +00:00:11 But then you hit that last 10%. You need AlpineJS for interactivity, or your state gets out of sync, + +00:00:16 and suddenly you're juggling two unrelated libraries that weren't really designed to work + +00:00:20 together. What if there was a single 11-kilobyte framework that gave you everything HTMX and + +00:00:26 AlpineJS did, and more with real-time updates, multiplayer collaboration out of the box, + +00:00:31 and performance so fast, you're actually bottlenecked by your monitor's refresh rate. + +00:00:37 That's Datastar. + +00:00:38 On this episode, I sit down with its creator, Delany Galan, core maintainer, Ben Crocker, + +00:00:43 and Datastar convert, Chris May, to help explore how this backend-driven, service-sent event-first + +00:00:50 framework is changing the way full-stack developers think about the modern web. + +00:00:55 This is Talk Python To Me, episode 537, recorded January 15th, 2026. + +00:01:19 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:24 This is your host, Michael Kennedy. + +00:01:26 I'm a PSF fellow who's been coding for over 25 years. + +00:01:30 Let's connect on social media. + +00:01:32 You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:35 The social links are all in your show notes. + +00:01:38 You can find over 10 years of past episodes at talkpython.fm. + +00:01:42 And if you want to be part of the show, you can join our recording live streams. + +00:01:45 That's right. + +00:01:46 We live stream the raw uncut version of each episode on YouTube. + +00:01:50 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:54 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:59 This episode is brought to you by Sentry. + +00:02:01 Don't let those errors go unnoticed. + +00:02:02 Use Sentry like we do here at Talk Python. + +00:02:04 Sign up at talkpython.fm/sentry. + +00:02:08 And it's brought to you by CommandBook, a native macOS app that I built that gives long-running + +00:02:13 terminal commands a permanent home. + +00:02:15 No more juggling six terminal tabs every morning. + +00:02:18 Carefully craft a command once, run it forever with auto restart, URL detection, and a full CLI. + +00:02:23 Download it for free at talkpython.fm/command book app. + +00:02:27 Ben, Delaney, Chris, welcome to you all. + +00:02:30 Thanks for being here on Talk Python To Me. + +00:02:31 Thanks for having us. + +00:02:32 Hey, how are you doing? + +00:02:33 Doing well, doing well. + +00:02:34 Very excited to talk about Datastar and some cool web frameworks for Python people and beyond, of course. + +00:02:42 But, you know, most people listening doing Python web frameworks. + +00:02:45 So talk about how that all integrates. + +00:02:47 And if you like the HTMX vibe, which we've talked a lot about on the show, I think there's + +00:02:52 going to be a lot to like here as well. + +00:02:54 And maybe more. + +00:02:55 We'll see. + +00:02:55 A case to be made. + +00:02:57 But, you know, before we get into all of that, though, let's just talk about a quick introduction + +00:03:02 for everyone here and like go around the squares of Ben, I'll let you go first. + +00:03:07 Who are you, Ben? + +00:03:08 Based in Costa Rica at the moment. + +00:03:10 I'm based in Europe most of the year, but half of the year my wife and I spend here. + +00:03:14 In terms of background, I've been primarily working with PHP for well over 20 years and got involved with Delaney and Datastar, been a core maintainer on that project ever since. + +00:03:26 And I looked at my commit history for last year, and it turns out now I write more Go code than PHP, so I don't want to call myself a PHP developer anymore. + +00:03:37 I'm just a web developer, a backend web developer, primarily that also writes TypeScript and maintains a front end. + +00:03:44 framework. There's a lot of stuff going on and ways in which you can write code for the web these days. + +00:03:50 Well, thanks. Awesome to have you here. Delaney, hello. + +00:03:53 Hi, how you doing? Yeah, I have kind of a weird checkered background into web development. I was + +00:03:58 originally in the circus, then I became a 3D artist, then I became an engineer. I've worked in + +00:04:04 games, video games, slot machines, military applications, all kinds of crazy things. + +00:04:11 I tend to work on really highly optimized, fast things. + +00:04:14 I love the ideas of the web, but I got really tired of how you actually implement things in that. + +00:04:20 And I was doing very large applications with millions of updates a second. + +00:04:24 And the tools that were out there just weren't good enough. + +00:04:27 So I ended up going down many, many rabbit holes and finally found something to make it better for everybody else. + +00:04:33 So yeah, that's really cool. + +00:04:34 And wow, what a really interesting issue. + +00:04:36 I know you got some crazy stories. + +00:04:38 Yes, I do. + +00:04:39 I always have a funny, weird outcome of something. + +00:04:42 Ironically, people talk about things being a circus, + +00:04:45 but like circuses are very well run logistic machines + +00:04:48 compared to most developer situations. + +00:04:50 So it's kind of funny. + +00:04:50 Yeah, it's an insult to circuses. + +00:04:52 Yes, it is. + +00:04:52 It really is. + +00:04:54 Amazing. + +00:04:55 Okay. + +00:04:55 And what we're going to talk about, Datastar has this amazing ability to update many things + +00:05:02 really quickly in real time, which we'll get into, but yeah, sort of foreshadowing there. + +00:05:07 And Chris May, welcome to the show. + +00:05:10 I've known you for a long time and I'm really happy to have you here. + +00:05:12 Great to be here. Thank you so much. + +00:05:14 Yeah. + +00:05:14 Yeah. So about me, I started writing websites back in 1995 and then picked up Python about 10 or so years later and just have really enjoyed the ride since then. + +00:05:25 Picked along the way, became technical coach and just loved making single page applications. + +00:05:30 I loved, I just love the web. + +00:05:32 You know, I love that we can publish something from our computer and anybody around the world can see it. + +00:05:36 And then what, maybe a little over a year ago, I, oh no, it was more than that. + +00:05:40 I remember I was on a trip and I was listening to a podcast of HXPod, the HTMX podcast, + +00:05:46 and heard about this crazy, cool tool, Datastar. + +00:05:50 And I was like, I even put in my DjangoCon presentation, like you should, everybody else + +00:05:54 should try it out. + +00:05:55 And finally I did and I'm converted. + +00:05:56 I love it. + +00:05:57 So I'm excited that the three of us get to talk about it. + +00:06:00 The reason that we're having this podcast is because I read your article about switching + +00:06:05 to Datastar. And I'm like, okay, this is interesting. You made the case very well. Of + +00:06:10 course, I'll link to the article. And so I thought, hey, I need to have Chris here as my Tony Romo to + +00:06:18 my Al Michaels or Nico Rosberg to my Crofty or whatever, right? So I'm happy to have you here. + +00:06:24 Exactly. Awesome to have you here. So let's just start with what is Datastar, right? I mean, + +00:06:30 we've hinted that it has some similarities to htmx but also not so ben and delaney give us the + +00:06:37 overview what is datastar so i can give a little bit of history and then ben's probably better at + +00:06:42 saying what it is now i have a background in like low-level stuff um even though i was a 3d artist + +00:06:47 first i'm much more comfortable in like shader development and that kind of thing like so glsl + +00:06:52 web thing like i'm a c guy that knows some other things but the thing is that i was working on + +00:06:57 some military applications where I needed really fast updates of a browser. And the reason why you + +00:07:03 in this military situation is that getting things approved is really hard, like executables to go + +00:07:09 into deployment. But having a browser means that you have this nice little sandbox that things can + +00:07:13 go in. So it's actually more of a deployment platform in my background than, you know, just + +00:07:17 the regular web. But I was doing things that were pushing the browser really, really far. I was using + +00:07:21 Vue and Spa. And I basically was like, well, these are the smartest people out here, but it's not + +00:07:26 fast enough. So I was using crazy WebSocket stuff, all this binary stuff. And then I tried doing, + +00:07:31 you had someone on last week talking about LiveView and like they have a Python version of that. I went + +00:07:37 hard in making a binary version of that, like going down to the protocol level, changing, + +00:07:41 optimizing that 10 different ways. I had an entire framework for doing this. And basically, + +00:07:45 in my opinion, that's a complete dead end. It is untenable. We can go into the reasons why, + +00:07:49 but the thing is, long story short, I ended up seeing what was happening in HTMLX in the hyper + +00:07:53 media space. And I completely discounted all of that because I said, like, I'm doing low level + +00:07:58 binary stuff. There's no way this other approach can be faster. And then my thing is always check + +00:08:02 the metrics, always don't take your assumptions and do the work. And the thing is, there's things + +00:08:07 that are wrong in the implementation, but there's things that are 100% right in the overall ideas of + +00:08:11 how to use that. So I went and I took a year and a half work and just threw it in the trash + +00:08:16 and said, okay, I'm starting over and like ended up doing some basic things would probably get into + +00:08:22 and ended up with this thing that is a backend diagnostic backend framework that has a 10 + +00:08:27 kilobyte shim that is the fastest, smallest thing out there by orders of magnitude. So it's not just + +00:08:33 a slightly different thing. It is literally a different paradigm shift. It's a crazy shift. + +00:08:37 So the difference between reacts to something like HTMX is different from HTMX to the data + +00:08:42 start way. So I'll let Ben actually explain what that is. But the thing is from a low level C + +00:08:46 guys point of view, it is one of the fastest things in your stack now, which is crazy to think + +00:08:51 Yeah, like a 10K shim can do that. + +00:08:53 That's incredible. + +00:08:54 And also, it sounds like your advice comes from somebody who's done a lot of profiling. + +00:08:59 Very much so. + +00:08:59 Like, that's the only thing. + +00:09:00 You got to measure, not guess. + +00:09:03 Yeah. + +00:09:03 In fact, there's funny things that we've had things on Twitter fighting with people and + +00:09:06 they're like, oh, this one situation was really slow. + +00:09:09 We actually looked at their flangraphs and it was a bug in the Safari GPU stuff. + +00:09:14 Because we were actually at the level where the JavaScript doesn't even show up. + +00:09:17 It's actually a GPU issue of it rendering fast stuff. + +00:09:20 in the browser, nothing to do with the JavaScript. + +00:09:23 Because the fastest JavaScript you can write is no JavaScript. + +00:09:25 So we really lean into what the browser can already do. + +00:09:28 And we're just making it so that that's easy to do so that the average person with the average website + +00:09:32 doesn't have to write any JavaScript at all. + +00:09:34 And they get to be a full stack developer in whatever language they choose. + +00:09:37 And I'll let everyone else talk from there. + +00:09:40 Awesome. Ben? + +00:09:41 Yeah, my version is going to be quite different to Delaney's + +00:09:43 because we care about different things. + +00:09:45 Fortunately, we do care about some of the same things. + +00:09:49 We work well together because I think we complement each other. + +00:09:52 But coming from a PHP background, I want the backend to be driving the front end. + +00:09:57 And it naturally does, right? + +00:09:58 Because even your HTML is being produced by your backend. + +00:10:02 And that's what's being served to the front end. + +00:10:05 I describe Datastar as a hypermedia framework. + +00:10:09 And some people get tripped up on what hypermedia is, but it's essentially hypertext with other media like images and CSS and that kind of thing. + +00:10:16 And everybody should know what hypertext is because it's the H in HTTP and HTML. + +00:10:22 There is an expectation for people coming into Datastar that you have a basic understanding of the web and web browsers and the web browser API because we lean as heavily as possible on the browser API. + +00:10:34 We get a lot of people coming into the Discord asking us, you know, how should I do this the Datastar way? + +00:10:39 And it got to the point where I'd heard that question so often I decided, OK, I'm going to write a page in the Datastar docs. + +00:10:45 We call it the tau of datastar. + +00:10:47 So it's kind of like the way of datastar. + +00:10:49 And if there's one thing to take from that, it's use as little datastar as possible. + +00:10:54 Like leverage the browser, because the browser is an incredible thing, right? + +00:10:58 Like it's basically an operating system, our operating system as web developers. + +00:11:02 So, and everything happens at the C level, super optimized. + +00:11:06 We're not going to be able to build something faster. + +00:11:08 So leverage the browser as much as possible on the browser APIs. + +00:11:12 And where HTML kind of lacks or where there are some gaps, that's essentially what Datastar is trying to fill. + +00:11:19 So I did a lot of work. + +00:11:21 So just to relate this, I guess, to something that other people might be familiar with, which is HTMLX. + +00:11:27 I was an early contributor to HTMLX, actually, and I was sold on the idea of hypermedia from the very beginning. + +00:11:33 So HTML is the language of the web. + +00:11:36 Why are we trying to replace it with JavaScript? + +00:11:39 And the problem that I ran into after several years of thinking HTMX is all I need is that last 10%, right? + +00:11:46 Because it'll get you 90% of what you're trying to do. + +00:11:50 But that last 10%, which we all know is the hardest piece that takes the most work, just isn't covered. + +00:11:56 So with HTMX, for example, you will very often reach for another library like AlpineJS, + +00:12:02 or you'll start writing vanilla JS perhaps to fill in those gaps to interactivity to the page, + +00:12:09 because HTMX is really just going to the back end, replacing the DOM. + +00:12:13 But now you have two dependencies. + +00:12:15 Now you have HTMX and Alpine, for example, and you're trying to make those play well together. + +00:12:20 And because I think that might be a little bit of the missing sauce from HTMX. + +00:12:24 I've had Carson Gross on and I really admire HTMX. + +00:12:28 But as I've worked with it over a couple of years, I feel like it's really good as salt or seasoning, + +00:12:35 something you sprinkle on to really make a website better. + +00:12:38 But if you try to make a meal out of salt, you're not going to want to eat it. + +00:12:41 And what I mean is, you have three different disjointed parts of the page, + +00:12:47 and you're like, this is so amazing to update this with HTML and partials, and so is that. + +00:12:51 But then you start talking about AlpineJS and connecting different things, + +00:12:56 and then the JavaScript gets out of sync with this server response. + +00:12:59 And it just, you start to feel constrained by it. + +00:13:03 And I think you all have a really nice solution. + +00:13:05 It's something a little bit like how you, we're going to talk about it, + +00:13:08 but sort of how you specify the HTML to be updated by the server, + +00:13:13 but then also connecting different parts of the pages. + +00:13:16 Chris put it in his article that like the problem is AlpineJS and HTMX + +00:13:20 are just two unrelated different things that happen to go together a lot. + +00:13:24 And so they're not cohesive in a sense, right? + +00:13:26 Well, and that's one thing that's definitely an issue. + +00:13:28 Like, for example, this was my thing because I actually tried to fix HTMX back in the day. + +00:13:33 And like the things that I wanted to fix were the problem that I see at least + +00:13:37 is that you have HTMX, you can add, it has extensions, so you can add stuff to it. + +00:13:41 But it fundamentally was built to be like, here's our way of doing it. + +00:13:45 And then you can do your own stuff on top of it. + +00:13:47 The problem is, is that I thought that's broken. + +00:13:51 I've done enough game development to know that you need to be agile. + +00:13:53 I need to be able to like be able to move quickly. + +00:13:55 So I wanted it so that nothing was basically like the core of data star is like 300 lines + +00:14:01 long. + +00:14:01 And it is basically setting up data dash star elements, hooking up plugins, and then everything + +00:14:06 else is a plugin. + +00:14:07 So if you don't agree with us, or if someone's better than I am, great, that's wonderful. + +00:14:12 We will be able to just pop that part out, put the new part in. + +00:14:15 But plugins can now depend on each other. + +00:14:17 They can understand. + +00:14:17 It's an ecosystem. + +00:14:18 Ironically, that's what happens under the hood. + +00:14:21 But the ideas of that make it so much more powerful. + +00:14:23 And the irony is that if you build it in that kind of plugin style way, in the more game developer style way, we are smaller than HTMLX and Alpine alone, let alone combined, let alone Hyperscript and all these other things. + +00:14:34 So it's just a different way of thinking about the problem. + +00:14:36 When I first encountered Datastar and looked at the source code, it looked very foreign to me because Delaney coming from game development, he built Datastar like a game engine. + +00:14:47 So you have this very thin core and then everything else pretty much is a plugin. + +00:14:52 And all Datastar core is a way for registering plugins and having Datastar attributes. + +00:14:59 And that's pretty much it. + +00:15:00 Everything else is an add-on that you, is a plugin that you can take away. + +00:15:03 So we even have a bundler on the site that allows you to just, well, you can just download + +00:15:08 Datastar core or you can just select what plugins you want. + +00:15:12 Now, that in and of itself is not that interesting because we're, at the end of the day, we're + +00:15:16 talking about a 10 kilobyte JavaScript file with all of the plugins. + +00:15:19 But it is open source, which we didn't mention. + +00:15:21 And so anybody can go just kind of look at it if you're interested. + +00:15:25 But that approach means that everything is modular and everything is there for a reason. + +00:15:30 And we'll get into this later, I guess. + +00:15:32 But like deciding what plugins go in and what stay out is one of the challenges. + +00:15:36 And we just try to keep it as lean as possible. + +00:15:39 My way of thinking about it is that Datastar gives you everything you need and nothing you don't. + +00:15:44 And that's how we try to kind of keep it lean and fast. + +00:15:48 This portion of Talk Python Maze brought to you by Sentry. + +00:15:51 I've been using Sentry personally on almost every application and API that I've built for + +00:15:56 Talk Python and beyond over the last few years. + +00:15:59 They're a core building block for keeping my infrastructure solid. + +00:16:03 They should be for yours as well. + +00:16:04 Here's why. + +00:16:05 Sentry doesn't just catch errors. + +00:16:07 It catches all the stuff that makes your app feel broken. + +00:16:10 The random slowdown, the freeze you can't reproduce, that bug that only shows up once + +00:16:14 real users hit it. + +00:16:15 And when something goes wrong, Sentry gives you the whole chain of events in one place. + +00:16:19 errors, traces, replays, logs, dots connected. + +00:16:22 You can see what's led to the issue without digging through five different dashboards. + +00:16:27 Seer, Sentry's AI debugging agent, builds on this data, taking the full context, + +00:16:32 explaining why the issue happened, pointing to the code responsible, drafts a fix, + +00:16:37 and even flags if your PR is about to introduce a new problem. + +00:16:41 The workflow stays simple. + +00:16:43 Something breaks, Sentry alerts you, the dashboard shows you the full context. + +00:16:47 Seer helps you fix it and catch new issues before they ship. It's totally reasonable to go from an error occurred to fixed in production in + +00:16:55 just 10 minutes. I truly appreciate the support that Sentry has given me to help solve my bugs + +00:17:01 and issues in my apps, especially those tricky ones that only appear in production. I know you will + +00:17:06 too if you try them out. So get started today with Sentry. Just visit talkpython.fm/sentry + +00:17:12 and get $100 in Sentry credits. Please use that link. It's in your podcast player show notes. If + +00:17:19 our code talkpython26, all one word talkpython26 to get $100 in credits. + +00:17:26 Thank you to Sentry for supporting the show. + +00:17:29 Cool. + +00:17:29 That's a super interesting philosophy to say you should be able to take, even take + +00:17:33 stuff out of what we're giving you by default, right? + +00:17:35 Now, before we move on from sort of introducing Datastar, I do want to point out at data-star.dev, + +00:17:42 which of course I'll link this notes, there's some cool examples on here. + +00:17:45 You've got a really nice space 2001 sort of theme with Hal and all that, which is great. + +00:17:53 I like the aesthetic here, which is very fun. + +00:17:56 It's got a little bit of a retro gaming feel, which is nice. + +00:18:00 But what I want to point out is I want to encourage people to go watch your little video. + +00:18:03 Your video is fun. + +00:18:05 It's really fun. + +00:18:06 This video is all about how Datastar fits in the world of SPAs. + +00:18:11 And one thing we didn't really mention is that Datastar is a full-fledged SPA replacement. + +00:18:17 So again, like that last 10%, often people will think, oh, well, I need to go to React or Vue.js or some single page application framework. + +00:18:26 Whereas we're saying that, no, no, no, Datastar will not only, it's not like a subset or like SPAs are not a superset. + +00:18:34 it's on the contrary. I think Datastark, we think Datastark can do more than SBAs because we are + +00:18:42 driven by the backend and we are focused on hypermedia, which is the language of the web. + +00:18:46 So this, yeah, so this video is kind of throwing, yeah, anyway, everybody should watch it. + +00:18:51 I'd also like to, if you can scroll back up to the top of the page, the Starfield animation was + +00:18:57 one of the things like when Delaney and when everybody who worked on this published it, + +00:19:02 Like I didn't realize how amazing this was because if you like right click and inspect that thing, + +00:19:07 it's a web component. + +00:19:09 And so all the JavaScript that's required for making all the stars go faster and slower + +00:19:13 and tracking your mouse where, you know, wherever you do it, + +00:19:16 it's all within that web component. + +00:19:18 And data star is essentially subscribing to like, where's the mouse pointer + +00:19:22 and passing it into the web component. + +00:19:24 Yeah, in fact, if you go to more examples, you will see that there's, + +00:19:29 and then go scroll down to, or use the hamburger thing. + +00:19:32 Yeah, go down to the rocket. + +00:19:35 There's the actual star field. + +00:19:38 So you can see the entire, the star field, the entire component is there. + +00:19:41 So if you scroll down from there, you'll see how it actually gets hooked up + +00:19:45 and the entire component, that's the whole thing, it's right there. + +00:19:48 - That's incredible. + +00:19:49 - And the thing is if you start moving around, like if you scroll up just a little bit more, + +00:19:52 so you can see the sliders, you'll see that they're live, everything's, + +00:19:56 if you move it around, like you move your mouse around the canvas, + +00:19:59 you'll see everything's live editing, everything's thing. + +00:20:02 It's the irony of Datastar. + +00:20:03 And this is the part that I don't think people quite get. + +00:20:06 And it's not that you're trying to like, we love what Carson has done with HCMS. + +00:20:10 We love that all the things they've done, but it does not do everything. + +00:20:13 It doesn't do enough. + +00:20:14 It is a library, not a framework. + +00:20:15 And the thing is, the irony is that Datastar actually has + +00:20:18 the fastest reactive signal, like reactive signals. + +00:20:22 We are the fastest thing out there. + +00:20:23 So it's not just like we did something that's kind of like VDOM, + +00:20:26 or we are like, we can compete with React. + +00:20:28 We demolish them with actual numbers. + +00:20:30 So we have the fastest morphing strategy and we also have the fastest signals, which means doing these kinds of things. + +00:20:35 It's just a non-issue. + +00:20:36 Like this star field thing is 1K. + +00:20:38 Like it's just these are the kinds of things that are just a non-issue in this if you do things our way. + +00:20:42 And you're leaning into the web ecosystem by leveraging web components instead of having to like build, have a build time pipeline to, you know, do all the custom JavaScript. + +00:20:53 Like once I realized like you can do these things, it just made, it just clicked. + +00:20:57 And I just make it's I feel like it's so much more fun now to work on the web now that I understand these things. + +00:21:04 Let's talk through some of the core examples. + +00:21:06 I feel like there's some similarities to the example section of the HTMX place. + +00:21:11 But, you know, HTMX doesn't have a star field, certainly. + +00:21:14 Best place to start is on the homepage. + +00:21:17 Before we get into those examples, just just to kind of take a step back and say, OK, we've mentioned HTMX a few times and we don't we don't even like to compare ourselves to HTMX. + +00:21:26 but it is a good maybe starting point for some people. + +00:21:29 We have a hello world example there, if you could find that. + +00:21:32 Yeah, let's scroll down just a little bit more. + +00:21:33 Yeah, you got it. + +00:21:34 One of the maybe differences between HTMX and Datastar is that Datastar can receive HTML responses, + +00:21:41 but it also by default, or the recommendation is to use server sent events. + +00:21:46 So if you hit start there, you're going to see kind of the network response tab, + +00:21:50 and those are server sent events. + +00:21:51 And SSE server sent events are an old technology that work just over HTTP. + +00:21:57 And essentially what happens is that the server holds a connection open to the browser + +00:22:02 and it's unidirectional. + +00:22:03 So you send a request to the server and then the server can stream events back down, + +00:22:08 which is what you're seeing here. + +00:22:10 Now, this is obviously a trivial example, right? + +00:22:12 We're sending one, or we're updating the message one character at a time. + +00:22:16 But when you see how simple this is, then you can perhaps see potential for this, right? + +00:22:22 And SSE or service end events have had kind of a renaissance in recent years with all of the LLMs, right? + +00:22:29 All the chatbots are streaming the responses back to you. + +00:22:33 So this type of technology, while it's not old, sorry, it's not new, it's actually been around a long time, has kind of been underused. + +00:22:43 And Delaney kind of tapped into that and said, well, because I also always thought, well, if I want pure reactivity or true reactivity, + +00:22:50 I need two-way communication. + +00:22:53 So I need web sockets. + +00:22:53 You need web sockets. + +00:22:54 You need binary and all that kind of stuff. + +00:22:56 Yeah. + +00:22:57 Yeah. + +00:22:57 There's problems with those, which we can get into. + +00:23:00 SSE is much simpler. + +00:23:01 It works over HTTP 1, 2, and 3. + +00:23:05 And as you can see, it's just plain text. + +00:23:06 There is no complicated handshake. + +00:23:08 If you change the interval to zero and hit start, you're going to see a different type of response, which is, + +00:23:16 and I don't know if you saw the content type change, but content type now is text HTML. + +00:23:20 Oh, intro. + +00:23:20 Oh, interesting. + +00:23:21 Yeah. + +00:23:22 So this is what HTMLX would do by default. + +00:23:24 You send back HTML responses, whereas here the content type + +00:23:28 is text event stream. + +00:23:30 And this allows you to hold that connection open for as long as you want. + +00:23:34 It can be open and closed, or it can stay open until the words + +00:23:39 hello world have been spelled out. + +00:23:41 Or you can keep it open indefinitely. + +00:23:44 So we're going to see some more advanced examples where the SSE connection is held open for longer. + +00:23:50 So I think wrapping your head around this example taps you into the potential of Datastar. + +00:23:57 MARK MANDEL: Yeah. + +00:23:58 And one of the things that-- + +00:24:00 well, when I looked at Datastar, I'm like, OK, there's some interesting aspects here. + +00:24:05 And we'll get into them, how you can set up-- + +00:24:08 when I click the Start button, it might replace a piece of the page-- hey, that sounds familiar-- + +00:24:13 with HTML, not through JavaScript, right? + +00:24:15 but it didn't specify anywhere what part of the page to replace or not. + +00:24:21 Like, how does it know? + +00:24:22 And so with Datastar, you lean more on the server for many things, + +00:24:28 including deciding what part of the page that the server created in the first place to update. + +00:24:32 I really like that. + +00:24:33 I think that that's super neat. + +00:24:35 It lets you not just have sort of closer to one source of truth, + +00:24:39 but also just you can pass down multiple things. + +00:24:43 is like, we need to update this pane on the right, this text, and this element all in one response. + +00:24:50 There's a lot of interesting aspects to what you're talking about here. + +00:24:54 JOHN MCWHORTER: Anyone who's familiar with out-of-band swaps + +00:24:56 in HTMX, well, guess what? + +00:24:59 Datastar is out-of-band by default. + +00:25:01 So it's matching currently based on the ID. + +00:25:04 So you see h3 id equals message. + +00:25:07 And every event that's coming back has an ID of message. + +00:25:10 But guess what? + +00:25:11 you can use any ID you want, right? + +00:25:13 So you can use actually any CSS selector you want. + +00:25:16 But yes, we put the onus more on the backend because that is where we believe state should live + +00:25:23 or that's the source of truth for state. + +00:25:27 And you send and you work with state on the front end only when and where it makes sense to, + +00:25:33 which is more the web component aspect. + +00:25:35 And I'll caveat what Ben said there in that like state mostly lives in the backend. + +00:25:40 And that's the problem is that like state lives where it lives. + +00:25:43 Like if the user is actively able to move their mouse cursor, + +00:25:46 they own that state of the mouse cursor. + +00:25:48 You don't own that, but most of the state from your database should be in the backend. + +00:25:52 The one thing that's interesting about the SSE compared to how most people + +00:25:56 think this stuff, I will say I fell into this trap too, right? + +00:25:59 Cause I did the live view crazy stuff is that your job as a web developer is to + +00:26:03 get strings to the browser as efficiently, as fast as possible. + +00:26:06 Cause like the browser is going to deal with that, that into html and all that there's nothing faster than giving it html right so the thing that i i + +00:26:14 know i lost for a long time is that sse i thought oh it's this big string thing how is that better + +00:26:20 than binary but the irony is that because it's so regular because there's already things like + +00:26:25 compression built into the browser there's streaming things there's things that are so + +00:26:28 much easier to do here in an efficient way that the irony is if you if you you don't have to care + +00:26:33 about all these things but if you just follow our way of doing it your python app will be faster + +00:26:38 than most people's like compiled, you know, like low level language thing + +00:26:42 because you're getting orders of magnitude in the algorithms + +00:26:44 and how we're doing stuff from the hood. + +00:26:46 So I don't know if you're interested in like the deep down stuff + +00:26:49 or just like how you use it as a Python developer. + +00:26:51 But the irony is that you now have tapped into this. + +00:26:53 It seems so simple. + +00:26:54 You're like, oh, this is just a different text response. + +00:26:56 How can this be orders of magnitude faster? + +00:26:58 Like, again, I don't know how much you want to get into the weeds of that + +00:27:01 compared to just it's fun to use, right? + +00:27:03 Yeah, I really like the philosophy of having so much of it controlled by the server. + +00:27:08 It just felt disheartening. + +00:27:10 It's like, okay, so what you're going to do is you're just going to create some JSON responses + +00:27:14 on your server, and then everything is some crazy build series of steps + +00:27:19 to end up with, I don't know, Vue or React or something on the front end. + +00:27:23 And there's just so much power and flexibility to write really cool server code. + +00:27:29 But, you know, like a lot of the trends have been, yeah, that's kind of just there to support the rest of it, + +00:27:34 you know, and so I don't know, this really appeals to me. + +00:27:37 question that comes up often is like, OK, well, how do I format this? Because it has its own syntax. + +00:27:44 Very simple to read, obviously, right? An event name and then these data lines. And you can just + +00:27:48 have as many data lines as you want. And that's your HTML. If you scroll up, though, we do have... + +00:27:54 So you do need to format this, but we essentially have all of these SDKs, including Python, you'll + +00:28:01 see there. And the Python SDK is actually, I would say, one of the most intricate ones we have. + +00:28:07 Spatuel King, he's a member of the community, or Chase, I believe is his first name, his real first + +00:28:13 name, and many other contributors did an amazing job on that. So lots and lots of Python frameworks + +00:28:18 are supported. You can maybe speak more to this, Chris. And really, the SDKs are very simple, + +00:28:24 because all they do is they take a function, a patch elements or patch signals function, + +00:28:30 and you just dump in the HTML that you want swapped into the DOM or the signals you want + +00:28:34 output on the page and it just does the formatting for you so so it's really just there's three + +00:28:39 functions i think in total that every sdk has to implement and it's such a time saver you know um + +00:28:45 i doved into service and events a lot with htmx and when you get the syntax wrong it is so painful + +00:28:52 to debug because pretty much can't it just doesn't work you know or whatever it's harder to debug and + +00:28:58 so to have the helper syntax it's just a dream well and also just so people are aware i like + +00:29:04 because I was originally going to try, the irony is I was trying to get server sent events + +00:29:08 like their plugin up to snuff like years ago. + +00:29:11 Like I would highly recommend not using SSE with HTMLX + +00:29:14 because the problem is that the entire model of how you build things is very poll based + +00:29:18 and it's built out of band. + +00:29:19 It's like a weird concept, like the idea of updating, + +00:29:22 like it is not built with that in mind. + +00:29:24 So I know that they're trying to move towards that in the future, + +00:29:26 but the whole way that you interact with it is based on polling. + +00:29:30 And the thing about our way is that not only are you doing push events, + +00:29:33 But the thing is that really does change the semantics of the language. + +00:29:36 So first of all, you get like 40X compression by doing our way. + +00:29:40 But also you only send data when you need to instead of polling. + +00:29:43 So now you're using less resources. + +00:29:45 You're using less network. + +00:29:46 It changes the whole dynamic in a deeper way that you can literally save 5,000X in your network bandwidth. + +00:29:54 It sounds crazy, but it's just a reality. + +00:29:57 Right. + +00:29:57 Another thing, Delaney, that's really nice about that is the latency. + +00:30:00 That's something that drives me crazy about polling. + +00:30:03 general is just like okay well we don't want to hammer the server too hard so let's make this you + +00:30:09 know one second two second but then it's like well i click this button and then it updates and you're + +00:30:13 like ah if if something happens on the server it's sent right if it wants to one of the things that + +00:30:18 ironically because i do a lot of like go or low level language stuff is that i tend to put a debounce + +00:30:23 in my server to like five milliseconds so that i get i'm not updating more than you know 200 times + +00:30:30 a second even on a monitor because the browser actually break after 500 fps so like the + +00:30:36 interesting thing is not that it's basically data starts no longer the issue in your thing if you + +00:30:40 are on a low low powered battery device like a mobile on a 3g this is it will just work like it's + +00:30:47 stuff that you just don't have to worry about so it does change the semantics of how you build things + +00:30:52 just so that you're aware because even things like for example built into the htmx they don't do + +00:30:57 automatic exponential back off. + +00:30:59 It doesn't have all the verbs. + +00:31:01 There's caveats there that I would recommend not doing it, honestly, if you're going to do it. + +00:31:05 It's crazy that you're talking about going below the monitor refresh rate. + +00:31:09 You're not going to see it. This is only 120 hertz. + +00:31:13 120 times a second. + +00:31:15 So why would you pull faster than that? That's wild. + +00:31:20 This portion of Talk Python To Me is brought to you by us. + +00:31:23 I'm thrilled to announce a brand new app built for developers + +00:31:26 created by yours truly. + +00:31:28 It's called Command Book. + +00:31:30 You know that thing you do every morning? + +00:31:32 Open up six terminal tabs, CD into this directory, + +00:31:35 activate that virtual environment, run the server with --reload. + +00:31:38 Now, CD somewhere else, start the background worker, + +00:31:41 another tab for Docker, another one to tail production logs. + +00:31:44 Every tab just says Python, Python, Python, Docker tail. + +00:31:48 And you're clicking through them going, which Python was that again? + +00:31:51 Where my app is running? + +00:31:52 Then sometime later, your dev server silently dies + +00:31:55 because it tried to reload while you're in the middle of a code edit, + +00:31:59 unmatched brace, a half-written import, or something. + +00:32:02 Now you're hunting through tabs to figure out which process crashed + +00:32:05 and how to restart it. + +00:32:06 My app, CommandBook, gives all of these long-running commands a permanent home. + +00:32:11 You save a command once, the working directory, the environment, + +00:32:14 free commands like git pull, and from then on, you just click run. + +00:32:18 You can even group commands together to start and stop everything + +00:32:21 for a project with a single click. + +00:32:23 It also has what I call Honey Badger Mode, auto restart on crash. + +00:32:27 So when your dev server goes down mid-reload, Command Book just brings it right back up and does so over and over until the code is fixed. + +00:32:35 It also detects URLs from your output, so you're never scrolling through thousands of lines of logs + +00:32:39 just to figure out how to reopen your web app. + +00:32:42 And it shows you uptime, memory usage, and all sorts of cool things about your process. + +00:32:46 The whole thing is a native macOS app. + +00:32:49 No electron, no Chromium, just 21 megs. + +00:32:51 And it comes with a full CLI. + +00:32:53 So anything you've configured in the UI, you can fire off from your terminal with just a single command. + +00:32:58 Right now it's macOS only, but if there's enough interest, + +00:33:02 I'll build a Windows version too. + +00:33:03 So let me know. + +00:33:05 Please check it out at talkpython.fm/command book app, + +00:33:09 download it for free, level up your developer workflow. + +00:33:12 The link is in your podcast player show notes. + +00:33:14 That's talkpython.fm/command book. + +00:33:16 I really hope you enjoy this new app that I built. + +00:33:20 Yeah, on the topic of latency and all that, if you go to the examples, there's some we could look at that I think really demonstrate this. + +00:33:28 Well, maybe start with bad Apple just because we're talking about refresh rates. + +00:33:33 OK. + +00:33:33 What's happening is that the back end is streaming down just a bunch of symbols, but it creates this animation. + +00:33:39 And if you were to open the network tab, you would see it actually would be interesting to see. + +00:33:44 You probably have to refresh the page just to. + +00:33:46 Yeah. + +00:33:47 And you're going to see updates. + +00:33:49 That one there. + +00:33:49 Yeah, if you click that-- + +00:33:51 This one? + +00:33:52 Yeah, that's the one. + +00:33:53 You click Event Stream. + +00:33:54 There's an Event Stream tab for Event Stream responses. + +00:33:57 You're going to see these streaming. + +00:33:58 I don't know what frames per second we have this set to, but you see it streaming past, right? + +00:34:03 Right. + +00:34:03 The first time many people see this, this is a surprise that the browser is capable of this. + +00:34:09 But the browser can stream video, so why can't it stream a bunch of text? + +00:34:12 I mean, it's not that big of a leap of faith. + +00:34:15 But you can see, it looks like it's about every 10, 20 milliseconds. + +00:34:19 I think we're doing like 30 frames a second, but again, we can do like, we're doing this on a, + +00:34:24 basically a free tier server. So like, this is just a non-issue and it's doing all the compression + +00:34:28 stuff. So if you notice that your update, even though we're doing like full ASCII development at, + +00:34:32 you know, thousands of characters, your updates are actually not updating that. Like you see how + +00:34:37 it's transferring, but it's not transferring that much compared to how much it's actually coming out. + +00:34:41 I can see we got 1.9 megs for the whole page. Yeah. But do you see next to it? What, what was + +00:34:46 actually like the resources so you see the compression well yeah we're probably not seeing + +00:34:51 it there but in the bottom you'll see two two megabytes have been transferred but 10 megabytes + +00:34:55 of resources and so that's oh yeah yeah so it's a 5x compression yeah it's going to be much more + +00:35:01 on the stream i think because it's streaming uh normally you can hover over the size + +00:35:07 and you'll see the uncompressed but i guess it's changing too fast that's pretty wild and you know + +00:35:13 in practical usage, like I have a status screen that I have from my production app at work. And + +00:35:19 it's just amazing to just constantly be seeing these things update. And I'm doing that by having + +00:35:24 the database tell my Python code, hey, refresh. I actually ask it to get all the entries from the + +00:35:31 database and send it down the pipe. And so it's not like I'm doing the optimized thing. I'm doing + +00:35:35 the simple thing and I get all these cool things just updating all the time. And it's just such a + +00:35:40 useful thing, especially for status screens, dashboards, stuff like that. + +00:35:44 Speaking of that, go to the DB Mon example. This is one of my favorites because when React + +00:35:50 first had their first conference, they said, look at what we're doing. We're able to update at a rate + +00:35:54 that no one else can compete with us in how fast they could update the browser, right? If we + +00:35:59 actually, yeah, you're still there. So the thing is, if you actually set the FPS to something like + +00:36:03 80, whatever. So that is how fast it's coming from the backend to you. So go ahead. Yeah, + +00:36:10 because we just don't want people blasting the server. + +00:36:12 Yeah, you don't want to walk away. + +00:36:14 Yeah, but the point is that this is coming. + +00:36:16 See, we're doing stuff in microseconds on a potato. + +00:36:19 Yeah, let me just describe this a little bit for people listening. + +00:36:22 So it's like a database monitoring table that shows you how many transactions are the database overloads. + +00:36:29 So it's updating a grid of maybe 10 or 12 databases with five or six elements, + +00:36:36 and it's doing that in microseconds, 80 times a second. + +00:36:38 A lot of people see these examples and they think, well, I'm not building this kind of stuff. + +00:36:43 And me included. + +00:36:44 I build crud apps most of the time. + +00:36:47 And there are plenty of examples here that are just cruddy things. + +00:36:51 They're kind of the more boring examples. + +00:36:53 But one example that might be worth looking at is the to-do MVC. + +00:36:58 And if you can figure out how to open that in split screen. + +00:37:01 Okay. + +00:37:02 What part do you want me to open up in split? + +00:37:04 Oh, just this, the example. + +00:37:05 Yeah, so I can do these two and then I can tile them. + +00:37:09 How's that? + +00:37:10 So this is a CRUD app, but what Datastar gives you is the ability to do multiplayer out of the box. + +00:37:15 And that is like real-time collaborative apps are not easy to do and not easy to scale as well. + +00:37:22 But as you'll see here, when you have like two sessions open, it's going to be near instant. + +00:37:27 You're going to basically be observing the latency on your network connection, + +00:37:31 which is going to be 50 milliseconds to 100, but barely perceivable. + +00:37:35 So just to describe to people, we've got this 2D MVC, which allows you to, well, it's like a + +00:37:40 to-do example, which is required to be a legitimate JavaScript framework. But I've opened it in two + +00:37:46 tabs and I've used Vivaldi's tile. So these are legitimately two browsers. They just appear to be + +00:37:51 kind of in the same window. And when I enter stuff into it, it literally looks like they update in + +00:37:57 parallel, which is crazy. If you check a few of them, you'll see, + +00:38:00 You can barely tell which one's updating which. + +00:38:03 It happens almost instantly. + +00:38:04 Yeah, if I look at the other one and I click on one, it feels like that's responding to my click. + +00:38:10 I need to correct myself that it is happening instantly because when you click, when you check one of those, + +00:38:15 it's not, and this is an interesting thing we can get into. + +00:38:19 We're not doing optimistic updates. + +00:38:21 It's actually sending a request to the server and the server is simultaneously updating + +00:38:27 both of your tabs at the same time. + +00:38:29 Even if I had just one open, it's still going round trip to the server. + +00:38:33 That's why it looks like it's simultaneous because it's effectively. + +00:38:35 This is a thing that you can, we can talk for like three hours + +00:38:38 and I will yell at most spa developers because there's this weird thing + +00:38:43 that because it's easy, people will actively lie to users in the spa world + +00:38:47 and they'll do optimistic updates, which means I'm going to make it + +00:38:50 so that I'm making this change. + +00:38:52 And then if there's a problem, then fix it. + +00:38:54 Whereas we say you should do indicator saying, I'm trying to make a change to this + +00:38:58 and then fix it. + +00:38:59 Because you don't want, like when you're playing a video game, you can do what's called dead + +00:39:02 reckoning and you can do some stuff to net rollback code. + +00:39:05 You can do some clever things to hide latency, but you don't want to hide latency when it + +00:39:10 comes to like a bank transfer or did I buy that thing or did I get that theater ticket + +00:39:14 or any of that stuff. + +00:39:15 Like people just have the wrong mental model of how the web should work. + +00:39:19 I'm actually going to send you another thing that this might blow your mind even more because + +00:39:23 the three of us basically can play. + +00:39:25 This is an example where all of us could be playing live with each other right now in an active shared state that's been at the top of the Hacker News and again, runs on a potato. + +00:39:35 I don't know if you just put that in your... + +00:39:37 Yeah, let me drop it over. + +00:39:38 Hold on. + +00:39:38 I'm going to put it in the other tab. + +00:39:39 So right now, don't touch anything. + +00:39:42 I'm going to actively start. + +00:39:43 I am purple. + +00:39:43 I am literally starting to click right now. + +00:39:45 All right. + +00:39:46 So we're looking at a multiplayer game of life here. + +00:39:48 I'm seeing that live here. + +00:39:50 And if you open up that in the other tab, you would actively see the exact same state. + +00:39:54 So everyone in the world, like if you open that up in the other tab, + +00:39:58 you cannot get out of sync. + +00:40:00 It's not faking it in the front end. + +00:40:01 This is literally sending in. + +00:40:02 What's even crazier about this, here's the crazy part. + +00:40:04 It's actually a rendering demo. + +00:40:06 The guy who wrote it is writing in a scripting language, Clojure, + +00:40:09 and he's sending down 2,500 divs per frame styled, inline styled. + +00:40:14 Now go to your network tab now and look at what's actually, + +00:40:17 like look at your network tab and you'll see how little data we're actually sending over. + +00:40:21 even though he's updating 2,500 divs per frame, like if you go to wherever it's updated, + +00:40:26 yeah, whichever one's the one that's updated, there you go, yeah. + +00:40:29 So if you look here and look at how much is being sent + +00:40:32 versus how many is actually, like try to, like, this is just a different paradigm + +00:40:36 for how you build. + +00:40:37 And the thing is, again, not everybody has to care about these low level things, + +00:40:40 but the thing is, is that once you do this, the idea of CRUD kind of goes away + +00:40:44 because in our opinion, you go to a multi, you make a multi-page app like you would normally do + +00:40:49 in HTMX or anything else, but you keep an open stream + +00:40:52 and you just update whatever's happening in your backend + +00:40:54 as it's happening. + +00:40:55 And it simplifies the world. + +00:40:57 And what's also interesting is because of how we do compression and all that, + +00:41:00 you just send your entire page. + +00:41:01 You don't need like out of band. + +00:41:03 It doesn't even really make sense because we're so fast that you can just, + +00:41:06 you as a Python developer, you just give us your entire page + +00:41:09 and let us deal with it. + +00:41:10 And we will come up with the fast stuff. + +00:41:12 So Chris should probably talk a lot more to that because the fat morpher stuff, + +00:41:15 it's a fundamental change in how you build web apps, I think. + +00:41:18 Yeah, yeah. + +00:41:19 especially the kind of the mental shift of like, because I kept thinking, okay, I need to like send one row at a time. + +00:41:25 And I actually have one status screen that does that because we use a Firestore, Google's Firestore as our backend for this app. + +00:41:32 But for some reason, sometimes it just doesn't send every update. + +00:41:36 And so on another status screen, I actually, you know, query the whole database table or collection and send it down to Pipe. + +00:41:42 And because sometimes it doesn't send from Firestore, I get the entire latest state of all the things that are in flight and updated on my screen. + +00:41:51 And it just makes things easier. + +00:41:52 Yeah, it's amazing. + +00:41:53 Sounds like a good opportunity to subscribe to database query changes. + +00:41:57 I know some databases you can say, if this query updates, you know, trigger this event and then keep it flowing, + +00:42:03 like straight from events on the database, straight to your front end. + +00:42:06 Pretty cool. + +00:42:07 I do want to go back and just put a little bit of commentary, Delaney. + +00:42:12 Well, you said optimistic updates. + +00:42:15 So one of the things that's really common in JavaScript is I click this thing, it changes. + +00:42:20 I want to mark it as changed. + +00:42:21 And then I'm going to tell the server, hey, we made this change. + +00:42:25 It's very possible the server died, that you're not allowed to make that change or whatever. + +00:42:29 And then you've got to come back and go actually undo that. + +00:42:30 That really, you know, there's like a weird. + +00:42:33 So what you're saying is you don't have to worry about that kind of stuff. + +00:42:35 We're a framework, not just a library. + +00:42:37 The idea is that you have these indicators that not only basically your indicators drive a signal. + +00:42:41 Like, again, the details don't really matter. + +00:42:43 But the idea is that you have instantaneous, like within the same frame updates of, + +00:42:48 hey, I'm going off to do something. + +00:42:50 Like usually you make a spinner or you say, I'm going to do this to gray out the field + +00:42:53 or I'm going to do, like there's all kinds of things you can drive. + +00:42:55 Because again, the state of what the local stuff is while the change is there, that lives in the client. + +00:43:02 Like that is part of Datastar. + +00:43:04 It has all the right tools to make it so that you can disable it or gray it out + +00:43:08 or say, I'm going to put a spinner next to it. + +00:43:10 Like you can do all those things. + +00:43:11 But the thing is you're not lying to your user. + +00:43:13 That's my whole thing. + +00:43:14 And people say, well, that's not really a lie. + +00:43:15 It's like, yes, it is. + +00:43:16 You're literally lying to people. + +00:43:18 Like, please stop. + +00:43:18 It's a DX issue. + +00:43:20 The reason why people do it is because it's convenient, not because it's correct. + +00:43:24 And again, like you can do optimistic updates. + +00:43:26 You can do SBA-like things using Datastar. + +00:43:29 We don't recommend it because Datastar is more than just this tech. + +00:43:34 It's also like a way of doing things. + +00:43:37 What I wanted to point out here is that you might imagine that, you know, this is something that when you click edit, + +00:43:43 it turns it into a form. + +00:43:44 So you might like load the form into the page, hide the form, + +00:43:47 and then just do a show hide approach. + +00:43:49 But the hypermedia approach is kind of like the REST approach + +00:43:53 where you can only take the next action at any given time. + +00:43:57 So if you open the network tab, I just want to kind of walk you through this briefly. + +00:44:00 If you cancel that, when you hit edit, you will see a network request to the server. + +00:44:04 And what comes back is that form. + +00:44:07 So it's real time as in like what you're seeing now is the actual state reflected on the backend. + +00:44:13 And when you save, you're also going to see the same thing. + +00:44:16 You're going to see a network request to the server and it gets the true current state + +00:44:22 as it has been saved is now all comes back down. + +00:44:24 So it's like you don't even need optimistic updates most of the time. + +00:44:28 And when you do use it, it's because you're trying to cover up poor performance. + +00:44:33 You're favoring perceived performance over true performance. + +00:44:37 One of the things I hear a lot is people saying, but it's so much slower. + +00:44:41 But I think people are used to or think it's much slower than it is + +00:44:45 because the web, the spa life that we see around us feels so slow. + +00:44:51 But anytime I've seen people try to lean into just using the network, + +00:44:55 it's so much faster than you expect. + +00:44:57 Well, and also you have so much less our way. + +00:45:00 Your usage of network can be easily 100x less, which means you have less contention, + +00:45:04 which means when you do send something, it's there immediately. + +00:45:07 And also because you're not doing polling with polling, you have to send to the server and the server sends back. If you just send from server + +00:45:13 when something updates, now you've just halved your RTT, right? Your round trip has just halved. + +00:45:19 So you half it and you're doing like a thousand less of something. All of a sudden, things opened + +00:45:23 up for you in weird ways, right? It's a fundamentally different way of thinking about the problem. + +00:45:29 Another example that we like to bring up a lot is there was a while back, someone did a million + +00:45:33 checkbox demo and they had a whole write-up on it, right? And they basically had to take it down + +00:45:38 because it was just too expensive to run. + +00:45:39 We have a version that's not just checkboxes, but color checkboxes. + +00:45:43 So you can actually make ASCII art and stuff like that. + +00:45:45 And it's a billion. + +00:45:46 And it runs on the same server that was running that Game of Life demo. + +00:45:49 It's on the same server. + +00:45:50 It's actively, and it's been on top of Hacker News. + +00:45:52 It's a $5 VPS as far as I know. + +00:45:54 Yeah, it's a $5 one. + +00:45:56 It runs all these demos all the time, active, top of Hacker News, + +00:45:59 and it's never gone down. + +00:46:01 What's really interesting about that demo is that it becomes a backend optimization challenge, right? + +00:46:07 You're no longer trying to optimize the front end. + +00:46:09 You rely on the browser and the browser API to take care of that for you. + +00:46:14 And now you're doing, I don't know, you're optimizing your database. + +00:46:17 You're optimizing your queries. + +00:46:20 I actually threw the link to that in there. + +00:46:22 Because it's a nice demo to look at when you realize there are a billion of these being stored in a SQLite database somewhere. + +00:46:31 So you can scroll anywhere on the board. + +00:46:33 And it's like a 30,000 by 30,000 something grid because the square root of a billion + +00:46:39 is some weird number, as it turns out. + +00:46:42 And obviously this is, or not obviously, but this is multiplayer. + +00:46:45 So if I was to view this or you were to open a different browser tab, + +00:46:49 then you would see the exact same thing. + +00:46:51 The board is the same everywhere. + +00:46:52 That is crazy. + +00:46:54 The one thing that I will say that's hard, I don't know, Chris can really probably speak to this more + +00:46:58 and it sounds like a weird thing, Like, Datastar ends up in reality being like five or six things on your page, and it just + +00:47:05 gets out of the way. + +00:47:06 All of a sudden, like, most Datastar, you're going to get to a point where you try it, + +00:47:09 and you're like, that's it? + +00:47:10 Like, you will feel weird about every other approach once you really try it. + +00:47:14 Like, just try it, and you will see every other approach is wrong. + +00:47:18 Like, it's not because I made it. + +00:47:20 Like, I wish someone would have made this because it just, it's so simple. + +00:47:23 It feels like cheating in a weird way. + +00:47:26 That's hard to explain. + +00:47:27 it really it's a weird like i don't know what we all were doing i was part of the problem like right + +00:47:31 like i was like oh well google and everyone facebook and all the other guys have this figured + +00:47:36 out like this has to be the best approach so that's the weird thing is it's so simple i don't know what + +00:47:41 like crystal probably it sounds like i'm selling it but it's just i don't know it's weird it's so + +00:47:45 exciting it's so amazing and yet it's it's using all these boring technologies and like yeah like + +00:47:50 i remember i showed my wife this my status board and she's like oh yeah that looks really cool and + +00:47:55 I'm like, oh yeah, because you don't understand what everything is going on behind it, you know? + +00:47:59 Yeah, exactly. + +00:48:00 It's like, it used to be so complicated. + +00:48:03 So let's do, we got a little bit of time left. + +00:48:05 Let's do this. + +00:48:06 I think it might be fun to talk through kind of some of the attributes and what it looks like, + +00:48:13 kind of program with this a little bit and then what it looks like on the server. + +00:48:15 How's that sound? + +00:48:16 Would it make sense? + +00:48:17 Like there's a good example of a kind of a meta framework for Python called Stario, + +00:48:21 which they just got their V2. + +00:48:23 Okay. + +00:48:23 Just launched. + +00:48:24 I don't know if that is a more Python-esque way of doing it. + +00:48:27 It depends on how you want to. + +00:48:28 Let's start with some of the Datastar attributes, and then we could talk about that. + +00:48:32 How's that sound? + +00:48:33 Like, just, you know, what does it look like to say to, + +00:48:36 you know, I want to connect a button to Datastar actions + +00:48:41 on the back end or wired up and so on? + +00:48:43 We've talked about a lot about, you know, the back end driving the front end through patching elements, + +00:48:48 which is kind of the lower half of what you're looking at. + +00:48:51 To access that, you need to have a click listener or some sort of event listener to trigger that. + +00:48:58 And datastar, as the name suggests, uses data-star or asterisk attributes. + +00:49:05 So these are part of the HTML spec data set. + +00:49:09 And we just leverage that. + +00:49:10 And we have a small grammar that you'd find on the reference page with all of the data-attributes. + +00:49:19 And data on is just registering an event listener on the current element. + +00:49:23 So data on colon click is just obviously registering a click event handler on the button. + +00:49:30 And what's happening is that then Datasar also gives you actions. + +00:49:34 So that at get is an action to send a get request to the server. + +00:49:38 You pass in the path, which is slash endpoint there. + +00:49:42 And then the server takes care of the rest. + +00:49:44 So what you're seeing is a div underneath with an ID. + +00:49:48 IDs are obviously unique in HTML, so they're ideal. + +00:49:51 And Datastar just uses that fact. + +00:49:55 And what Datastar is going to do from the backend is it's going to, yeah, just send back down that div with some text content inside of it. + +00:50:03 And then what Datastar does is it mutates the incoming DOM into the existing DOM. + +00:50:10 I'm sorry, it morphs. + +00:50:11 So it uses a morphing strategy. + +00:50:14 So rather than doing a straight swap, which is what HTMLX does, it will actually morph the incoming HTML into what's currently on the page. + +00:50:23 That's kind of what opens up the door to these kind of broad, like where you send the entire document down, but only what changes get swapped in. + +00:50:30 But in this case, it's more of a fine grained thing. + +00:50:34 So only that div is going to get swapped out. + +00:50:36 And the reason why that morph matters is because you, since you aren't replacing it, things like + +00:50:41 focus and like where your input is and all that stays the same. So when you, even though you update + +00:50:46 the whole page, you're actually not actually changing the state and that's really important. + +00:50:51 So you do declarative development. You just say, I want it to look this way and it just does the + +00:50:56 right things to do it. It's from a mental ball. It's almost like having the BDOM in the backend. + +00:51:00 You just say, here's what I want this page to look like. And it does all the work, but we don't do + +00:51:03 BDOM. We don't do any of that stuff. We do the fast thing. So in terms of what your backend would + +00:51:08 send, if you can just scroll back up, it's that text that you were looking at. Let's look at the + +00:51:14 raw version because, yeah, so that's the HTML. If you scroll down to the next text, it's a code + +00:51:19 field. Yeah. There's a section that has like event, data star, patch elements, and then what the + +00:51:24 elements are and so on, right? This is like the SSE stream. Yeah. And that would be the raw events + +00:51:29 that you would send down. + +00:51:31 But if you look at the next one where we have a Python example, + +00:51:35 you would see like, well, how do you do that in Python + +00:51:38 without actually writing, you know, the raw format out? + +00:51:41 And that's how you would do it there using the Python SDK. + +00:51:44 Let's dive in a little bit to the SDK itself. + +00:51:48 So I got so many things open. + +00:51:51 Hold on. + +00:51:52 We got another link for you. + +00:51:53 No, I'm kidding. + +00:51:54 You know what? + +00:51:55 I'm just going to go. + +00:51:55 I'm going from the homepage. + +00:51:56 There we go. + +00:51:57 There you go. + +00:51:57 Chris, maybe you could talk us through this. + +00:51:59 I think before I throw it to you, though, yeah, there's a lot of framework support here. + +00:52:03 So if you're a Django person, a FastAPI person, even fast HTML, it's interesting, + +00:52:08 LightSar, Quartz, SanEck, or Starlet. + +00:52:10 There's a bunch of different ones here, but maybe just talk us through this, if you'll, Chris. + +00:52:16 I'm trying to remember. + +00:52:16 I'm not as familiar with the example, but as you can see, this one method is, I think, where the magic happens. + +00:52:22 I'm trying to remember which tool. + +00:52:24 This is a Quartz. + +00:52:25 Yeah, the Quartz is the examples in Quartz. + +00:52:27 So, you know, they first define a route, a home route slash, and it returns HTML and it's just in the string there. + +00:52:34 Right. And then that. + +00:52:35 This could be a Ginget or Chameleon or whatever template. + +00:52:38 Like it's just whatever. It doesn't matter. + +00:52:39 But somehow they get it. Yeah. + +00:52:40 Makes the example easier to see in one go. + +00:52:42 And obviously you see that it's pulling Datastar from the CDN and then it on the load, it gets it sends a request to the slash updates endpoint. + +00:52:54 See what comes from that. + +00:52:56 And so down below that, you have the slash updates endpoint, which has a decorator called data, data star underscore response. + +00:53:03 And that just does a couple of nice things like sets the HTTP headers and whatnot to be the service and event protocol. + +00:53:10 And then what I like to the first line says signals equals await read signals. + +00:53:16 And so that's another helper that essentially says when I have a request coming in, + +00:53:21 data star has a specific way of sending the state of the front end to the back end. + +00:53:25 So the back end can do whatever it needs. + +00:53:26 Right. We haven't even talked about signals yet. They're like kind of a data binding set of JavaScript data that loads, you know, reactive data loads on the front end, right? + +00:53:35 In some ways, the Alpine JS kind of, I don't want to say equivalent, + +00:53:39 but it covers similar functionality. + +00:53:41 And so if you have data on the front end that the backend would like to know, + +00:53:45 that's an easy way to get it. + +00:53:47 And then essentially what happens is we get into this loop, this while true loop, + +00:53:51 and Datastar will just start sending down server sent events in text + +00:53:56 by using the sse.patchElements function, or I guess it's a method technically. + +00:54:01 And all it's doing is sending a string that has the current date time dot now in ISO format down. + +00:54:07 And then we wait, we sleep for a second, or is it a second? + +00:54:11 I guess it's a microsecond. + +00:54:12 I keep forgetting which one. + +00:54:13 Yeah, that's a millisecond. + +00:54:14 No, no, it's second. + +00:54:15 And sleep is seconds. + +00:54:16 Sleep is seconds. + +00:54:17 It takes a float. + +00:54:18 So once it sleeps, it sends another server sign event. + +00:54:21 With this time, it's instead of sending the HTML down, we're sending a signal. + +00:54:27 So essentially changing, say, you can say, like the JavaScript value or script data + +00:54:31 onto the front of the page. + +00:54:32 Right, right. + +00:54:33 So it's showing that you can send the HTML and let Datastar patch it, or you can basically + +00:54:38 from the server set one of these signal things that will be reactive on the front end, right? + +00:54:43 Yeah. + +00:54:43 You said it much better than I did. + +00:54:45 Thanks. + +00:54:45 It's a long way of saying it's a clock, right? + +00:54:48 Yeah. + +00:54:48 Also the thing, just for people that aren't used to thinking about this way, especially + +00:54:51 if you're doing Python, like all a signal is, is instead of saying that here's, I'm + +00:54:56 setting a variable, you're saying I'm setting a relationship that says like, kind of like + +00:55:00 in an Excel document when you set a formula for a cell, + +00:55:04 it's the same idea. + +00:55:04 You're setting up a relationship saying, when this thing and this thing changes, update this. + +00:55:08 And it does smart things to do that efficiently. + +00:55:10 But the idea is it's a relationship. + +00:55:12 It's declarative. + +00:55:13 So kind of like with SQL, you think of SQL as a declarative language, right? + +00:55:17 You don't care how it creates an index. + +00:55:19 You just say, create index. + +00:55:20 Same thing happens here. + +00:55:21 You just say, hey, I want when this thing changes, this other thing to change. + +00:55:25 And the problem is that declarativeness is not built into JavaScript. + +00:55:28 It's not built into the browser, but we just made the web a little bit more declarative. + +00:55:32 That's all we did, basically. + +00:55:33 Right. + +00:55:34 Declarative is generally pretty good. + +00:55:36 It's a good way to work. + +00:55:37 It keeps things simple and lets the underlying system have at it. + +00:55:40 So a couple of things, well, we still got a little bit of time, + +00:55:44 but to wrap things up a little bit. + +00:55:46 Editors. + +00:55:47 I think having good editor support is really important for adoption. + +00:55:52 You know, drives me crazy when I go and try to work with JavaScript, CSS, + +00:55:56 attributes or whatever, and I'm like, they're not here. + +00:55:59 No help. + +00:56:00 So you all have nice extensions and plugins for common editors Python people might use, right? + +00:56:06 Yeah, we have VS Code, which you're seeing here, and PHP Storm. + +00:56:12 Or sorry, I use PHP Storm, but all JetBrains editors. + +00:56:16 PHP Storm, PyToram, WebStorm, all of them things. + +00:56:19 So it's in the JetBrains marketplace, so it'll work for, I believe, all JetBrains IDEs. + +00:56:25 I believe so. + +00:56:25 You also have the AI editors covered. + +00:56:28 Do we? + +00:56:29 In the OpenVSX registry, all the ones that have been kicked out from VS Code, this is where they all have to go to get their installs, right? + +00:56:37 That explains why people requested this from me. + +00:56:41 Yeah, if you're doing cursor, anti-gravity, windsurf, like all those things, they were all kicked out of the VS Code registry. + +00:56:50 That's not a complaint. + +00:56:51 I mean, it's a Microsoft product. + +00:56:53 They built it. + +00:56:54 They don't have to build all the other ones. + +00:56:56 But that's why they're here, right? + +00:56:57 We keep those up to date. + +00:56:58 We do those ourselves, the SDKs. + +00:57:01 I mean, Delaney wrote the Go one, I wrote the PHP one, and the rest are just community contributions. + +00:57:07 We've had contributions to these too, to the IDE extensions. + +00:57:11 We maintain these primarily. + +00:57:13 Yeah, and these are great. + +00:57:14 These just, you know, save on typing, but more importantly, save on making typos. + +00:57:20 You know, they show you all of the available data attributes. + +00:57:23 Maybe Chris can speak more to is that the irony is, though, + +00:57:26 you won't need that many tags to actually do your work. + +00:57:29 So it's not like a tailwind thing where you're like, + +00:57:31 oh, I rely on it to autocomplete. + +00:57:33 It's just... + +00:57:33 Yeah, absolutely. + +00:57:34 In fact, it's one of those things where I discover more things I can do with Datastar + +00:57:38 because as I'm typing data dash and I'm like, oh, I didn't actually remember + +00:57:42 that there's a attribute to do whatever it is. + +00:57:45 I don't remember. + +00:57:46 Like, I can't remember at this point. + +00:57:48 And then I went to the documentation like, oh, check this out. + +00:57:50 This is so much more I can do. + +00:57:52 But yeah, I find I love the plugin, but I find I don't use it too much + +00:57:55 just because I'm not writing as much HTML with it. + +00:57:58 While I'm sitting here on this open VSX registry, do you all have advice for making Datastar work well + +00:58:05 with Identic AI and Claude Code, Cursor, et cetera? + +00:58:08 There's some active research going on like in Oslo at a college + +00:58:13 that's doing, ironically, using Datastar to do some stuff around like how LLMs work with code bases. + +00:58:22 And the reason why is because the entire code base fits in basically every context, even the nano ones, like the entire code base fits there. + +00:58:28 And what they found, we've gone back and forth a bit, is that almost all of them are completely + +00:58:33 overfitted. So if you just want to make a website with agentic stuff, go do React, + +00:58:38 because that's what it's built for. And it's overfitted to such a degree that if you try to + +00:58:43 use the spec correctly and to say, here's all the sorts of data star, go use it to build websites, + +00:58:49 it will fall over almost in every regard. + +00:58:51 So it's one of those things where you don't need that much, + +00:58:54 but it will ironically show you how bad things like Clawed and Codex and stuff + +00:58:59 are at just using the current context to solve things. + +00:59:03 Hopefully that gets better, but we have something around, + +00:59:05 like we have a slash docs page that you can feed into your LLM, + +00:59:08 but I'll say that we do not focus on that at all because you're basically fighting against what the training already happened. + +00:59:15 So you're better off, like if you want to use, if you want to make better size, + +00:59:19 you want to be fast and efficient and all that stuff, + +00:59:21 we're 100% the right thing to do. + +00:59:22 If you just want to like one shot something, go use React and stay in that world. + +00:59:26 You want to vibe code it? + +00:59:28 Hey, I've got something. + +00:59:29 I feel like this might resonate with you, Delaney, especially the way you just described it. + +00:59:34 Have you all seen the Kai Lintit Senior Engineer Tries Vibe Coding? + +00:59:39 This is an amazing video. + +00:59:41 And like half of the video is like, no, no, no, not in being installed. + +00:59:46 What are you doing? + +00:59:47 It reminds me very much of like, it's just like, nope, that's not what I told you to do. + +00:59:52 I know that's what you think the most common thing is, please stop. + +00:59:54 Yeah. + +00:59:55 And the thing is that it's not that I actually like a lot of the stuff, but I treat it as + +00:59:58 an autocomplete or like it can write code faster than I can when it comes to like, hey, change + +01:00:03 this in 27 different places. + +01:00:05 And I forget which files I did it like. + +01:00:06 There's value to it, but people are trying to use it to learn. + +01:00:10 It's a complete, it actively is working against you. + +01:00:13 Ben has done an amazing job with the guide. + +01:00:15 Like, please, like, it's fine to use the LLMs to help, like, guide your process and to, like, + +01:00:20 knock stuff out quickly once you have a baseline. + +01:00:22 But you have to know when to say no. + +01:00:24 And he has done it. + +01:00:25 The guide takes half an hour, less than a half, like, probably 15, 20 minutes to read + +01:00:29 and then, like, an hour to actually work through. + +01:00:32 Please try it first before you try to throw it at the LLMs. + +01:00:35 It's not because I hate them. + +01:00:37 It's more that they are just overfit to the, like, the sea of badly written SPA code. + +01:00:42 That's, unfortunately, that's the situation we're in. + +01:00:45 Yeah, especially with JavaScript, the agentic AI is very trained. + +01:00:49 It wants what it wants. + +01:00:51 All right. + +01:00:52 Let's talk, speaking of being near the guy, if I go over here to more, there's a pro section. + +01:00:59 I'll let you all give a shout out to pro. + +01:01:02 I know you have a really strong sales pitch here. + +01:01:04 You were talking about earlier. + +01:01:06 Now, what is this data, Datastar Pro? + +01:01:08 It's been about a year since we released the beta one of Datastar. + +01:01:14 We are taking our sweet ass time for a very good reason, which is that we want version one to be the last version or like the last major version. + +01:01:23 We don't really want to force people through breaking changes and major updates because that's really just a pain. + +01:01:30 And I think like Python has done a great job with that and Go as well. + +01:01:36 And like there are some ecosystems where you just don't make breaking changes. + +01:01:39 That's the norm. And that's what we want to be. + +01:01:41 And the JavaScript ecosystem is, you know, the antithesis to that. + +01:01:45 They're like, here, hold my beer. + +01:01:47 I'll show you breaking changes. + +01:01:48 Yeah. + +01:01:48 Have you heard of LeftPad? + +01:01:49 To give you an idea of how far we take that, we don't have npm. + +01:01:53 Like we don't actually even submit to npm. + +01:01:55 We have no package.json in our JavaScript framework. + +01:01:59 We actually, like, there's none of that stuff. + +01:02:02 It does not exist in our ecosystem. + +01:02:03 So we take it very seriously when we say it's funny to have a JavaScript framework that + +01:02:07 actively hates the JavaScript ecosystem. + +01:02:09 And you guys, I think also it's worth pointing out that you don't have a strong build step, + +01:02:14 tree shaking, web packing story, right? + +01:02:17 You just dropped- + +01:02:17 No, we do. + +01:02:18 The thing is that, like for example, beat is kind of the well-known way to do this stuff. + +01:02:25 But guess what? + +01:02:25 Under the hood, it uses ES build. + +01:02:27 And ES build is a go thing. + +01:02:29 We build a lot of our stuff in go. + +01:02:30 So it's literally embedded in our, like we just use the ES build directly. + +01:02:33 We don't need 20,000 things from npm. + +01:02:36 We just use the go tools inside of our binary because that's the fast thing to do. + +01:02:40 So we don't need all of that. + +01:02:41 So we have no dependencies, nothing, and we don't even use npm or any of that at all. + +01:02:46 Yeah, so the reason I mentioned the beta is during the beta phase, which lasted about six months, + +01:02:51 we, Datastar gained a lot of traction, a lot of interest, and people had a lot of requests. + +01:02:56 And we were like, yeah, we see, and because it's plugin-based, + +01:03:00 you can always just add another plugin. + +01:03:01 You can add it yourself, or we can build a plugin and add it to Datastar. + +01:03:05 But we were very adamant about keeping the open source Datastar framework as tight as possible. + +01:03:15 Like I said, it should do everything you need, but nothing you don't. + +01:03:18 So how do we do that while adding plugins? + +01:03:21 So during that beta phase, we started thinking about, well, do we have multiple versions of Datastar? + +01:03:26 Do we have a marketplace of plugins? + +01:03:28 Or how do we manage that? + +01:03:30 And at the same time, we were also asking ourselves, because Delaney and I both, we have full-time things that we're doing. + +01:03:35 And this is a side project, but we're almost doing it full time alongside our other full time things. So how do we make this project sustainable? Because it doesn't stop at Datastar. You probably see Rocket and Stellar CSS on that page in the navigation sidebar. Those are like projects that build on top of Datastar. So Datastar is just the foundation. And Rocket kind of takes it to web components and Stellar CSS is a CSS framework that builds on top of these concepts. + +01:04:05 So we're trying to fix not only JavaScript web components, but also CSS. + +01:04:10 So we have a long-term vision. + +01:04:12 How do we make that sustainable when we're both busy people anyway, and this just takes + +01:04:18 so much of our time and the project appears to be growing? + +01:04:22 So at that point, we decided, well, how do we want to even run this? + +01:04:25 So we decided we don't want to found some company and do VC like we're, if anything, + +01:04:31 anti-VC funding. + +01:04:32 So we founded a nonprofit organization in the U.S. called Star Federation, and that's what backs this project, including Rocket and Stellar CSS. + +01:04:44 To help fund that organization, we decided let's have Datastar be the open source framework, but then something called Datastar Pro, which is like all those plugins that we think are good ideas, but that most people don't need. + +01:04:58 We'll put those into Datastar Pro, and that can kind of grow over time. + +01:05:01 It's a collection of plugins that you might want if you're using Datastar in a professional setting. + +01:05:07 But, you know, if you're just using Datastar, you don't actually need it. + +01:05:11 And so that's what we tell people. + +01:05:12 Most people don't need it. + +01:05:13 It's a collection of plugins and it's a Datastar inspector, which sits on your page. + +01:05:19 You get access to the bundler and now you get access to Rocket and StadrCSS, which is a work in progress. + +01:05:26 Yeah, that was, I think, a good decision. + +01:05:28 Like there was definitely some uproar initially that, you know, some plugins were taken away, but those plugins were never taken away. + +01:05:35 They still exist in the repo if anybody needs them. + +01:05:38 What the result is, is that we have like some money coming into a bank account, which is not even used to pay maintainers. + +01:05:46 We use that for running costs and like for, you know, podcasting software. + +01:05:53 And if we need to travel to conferences, which we've yet to do. + +01:05:56 But essentially, it's like a way of having some money into the bank so that we can justify all of the work that we do in maintaining Datastar and pushing that forward. + +01:06:06 But the V1 thing. + +01:06:07 100% free sounds great until that means it becomes abandoned where, you know, and like people can't work on it anymore. + +01:06:14 And I think it's fair. + +01:06:16 There's one thing that's kind of interesting about the model, because especially with the tailwind stuff that's been going on lately. + +01:06:21 One of the things that we talked about, and people get very angry about this, but for example, + +01:06:26 Rocket is a web component layer that you basically just write Datastar in a declarative way, + +01:06:31 and it dynamically generates web components for you on the fly. + +01:06:34 And it's a great way to build web components. + +01:06:36 It'll save you tons of hours. + +01:06:38 And people won't pay for features. + +01:06:39 They pay for convenience. + +01:06:40 So the thing is, people said, well, I want you to generate out the content and make that + +01:06:46 open and available. + +01:06:46 And we said no, because basically the way we look at it is that almost like Pico 8, + +01:06:51 or any kind of game engine, you pay for the game and then all the mods are free. + +01:06:55 So all the rocket components and all this stuff is gonna be free, but the core engine is not free. + +01:07:01 It's a paid thing. + +01:07:02 And the reason why is if it becomes successful, if we do our job and we make it so it's easy for everybody, + +01:07:07 the Star Federation will do better over time. + +01:07:10 Whereas Tailwind's model of they're competing against every other person in that space, + +01:07:15 whereas it just does not work. + +01:07:16 So our thing is if we do get successful and we do get more people, + +01:07:20 then it's self-sustaining as in you paid for this little engine + +01:07:24 and now you get all the ecosystem around it of open source. + +01:07:27 So you can do open source in a way, but you have to find a core engine that is not open source. + +01:07:32 Otherwise it will fail in the modern world. + +01:07:34 Let's close this thing out with two super quick things + +01:07:37 because I know we're over time. + +01:07:38 Roadmap. + +01:07:39 Ben, you talked about taking your sweet time to 1.0. + +01:07:42 Is there a forward-looking roadmap? + +01:07:44 Are you guys done or what are things? + +01:07:46 The release counted RC1, I think it was about six months ago. + +01:07:51 and like the release has just been slowing down, slowing down. + +01:07:54 So that stagnation in like just releases with fixes is a good sign to me that we're very, very close. + +01:08:00 At this point, like the switch from release candidate to stable + +01:08:05 is just literally like just, you know, dropping the RC. + +01:08:09 There's no like features that are going into it. + +01:08:11 There's no big changes. + +01:08:13 We're taking our time because like I said, it's easy to put something up slightly prematurely + +01:08:18 and get some defaults wrong. + +01:08:20 I mean, that's what happened with HMX 2 was just like fixing some defaults that they decided they got wrong in version one. + +01:08:27 So we're trying to avoid a situation like that. + +01:08:30 And the only way to do that is to just let it simmer, let people use it, let people dog food it. + +01:08:35 And us, too, we're actively using for many projects using data ourselves and discovering every now and then, oh, this default is probably we're trying to avoid foot guns. + +01:08:46 So we're trying to make it so that the defaults give you the best possible experience that you need zero configuration, ideally, but you can configure as needed. + +01:08:55 But getting those defaults right is really the only thing stopping us from, not right, but locked down is the only thing stopping us from a V1 stable. + +01:09:05 I don't like to give timelines. + +01:09:07 In fact, it's one of our things that I tell Delaney, never promise a timeline. + +01:09:12 But I could see us in the first half of this year, just flipping the switch. + +01:09:18 But it sounds like you might be able to use the RC and you might more or less be safe, yeah. + +01:09:22 In fact, we recommend people rename the RC and they change the name to React-Foo + +01:09:28 so that they just drop it in their React projects because the entire framework is smaller than most components. + +01:09:34 Just start hiding it places. + +01:09:36 Yeah, don't even name it Datastar. + +01:09:38 A stealth takeover of the spa world. + +01:09:41 Awesome. I love it. + +01:09:42 All right, let's wrap up the show with a final call to action + +01:09:46 for people who want to use Datastar, learn more, get started. + +01:09:50 Chris, I'll let you go first so Ben and Delaney can have the final word. + +01:09:54 The first thing I was thinking of is because I get asked so much about + +01:09:59 how long it takes to connect to the server and things like that, + +01:10:03 there is a portion in the DjangoCon talk I gave in, I think it was 2023, + +01:10:08 where I showed a video of five phones, five Android phones, trying to do the same thing, + +01:10:14 shopping for eggs, I believe it was. And essentially one of them is an HTML driven + +01:10:20 multi-page app and it smokes the single page applications, the native apps and everything. + +01:10:25 And so I put a link in our chat. Maybe you'll be a part of the show notes. It's a deep link to go + +01:10:30 straight to that portion of the talk because it is like that video reminded me like, this is what + +01:10:36 I want to build. I want to build websites that are fun for people to use. And, you know, Datastar + +01:10:43 enables me to use real-time interactions with way less complexity than I ever thought + +01:10:50 possible. So I guess the two things I would say is, one, check out the deep link if you're at all + +01:10:55 interested. And number two, definitely try out something, you know, just even clone the Python + +01:11:00 repo and just try some of the examples and see what it's like. Yeah, I'll definitely link to that. + +01:11:05 Cool. Thanks, Ben. + +01:11:06 I also gave a conference talk last year. + +01:11:08 There's a recording, so I'll send you the link to that, which really walks through my journey of Datastar + +01:11:13 and how Datastar has truly opened my eyes to what's possible. + +01:11:18 I feel like I talk a lot about how Datastar is a journey of unlearning + +01:11:23 old and bad patterns, deeply rooted ones in what I think web development is. + +01:11:29 And these days, as I mentioned, I never would have thought that I'd be developing in Go, + +01:11:35 but I see like all like the, like I think even Python + +01:11:39 is getting better concurrency support, right? + +01:11:41 So I think you talked about that recently, Michael, here. + +01:11:44 So now I'm seeing with Datastar, I can do so much more on the backend. + +01:11:49 I can be so much more creative on the backend and that's what interests me. + +01:11:53 So it's just fun. + +01:11:55 What can I say? + +01:11:56 It's fun. + +01:11:57 I'm still really jealous of that presentation too. + +01:11:59 Well done with it. + +01:12:00 Yeah, certainly send me the link. + +01:12:01 I'll put it in the show notes and well done. + +01:12:03 Delaney. + +01:12:04 The irony is that like, I don't consider myself a web dev at all. + +01:12:07 It just happens to be something I do a little bit of. + +01:12:09 The thing that is the, when I first started making this public, I was like, Hey, I think + +01:12:14 I'm onto something like someone proved me wrong. + +01:12:16 I was, I'm a little bit more like kind of, I always say like in the jujitsu world type + +01:12:21 stuff, like you want someone to roll with you, not because you're trying to up them. + +01:12:24 It's that like, they're trying to help you find weaknesses in your game. + +01:12:27 Right. + +01:12:27 So I want there to be an active, like someone proved me wrong. + +01:12:31 And I'm at the point now where I feel so confident. + +01:12:34 I will put money on it. + +01:12:35 I've tried going out to people out in the dev Twitter and all that. + +01:12:38 I guarantee you, and I'm happy to put money up on this, + +01:12:41 if you could do it the Datastar way, whether you're using React or HTMLX or any other approach, + +01:12:46 it will be less code. + +01:12:47 It'll be faster. + +01:12:48 It'll be cheaper. + +01:12:49 And it'll be simpler to understand. + +01:12:52 I will take up anybody anywhere on that thing. + +01:12:55 Basically, it's not a boast. + +01:12:56 It's just the facts on the table. + +01:12:59 And it's a paradigm shift that I want the world to know about + +01:13:02 just so that people understand, hey, there's going to be someone + +01:13:05 that comes up with something better than I did, right? + +01:13:06 Like I'm standing, the reason why we have the fastest signal library in the world + +01:13:10 is because we listen to the people that are really good at that. + +01:13:12 We use alien signals. + +01:13:13 The reason why we have the fastest morphing library is that we listen to people and said, + +01:13:16 hey, there's people that care about this stuff and are working towards it. + +01:13:20 It's not that there's anything special here. + +01:13:22 It's that it's trying to build an ecosystem of like people that care about performance + +01:13:25 and people care about the details. + +01:13:27 And if you do that, then everything gets better. + +01:13:30 So it's not just where are we at now, but if anyone thinks they can do better, + +01:13:34 please join us. + +01:13:34 We want to hear it. + +01:13:35 But like, I'm done having the vibe code, or not the vibe code, + +01:13:39 but like the vibes around like, well, this doesn't feel like a spa + +01:13:42 or like a spa has its place. + +01:13:44 A couple of episodes ago, there was a Cody from the Litestar stuff said, + +01:13:48 there's a time and place for HTMX or Datastar. + +01:13:50 And he's just, that's just not true. + +01:13:52 It's just like hypermedia is the way to build things + +01:13:55 for a hypermedia client, which is the browser. + +01:13:57 So I will, anyone that can show that it's wrong, please let us know. + +01:14:01 Like we're here to help. + +01:14:02 I would love to see this paired with some Electron JS apps + +01:14:06 to make your desktop apps a little better too. + +01:14:08 So anyway. + +01:14:09 Seriously, oh my God. + +01:14:10 That's a different episode. + +01:14:11 So I just want to say thank you, Delaney, Ben and Chris. + +01:14:15 Thank you all for being here. + +01:14:16 And congrats on Datastar. + +01:14:17 It looks like a super project. + +01:14:18 I'm looking at some projects or an app running right over there + +01:14:23 on my left that I kind of wish I'd built Datastar. + +01:14:25 Well, and the thing is to make sure to not think that it's just used, like, yes, we talk + +01:14:29 about the real-time stuff, but it's better for even just normal crud stuff. + +01:14:32 And that's kind of hard to, like, it's not as sexy to talk about, but it's better at + +01:14:36 that too. + +01:14:36 Well, it's also 80% of what gets built. + +01:14:38 So it's important to like point it out, right? + +01:14:40 Yeah. + +01:14:41 All right. + +01:14:41 Bye everyone. + +01:14:42 Thanks for being here. + +01:14:42 Thank you. + +01:14:44 This has been another episode of Talk Python To Me. + +01:14:46 Thank you to our sponsors. + +01:14:47 Be sure to check out what they're offering. + +01:14:49 It really helps support the show. + +01:14:50 Take some stress out of your life. + +01:14:52 Get notified immediately about errors and performance issues in your web or mobile applications with Sentry. + +01:14:58 Just visit talkpython.fm/sentry and get started for free. + +01:15:03 Be sure to use our code talkpython26. + +01:15:06 That's Talk Python, the numbers two, six, all one word. + +01:15:11 This episode is brought to you by CommandBook, a native macOS app that I built that gives long-running terminal commands a permanent home. + +01:15:18 No more juggling six terminal tabs every morning. + +01:15:21 Carefully craft a command once, run it forever with auto restart, + +01:15:24 URL detection, and a full CLI. + +01:15:26 Download it for free at talkpython.fm/command book app. + +01:15:30 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses + +01:15:35 on topics ranging from complete beginners to async code, + +01:15:39 Flask, Django, HTMX, and even LLMs. + +01:15:42 Best of all, there's no subscription in sight. + +01:15:45 Browse the catalog at talkpython.fm. + +01:15:47 And if you're not already subscribed to the show on your favorite podcast player, + +01:15:51 what are you waiting for? + +01:15:53 Just search for Python in your podcast player. + +01:15:55 We should be right at the top. + +01:15:56 If you enjoyed that geeky rap song, you can download the full track. + +01:15:59 The link is actually in your podcast blur show notes. + +01:16:02 This is your host, Michael Kennedy. + +01:16:03 Thank you so much for listening. + +01:16:05 I really appreciate it. + +01:16:06 I'll see you next time. + +01:16:18 I'm out. + diff --git a/transcripts/537-datastar.vtt b/transcripts/537-datastar.vtt new file mode 100644 index 0000000..c21f5d7 --- /dev/null +++ b/transcripts/537-datastar.vtt @@ -0,0 +1,4586 @@ +WEBVTT + +00:00:00.020 --> 00:00:05.060 +You love building web apps with Python, and HTMX got you excited about the hypermedia approach. + +00:00:05.720 --> 00:00:10.220 +Let the server drive the HTML, skip the JavaScript build step, keep things simple, right? + +00:00:11.020 --> 00:00:16.780 +But then you hit that last 10%. You need AlpineJS for interactivity, or your state gets out of sync, + +00:00:16.810 --> 00:00:20.780 +and suddenly you're juggling two unrelated libraries that weren't really designed to work + +00:00:20.960 --> 00:00:26.660 +together. What if there was a single 11-kilobyte framework that gave you everything HTMX and + +00:00:26.680 --> 00:00:31.760 +AlpineJS did, and more with real-time updates, multiplayer collaboration out of the box, + +00:00:31.760 --> 00:00:36.480 +and performance so fast, you're actually bottlenecked by your monitor's refresh rate. + +00:00:37.120 --> 00:00:37.840 +That's Datastar. + +00:00:38.420 --> 00:00:43.240 +On this episode, I sit down with its creator, Delany Galan, core maintainer, Ben Crocker, + +00:00:43.780 --> 00:00:50.320 +and Datastar convert, Chris May, to help explore how this backend-driven, service-sent event-first + +00:00:50.720 --> 00:00:55.160 +framework is changing the way full-stack developers think about the modern web. + +00:00:55.820 --> 00:01:01.900 +This is Talk Python To Me, episode 537, recorded January 15th, 2026. + +00:01:19.600 --> 00:01:24.360 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:24.880 --> 00:01:26.280 +This is your host, Michael Kennedy. + +00:01:26.720 --> 00:01:30.200 +I'm a PSF fellow who's been coding for over 25 years. + +00:01:30.860 --> 00:01:31.960 +Let's connect on social media. + +00:01:32.280 --> 00:01:35.440 +You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:35.610 --> 00:01:37.560 +The social links are all in your show notes. + +00:01:38.320 --> 00:01:41.840 +You can find over 10 years of past episodes at talkpython.fm. + +00:01:42.020 --> 00:01:45.260 +And if you want to be part of the show, you can join our recording live streams. + +00:01:45.560 --> 00:01:46.040 +That's right. + +00:01:46.250 --> 00:01:49.500 +We live stream the raw uncut version of each episode on YouTube. + +00:01:50.060 --> 00:01:54.500 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:54.730 --> 00:01:58.380 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:59.380 --> 00:02:01.000 +This episode is brought to you by Sentry. + +00:02:01.380 --> 00:02:02.620 +Don't let those errors go unnoticed. + +00:02:02.900 --> 00:02:04.420 +Use Sentry like we do here at Talk Python. + +00:02:04.920 --> 00:02:07.780 +Sign up at talkpython.fm/sentry. + +00:02:08.300 --> 00:02:13.600 +And it's brought to you by CommandBook, a native macOS app that I built that gives long-running + +00:02:13.800 --> 00:02:15.240 +terminal commands a permanent home. + +00:02:15.640 --> 00:02:17.640 +No more juggling six terminal tabs every morning. + +00:02:18.020 --> 00:02:22.900 +Carefully craft a command once, run it forever with auto restart, URL detection, and a full CLI. + +00:02:23.260 --> 00:02:26.380 +Download it for free at talkpython.fm/command book app. + +00:02:27.180 --> 00:02:29.900 +Ben, Delaney, Chris, welcome to you all. + +00:02:30.380 --> 00:02:31.560 +Thanks for being here on Talk Python To Me. + +00:02:31.680 --> 00:02:32.120 +Thanks for having us. + +00:02:32.320 --> 00:02:33.020 +Hey, how are you doing? + +00:02:33.060 --> 00:02:34.060 +Doing well, doing well. + +00:02:34.280 --> 00:02:42.040 +Very excited to talk about Datastar and some cool web frameworks for Python people and beyond, of course. + +00:02:42.220 --> 00:02:44.940 +But, you know, most people listening doing Python web frameworks. + +00:02:45.140 --> 00:02:46.980 +So talk about how that all integrates. + +00:02:47.160 --> 00:02:52.400 +And if you like the HTMX vibe, which we've talked a lot about on the show, I think there's + +00:02:52.500 --> 00:02:54.140 +going to be a lot to like here as well. + +00:02:54.190 --> 00:02:54.960 +And maybe more. + +00:02:55.140 --> 00:02:55.420 +We'll see. + +00:02:55.530 --> 00:02:56.860 +A case to be made. + +00:02:57.380 --> 00:03:02.280 +But, you know, before we get into all of that, though, let's just talk about a quick introduction + +00:03:02.660 --> 00:03:06.840 +for everyone here and like go around the squares of Ben, I'll let you go first. + +00:03:07.300 --> 00:03:07.760 +Who are you, Ben? + +00:03:08.320 --> 00:03:09.920 +Based in Costa Rica at the moment. + +00:03:10.260 --> 00:03:14.140 +I'm based in Europe most of the year, but half of the year my wife and I spend here. + +00:03:14.640 --> 00:03:26.280 +In terms of background, I've been primarily working with PHP for well over 20 years and got involved with Delaney and Datastar, been a core maintainer on that project ever since. + +00:03:26.840 --> 00:03:37.320 +And I looked at my commit history for last year, and it turns out now I write more Go code than PHP, so I don't want to call myself a PHP developer anymore. + +00:03:37.320 --> 00:03:44.220 +I'm just a web developer, a backend web developer, primarily that also writes TypeScript and maintains a front end. + +00:03:44.560 --> 00:03:49.120 +framework. There's a lot of stuff going on and ways in which you can write code for the web these days. + +00:03:50.420 --> 00:03:52.900 +Well, thanks. Awesome to have you here. Delaney, hello. + +00:03:53.060 --> 00:03:58.180 +Hi, how you doing? Yeah, I have kind of a weird checkered background into web development. I was + +00:03:58.380 --> 00:04:04.280 +originally in the circus, then I became a 3D artist, then I became an engineer. I've worked in + +00:04:04.780 --> 00:04:10.720 +games, video games, slot machines, military applications, all kinds of crazy things. + +00:04:11.140 --> 00:04:13.880 +I tend to work on really highly optimized, fast things. + +00:04:14.480 --> 00:04:16.459 +I love the ideas of the web, + +00:04:16.820 --> 00:04:20.600 +but I got really tired of how you actually implement things in that. + +00:04:20.600 --> 00:04:24.420 +And I was doing very large applications with millions of updates a second. + +00:04:24.900 --> 00:04:27.120 +And the tools that were out there just weren't good enough. + +00:04:27.320 --> 00:04:29.580 +So I ended up going down many, many rabbit holes + +00:04:30.060 --> 00:04:32.920 +and finally found something to make it better for everybody else. + +00:04:33.470 --> 00:04:34.400 +So yeah, that's really cool. + +00:04:34.480 --> 00:04:36.200 +And wow, what a really interesting issue. + +00:04:36.300 --> 00:04:38.120 +I know you got some crazy stories. + +00:04:38.420 --> 00:04:38.920 +Yes, I do. + +00:04:39.080 --> 00:04:42.400 +I always have a funny, weird outcome of something. + +00:04:42.820 --> 00:04:44.860 +Ironically, people talk about things being a circus, + +00:04:45.140 --> 00:04:47.500 +but like circuses are very well run logistic machines + +00:04:48.100 --> 00:04:50.020 +compared to most developer situations. + +00:04:50.090 --> 00:04:50.700 +So it's kind of funny. + +00:04:50.800 --> 00:04:52.060 +Yeah, it's an insult to circuses. + +00:04:52.140 --> 00:04:52.640 +Yes, it is. + +00:04:52.640 --> 00:04:53.260 +It really is. + +00:04:54.220 --> 00:04:54.560 +Amazing. + +00:04:55.040 --> 00:04:55.200 +Okay. + +00:04:55.540 --> 00:04:57.980 +And what we're going to talk about, + +00:04:58.100 --> 00:05:02.220 +Datastar has this amazing ability to update many things + +00:05:02.570 --> 00:05:05.020 +really quickly in real time, which we'll get into, + +00:05:05.260 --> 00:05:06.980 +but yeah, sort of foreshadowing there. + +00:05:07.160 --> 00:05:09.820 +And Chris May, welcome to the show. + +00:05:10.380 --> 00:05:12.600 +I've known you for a long time and I'm really happy to have you here. + +00:05:12.740 --> 00:05:13.980 +Great to be here. Thank you so much. + +00:05:14.180 --> 00:05:14.380 +Yeah. + +00:05:14.520 --> 00:05:25.120 +Yeah. So about me, I started writing websites back in 1995 and then picked up Python about 10 or so years later and just have really enjoyed the ride since then. + +00:05:25.700 --> 00:05:30.220 +Picked along the way, became technical coach and just loved making single page applications. + +00:05:30.560 --> 00:05:32.220 +I loved, I just love the web. + +00:05:32.400 --> 00:05:36.340 +You know, I love that we can publish something from our computer and anybody around the world can see it. + +00:05:36.620 --> 00:05:40.800 +And then what, maybe a little over a year ago, I, oh no, it was more than that. + +00:05:40.860 --> 00:05:46.300 +I remember I was on a trip and I was listening to a podcast of HXPod, the HTMX podcast, + +00:05:46.820 --> 00:05:50.220 +and heard about this crazy, cool tool, Datastar. + +00:05:50.220 --> 00:05:54.520 +And I was like, I even put in my DjangoCon presentation, like you should, everybody else + +00:05:54.640 --> 00:05:55.120 +should try it out. + +00:05:55.150 --> 00:05:56.820 +And finally I did and I'm converted. + +00:05:56.910 --> 00:05:57.560 +I love it. + +00:05:57.770 --> 00:06:00.200 +So I'm excited that the three of us get to talk about it. + +00:06:00.240 --> 00:06:05.919 +The reason that we're having this podcast is because I read your article about switching + +00:06:05.940 --> 00:06:10.760 +to Datastar. And I'm like, okay, this is interesting. You made the case very well. Of + +00:06:10.760 --> 00:06:18.000 +course, I'll link to the article. And so I thought, hey, I need to have Chris here as my Tony Romo to + +00:06:18.120 --> 00:06:24.520 +my Al Michaels or Nico Rosberg to my Crofty or whatever, right? So I'm happy to have you here. + +00:06:24.760 --> 00:06:30.380 +Exactly. Awesome to have you here. So let's just start with what is Datastar, right? I mean, + +00:06:30.420 --> 00:06:37.220 +we've hinted that it has some similarities to htmx but also not so ben and delaney give us the + +00:06:37.430 --> 00:06:41.920 +overview what is datastar so i can give a little bit of history and then ben's probably better at + +00:06:42.140 --> 00:06:47.720 +saying what it is now i have a background in like low-level stuff um even though i was a 3d artist + +00:06:47.900 --> 00:06:52.660 +first i'm much more comfortable in like shader development and that kind of thing like so glsl + +00:06:52.770 --> 00:06:57.939 +web thing like i'm a c guy that knows some other things but the thing is that i was working on + +00:06:57.960 --> 00:07:03.140 +some military applications where I needed really fast updates of a browser. And the reason why you + +00:07:03.300 --> 00:07:09.240 +in this military situation is that getting things approved is really hard, like executables to go + +00:07:09.400 --> 00:07:13.060 +into deployment. But having a browser means that you have this nice little sandbox that things can + +00:07:13.070 --> 00:07:17.160 +go in. So it's actually more of a deployment platform in my background than, you know, just + +00:07:17.160 --> 00:07:21.700 +the regular web. But I was doing things that were pushing the browser really, really far. I was using + +00:07:21.840 --> 00:07:26.300 +Vue and Spa. And I basically was like, well, these are the smartest people out here, but it's not + +00:07:26.320 --> 00:07:31.160 +fast enough. So I was using crazy WebSocket stuff, all this binary stuff. And then I tried doing, + +00:07:31.620 --> 00:07:36.440 +you had someone on last week talking about LiveView and like they have a Python version of that. I went + +00:07:37.180 --> 00:07:40.820 +hard in making a binary version of that, like going down to the protocol level, changing, + +00:07:41.180 --> 00:07:44.600 +optimizing that 10 different ways. I had an entire framework for doing this. And basically, + +00:07:45.030 --> 00:07:48.840 +in my opinion, that's a complete dead end. It is untenable. We can go into the reasons why, + +00:07:49.080 --> 00:07:53.319 +but the thing is, long story short, I ended up seeing what was happening in HTMLX in the hyper + +00:07:53.340 --> 00:07:58.060 +media space. And I completely discounted all of that because I said, like, I'm doing low level + +00:07:58.200 --> 00:08:02.420 +binary stuff. There's no way this other approach can be faster. And then my thing is always check + +00:08:02.430 --> 00:08:07.000 +the metrics, always don't take your assumptions and do the work. And the thing is, there's things + +00:08:07.070 --> 00:08:11.080 +that are wrong in the implementation, but there's things that are 100% right in the overall ideas of + +00:08:11.360 --> 00:08:15.520 +how to use that. So I went and I took a year and a half work and just threw it in the trash + +00:08:16.100 --> 00:08:21.839 +and said, okay, I'm starting over and like ended up doing some basic things would probably get into + +00:08:22.160 --> 00:08:27.120 +and ended up with this thing that is a backend diagnostic backend framework that has a 10 + +00:08:27.280 --> 00:08:33.140 +kilobyte shim that is the fastest, smallest thing out there by orders of magnitude. So it's not just + +00:08:33.190 --> 00:08:37.039 +a slightly different thing. It is literally a different paradigm shift. It's a crazy shift. + +00:08:37.070 --> 00:08:41.940 +So the difference between reacts to something like HTMX is different from HTMX to the data + +00:08:42.080 --> 00:08:46.620 +start way. So I'll let Ben actually explain what that is. But the thing is from a low level C + +00:08:46.680 --> 00:08:51.120 +guys point of view, it is one of the fastest things in your stack now, which is crazy to think + +00:08:51.120 --> 00:08:52.940 +Yeah, like a 10K shim can do that. + +00:08:53.000 --> 00:08:53.600 +That's incredible. + +00:08:54.180 --> 00:08:58.340 +And also, it sounds like your advice comes from somebody who's done a lot of profiling. + +00:08:59.120 --> 00:08:59.760 +Very much so. + +00:08:59.880 --> 00:09:00.660 +Like, that's the only thing. + +00:09:00.760 --> 00:09:02.900 +You got to measure, not guess. + +00:09:03.160 --> 00:09:03.260 +Yeah. + +00:09:03.370 --> 00:09:06.940 +In fact, there's funny things that we've had things on Twitter fighting with people and + +00:09:06.940 --> 00:09:08.880 +they're like, oh, this one situation was really slow. + +00:09:09.350 --> 00:09:14.420 +We actually looked at their flangraphs and it was a bug in the Safari GPU stuff. + +00:09:14.770 --> 00:09:17.500 +Because we were actually at the level where the JavaScript doesn't even show up. + +00:09:17.580 --> 00:09:20.580 +It's actually a GPU issue of it rendering fast stuff. + +00:09:20.940 --> 00:09:22.640 +in the browser, nothing to do with the JavaScript. + +00:09:23.030 --> 00:09:25.320 +Because the fastest JavaScript you can write is no JavaScript. + +00:09:25.860 --> 00:09:28.140 +So we really lean into what the browser can already do. + +00:09:28.540 --> 00:09:30.220 +And we're just making it so that that's easy to do + +00:09:30.270 --> 00:09:32.380 +so that the average person with the average website + +00:09:32.960 --> 00:09:34.540 +doesn't have to write any JavaScript at all. + +00:09:34.810 --> 00:09:36.320 +And they get to be a full stack developer + +00:09:36.370 --> 00:09:37.340 +in whatever language they choose. + +00:09:37.780 --> 00:09:39.980 +And I'll let everyone else talk from there. + +00:09:40.040 --> 00:09:40.760 +Awesome. Ben? + +00:09:41.020 --> 00:09:43.560 +Yeah, my version is going to be quite different to Delaney's + +00:09:43.800 --> 00:09:45.800 +because we care about different things. + +00:09:45.980 --> 00:09:48.260 +Fortunately, we do care about some of the same things. + +00:09:49.000 --> 00:09:51.620 +We work well together because I think we complement each other. + +00:09:52.270 --> 00:09:57.280 +But coming from a PHP background, I want the backend to be driving the front end. + +00:09:57.290 --> 00:09:58.640 +And it naturally does, right? + +00:09:58.840 --> 00:10:02.580 +Because even your HTML is being produced by your backend. + +00:10:02.610 --> 00:10:04.280 +And that's what's being served to the front end. + +00:10:05.100 --> 00:10:08.420 +I describe Datastar as a hypermedia framework. + +00:10:09.240 --> 00:10:16.660 +And some people get tripped up on what hypermedia is, but it's essentially hypertext with other media like images and CSS and that kind of thing. + +00:10:16.740 --> 00:10:21.440 +And everybody should know what hypertext is because it's the H in HTTP and HTML. + +00:10:22.060 --> 00:10:33.840 +There is an expectation for people coming into Datastar that you have a basic understanding of the web and web browsers and the web browser API because we lean as heavily as possible on the browser API. + +00:10:34.340 --> 00:10:39.060 +We get a lot of people coming into the Discord asking us, you know, how should I do this the Datastar way? + +00:10:39.080 --> 00:10:44.860 +And it got to the point where I'd heard that question so often I decided, OK, I'm going to write a page in the Datastar docs. + +00:10:45.280 --> 00:10:46.940 +We call it the tau of datastar. + +00:10:47.110 --> 00:10:49.160 +So it's kind of like the way of datastar. + +00:10:49.520 --> 00:10:54.300 +And if there's one thing to take from that, it's use as little datastar as possible. + +00:10:54.720 --> 00:10:58.100 +Like leverage the browser, because the browser is an incredible thing, right? + +00:10:58.100 --> 00:11:01.920 +Like it's basically an operating system, our operating system as web developers. + +00:11:02.580 --> 00:11:05.500 +So, and everything happens at the C level, super optimized. + +00:11:06.100 --> 00:11:08.120 +We're not going to be able to build something faster. + +00:11:08.400 --> 00:11:11.480 +So leverage the browser as much as possible on the browser APIs. + +00:11:12.280 --> 00:11:19.200 +And where HTML kind of lacks or where there are some gaps, that's essentially what Datastar is trying to fill. + +00:11:19.680 --> 00:11:21.380 +So I did a lot of work. + +00:11:21.580 --> 00:11:26.760 +So just to relate this, I guess, to something that other people might be familiar with, which is HTMLX. + +00:11:27.380 --> 00:11:33.440 +I was an early contributor to HTMLX, actually, and I was sold on the idea of hypermedia from the very beginning. + +00:11:33.800 --> 00:11:35.740 +So HTML is the language of the web. + +00:11:36.100 --> 00:11:38.340 +Why are we trying to replace it with JavaScript? + +00:11:39.000 --> 00:11:46.740 +And the problem that I ran into after several years of thinking HTMX is all I need is that last 10%, right? + +00:11:46.910 --> 00:11:49.980 +Because it'll get you 90% of what you're trying to do. + +00:11:50.290 --> 00:11:56.280 +But that last 10%, which we all know is the hardest piece that takes the most work, just isn't covered. + +00:11:56.680 --> 00:12:02.059 +So with HTMX, for example, you will very often reach for another library like AlpineJS, + +00:12:02.760 --> 00:12:09.300 +or you'll start writing vanilla JS perhaps to fill in those gaps to interactivity to the page, + +00:12:09.390 --> 00:12:12.860 +because HTMX is really just going to the back end, replacing the DOM. + +00:12:13.270 --> 00:12:15.240 +But now you have two dependencies. + +00:12:15.820 --> 00:12:19.920 +Now you have HTMX and Alpine, for example, and you're trying to make those play well together. + +00:12:20.420 --> 00:12:24.880 +And because I think that might be a little bit of the missing sauce from HTMX. + +00:12:24.980 --> 00:12:28.200 +I've had Carson Gross on and I really admire HTMX. + +00:12:28.700 --> 00:12:34.920 +But as I've worked with it over a couple of years, I feel like it's really good as salt or seasoning, + +00:12:35.020 --> 00:12:38.200 +something you sprinkle on to really make a website better. + +00:12:38.660 --> 00:12:41.300 +But if you try to make a meal out of salt, you're not going to want to eat it. + +00:12:41.480 --> 00:12:46.940 +And what I mean is, you have three different disjointed parts of the page, + +00:12:47.030 --> 00:12:51.280 +and you're like, this is so amazing to update this with HTML and partials, and so is that. + +00:12:51.980 --> 00:12:55.940 +But then you start talking about AlpineJS and connecting different things, + +00:12:56.140 --> 00:12:59.340 +and then the JavaScript gets out of sync with this server response. + +00:12:59.590 --> 00:13:02.880 +And it just, you start to feel constrained by it. + +00:13:03.260 --> 00:13:05.400 +And I think you all have a really nice solution. + +00:13:05.560 --> 00:13:08.560 +It's something a little bit like how you, we're going to talk about it, + +00:13:08.560 --> 00:13:13.040 +but sort of how you specify the HTML to be updated by the server, + +00:13:13.580 --> 00:13:16.380 +but then also connecting different parts of the pages. + +00:13:16.940 --> 00:13:20.900 +Chris put it in his article that like the problem is AlpineJS and HTMX + +00:13:20.930 --> 00:13:24.220 +are just two unrelated different things that happen to go together a lot. + +00:13:24.280 --> 00:13:26.600 +And so they're not cohesive in a sense, right? + +00:13:26.760 --> 00:13:28.540 +Well, and that's one thing that's definitely an issue. + +00:13:28.880 --> 00:13:30.520 +Like, for example, this was my thing + +00:13:30.680 --> 00:13:33.500 +because I actually tried to fix HTMX back in the day. + +00:13:33.760 --> 00:13:35.040 +And like the things that I wanted to fix + +00:13:35.440 --> 00:13:37.280 +were the problem that I see at least + +00:13:37.400 --> 00:13:39.060 +is that you have HTMX, you can add, + +00:13:39.200 --> 00:13:41.340 +it has extensions, so you can add stuff to it. + +00:13:41.700 --> 00:13:43.900 +But it fundamentally was built to be like, + +00:13:44.200 --> 00:13:45.660 +here's our way of doing it. + +00:13:45.920 --> 00:13:47.540 +And then you can do your own stuff on top of it. + +00:13:47.740 --> 00:13:51.040 +The problem is, is that I thought that's broken. + +00:13:51.260 --> 00:13:52.180 +I've done enough game development + +00:13:52.240 --> 00:13:53.200 +to know that you need to be agile. + +00:13:53.300 --> 00:13:55.100 +I need to be able to like be able to move quickly. + +00:13:55.560 --> 00:14:01.180 +So I wanted it so that nothing was basically like the core of data star is like 300 lines + +00:14:01.270 --> 00:14:01.380 +long. + +00:14:01.670 --> 00:14:06.580 +And it is basically setting up data dash star elements, hooking up plugins, and then everything + +00:14:06.690 --> 00:14:07.220 +else is a plugin. + +00:14:07.530 --> 00:14:12.160 +So if you don't agree with us, or if someone's better than I am, great, that's wonderful. + +00:14:12.470 --> 00:14:15.120 +We will be able to just pop that part out, put the new part in. + +00:14:15.500 --> 00:14:17.060 +But plugins can now depend on each other. + +00:14:17.110 --> 00:14:17.520 +They can understand. + +00:14:17.800 --> 00:14:18.500 +It's an ecosystem. + +00:14:18.960 --> 00:14:20.660 +Ironically, that's what happens under the hood. + +00:14:21.080 --> 00:14:23.140 +But the ideas of that make it so much more powerful. + +00:14:23.200 --> 00:14:34.120 +And the irony is that if you build it in that kind of plugin style way, in the more game developer style way, we are smaller than HTMLX and Alpine alone, let alone combined, let alone Hyperscript and all these other things. + +00:14:34.320 --> 00:14:36.700 +So it's just a different way of thinking about the problem. + +00:14:36.840 --> 00:14:46.720 +When I first encountered Datastar and looked at the source code, it looked very foreign to me because Delaney coming from game development, he built Datastar like a game engine. + +00:14:47.160 --> 00:14:52.420 +So you have this very thin core and then everything else pretty much is a plugin. + +00:14:52.480 --> 00:14:58.680 +And all Datastar core is a way for registering plugins and having Datastar attributes. + +00:14:59.130 --> 00:15:00.380 +And that's pretty much it. + +00:15:00.600 --> 00:15:03.580 +Everything else is an add-on that you, is a plugin that you can take away. + +00:15:03.710 --> 00:15:08.020 +So we even have a bundler on the site that allows you to just, well, you can just download + +00:15:08.560 --> 00:15:11.660 +Datastar core or you can just select what plugins you want. + +00:15:12.060 --> 00:15:16.160 +Now, that in and of itself is not that interesting because we're, at the end of the day, we're + +00:15:16.200 --> 00:15:19.300 +talking about a 10 kilobyte JavaScript file with all of the plugins. + +00:15:19.840 --> 00:15:21.720 +But it is open source, which we didn't mention. + +00:15:21.760 --> 00:15:24.580 +And so anybody can go just kind of look at it if you're interested. + +00:15:25.080 --> 00:15:29.700 +But that approach means that everything is modular and everything is there for a reason. + +00:15:30.320 --> 00:15:32.180 +And we'll get into this later, I guess. + +00:15:32.230 --> 00:15:36.760 +But like deciding what plugins go in and what stay out is one of the challenges. + +00:15:36.890 --> 00:15:38.940 +And we just try to keep it as lean as possible. + +00:15:39.400 --> 00:15:43.860 +My way of thinking about it is that Datastar gives you everything you need and nothing you don't. + +00:15:44.240 --> 00:15:47.400 +And that's how we try to kind of keep it lean and fast. + +00:15:48.680 --> 00:15:51.060 +This portion of Talk Python Maze brought to you by Sentry. + +00:15:51.400 --> 00:15:56.500 +I've been using Sentry personally on almost every application and API that I've built for + +00:15:56.660 --> 00:15:59.120 +Talk Python and beyond over the last few years. + +00:15:59.480 --> 00:16:02.400 +They're a core building block for keeping my infrastructure solid. + +00:16:03.000 --> 00:16:04.300 +They should be for yours as well. + +00:16:04.580 --> 00:16:04.960 +Here's why. + +00:16:05.620 --> 00:16:07.060 +Sentry doesn't just catch errors. + +00:16:07.140 --> 00:16:09.840 +It catches all the stuff that makes your app feel broken. + +00:16:10.160 --> 00:16:14.580 +The random slowdown, the freeze you can't reproduce, that bug that only shows up once + +00:16:14.880 --> 00:16:15.660 +real users hit it. + +00:16:15.940 --> 00:16:19.440 +And when something goes wrong, Sentry gives you the whole chain of events in one place. + +00:16:19.680 --> 00:16:22.640 +errors, traces, replays, logs, dots connected. + +00:16:22.980 --> 00:16:24.860 +You can see what's led to the issue + +00:16:25.100 --> 00:16:26.820 +without digging through five different dashboards. + +00:16:27.640 --> 00:16:29.660 +Seer, Sentry's AI debugging agent, + +00:16:30.180 --> 00:16:32.120 +builds on this data, taking the full context, + +00:16:32.800 --> 00:16:34.840 +explaining why the issue happened, + +00:16:35.360 --> 00:16:37.720 +pointing to the code responsible, drafts a fix, + +00:16:37.800 --> 00:16:40.780 +and even flags if your PR is about to introduce a new problem. + +00:16:41.620 --> 00:16:42.660 +The workflow stays simple. + +00:16:43.100 --> 00:16:44.860 +Something breaks, Sentry alerts you, + +00:16:45.020 --> 00:16:46.820 +the dashboard shows you the full context. + +00:16:47.160 --> 00:16:49.640 +Seer helps you fix it and catch new issues + +00:16:49.660 --> 00:16:55.400 +before they ship. It's totally reasonable to go from an error occurred to fixed in production in + +00:16:55.600 --> 00:17:01.160 +just 10 minutes. I truly appreciate the support that Sentry has given me to help solve my bugs + +00:17:01.320 --> 00:17:06.620 +and issues in my apps, especially those tricky ones that only appear in production. I know you will + +00:17:06.640 --> 00:17:12.020 +too if you try them out. So get started today with Sentry. Just visit talkpython.fm/sentry + +00:17:12.540 --> 00:17:19.620 +and get $100 in Sentry credits. Please use that link. It's in your podcast player show notes. If + +00:17:19.640 --> 00:17:25.839 +our code talkpython26, all one word talkpython26 to get $100 in credits. + +00:17:26.740 --> 00:17:28.300 +Thank you to Sentry for supporting the show. + +00:17:29.600 --> 00:17:29.760 +Cool. + +00:17:29.800 --> 00:17:33.240 +That's a super interesting philosophy to say you should be able to take, even take + +00:17:33.310 --> 00:17:35.580 +stuff out of what we're giving you by default, right? + +00:17:35.850 --> 00:17:42.100 +Now, before we move on from sort of introducing Datastar, I do want to point out at data-star.dev, + +00:17:42.220 --> 00:17:45.420 +which of course I'll link this notes, there's some cool examples on here. + +00:17:45.720 --> 00:17:53.260 +You've got a really nice space 2001 sort of theme with Hal and all that, which is great. + +00:17:53.920 --> 00:17:56.260 +I like the aesthetic here, which is very fun. + +00:17:56.800 --> 00:18:00.620 +It's got a little bit of a retro gaming feel, which is nice. + +00:18:00.880 --> 00:18:03.400 +But what I want to point out is I want to encourage people to go watch your little video. + +00:18:03.960 --> 00:18:04.800 +Your video is fun. + +00:18:05.520 --> 00:18:06.360 +It's really fun. + +00:18:06.480 --> 00:18:11.160 +This video is all about how Datastar fits in the world of SPAs. + +00:18:11.920 --> 00:18:16.480 +And one thing we didn't really mention is that Datastar is a full-fledged SPA replacement. + +00:18:17.020 --> 00:18:26.140 +So again, like that last 10%, often people will think, oh, well, I need to go to React or Vue.js or some single page application framework. + +00:18:26.340 --> 00:18:34.640 +Whereas we're saying that, no, no, no, Datastar will not only, it's not like a subset or like SPAs are not a superset. + +00:18:34.680 --> 00:18:41.620 +it's on the contrary. I think Datastark, we think Datastark can do more than SBAs because we are + +00:18:42.180 --> 00:18:46.300 +driven by the backend and we are focused on hypermedia, which is the language of the web. + +00:18:46.640 --> 00:18:51.480 +So this, yeah, so this video is kind of throwing, yeah, anyway, everybody should watch it. + +00:18:51.800 --> 00:18:56.840 +I'd also like to, if you can scroll back up to the top of the page, the Starfield animation was + +00:18:57.040 --> 00:19:02.460 +one of the things like when Delaney and when everybody who worked on this published it, + +00:19:02.780 --> 00:19:04.700 +Like I didn't realize how amazing this was + +00:19:04.810 --> 00:19:07.460 +because if you like right click and inspect that thing, + +00:19:07.960 --> 00:19:08.740 +it's a web component. + +00:19:09.230 --> 00:19:11.100 +And so all the JavaScript that's required + +00:19:11.300 --> 00:19:13.540 +for making all the stars go faster and slower + +00:19:13.760 --> 00:19:16.180 +and tracking your mouse where, you know, wherever you do it, + +00:19:16.600 --> 00:19:18.280 +it's all within that web component. + +00:19:18.520 --> 00:19:21.180 +And data star is essentially subscribing to like, + +00:19:21.260 --> 00:19:22.020 +where's the mouse pointer + +00:19:22.440 --> 00:19:24.120 +and passing it into the web component. + +00:19:24.380 --> 00:19:26.520 +Yeah, in fact, if you go to more examples, + +00:19:27.220 --> 00:19:29.260 +you will see that there's, + +00:19:29.360 --> 00:19:31.240 +and then go scroll down to, + +00:19:31.720 --> 00:19:32.780 +or use the hamburger thing. + +00:19:32.850 --> 00:19:35.200 +Yeah, go down to the rocket. + +00:19:35.980 --> 00:19:37.740 +There's the actual star field. + +00:19:38.110 --> 00:19:39.940 +So you can see the entire, the star field, + +00:19:40.010 --> 00:19:41.700 +the entire component is there. + +00:19:41.790 --> 00:19:43.020 +So if you scroll down from there, + +00:19:43.320 --> 00:19:45.100 +you'll see how it actually gets hooked up + +00:19:45.460 --> 00:19:46.680 +and the entire component, + +00:19:46.900 --> 00:19:48.200 +that's the whole thing, it's right there. + +00:19:48.340 --> 00:19:48.840 +- That's incredible. + +00:19:49.200 --> 00:19:50.580 +- And the thing is if you start moving around, + +00:19:51.060 --> 00:19:52.820 +like if you scroll up just a little bit more, + +00:19:52.870 --> 00:19:54.280 +so you can see the sliders, + +00:19:54.690 --> 00:19:56.160 +you'll see that they're live, everything's, + +00:19:56.360 --> 00:19:57.080 +if you move it around, + +00:19:57.540 --> 00:19:59.240 +like you move your mouse around the canvas, + +00:19:59.460 --> 00:20:00.860 +you'll see everything's live editing, + +00:20:01.040 --> 00:20:01.820 +everything's thing. + +00:20:02.040 --> 00:20:03.800 +It's the irony of Datastar. + +00:20:03.800 --> 00:20:05.840 +And this is the part that I don't think people quite get. + +00:20:06.260 --> 00:20:08.100 +And it's not that you're trying to like, + +00:20:08.500 --> 00:20:10.160 +we love what Carson has done with HCMS. + +00:20:10.320 --> 00:20:11.680 +We love that all the things they've done, + +00:20:11.800 --> 00:20:13.120 +but it does not do everything. + +00:20:13.460 --> 00:20:13.980 +It doesn't do enough. + +00:20:14.080 --> 00:20:15.140 +It is a library, not a framework. + +00:20:15.560 --> 00:20:16.120 +And the thing is, + +00:20:16.360 --> 00:20:18.180 +the irony is that Datastar actually has + +00:20:18.520 --> 00:20:19.880 +the fastest reactive signal, + +00:20:21.060 --> 00:20:21.860 +like reactive signals. + +00:20:22.200 --> 00:20:23.700 +We are the fastest thing out there. + +00:20:23.980 --> 00:20:25.420 +So it's not just like we did something + +00:20:25.520 --> 00:20:26.300 +that's kind of like VDOM, + +00:20:26.360 --> 00:20:26.980 +or we are like, + +00:20:27.180 --> 00:20:28.160 +we can compete with React. + +00:20:28.220 --> 00:20:30.220 +We demolish them with actual numbers. + +00:20:30.860 --> 00:20:35.160 +So we have the fastest morphing strategy and we also have the fastest signals, which means doing these kinds of things. + +00:20:35.420 --> 00:20:35.980 +It's just a non-issue. + +00:20:36.010 --> 00:20:37.960 +Like this star field thing is 1K. + +00:20:38.460 --> 00:20:42.740 +Like it's just these are the kinds of things that are just a non-issue in this if you do things our way. + +00:20:42.860 --> 00:20:52.320 +And you're leaning into the web ecosystem by leveraging web components instead of having to like build, have a build time pipeline to, you know, do all the custom JavaScript. + +00:20:53.000 --> 00:20:57.380 +Like once I realized like you can do these things, it just made, it just clicked. + +00:20:57.540 --> 00:21:03.900 +And I just make it's I feel like it's so much more fun now to work on the web now that I understand these things. + +00:21:04.180 --> 00:21:06.200 +Let's talk through some of the core examples. + +00:21:06.450 --> 00:21:11.600 +I feel like there's some similarities to the example section of the HTMX place. + +00:21:11.700 --> 00:21:14.520 +But, you know, HTMX doesn't have a star field, certainly. + +00:21:14.880 --> 00:21:16.840 +Best place to start is on the homepage. + +00:21:17.470 --> 00:21:26.720 +Before we get into those examples, just just to kind of take a step back and say, OK, we've mentioned HTMX a few times and we don't we don't even like to compare ourselves to HTMX. + +00:21:26.880 --> 00:21:29.100 +but it is a good maybe starting point for some people. + +00:21:29.370 --> 00:21:31.220 +We have a hello world example there, + +00:21:31.640 --> 00:21:32.300 +if you could find that. + +00:21:32.360 --> 00:21:33.680 +Yeah, let's scroll down just a little bit more. + +00:21:33.840 --> 00:21:34.520 +Yeah, you got it. + +00:21:34.820 --> 00:21:38.200 +One of the maybe differences between HTMX and Datastar + +00:21:38.200 --> 00:21:40.900 +is that Datastar can receive HTML responses, + +00:21:41.050 --> 00:21:42.400 +but it also by default, + +00:21:42.720 --> 00:21:45.520 +or the recommendation is to use server sent events. + +00:21:46.020 --> 00:21:47.260 +So if you hit start there, + +00:21:47.660 --> 00:21:50.240 +you're going to see kind of the network response tab, + +00:21:50.310 --> 00:21:51.580 +and those are server sent events. + +00:21:51.840 --> 00:21:55.179 +And SSE server sent events are an old technology + +00:21:55.200 --> 00:21:56.740 +that work just over HTTP. + +00:21:57.480 --> 00:22:00.120 +And essentially what happens is that the server + +00:22:00.360 --> 00:22:01.960 +holds a connection open to the browser + +00:22:02.140 --> 00:22:03.520 +and it's unidirectional. + +00:22:03.820 --> 00:22:05.720 +So you send a request to the server + +00:22:05.960 --> 00:22:08.820 +and then the server can stream events back down, + +00:22:08.980 --> 00:22:09.980 +which is what you're seeing here. + +00:22:10.300 --> 00:22:12.160 +Now, this is obviously a trivial example, right? + +00:22:12.240 --> 00:22:12.900 +We're sending one, + +00:22:13.360 --> 00:22:16.040 +or we're updating the message one character at a time. + +00:22:16.600 --> 00:22:19.300 +But when you see how simple this is, + +00:22:19.320 --> 00:22:22.600 +then you can perhaps see potential for this, right? + +00:22:22.660 --> 00:22:29.720 +And SSE or service end events have had kind of a renaissance in recent years with all of the LLMs, right? + +00:22:29.750 --> 00:22:33.740 +All the chatbots are streaming the responses back to you. + +00:22:33.970 --> 00:22:42.680 +So this type of technology, while it's not old, sorry, it's not new, it's actually been around a long time, has kind of been underused. + +00:22:43.660 --> 00:22:50.940 +And Delaney kind of tapped into that and said, well, because I also always thought, well, if I want pure reactivity or true reactivity, + +00:22:50.960 --> 00:22:52.840 +I need two-way communication. + +00:22:53.020 --> 00:22:53.720 +So I need web sockets. + +00:22:53.720 --> 00:22:54.560 +You need web sockets. + +00:22:54.650 --> 00:22:56.660 +You need binary and all that kind of stuff. + +00:22:56.820 --> 00:22:56.940 +Yeah. + +00:22:57.120 --> 00:22:57.220 +Yeah. + +00:22:57.460 --> 00:22:59.440 +There's problems with those, which we can get into. + +00:23:00.200 --> 00:23:01.240 +SSE is much simpler. + +00:23:01.470 --> 00:23:04.300 +It works over HTTP 1, 2, and 3. + +00:23:05.120 --> 00:23:06.620 +And as you can see, it's just plain text. + +00:23:06.780 --> 00:23:08.280 +There is no complicated handshake. + +00:23:08.700 --> 00:23:12.480 +If you change the interval to zero and hit start, + +00:23:13.260 --> 00:23:15.740 +you're going to see a different type of response, which is, + +00:23:16.060 --> 00:23:17.860 +and I don't know if you saw the content type change, + +00:23:18.160 --> 00:23:19.820 +but content type now is text HTML. + +00:23:20.210 --> 00:23:20.580 +Oh, intro. + +00:23:20.700 --> 00:23:21.320 +Oh, interesting. + +00:23:21.520 --> 00:23:21.620 +Yeah. + +00:23:22.070 --> 00:23:24.300 +So this is what HTMLX would do by default. + +00:23:24.410 --> 00:23:28.140 +You send back HTML responses, whereas here the content type + +00:23:28.320 --> 00:23:29.620 +is text event stream. + +00:23:30.250 --> 00:23:32.900 +And this allows you to hold that connection open + +00:23:33.370 --> 00:23:34.300 +for as long as you want. + +00:23:34.300 --> 00:23:39.400 +It can be open and closed, or it can stay open until the words + +00:23:39.540 --> 00:23:41.420 +hello world have been spelled out. + +00:23:41.940 --> 00:23:43.660 +Or you can keep it open indefinitely. + +00:23:44.100 --> 00:23:46.720 +So we're going to see some more advanced examples + +00:23:46.740 --> 00:23:50.400 +where the SSE connection is held open for longer. + +00:23:50.680 --> 00:23:53.880 +So I think wrapping your head around this example + +00:23:55.260 --> 00:23:57.600 +taps you into the potential of Datastar. + +00:23:57.620 --> 00:23:57.980 +MARK MANDEL: Yeah. + +00:23:58.280 --> 00:23:59.880 +And one of the things that-- + +00:24:00.020 --> 00:24:03.040 +well, when I looked at Datastar, I'm like, OK, + +00:24:03.740 --> 00:24:05.320 +there's some interesting aspects here. + +00:24:05.320 --> 00:24:08.220 +And we'll get into them, how you can set up-- + +00:24:08.660 --> 00:24:10.860 +when I click the Start button, it might replace + +00:24:10.960 --> 00:24:12.900 +a piece of the page-- hey, that sounds familiar-- + +00:24:13.280 --> 00:24:15.460 +with HTML, not through JavaScript, right? + +00:24:15.900 --> 00:24:21.140 +but it didn't specify anywhere what part of the page to replace or not. + +00:24:21.180 --> 00:24:22.480 +Like, how does it know? + +00:24:22.940 --> 00:24:26.960 +And so with Datastar, you lean more on the server for many things, + +00:24:28.000 --> 00:24:32.220 +including deciding what part of the page that the server created in the first place to update. + +00:24:32.300 --> 00:24:33.420 +I really like that. + +00:24:33.480 --> 00:24:35.220 +I think that that's super neat. + +00:24:35.580 --> 00:24:39.440 +It lets you not just have sort of closer to one source of truth, + +00:24:39.660 --> 00:24:43.400 +but also just you can pass down multiple things. + +00:24:43.480 --> 00:24:45.720 +is like, we need to update this pane on the right, + +00:24:46.280 --> 00:24:50.100 +this text, and this element all in one response. + +00:24:50.640 --> 00:24:52.240 +There's a lot of interesting aspects + +00:24:52.480 --> 00:24:54.520 +to what you're talking about here. + +00:24:54.560 --> 00:24:56.860 +JOHN MCWHORTER: Anyone who's familiar with out-of-band swaps + +00:24:56.920 --> 00:24:59.000 +in HTMX, well, guess what? + +00:24:59.160 --> 00:25:00.640 +Datastar is out-of-band by default. + +00:25:01.480 --> 00:25:03.880 +So it's matching currently based on the ID. + +00:25:04.600 --> 00:25:06.920 +So you see h3 id equals message. + +00:25:07.280 --> 00:25:10.300 +And every event that's coming back has an ID of message. + +00:25:10.940 --> 00:25:11.820 +But guess what? + +00:25:11.960 --> 00:25:13.680 +you can use any ID you want, right? + +00:25:13.790 --> 00:25:16.760 +So you can use actually any CSS selector you want. + +00:25:16.790 --> 00:25:19.700 +But yes, we put the onus more on the backend + +00:25:20.120 --> 00:25:23.160 +because that is where we believe state should live + +00:25:23.500 --> 00:25:26.720 +or that's the source of truth for state. + +00:25:27.300 --> 00:25:30.320 +And you send and you work with state on the front end + +00:25:30.660 --> 00:25:32.680 +only when and where it makes sense to, + +00:25:33.040 --> 00:25:34.840 +which is more the web component aspect. + +00:25:35.080 --> 00:25:36.920 +And I'll caveat what Ben said there + +00:25:36.970 --> 00:25:39.960 +in that like state mostly lives in the backend. + +00:25:40.460 --> 00:25:43.380 +And that's the problem is that like state lives where it lives. + +00:25:43.500 --> 00:25:46.280 +Like if the user is actively able to move their mouse cursor, + +00:25:46.540 --> 00:25:48.420 +they own that state of the mouse cursor. + +00:25:48.560 --> 00:25:49.340 +You don't own that, + +00:25:49.620 --> 00:25:52.240 +but most of the state from your database should be in the backend. + +00:25:52.780 --> 00:25:56.660 +The one thing that's interesting about the SSE compared to how most people + +00:25:56.700 --> 00:25:59.060 +think this stuff, I will say I fell into this trap too, right? + +00:25:59.060 --> 00:26:03.500 +Cause I did the live view crazy stuff is that your job as a web developer is to + +00:26:03.560 --> 00:26:06.700 +get strings to the browser as efficiently, as fast as possible. + +00:26:06.840 --> 00:26:09.160 +Cause like the browser is going to deal with that, + +00:26:09.340 --> 00:26:14.640 +that into html and all that there's nothing faster than giving it html right so the thing that i i + +00:26:14.690 --> 00:26:20.440 +know i lost for a long time is that sse i thought oh it's this big string thing how is that better + +00:26:20.580 --> 00:26:24.900 +than binary but the irony is that because it's so regular because there's already things like + +00:26:25.000 --> 00:26:28.560 +compression built into the browser there's streaming things there's things that are so + +00:26:28.660 --> 00:26:33.720 +much easier to do here in an efficient way that the irony is if you if you you don't have to care + +00:26:33.800 --> 00:26:38.520 +about all these things but if you just follow our way of doing it your python app will be faster + +00:26:38.540 --> 00:26:40.200 +than most people's like compiled, + +00:26:40.660 --> 00:26:42.080 +you know, like low level language thing + +00:26:42.200 --> 00:26:43.800 +because you're getting orders of magnitude + +00:26:44.030 --> 00:26:44.640 +in the algorithms + +00:26:44.770 --> 00:26:46.280 +and how we're doing stuff from the hood. + +00:26:46.420 --> 00:26:47.620 +So I don't know if you're interested + +00:26:47.710 --> 00:26:48.980 +in like the deep down stuff + +00:26:49.180 --> 00:26:51.000 +or just like how you use it as a Python developer. + +00:26:51.030 --> 00:26:53.540 +But the irony is that you now have tapped into this. + +00:26:53.670 --> 00:26:54.420 +It seems so simple. + +00:26:54.580 --> 00:26:56.260 +You're like, oh, this is just a different text response. + +00:26:56.880 --> 00:26:58.480 +How can this be orders of magnitude faster? + +00:26:58.860 --> 00:26:59.960 +Like, again, I don't know how much + +00:26:59.970 --> 00:27:00.920 +you want to get into the weeds of that + +00:27:01.280 --> 00:27:03.180 +compared to just it's fun to use, right? + +00:27:03.340 --> 00:27:05.440 +Yeah, I really like the philosophy + +00:27:05.580 --> 00:27:07.780 +of having so much of it controlled by the server. + +00:27:08.280 --> 00:27:09.920 +It just felt disheartening. + +00:27:10.160 --> 00:27:11.480 +It's like, okay, so what you're going to do + +00:27:11.480 --> 00:27:13.780 +is you're just going to create some JSON responses + +00:27:14.780 --> 00:27:15.620 +on your server, + +00:27:15.730 --> 00:27:18.880 +and then everything is some crazy build series of steps + +00:27:19.420 --> 00:27:21.120 +to end up with, I don't know, + +00:27:21.400 --> 00:27:23.160 +Vue or React or something on the front end. + +00:27:23.250 --> 00:27:26.420 +And there's just so much power and flexibility + +00:27:26.900 --> 00:27:29.060 +to write really cool server code. + +00:27:29.260 --> 00:27:31.140 +But, you know, like a lot of the trends have been, + +00:27:31.680 --> 00:27:34.680 +yeah, that's kind of just there to support the rest of it, + +00:27:34.750 --> 00:27:37.120 +you know, and so I don't know, this really appeals to me. + +00:27:37.500 --> 00:27:43.600 +question that comes up often is like, OK, well, how do I format this? Because it has its own syntax. + +00:27:44.040 --> 00:27:48.620 +Very simple to read, obviously, right? An event name and then these data lines. And you can just + +00:27:48.670 --> 00:27:54.460 +have as many data lines as you want. And that's your HTML. If you scroll up, though, we do have... + +00:27:54.730 --> 00:28:01.240 +So you do need to format this, but we essentially have all of these SDKs, including Python, you'll + +00:28:01.320 --> 00:28:06.660 +see there. And the Python SDK is actually, I would say, one of the most intricate ones we have. + +00:28:07.240 --> 00:28:12.900 +Spatuel King, he's a member of the community, or Chase, I believe is his first name, his real first + +00:28:13.020 --> 00:28:18.340 +name, and many other contributors did an amazing job on that. So lots and lots of Python frameworks + +00:28:18.520 --> 00:28:24.120 +are supported. You can maybe speak more to this, Chris. And really, the SDKs are very simple, + +00:28:24.360 --> 00:28:29.780 +because all they do is they take a function, a patch elements or patch signals function, + +00:28:30.160 --> 00:28:34.840 +and you just dump in the HTML that you want swapped into the DOM or the signals you want + +00:28:34.820 --> 00:28:39.680 +output on the page and it just does the formatting for you so so it's really just there's three + +00:28:39.900 --> 00:28:45.120 +functions i think in total that every sdk has to implement and it's such a time saver you know um + +00:28:45.620 --> 00:28:52.260 +i doved into service and events a lot with htmx and when you get the syntax wrong it is so painful + +00:28:52.680 --> 00:28:58.840 +to debug because pretty much can't it just doesn't work you know or whatever it's harder to debug and + +00:28:58.920 --> 00:29:04.780 +so to have the helper syntax it's just a dream well and also just so people are aware i like + +00:29:04.800 --> 00:29:06.060 +because I was originally going to try, + +00:29:06.270 --> 00:29:07.980 +the irony is I was trying to get server sent events + +00:29:08.320 --> 00:29:10.760 +like their plugin up to snuff like years ago. + +00:29:11.240 --> 00:29:14.580 +Like I would highly recommend not using SSE with HTMLX + +00:29:14.860 --> 00:29:16.800 +because the problem is that the entire model + +00:29:16.830 --> 00:29:18.260 +of how you build things is very poll based + +00:29:18.720 --> 00:29:19.940 +and it's built out of band. + +00:29:19.970 --> 00:29:20.900 +It's like a weird concept, + +00:29:21.010 --> 00:29:21.940 +like the idea of updating, + +00:29:22.050 --> 00:29:23.800 +like it is not built with that in mind. + +00:29:24.130 --> 00:29:25.680 +So I know that they're trying to move towards that + +00:29:25.770 --> 00:29:26.180 +in the future, + +00:29:26.440 --> 00:29:28.800 +but the whole way that you interact with it + +00:29:28.900 --> 00:29:29.700 +is based on polling. + +00:29:30.180 --> 00:29:31.780 +And the thing about our way + +00:29:31.900 --> 00:29:33.300 +is that not only are you doing push events, + +00:29:33.540 --> 00:29:36.700 +But the thing is that really does change the semantics of the language. + +00:29:36.900 --> 00:29:40.020 +So first of all, you get like 40X compression by doing our way. + +00:29:40.230 --> 00:29:43.040 +But also you only send data when you need to instead of polling. + +00:29:43.330 --> 00:29:44.600 +So now you're using less resources. + +00:29:45.060 --> 00:29:45.960 +You're using less network. + +00:29:46.520 --> 00:29:54.620 +It changes the whole dynamic in a deeper way that you can literally save 5,000X in your network bandwidth. + +00:29:54.900 --> 00:29:57.300 +It sounds crazy, but it's just a reality. + +00:29:57.540 --> 00:29:57.640 +Right. + +00:29:57.900 --> 00:30:00.200 +Another thing, Delaney, that's really nice about that is the latency. + +00:30:00.900 --> 00:30:03.060 +That's something that drives me crazy about polling. + +00:30:03.400 --> 00:30:09.060 +general is just like okay well we don't want to hammer the server too hard so let's make this you + +00:30:09.060 --> 00:30:13.420 +know one second two second but then it's like well i click this button and then it updates and you're + +00:30:13.460 --> 00:30:18.000 +like ah if if something happens on the server it's sent right if it wants to one of the things that + +00:30:18.120 --> 00:30:23.040 +ironically because i do a lot of like go or low level language stuff is that i tend to put a debounce + +00:30:23.380 --> 00:30:29.780 +in my server to like five milliseconds so that i get i'm not updating more than you know 200 times + +00:30:30.100 --> 00:30:36.500 +a second even on a monitor because the browser actually break after 500 fps so like the + +00:30:36.700 --> 00:30:40.380 +interesting thing is not that it's basically data starts no longer the issue in your thing if you + +00:30:40.520 --> 00:30:47.460 +are on a low low powered battery device like a mobile on a 3g this is it will just work like it's + +00:30:47.700 --> 00:30:52.560 +stuff that you just don't have to worry about so it does change the semantics of how you build things + +00:30:52.980 --> 00:30:57.140 +just so that you're aware because even things like for example built into the htmx they don't do + +00:30:57.220 --> 00:30:59.300 +automatic exponential back off. + +00:30:59.500 --> 00:31:00.920 +It doesn't have all the verbs. + +00:31:01.320 --> 00:31:03.180 +There's caveats there that I would recommend + +00:31:03.250 --> 00:31:04.960 +not doing it, honestly, if you're going to do it. + +00:31:05.000 --> 00:31:07.260 +It's crazy that you're talking about going below + +00:31:07.330 --> 00:31:08.720 +the monitor refresh rate. + +00:31:09.020 --> 00:31:11.020 +You're not going to see it. This is only + +00:31:11.710 --> 00:31:12.160 +120 hertz. + +00:31:13.740 --> 00:31:14.800 +120 times a second. + +00:31:15.150 --> 00:31:16.700 +So why would you pull + +00:31:17.020 --> 00:31:18.700 +faster than that? That's wild. + +00:31:20.480 --> 00:31:20.960 +This portion + +00:31:21.010 --> 00:31:22.980 +of Talk Python To Me is brought to you by us. + +00:31:23.320 --> 00:31:24.920 +I'm thrilled to announce a brand + +00:31:25.020 --> 00:31:26.560 +new app built for developers + +00:31:26.580 --> 00:31:27.840 +created by yours truly. + +00:31:28.260 --> 00:31:29.460 +It's called Command Book. + +00:31:30.280 --> 00:31:31.520 +You know that thing you do every morning? + +00:31:32.020 --> 00:31:33.460 +Open up six terminal tabs, + +00:31:33.860 --> 00:31:34.700 +CD into this directory, + +00:31:35.300 --> 00:31:36.340 +activate that virtual environment, + +00:31:36.820 --> 00:31:38.280 +run the server with --reload. + +00:31:38.580 --> 00:31:39.620 +Now, CD somewhere else, + +00:31:40.280 --> 00:31:41.280 +start the background worker, + +00:31:41.820 --> 00:31:42.540 +another tab for Docker, + +00:31:42.900 --> 00:31:44.260 +another one to tail production logs. + +00:31:44.720 --> 00:31:46.740 +Every tab just says Python, Python, Python, + +00:31:47.020 --> 00:31:47.440 +Docker tail. + +00:31:48.600 --> 00:31:49.480 +And you're clicking through them going, + +00:31:49.940 --> 00:31:51.100 +which Python was that again? + +00:31:51.660 --> 00:31:52.360 +Where my app is running? + +00:31:52.920 --> 00:31:53.860 +Then sometime later, + +00:31:54.140 --> 00:31:55.960 +your dev server silently dies + +00:31:55.980 --> 00:31:58.660 +because it tried to reload while you're in the middle of a code edit, + +00:31:59.320 --> 00:32:01.880 +unmatched brace, a half-written import, or something. + +00:32:02.640 --> 00:32:05.180 +Now you're hunting through tabs to figure out which process crashed + +00:32:05.270 --> 00:32:06.120 +and how to restart it. + +00:32:06.600 --> 00:32:11.140 +My app, CommandBook, gives all of these long-running commands a permanent home. + +00:32:11.700 --> 00:32:14.500 +You save a command once, the working directory, the environment, + +00:32:14.860 --> 00:32:17.940 +free commands like git pull, and from then on, you just click run. + +00:32:18.500 --> 00:32:21.260 +You can even group commands together to start and stop everything + +00:32:21.370 --> 00:32:22.840 +for a project with a single click. + +00:32:23.360 --> 00:32:26.960 +It also has what I call Honey Badger Mode, auto restart on crash. + +00:32:27.540 --> 00:32:30.060 +So when your dev server goes down mid-reload, + +00:32:30.520 --> 00:32:34.720 +Command Book just brings it right back up and does so over and over until the code is fixed. + +00:32:35.340 --> 00:32:37.140 +It also detects URLs from your output, + +00:32:37.400 --> 00:32:39.720 +so you're never scrolling through thousands of lines of logs + +00:32:39.810 --> 00:32:41.860 +just to figure out how to reopen your web app. + +00:32:42.400 --> 00:32:46.280 +And it shows you uptime, memory usage, and all sorts of cool things about your process. + +00:32:46.980 --> 00:32:48.960 +The whole thing is a native macOS app. + +00:32:49.160 --> 00:32:51.520 +No electron, no Chromium, just 21 megs. + +00:32:51.900 --> 00:32:53.280 +And it comes with a full CLI. + +00:32:53.520 --> 00:32:55.360 +So anything you've configured in the UI, + +00:32:55.680 --> 00:32:58.220 +you can fire off from your terminal with just a single command. + +00:32:58.760 --> 00:33:01.900 +Right now it's macOS only, but if there's enough interest, + +00:33:02.140 --> 00:33:03.200 +I'll build a Windows version too. + +00:33:03.460 --> 00:33:04.160 +So let me know. + +00:33:05.000 --> 00:33:08.980 +Please check it out at talkpython.fm/command book app, + +00:33:09.460 --> 00:33:11.980 +download it for free, level up your developer workflow. + +00:33:12.440 --> 00:33:14.040 +The link is in your podcast player show notes. + +00:33:14.640 --> 00:33:16.620 +That's talkpython.fm/command book. + +00:33:16.980 --> 00:33:18.820 +I really hope you enjoy this new app that I built. + +00:33:20.180 --> 00:33:28.140 +Yeah, on the topic of latency and all that, if you go to the examples, there's some we could look at that I think really demonstrate this. + +00:33:28.180 --> 00:33:32.740 +Well, maybe start with bad Apple just because we're talking about refresh rates. + +00:33:33.120 --> 00:33:33.180 +OK. + +00:33:33.600 --> 00:33:39.140 +What's happening is that the back end is streaming down just a bunch of symbols, but it creates this animation. + +00:33:39.520 --> 00:33:44.020 +And if you were to open the network tab, you would see it actually would be interesting to see. + +00:33:44.220 --> 00:33:46.100 +You probably have to refresh the page just to. + +00:33:46.880 --> 00:33:47.000 +Yeah. + +00:33:47.660 --> 00:33:48.660 +And you're going to see updates. + +00:33:49.160 --> 00:33:49.680 +That one there. + +00:33:49.980 --> 00:33:51.060 +Yeah, if you click that-- + +00:33:51.100 --> 00:33:51.600 +This one? + +00:33:52.160 --> 00:33:52.880 +Yeah, that's the one. + +00:33:53.100 --> 00:33:53.960 +You click Event Stream. + +00:33:54.500 --> 00:33:56.780 +There's an Event Stream tab for Event Stream responses. + +00:33:57.540 --> 00:33:58.500 +You're going to see these streaming. + +00:33:58.610 --> 00:34:01.800 +I don't know what frames per second we have this set to, + +00:34:01.920 --> 00:34:03.160 +but you see it streaming past, right? + +00:34:03.340 --> 00:34:03.460 +Right. + +00:34:03.640 --> 00:34:05.920 +The first time many people see this, + +00:34:05.980 --> 00:34:09.360 +this is a surprise that the browser is capable of this. + +00:34:09.460 --> 00:34:11.179 +But the browser can stream video, so why + +00:34:11.320 --> 00:34:12.659 +can't it stream a bunch of text? + +00:34:12.860 --> 00:34:14.679 +I mean, it's not that big of a leap of faith. + +00:34:15.060 --> 00:34:18.280 +But you can see, it looks like it's about every 10, 20 + +00:34:18.500 --> 00:34:18.740 +milliseconds. + +00:34:19.100 --> 00:34:23.720 +I think we're doing like 30 frames a second, but again, we can do like, we're doing this on a, + +00:34:24.040 --> 00:34:28.419 +basically a free tier server. So like, this is just a non-issue and it's doing all the compression + +00:34:28.550 --> 00:34:32.620 +stuff. So if you notice that your update, even though we're doing like full ASCII development at, + +00:34:32.820 --> 00:34:36.820 +you know, thousands of characters, your updates are actually not updating that. Like you see how + +00:34:37.040 --> 00:34:41.200 +it's transferring, but it's not transferring that much compared to how much it's actually coming out. + +00:34:41.460 --> 00:34:46.420 +I can see we got 1.9 megs for the whole page. Yeah. But do you see next to it? What, what was + +00:34:46.440 --> 00:34:51.580 +actually like the resources so you see the compression well yeah we're probably not seeing + +00:34:51.580 --> 00:34:55.780 +it there but in the bottom you'll see two two megabytes have been transferred but 10 megabytes + +00:34:55.980 --> 00:35:01.080 +of resources and so that's oh yeah yeah so it's a 5x compression yeah it's going to be much more + +00:35:01.320 --> 00:35:06.760 +on the stream i think because it's streaming uh normally you can hover over the size + +00:35:07.520 --> 00:35:13.380 +and you'll see the uncompressed but i guess it's changing too fast that's pretty wild and you know + +00:35:13.240 --> 00:35:19.020 +in practical usage, like I have a status screen that I have from my production app at work. And + +00:35:19.100 --> 00:35:24.660 +it's just amazing to just constantly be seeing these things update. And I'm doing that by having + +00:35:24.860 --> 00:35:31.140 +the database tell my Python code, hey, refresh. I actually ask it to get all the entries from the + +00:35:31.220 --> 00:35:35.520 +database and send it down the pipe. And so it's not like I'm doing the optimized thing. I'm doing + +00:35:35.640 --> 00:35:40.660 +the simple thing and I get all these cool things just updating all the time. And it's just such a + +00:35:40.720 --> 00:35:44.060 +useful thing, especially for status screens, dashboards, stuff like that. + +00:35:44.140 --> 00:35:49.880 +Speaking of that, go to the DB Mon example. This is one of my favorites because when React + +00:35:50.340 --> 00:35:54.680 +first had their first conference, they said, look at what we're doing. We're able to update at a rate + +00:35:54.760 --> 00:35:59.640 +that no one else can compete with us in how fast they could update the browser, right? If we + +00:35:59.760 --> 00:36:03.460 +actually, yeah, you're still there. So the thing is, if you actually set the FPS to something like + +00:36:03.840 --> 00:36:10.360 +80, whatever. So that is how fast it's coming from the backend to you. So go ahead. Yeah, + +00:36:10.520 --> 00:36:12.620 +because we just don't want people blasting the server. + +00:36:12.740 --> 00:36:13.760 +Yeah, you don't want to walk away. + +00:36:14.240 --> 00:36:16.400 +Yeah, but the point is that this is coming. + +00:36:16.820 --> 00:36:19.520 +See, we're doing stuff in microseconds on a potato. + +00:36:19.780 --> 00:36:21.800 +Yeah, let me just describe this a little bit for people listening. + +00:36:22.000 --> 00:36:25.180 +So it's like a database monitoring table + +00:36:25.360 --> 00:36:29.200 +that shows you how many transactions are the database overloads. + +00:36:29.580 --> 00:36:33.600 +So it's updating a grid of maybe 10 or 12 databases + +00:36:34.180 --> 00:36:35.880 +with five or six elements, + +00:36:36.280 --> 00:36:38.860 +and it's doing that in microseconds, 80 times a second. + +00:36:38.960 --> 00:36:43.480 +A lot of people see these examples and they think, well, I'm not building this kind of stuff. + +00:36:43.880 --> 00:36:44.800 +And me included. + +00:36:44.960 --> 00:36:46.560 +I build crud apps most of the time. + +00:36:47.400 --> 00:36:51.500 +And there are plenty of examples here that are just cruddy things. + +00:36:51.940 --> 00:36:53.480 +They're kind of the more boring examples. + +00:36:53.940 --> 00:36:57.640 +But one example that might be worth looking at is the to-do MVC. + +00:36:58.000 --> 00:37:01.520 +And if you can figure out how to open that in split screen. + +00:37:01.960 --> 00:37:02.120 +Okay. + +00:37:02.560 --> 00:37:04.560 +What part do you want me to open up in split? + +00:37:04.660 --> 00:37:05.680 +Oh, just this, the example. + +00:37:05.880 --> 00:37:09.560 +Yeah, so I can do these two and then I can tile them. + +00:37:09.820 --> 00:37:10.140 +How's that? + +00:37:10.320 --> 00:37:15.440 +So this is a CRUD app, but what Datastar gives you is the ability to do multiplayer out of the box. + +00:37:15.710 --> 00:37:22.320 +And that is like real-time collaborative apps are not easy to do and not easy to scale as well. + +00:37:22.570 --> 00:37:27.620 +But as you'll see here, when you have like two sessions open, it's going to be near instant. + +00:37:27.820 --> 00:37:31.340 +You're going to basically be observing the latency on your network connection, + +00:37:31.620 --> 00:37:35.680 +which is going to be 50 milliseconds to 100, but barely perceivable. + +00:37:35.860 --> 00:37:40.360 +So just to describe to people, we've got this 2D MVC, which allows you to, well, it's like a + +00:37:40.440 --> 00:37:46.200 +to-do example, which is required to be a legitimate JavaScript framework. But I've opened it in two + +00:37:46.360 --> 00:37:51.000 +tabs and I've used Vivaldi's tile. So these are legitimately two browsers. They just appear to be + +00:37:51.140 --> 00:37:57.060 +kind of in the same window. And when I enter stuff into it, it literally looks like they update in + +00:37:57.280 --> 00:38:00.320 +parallel, which is crazy. If you check a few of them, you'll see, + +00:38:00.420 --> 00:38:02.920 +You can barely tell which one's updating which. + +00:38:03.110 --> 00:38:04.520 +It happens almost instantly. + +00:38:04.920 --> 00:38:07.380 +Yeah, if I look at the other one and I click on one, + +00:38:07.600 --> 00:38:09.980 +it feels like that's responding to my click. + +00:38:10.420 --> 00:38:13.000 +I need to correct myself that it is happening instantly + +00:38:13.230 --> 00:38:15.500 +because when you click, when you check one of those, + +00:38:15.700 --> 00:38:19.420 +it's not, and this is an interesting thing we can get into. + +00:38:19.620 --> 00:38:21.260 +We're not doing optimistic updates. + +00:38:21.700 --> 00:38:23.680 +It's actually sending a request to the server + +00:38:24.160 --> 00:38:26.560 +and the server is simultaneously updating + +00:38:27.200 --> 00:38:29.400 +both of your tabs at the same time. + +00:38:29.480 --> 00:38:31.000 +Even if I had just one open, + +00:38:31.500 --> 00:38:33.140 +it's still going round trip to the server. + +00:38:33.480 --> 00:38:34.780 +That's why it looks like it's simultaneous + +00:38:34.990 --> 00:38:35.820 +because it's effectively. + +00:38:35.980 --> 00:38:36.940 +This is a thing that you can, + +00:38:37.360 --> 00:38:38.540 +we can talk for like three hours + +00:38:38.840 --> 00:38:40.640 +and I will yell at most spa developers + +00:38:41.439 --> 00:38:43.020 +because there's this weird thing + +00:38:43.030 --> 00:38:43.980 +that because it's easy, + +00:38:44.460 --> 00:38:47.340 +people will actively lie to users in the spa world + +00:38:47.590 --> 00:38:48.920 +and they'll do optimistic updates, + +00:38:49.030 --> 00:38:50.640 +which means I'm going to make it + +00:38:50.650 --> 00:38:52.020 +so that I'm making this change. + +00:38:52.200 --> 00:38:54.300 +And then if there's a problem, then fix it. + +00:38:54.660 --> 00:38:56.700 +Whereas we say you should do indicator saying, + +00:38:56.880 --> 00:38:58.400 +I'm trying to make a change to this + +00:38:58.840 --> 00:38:59.280 +and then fix it. + +00:38:59.360 --> 00:39:02.600 +Because you don't want, like when you're playing a video game, you can do what's called dead + +00:39:02.780 --> 00:39:05.400 +reckoning and you can do some stuff to net rollback code. + +00:39:05.620 --> 00:39:10.540 +You can do some clever things to hide latency, but you don't want to hide latency when it + +00:39:10.540 --> 00:39:14.060 +comes to like a bank transfer or did I buy that thing or did I get that theater ticket + +00:39:14.280 --> 00:39:15.220 +or any of that stuff. + +00:39:15.620 --> 00:39:18.680 +Like people just have the wrong mental model of how the web should work. + +00:39:19.080 --> 00:39:23.360 +I'm actually going to send you another thing that this might blow your mind even more because + +00:39:23.640 --> 00:39:25.240 +the three of us basically can play. + +00:39:25.480 --> 00:39:35.060 +This is an example where all of us could be playing live with each other right now in an active shared state that's been at the top of the Hacker News and again, runs on a potato. + +00:39:35.170 --> 00:39:37.160 +I don't know if you just put that in your... + +00:39:37.180 --> 00:39:38.080 +Yeah, let me drop it over. + +00:39:38.190 --> 00:39:38.440 +Hold on. + +00:39:38.720 --> 00:39:39.620 +I'm going to put it in the other tab. + +00:39:39.860 --> 00:39:41.800 +So right now, don't touch anything. + +00:39:42.240 --> 00:39:43.180 +I'm going to actively start. + +00:39:43.240 --> 00:39:43.680 +I am purple. + +00:39:43.950 --> 00:39:45.740 +I am literally starting to click right now. + +00:39:45.840 --> 00:39:45.980 +All right. + +00:39:46.030 --> 00:39:48.740 +So we're looking at a multiplayer game of life here. + +00:39:48.940 --> 00:39:50.100 +I'm seeing that live here. + +00:39:50.340 --> 00:39:54.500 +And if you open up that in the other tab, you would actively see the exact same state. + +00:39:54.880 --> 00:39:57.960 +So everyone in the world, like if you open that up in the other tab, + +00:39:58.360 --> 00:39:59.860 +you cannot get out of sync. + +00:40:00.020 --> 00:40:01.280 +It's not faking it in the front end. + +00:40:01.320 --> 00:40:02.300 +This is literally sending in. + +00:40:02.520 --> 00:40:04.700 +What's even crazier about this, here's the crazy part. + +00:40:04.760 --> 00:40:05.960 +It's actually a rendering demo. + +00:40:06.360 --> 00:40:09.000 +The guy who wrote it is writing in a scripting language, Clojure, + +00:40:09.400 --> 00:40:14.500 +and he's sending down 2,500 divs per frame styled, inline styled. + +00:40:14.840 --> 00:40:17.480 +Now go to your network tab now and look at what's actually, + +00:40:17.580 --> 00:40:21.720 +like look at your network tab and you'll see how little data we're actually sending over. + +00:40:21.860 --> 00:40:24.420 +even though he's updating 2,500 divs per frame, + +00:40:24.880 --> 00:40:26.380 +like if you go to wherever it's updated, + +00:40:26.600 --> 00:40:28.400 +yeah, whichever one's the one that's updated, + +00:40:28.450 --> 00:40:29.180 +there you go, yeah. + +00:40:29.540 --> 00:40:31.940 +So if you look here and look at how much is being sent + +00:40:32.070 --> 00:40:33.260 +versus how many is actually, + +00:40:33.870 --> 00:40:36.380 +like try to, like, this is just a different paradigm + +00:40:36.700 --> 00:40:37.380 +for how you build. + +00:40:37.440 --> 00:40:38.040 +And the thing is, again, + +00:40:38.600 --> 00:40:40.380 +not everybody has to care about these low level things, + +00:40:40.430 --> 00:40:42.240 +but the thing is, is that once you do this, + +00:40:42.450 --> 00:40:43.980 +the idea of CRUD kind of goes away + +00:40:44.110 --> 00:40:46.520 +because in our opinion, you go to a multi, + +00:40:46.600 --> 00:40:49.240 +you make a multi-page app like you would normally do + +00:40:49.360 --> 00:40:50.220 +in HTMX or anything else, + +00:40:50.440 --> 00:40:51.620 +but you keep an open stream + +00:40:52.180 --> 00:40:54.360 +and you just update whatever's happening in your backend + +00:40:54.840 --> 00:40:55.400 +as it's happening. + +00:40:55.940 --> 00:40:56.980 +And it simplifies the world. + +00:40:57.080 --> 00:40:58.020 +And what's also interesting + +00:40:58.520 --> 00:40:59.880 +is because of how we do compression and all that, + +00:41:00.160 --> 00:41:01.480 +you just send your entire page. + +00:41:01.820 --> 00:41:03.100 +You don't need like out of band. + +00:41:03.360 --> 00:41:04.420 +It doesn't even really make sense + +00:41:04.680 --> 00:41:06.220 +because we're so fast that you can just, + +00:41:06.560 --> 00:41:07.680 +you as a Python developer, + +00:41:08.140 --> 00:41:09.300 +you just give us your entire page + +00:41:09.480 --> 00:41:10.420 +and let us deal with it. + +00:41:10.600 --> 00:41:11.900 +And we will come up with the fast stuff. + +00:41:12.320 --> 00:41:14.540 +So Chris should probably talk a lot more to that + +00:41:14.620 --> 00:41:15.620 +because the fat morpher stuff, + +00:41:15.720 --> 00:41:18.520 +it's a fundamental change in how you build web apps, I think. + +00:41:18.640 --> 00:41:18.980 +Yeah, yeah. + +00:41:19.180 --> 00:41:21.960 +especially the kind of the mental shift of like, + +00:41:22.270 --> 00:41:25.360 +because I kept thinking, okay, I need to like send one row at a time. + +00:41:25.840 --> 00:41:27.680 +And I actually have one status screen that does that + +00:41:27.810 --> 00:41:32.180 +because we use a Firestore, Google's Firestore as our backend for this app. + +00:41:32.700 --> 00:41:35.600 +But for some reason, sometimes it just doesn't send every update. + +00:41:36.040 --> 00:41:38.040 +And so on another status screen, I actually, you know, + +00:41:38.320 --> 00:41:41.960 +query the whole database table or collection and send it down to Pipe. + +00:41:42.180 --> 00:41:45.640 +And because sometimes it doesn't send from Firestore, + +00:41:45.840 --> 00:41:50.760 +I get the entire latest state of all the things that are in flight and updated on my screen. + +00:41:51.200 --> 00:41:52.400 +And it just makes things easier. + +00:41:52.580 --> 00:41:53.220 +Yeah, it's amazing. + +00:41:53.680 --> 00:41:57.540 +Sounds like a good opportunity to subscribe to database query changes. + +00:41:57.780 --> 00:42:03.280 +I know some databases you can say, if this query updates, you know, trigger this event and then keep it flowing, + +00:42:03.600 --> 00:42:06.640 +like straight from events on the database, straight to your front end. + +00:42:06.980 --> 00:42:07.220 +Pretty cool. + +00:42:07.500 --> 00:42:12.060 +I do want to go back and just put a little bit of commentary, Delaney. + +00:42:12.240 --> 00:42:14.160 +Well, you said optimistic updates. + +00:42:15.040 --> 00:42:19.940 +So one of the things that's really common in JavaScript is I click this thing, it changes. + +00:42:20.310 --> 00:42:21.860 +I want to mark it as changed. + +00:42:21.990 --> 00:42:24.500 +And then I'm going to tell the server, hey, we made this change. + +00:42:25.060 --> 00:42:28.860 +It's very possible the server died, that you're not allowed to make that change or whatever. + +00:42:29.050 --> 00:42:30.800 +And then you've got to come back and go actually undo that. + +00:42:30.880 --> 00:42:32.760 +That really, you know, there's like a weird. + +00:42:33.240 --> 00:42:35.400 +So what you're saying is you don't have to worry about that kind of stuff. + +00:42:35.540 --> 00:42:36.920 +We're a framework, not just a library. + +00:42:37.340 --> 00:42:41.700 +The idea is that you have these indicators that not only basically your indicators drive a signal. + +00:42:41.850 --> 00:42:43.660 +Like, again, the details don't really matter. + +00:42:43.760 --> 00:42:45.720 +But the idea is that you have instantaneous, + +00:42:46.320 --> 00:42:48.460 +like within the same frame updates of, + +00:42:48.760 --> 00:42:49.860 +hey, I'm going off to do something. + +00:42:50.020 --> 00:42:52.040 +Like usually you make a spinner or you say, + +00:42:52.360 --> 00:42:53.680 +I'm going to do this to gray out the field + +00:42:53.960 --> 00:42:54.420 +or I'm going to do, + +00:42:54.620 --> 00:42:55.800 +like there's all kinds of things you can drive. + +00:42:55.980 --> 00:42:59.240 +Because again, the state of what the local stuff is + +00:42:59.600 --> 00:43:02.060 +while the change is there, that lives in the client. + +00:43:02.360 --> 00:43:03.960 +Like that is part of Datastar. + +00:43:04.360 --> 00:43:06.180 +It has all the right tools to make it + +00:43:06.180 --> 00:43:08.080 +so that you can disable it or gray it out + +00:43:08.280 --> 00:43:10.200 +or say, I'm going to put a spinner next to it. + +00:43:10.280 --> 00:43:11.400 +Like you can do all those things. + +00:43:11.900 --> 00:43:13.320 +But the thing is you're not lying to your user. + +00:43:13.700 --> 00:43:14.340 +That's my whole thing. + +00:43:14.380 --> 00:43:15.780 +And people say, well, that's not really a lie. + +00:43:15.860 --> 00:43:16.440 +It's like, yes, it is. + +00:43:16.560 --> 00:43:17.560 +You're literally lying to people. + +00:43:18.060 --> 00:43:18.740 +Like, please stop. + +00:43:18.920 --> 00:43:20.240 +It's a DX issue. + +00:43:20.720 --> 00:43:23.780 +The reason why people do it is because it's convenient, not because it's correct. + +00:43:24.020 --> 00:43:26.020 +And again, like you can do optimistic updates. + +00:43:26.200 --> 00:43:29.120 +You can do SBA-like things using Datastar. + +00:43:29.220 --> 00:43:34.500 +We don't recommend it because Datastar is more than just this tech. + +00:43:34.760 --> 00:43:36.300 +It's also like a way of doing things. + +00:43:37.060 --> 00:43:42.860 +What I wanted to point out here is that you might imagine that, you know, this is something that when you click edit, + +00:43:43.120 --> 00:43:44.520 +it turns it into a form. + +00:43:44.520 --> 00:43:47.240 +So you might like load the form into the page, hide the form, + +00:43:47.460 --> 00:43:48.880 +and then just do a show hide approach. + +00:43:49.920 --> 00:43:53.060 +But the hypermedia approach is kind of like the REST approach + +00:43:53.400 --> 00:43:56.380 +where you can only take the next action at any given time. + +00:43:57.060 --> 00:43:58.300 +So if you open the network tab, + +00:43:58.500 --> 00:44:00.480 +I just want to kind of walk you through this briefly. + +00:44:00.480 --> 00:44:02.280 +If you cancel that, when you hit edit, + +00:44:02.820 --> 00:44:04.680 +you will see a network request to the server. + +00:44:04.840 --> 00:44:06.900 +And what comes back is that form. + +00:44:07.220 --> 00:44:10.380 +So it's real time as in like what you're seeing now + +00:44:10.400 --> 00:44:12.960 +is the actual state reflected on the backend. + +00:44:13.570 --> 00:44:16.120 +And when you save, you're also going to see the same thing. + +00:44:16.480 --> 00:44:18.340 +You're going to see a network request to the server + +00:44:18.730 --> 00:44:21.440 +and it gets the true current state + +00:44:22.010 --> 00:44:24.340 +as it has been saved is now all comes back down. + +00:44:24.800 --> 00:44:27.620 +So it's like you don't even need optimistic updates + +00:44:28.060 --> 00:44:28.600 +most of the time. + +00:44:28.880 --> 00:44:30.820 +And when you do use it, + +00:44:30.880 --> 00:44:33.000 +it's because you're trying to cover up poor performance. + +00:44:33.620 --> 00:44:35.520 +You're favoring perceived performance + +00:44:36.070 --> 00:44:36.980 +over true performance. + +00:44:37.280 --> 00:44:39.480 +One of the things I hear a lot is people saying, + +00:44:39.760 --> 00:44:40.820 +but it's so much slower. + +00:44:41.350 --> 00:44:43.160 +But I think people are used to + +00:44:43.250 --> 00:44:45.160 +or think it's much slower than it is + +00:44:45.420 --> 00:44:47.680 +because the web, the spa life + +00:44:47.880 --> 00:44:50.440 +that we see around us feels so slow. + +00:44:51.020 --> 00:44:53.640 +But anytime I've seen people try to lean + +00:44:53.900 --> 00:44:55.160 +into just using the network, + +00:44:55.660 --> 00:44:57.120 +it's so much faster than you expect. + +00:44:57.340 --> 00:44:59.520 +Well, and also you have so much less our way. + +00:45:00.000 --> 00:45:02.860 +Your usage of network can be easily 100x less, + +00:45:03.360 --> 00:45:04.460 +which means you have less contention, + +00:45:04.900 --> 00:45:06.160 +which means when you do send something, + +00:45:06.600 --> 00:45:07.460 +it's there immediately. + +00:45:07.490 --> 00:45:09.140 +And also because you're not doing polling + +00:45:09.420 --> 00:45:13.280 +with polling, you have to send to the server and the server sends back. If you just send from server + +00:45:13.600 --> 00:45:18.880 +when something updates, now you've just halved your RTT, right? Your round trip has just halved. + +00:45:19.060 --> 00:45:23.640 +So you half it and you're doing like a thousand less of something. All of a sudden, things opened + +00:45:23.760 --> 00:45:28.780 +up for you in weird ways, right? It's a fundamentally different way of thinking about the problem. + +00:45:29.259 --> 00:45:33.660 +Another example that we like to bring up a lot is there was a while back, someone did a million + +00:45:33.820 --> 00:45:37.980 +checkbox demo and they had a whole write-up on it, right? And they basically had to take it down + +00:45:38.000 --> 00:45:39.220 +because it was just too expensive to run. + +00:45:39.720 --> 00:45:41.720 +We have a version that's not just checkboxes, + +00:45:41.840 --> 00:45:43.060 +but color checkboxes. + +00:45:43.060 --> 00:45:44.660 +So you can actually make ASCII art and stuff like that. + +00:45:45.020 --> 00:45:45.680 +And it's a billion. + +00:45:46.160 --> 00:45:47.220 +And it runs on the same server + +00:45:47.600 --> 00:45:48.940 +that was running that Game of Life demo. + +00:45:49.280 --> 00:45:49.860 +It's on the same server. + +00:45:50.300 --> 00:45:52.380 +It's actively, and it's been on top of Hacker News. + +00:45:52.460 --> 00:45:54.640 +It's a $5 VPS as far as I know. + +00:45:54.700 --> 00:45:55.700 +Yeah, it's a $5 one. + +00:45:56.020 --> 00:45:57.760 +It runs all these demos all the time, + +00:45:58.020 --> 00:45:59.220 +active, top of Hacker News, + +00:45:59.680 --> 00:46:01.180 +and it's never gone down. + +00:46:01.300 --> 00:46:02.780 +What's really interesting about that demo + +00:46:03.020 --> 00:46:06.740 +is that it becomes a backend optimization challenge, right? + +00:46:07.040 --> 00:46:09.060 +You're no longer trying to optimize the front end. + +00:46:09.220 --> 00:46:14.100 +You rely on the browser and the browser API to take care of that for you. + +00:46:14.340 --> 00:46:17.460 +And now you're doing, I don't know, you're optimizing your database. + +00:46:17.740 --> 00:46:19.020 +You're optimizing your queries. + +00:46:20.420 --> 00:46:22.360 +I actually threw the link to that in there. + +00:46:22.740 --> 00:46:30.380 +Because it's a nice demo to look at when you realize there are a billion of these being stored in a SQLite database somewhere. + +00:46:31.060 --> 00:46:32.700 +So you can scroll anywhere on the board. + +00:46:33.080 --> 00:46:37.180 +And it's like a 30,000 by 30,000 something grid + +00:46:37.500 --> 00:46:39.040 +because the square root of a billion + +00:46:39.580 --> 00:46:41.580 +is some weird number, as it turns out. + +00:46:42.240 --> 00:46:44.200 +And obviously this is, or not obviously, + +00:46:44.340 --> 00:46:45.220 +but this is multiplayer. + +00:46:45.540 --> 00:46:47.060 +So if I was to view this + +00:46:47.160 --> 00:46:49.020 +or you were to open a different browser tab, + +00:46:49.320 --> 00:46:50.980 +then you would see the exact same thing. + +00:46:51.060 --> 00:46:52.500 +The board is the same everywhere. + +00:46:52.820 --> 00:46:53.780 +That is crazy. + +00:46:54.120 --> 00:46:55.660 +The one thing that I will say that's hard, + +00:46:56.280 --> 00:46:58.720 +I don't know, Chris can really probably speak to this more + +00:46:58.820 --> 00:47:00.060 +and it sounds like a weird thing, + +00:47:00.280 --> 00:47:05.560 +Like, Datastar ends up in reality being like five or six things on your page, and it just + +00:47:05.760 --> 00:47:06.220 +gets out of the way. + +00:47:06.290 --> 00:47:09.720 +All of a sudden, like, most Datastar, you're going to get to a point where you try it, + +00:47:09.780 --> 00:47:10.600 +and you're like, that's it? + +00:47:10.910 --> 00:47:14.880 +Like, you will feel weird about every other approach once you really try it. + +00:47:14.890 --> 00:47:18.000 +Like, just try it, and you will see every other approach is wrong. + +00:47:18.290 --> 00:47:19.920 +Like, it's not because I made it. + +00:47:20.050 --> 00:47:23.280 +Like, I wish someone would have made this because it just, it's so simple. + +00:47:23.880 --> 00:47:26.060 +It feels like cheating in a weird way. + +00:47:26.540 --> 00:47:27.320 +That's hard to explain. + +00:47:27.400 --> 00:47:31.680 +it really it's a weird like i don't know what we all were doing i was part of the problem like right + +00:47:31.860 --> 00:47:35.940 +like i was like oh well google and everyone facebook and all the other guys have this figured + +00:47:36.120 --> 00:47:41.280 +out like this has to be the best approach so that's the weird thing is it's so simple i don't know what + +00:47:41.460 --> 00:47:45.000 +like crystal probably it sounds like i'm selling it but it's just i don't know it's weird it's so + +00:47:45.180 --> 00:47:50.500 +exciting it's so amazing and yet it's it's using all these boring technologies and like yeah like + +00:47:50.700 --> 00:47:55.619 +i remember i showed my wife this my status board and she's like oh yeah that looks really cool and + +00:47:55.640 --> 00:47:59.220 +I'm like, oh yeah, because you don't understand what everything is going on behind it, you know? + +00:47:59.460 --> 00:48:00.180 +Yeah, exactly. + +00:48:00.660 --> 00:48:02.900 +It's like, it used to be so complicated. + +00:48:03.620 --> 00:48:05.780 +So let's do, we got a little bit of time left. + +00:48:05.960 --> 00:48:06.280 +Let's do this. + +00:48:06.330 --> 00:48:12.580 +I think it might be fun to talk through kind of some of the attributes and what it looks like, + +00:48:13.080 --> 00:48:15.520 +kind of program with this a little bit and then what it looks like on the server. + +00:48:15.940 --> 00:48:16.420 +How's that sound? + +00:48:16.680 --> 00:48:17.220 +Would it make sense? + +00:48:17.380 --> 00:48:21.240 +Like there's a good example of a kind of a meta framework for Python called Stario, + +00:48:21.740 --> 00:48:23.240 +which they just got their V2. + +00:48:23.480 --> 00:48:23.620 +Okay. + +00:48:23.860 --> 00:48:24.440 +Just launched. + +00:48:24.600 --> 00:48:27.320 +I don't know if that is a more Python-esque way of doing it. + +00:48:27.560 --> 00:48:28.680 +It depends on how you want to. + +00:48:28.820 --> 00:48:31.260 +Let's start with some of the Datastar attributes, + +00:48:31.530 --> 00:48:32.680 +and then we could talk about that. + +00:48:32.800 --> 00:48:33.160 +How's that sound? + +00:48:33.580 --> 00:48:36.360 +Like, just, you know, what does it look like to say to, + +00:48:36.870 --> 00:48:40.960 +you know, I want to connect a button to Datastar actions + +00:48:41.130 --> 00:48:43.380 +on the back end or wired up and so on? + +00:48:43.460 --> 00:48:45.380 +We've talked about a lot about, you know, + +00:48:45.430 --> 00:48:48.340 +the back end driving the front end through patching elements, + +00:48:48.570 --> 00:48:51.080 +which is kind of the lower half of what you're looking at. + +00:48:51.720 --> 00:48:57.380 +To access that, you need to have a click listener or some sort of event listener to trigger that. + +00:48:58.080 --> 00:49:05.180 +And datastar, as the name suggests, uses data-star or asterisk attributes. + +00:49:05.820 --> 00:49:09.120 +So these are part of the HTML spec data set. + +00:49:09.680 --> 00:49:10.800 +And we just leverage that. + +00:49:10.940 --> 00:49:18.920 +And we have a small grammar that you'd find on the reference page with all of the data-attributes. + +00:49:19.160 --> 00:49:23.400 +And data on is just registering an event listener on the current element. + +00:49:23.820 --> 00:49:29.900 +So data on colon click is just obviously registering a click event handler on the button. + +00:49:30.520 --> 00:49:34.220 +And what's happening is that then Datasar also gives you actions. + +00:49:34.440 --> 00:49:38.340 +So that at get is an action to send a get request to the server. + +00:49:38.680 --> 00:49:41.540 +You pass in the path, which is slash endpoint there. + +00:49:42.320 --> 00:49:44.620 +And then the server takes care of the rest. + +00:49:44.640 --> 00:49:47.520 +So what you're seeing is a div underneath with an ID. + +00:49:48.100 --> 00:49:50.940 +IDs are obviously unique in HTML, so they're ideal. + +00:49:51.470 --> 00:49:54.320 +And Datastar just uses that fact. + +00:49:55.020 --> 00:50:03.200 +And what Datastar is going to do from the backend is it's going to, yeah, just send back down that div with some text content inside of it. + +00:50:03.640 --> 00:50:09.860 +And then what Datastar does is it mutates the incoming DOM into the existing DOM. + +00:50:10.220 --> 00:50:11.300 +I'm sorry, it morphs. + +00:50:11.820 --> 00:50:13.740 +So it uses a morphing strategy. + +00:50:14.440 --> 00:50:22.580 +So rather than doing a straight swap, which is what HTMLX does, it will actually morph the incoming HTML into what's currently on the page. + +00:50:23.160 --> 00:50:30.840 +That's kind of what opens up the door to these kind of broad, like where you send the entire document down, but only what changes get swapped in. + +00:50:30.860 --> 00:50:33.060 +But in this case, it's more of a fine grained thing. + +00:50:34.000 --> 00:50:35.960 +So only that div is going to get swapped out. + +00:50:36.120 --> 00:50:40.920 +And the reason why that morph matters is because you, since you aren't replacing it, things like + +00:50:41.110 --> 00:50:46.260 +focus and like where your input is and all that stays the same. So when you, even though you update + +00:50:46.370 --> 00:50:51.260 +the whole page, you're actually not actually changing the state and that's really important. + +00:50:51.370 --> 00:50:56.120 +So you do declarative development. You just say, I want it to look this way and it just does the + +00:50:56.130 --> 00:50:59.840 +right things to do it. It's from a mental ball. It's almost like having the BDOM in the backend. + +00:51:00.010 --> 00:51:03.940 +You just say, here's what I want this page to look like. And it does all the work, but we don't do + +00:51:03.920 --> 00:51:08.720 +BDOM. We don't do any of that stuff. We do the fast thing. So in terms of what your backend would + +00:51:08.920 --> 00:51:13.980 +send, if you can just scroll back up, it's that text that you were looking at. Let's look at the + +00:51:14.040 --> 00:51:19.780 +raw version because, yeah, so that's the HTML. If you scroll down to the next text, it's a code + +00:51:19.880 --> 00:51:24.260 +field. Yeah. There's a section that has like event, data star, patch elements, and then what the + +00:51:24.500 --> 00:51:29.880 +elements are and so on, right? This is like the SSE stream. Yeah. And that would be the raw events + +00:51:29.900 --> 00:51:30.720 +that you would send down. + +00:51:31.140 --> 00:51:32.860 +But if you look at the next one + +00:51:33.040 --> 00:51:34.980 +where we have a Python example, + +00:51:35.300 --> 00:51:35.940 +you would see like, + +00:51:36.340 --> 00:51:37.420 +well, how do you do that in Python + +00:51:38.260 --> 00:51:39.240 +without actually writing, + +00:51:39.720 --> 00:51:40.980 +you know, the raw format out? + +00:51:41.560 --> 00:51:43.000 +And that's how you would do it there + +00:51:43.200 --> 00:51:44.520 +using the Python SDK. + +00:51:44.660 --> 00:51:45.640 +Let's dive in a little bit + +00:51:45.760 --> 00:51:47.860 +to the SDK itself. + +00:51:48.260 --> 00:51:49.980 +So I got so many things open. + +00:51:51.820 --> 00:51:52.260 +Hold on. + +00:51:52.440 --> 00:51:53.240 +We got another link for you. + +00:51:53.360 --> 00:51:53.700 +No, I'm kidding. + +00:51:54.740 --> 00:51:54.980 +You know what? + +00:51:55.020 --> 00:51:55.640 +I'm just going to go. + +00:51:55.800 --> 00:51:56.500 +I'm going from the homepage. + +00:51:56.940 --> 00:51:57.260 +There we go. + +00:51:57.400 --> 00:51:57.640 +There you go. + +00:51:57.820 --> 00:51:59.240 +Chris, maybe you could talk us through this. + +00:51:59.360 --> 00:52:03.480 +I think before I throw it to you, though, yeah, there's a lot of framework support here. + +00:52:03.540 --> 00:52:08.020 +So if you're a Django person, a FastAPI person, even fast HTML, it's interesting, + +00:52:08.360 --> 00:52:10.240 +LightSar, Quartz, SanEck, or Starlet. + +00:52:10.800 --> 00:52:15.520 +There's a bunch of different ones here, but maybe just talk us through this, if you'll, Chris. + +00:52:16.000 --> 00:52:16.540 +I'm trying to remember. + +00:52:16.680 --> 00:52:20.180 +I'm not as familiar with the example, but as you can see, + +00:52:20.480 --> 00:52:22.560 +this one method is, I think, where the magic happens. + +00:52:22.940 --> 00:52:24.600 +I'm trying to remember which tool. + +00:52:24.780 --> 00:52:25.680 +This is a Quartz. + +00:52:25.900 --> 00:52:27.720 +Yeah, the Quartz is the examples in Quartz. + +00:52:27.900 --> 00:52:34.860 +So, you know, they first define a route, a home route slash, and it returns HTML and it's just in the string there. + +00:52:34.970 --> 00:52:35.740 +Right. And then that. + +00:52:35.890 --> 00:52:37.940 +This could be a Ginget or Chameleon or whatever template. + +00:52:38.010 --> 00:52:39.220 +Like it's just whatever. It doesn't matter. + +00:52:39.480 --> 00:52:40.380 +But somehow they get it. Yeah. + +00:52:40.580 --> 00:52:42.340 +Makes the example easier to see in one go. + +00:52:42.820 --> 00:52:54.280 +And obviously you see that it's pulling Datastar from the CDN and then it on the load, it gets it sends a request to the slash updates endpoint. + +00:52:54.840 --> 00:52:56.360 +See what comes from that. + +00:52:56.980 --> 00:53:02.840 +And so down below that, you have the slash updates endpoint, which has a decorator called data, data star underscore response. + +00:53:03.420 --> 00:53:10.260 +And that just does a couple of nice things like sets the HTTP headers and whatnot to be the service and event protocol. + +00:53:10.720 --> 00:53:15.880 +And then what I like to the first line says signals equals await read signals. + +00:53:16.440 --> 00:53:21.140 +And so that's another helper that essentially says when I have a request coming in, + +00:53:21.580 --> 00:53:25.260 +data star has a specific way of sending the state of the front end to the back end. + +00:53:25.280 --> 00:53:26.780 +So the back end can do whatever it needs. + +00:53:26.940 --> 00:53:35.460 +Right. We haven't even talked about signals yet. They're like kind of a data binding set of JavaScript data that loads, you know, reactive data loads on the front end, right? + +00:53:35.500 --> 00:53:37.680 +In some ways, the Alpine JS kind of, + +00:53:38.470 --> 00:53:39.160 +I don't want to say equivalent, + +00:53:39.390 --> 00:53:41.580 +but it covers similar functionality. + +00:53:41.990 --> 00:53:43.660 +And so if you have data on the front end + +00:53:44.100 --> 00:53:45.480 +that the backend would like to know, + +00:53:45.520 --> 00:53:47.260 +that's an easy way to get it. + +00:53:47.530 --> 00:53:48.940 +And then essentially what happens + +00:53:49.010 --> 00:53:50.940 +is we get into this loop, this while true loop, + +00:53:51.160 --> 00:53:53.300 +and Datastar will just start sending down + +00:53:53.410 --> 00:53:55.800 +server sent events in text + +00:53:56.080 --> 00:53:59.460 +by using the sse.patchElements function, + +00:53:59.680 --> 00:54:00.840 +or I guess it's a method technically. + +00:54:01.580 --> 00:54:03.880 +And all it's doing is sending a string + +00:54:03.880 --> 00:54:07.320 +that has the current date time dot now in ISO format down. + +00:54:07.730 --> 00:54:10.280 +And then we wait, we sleep for a second, + +00:54:10.460 --> 00:54:11.160 +or is it a second? + +00:54:11.470 --> 00:54:12.380 +I guess it's a microsecond. + +00:54:12.450 --> 00:54:13.100 +I keep forgetting which one. + +00:54:13.100 --> 00:54:14.400 +Yeah, that's a millisecond. + +00:54:14.540 --> 00:54:15.300 +No, no, it's second. + +00:54:15.440 --> 00:54:16.180 +And sleep is seconds. + +00:54:16.340 --> 00:54:16.880 +Sleep is seconds. + +00:54:17.280 --> 00:54:18.180 +It takes a float. + +00:54:18.320 --> 00:54:20.880 +So once it sleeps, it sends another server sign event. + +00:54:21.640 --> 00:54:24.600 +With this time, it's instead of sending the HTML down, + +00:54:24.980 --> 00:54:26.680 +we're sending a signal. + +00:54:27.030 --> 00:54:28.520 +So essentially changing, say, you can say, + +00:54:28.660 --> 00:54:31.320 +like the JavaScript value or script data + +00:54:31.830 --> 00:54:32.620 +onto the front of the page. + +00:54:32.780 --> 00:54:33.080 +Right, right. + +00:54:33.200 --> 00:54:37.800 +So it's showing that you can send the HTML and let Datastar patch it, or you can basically + +00:54:38.240 --> 00:54:43.200 +from the server set one of these signal things that will be reactive on the front end, right? + +00:54:43.390 --> 00:54:43.520 +Yeah. + +00:54:43.520 --> 00:54:44.480 +You said it much better than I did. + +00:54:45.760 --> 00:54:45.900 +Thanks. + +00:54:45.900 --> 00:54:47.940 +It's a long way of saying it's a clock, right? + +00:54:48.120 --> 00:54:48.220 +Yeah. + +00:54:48.440 --> 00:54:51.720 +Also the thing, just for people that aren't used to thinking about this way, especially + +00:54:51.900 --> 00:54:56.100 +if you're doing Python, like all a signal is, is instead of saying that here's, I'm + +00:54:56.210 --> 00:55:00.099 +setting a variable, you're saying I'm setting a relationship that says like, kind of like + +00:55:00.120 --> 00:55:03.660 +in an Excel document when you set a formula for a cell, + +00:55:04.000 --> 00:55:04.540 +it's the same idea. + +00:55:04.590 --> 00:55:05.700 +You're setting up a relationship saying, + +00:55:05.770 --> 00:55:08.080 +when this thing and this thing changes, update this. + +00:55:08.170 --> 00:55:10.520 +And it does smart things to do that efficiently. + +00:55:10.890 --> 00:55:12.280 +But the idea is it's a relationship. + +00:55:12.580 --> 00:55:13.220 +It's declarative. + +00:55:13.670 --> 00:55:14.700 +So kind of like with SQL, + +00:55:15.280 --> 00:55:17.600 +you think of SQL as a declarative language, right? + +00:55:17.680 --> 00:55:18.980 +You don't care how it creates an index. + +00:55:19.070 --> 00:55:20.120 +You just say, create index. + +00:55:20.620 --> 00:55:21.500 +Same thing happens here. + +00:55:21.570 --> 00:55:23.880 +You just say, hey, I want when this thing changes, + +00:55:24.110 --> 00:55:24.820 +this other thing to change. + +00:55:25.250 --> 00:55:26.880 +And the problem is that declarativeness + +00:55:27.040 --> 00:55:28.100 +is not built into JavaScript. + +00:55:28.400 --> 00:55:29.380 +It's not built into the browser, + +00:55:29.440 --> 00:55:32.380 +but we just made the web a little bit more declarative. + +00:55:32.800 --> 00:55:33.580 +That's all we did, basically. + +00:55:33.700 --> 00:55:33.780 +Right. + +00:55:34.320 --> 00:55:36.020 +Declarative is generally pretty good. + +00:55:36.260 --> 00:55:37.160 +It's a good way to work. + +00:55:37.300 --> 00:55:38.040 +It keeps things simple + +00:55:38.240 --> 00:55:40.280 +and lets the underlying system have at it. + +00:55:40.760 --> 00:55:42.140 +So a couple of things, + +00:55:42.880 --> 00:55:44.260 +well, we still got a little bit of time, + +00:55:44.440 --> 00:55:46.260 +but to wrap things up a little bit. + +00:55:46.560 --> 00:55:46.800 +Editors. + +00:55:47.200 --> 00:55:49.080 +I think having good editor support + +00:55:49.840 --> 00:55:51.840 +is really important for adoption. + +00:55:52.100 --> 00:55:53.220 +You know, drives me crazy + +00:55:53.720 --> 00:55:56.220 +when I go and try to work with JavaScript, CSS, + +00:55:56.800 --> 00:55:57.520 +attributes or whatever, + +00:55:57.720 --> 00:55:59.120 +and I'm like, they're not here. + +00:55:59.200 --> 00:55:59.720 +No help. + +00:56:00.260 --> 00:56:06.760 +So you all have nice extensions and plugins for common editors Python people might use, right? + +00:56:06.860 --> 00:56:12.040 +Yeah, we have VS Code, which you're seeing here, and PHP Storm. + +00:56:12.560 --> 00:56:16.140 +Or sorry, I use PHP Storm, but all JetBrains editors. + +00:56:16.580 --> 00:56:19.580 +PHP Storm, PyToram, WebStorm, all of them things. + +00:56:19.740 --> 00:56:25.180 +So it's in the JetBrains marketplace, so it'll work for, I believe, all JetBrains IDEs. + +00:56:25.360 --> 00:56:25.740 +I believe so. + +00:56:25.850 --> 00:56:28.440 +You also have the AI editors covered. + +00:56:28.770 --> 00:56:29.000 +Do we? + +00:56:29.180 --> 00:56:37.620 +In the OpenVSX registry, all the ones that have been kicked out from VS Code, this is where they all have to go to get their installs, right? + +00:56:37.740 --> 00:56:40.060 +That explains why people requested this from me. + +00:56:41.519 --> 00:56:49.060 +Yeah, if you're doing cursor, anti-gravity, windsurf, like all those things, they were all kicked out of the VS Code registry. + +00:56:50.640 --> 00:56:51.480 +That's not a complaint. + +00:56:51.570 --> 00:56:53.060 +I mean, it's a Microsoft product. + +00:56:53.070 --> 00:56:53.560 +They built it. + +00:56:54.050 --> 00:56:55.880 +They don't have to build all the other ones. + +00:56:56.020 --> 00:56:57.460 +But that's why they're here, right? + +00:56:57.560 --> 00:56:58.580 +We keep those up to date. + +00:56:58.980 --> 00:57:01.320 +We do those ourselves, the SDKs. + +00:57:01.330 --> 00:57:04.080 +I mean, Delaney wrote the Go one, I wrote the PHP one, + +00:57:04.110 --> 00:57:06.660 +and the rest are just community contributions. + +00:57:07.280 --> 00:57:11.040 +We've had contributions to these too, to the IDE extensions. + +00:57:11.680 --> 00:57:13.340 +We maintain these primarily. + +00:57:13.760 --> 00:57:14.360 +Yeah, and these are great. + +00:57:14.590 --> 00:57:19.480 +These just, you know, save on typing, but more importantly, save on making typos. + +00:57:20.100 --> 00:57:22.760 +You know, they show you all of the available data attributes. + +00:57:23.120 --> 00:57:26.120 +Maybe Chris can speak more to is that the irony is, though, + +00:57:26.300 --> 00:57:29.220 +you won't need that many tags to actually do your work. + +00:57:29.330 --> 00:57:30.960 +So it's not like a tailwind thing where you're like, + +00:57:31.140 --> 00:57:33.320 +oh, I rely on it to autocomplete. + +00:57:33.520 --> 00:57:33.860 +It's just... + +00:57:33.860 --> 00:57:34.220 +Yeah, absolutely. + +00:57:34.540 --> 00:57:35.800 +In fact, it's one of those things + +00:57:35.890 --> 00:57:38.140 +where I discover more things I can do with Datastar + +00:57:38.280 --> 00:57:40.500 +because as I'm typing data dash and I'm like, + +00:57:40.710 --> 00:57:42.600 +oh, I didn't actually remember + +00:57:42.780 --> 00:57:45.800 +that there's a attribute to do whatever it is. + +00:57:45.800 --> 00:57:46.240 +I don't remember. + +00:57:46.440 --> 00:57:47.620 +Like, I can't remember at this point. + +00:57:48.000 --> 00:57:49.860 +And then I went to the documentation like, + +00:57:49.960 --> 00:57:50.640 +oh, check this out. + +00:57:50.760 --> 00:57:51.740 +This is so much more I can do. + +00:57:52.100 --> 00:57:54.080 +But yeah, I find I love the plugin, + +00:57:54.280 --> 00:57:55.460 +but I find I don't use it too much + +00:57:55.480 --> 00:57:58.160 +just because I'm not writing as much HTML with it. + +00:57:58.280 --> 00:58:01.680 +While I'm sitting here on this open VSX registry, + +00:58:02.000 --> 00:58:05.420 +do you all have advice for making Datastar work well + +00:58:05.580 --> 00:58:08.500 +with Identic AI and Claude Code, Cursor, et cetera? + +00:58:08.740 --> 00:58:10.280 +There's some active research going on + +00:58:10.440 --> 00:58:13.300 +like in Oslo at a college + +00:58:13.840 --> 00:58:15.900 +that's doing, ironically, using Datastar + +00:58:16.720 --> 00:58:21.940 +to do some stuff around like how LLMs work with code bases. + +00:58:22.120 --> 00:58:24.440 +And the reason why is because the entire code base + +00:58:24.380 --> 00:58:28.320 +fits in basically every context, even the nano ones, like the entire code base fits there. + +00:58:28.620 --> 00:58:33.340 +And what they found, we've gone back and forth a bit, is that almost all of them are completely + +00:58:33.540 --> 00:58:37.840 +overfitted. So if you just want to make a website with agentic stuff, go do React, + +00:58:38.260 --> 00:58:43.020 +because that's what it's built for. And it's overfitted to such a degree that if you try to + +00:58:43.420 --> 00:58:49.160 +use the spec correctly and to say, here's all the sorts of data star, go use it to build websites, + +00:58:49.380 --> 00:58:51.500 +it will fall over almost in every regard. + +00:58:51.840 --> 00:58:54.760 +So it's one of those things where you don't need that much, + +00:58:54.940 --> 00:58:59.420 +but it will ironically show you how bad things like Clawed and Codex and stuff + +00:58:59.600 --> 00:59:02.680 +are at just using the current context to solve things. + +00:59:03.080 --> 00:59:05.480 +Hopefully that gets better, but we have something around, + +00:59:05.840 --> 00:59:08.440 +like we have a slash docs page that you can feed into your LLM, + +00:59:08.680 --> 00:59:11.500 +but I'll say that we do not focus on that at all + +00:59:11.740 --> 00:59:14.900 +because you're basically fighting against what the training already happened. + +00:59:15.160 --> 00:59:17.820 +So you're better off, like if you want to use, + +00:59:18.140 --> 00:59:19.060 +if you want to make better size, + +00:59:19.100 --> 00:59:20.080 +you want to be fast and efficient + +00:59:20.140 --> 00:59:20.720 +and all that stuff, + +00:59:21.120 --> 00:59:22.200 +we're 100% the right thing to do. + +00:59:22.380 --> 00:59:24.360 +If you just want to like one shot something, + +00:59:24.780 --> 00:59:26.880 +go use React and stay in that world. + +00:59:26.940 --> 00:59:27.820 +You want to vibe code it? + +00:59:28.340 --> 00:59:29.120 +Hey, I've got something. + +00:59:29.220 --> 00:59:32.200 +I feel like this might resonate with you, Delaney, + +00:59:32.420 --> 00:59:33.640 +especially the way you just described it. + +00:59:34.040 --> 00:59:37.740 +Have you all seen the Kai Lintit Senior Engineer + +00:59:37.980 --> 00:59:38.660 +Tries Vibe Coding? + +00:59:39.080 --> 00:59:41.280 +This is an amazing video. + +00:59:41.660 --> 00:59:43.060 +And like half of the video is like, + +00:59:43.480 --> 00:59:45.400 +no, no, no, not in being installed. + +00:59:46.100 --> 00:59:46.520 +What are you doing? + +00:59:47.180 --> 00:59:52.600 +It reminds me very much of like, it's just like, nope, that's not what I told you to do. + +00:59:52.660 --> 00:59:54.740 +I know that's what you think the most common thing is, please stop. + +00:59:54.860 --> 00:59:54.940 +Yeah. + +00:59:55.080 --> 00:59:58.880 +And the thing is that it's not that I actually like a lot of the stuff, but I treat it as + +00:59:58.900 --> 01:00:03.500 +an autocomplete or like it can write code faster than I can when it comes to like, hey, change + +01:00:03.500 --> 01:00:04.920 +this in 27 different places. + +01:00:05.400 --> 01:00:06.580 +And I forget which files I did it like. + +01:00:06.960 --> 01:00:10.080 +There's value to it, but people are trying to use it to learn. + +01:00:10.620 --> 01:00:12.800 +It's a complete, it actively is working against you. + +01:00:13.180 --> 01:00:15.300 +Ben has done an amazing job with the guide. + +01:00:15.780 --> 01:00:20.180 +Like, please, like, it's fine to use the LLMs to help, like, guide your process and to, like, + +01:00:20.660 --> 01:00:22.120 +knock stuff out quickly once you have a baseline. + +01:00:22.640 --> 01:00:24.160 +But you have to know when to say no. + +01:00:24.720 --> 01:00:25.580 +And he has done it. + +01:00:25.770 --> 01:00:29.760 +The guide takes half an hour, less than a half, like, probably 15, 20 minutes to read + +01:00:29.940 --> 01:00:31.840 +and then, like, an hour to actually work through. + +01:00:32.320 --> 01:00:35.580 +Please try it first before you try to throw it at the LLMs. + +01:00:35.680 --> 01:00:36.960 +It's not because I hate them. + +01:00:37.080 --> 01:00:42.300 +It's more that they are just overfit to the, like, the sea of badly written SPA code. + +01:00:42.700 --> 01:00:44.820 +That's, unfortunately, that's the situation we're in. + +01:00:45.020 --> 01:00:49.260 +Yeah, especially with JavaScript, the agentic AI is very trained. + +01:00:49.700 --> 01:00:50.600 +It wants what it wants. + +01:00:51.280 --> 01:00:51.560 +All right. + +01:00:52.220 --> 01:00:55.760 +Let's talk, speaking of being near the guy, + +01:00:55.760 --> 01:00:59.460 +if I go over here to more, there's a pro section. + +01:00:59.980 --> 01:01:01.740 +I'll let you all give a shout out to pro. + +01:01:02.300 --> 01:01:04.440 +I know you have a really strong sales pitch here. + +01:01:04.980 --> 01:01:06.440 +You were talking about earlier. + +01:01:06.940 --> 01:01:08.380 +Now, what is this data, Datastar Pro? + +01:01:08.840 --> 01:01:13.800 +It's been about a year since we released the beta one of Datastar. + +01:01:14.420 --> 01:01:23.180 +We are taking our sweet ass time for a very good reason, which is that we want version one to be the last version or like the last major version. + +01:01:23.820 --> 01:01:30.460 +We don't really want to force people through breaking changes and major updates because that's really just a pain. + +01:01:30.630 --> 01:01:35.820 +And I think like Python has done a great job with that and Go as well. + +01:01:36.060 --> 01:01:39.240 +And like there are some ecosystems where you just don't make breaking changes. + +01:01:39.560 --> 01:01:41.320 +That's the norm. And that's what we want to be. + +01:01:41.440 --> 01:01:44.840 +And the JavaScript ecosystem is, you know, the antithesis to that. + +01:01:45.800 --> 01:01:46.980 +They're like, here, hold my beer. + +01:01:47.100 --> 01:01:48.020 +I'll show you breaking changes. + +01:01:48.300 --> 01:01:48.520 +Yeah. + +01:01:48.660 --> 01:01:49.560 +Have you heard of LeftPad? + +01:01:49.700 --> 01:01:53.240 +To give you an idea of how far we take that, we don't have npm. + +01:01:53.740 --> 01:01:55.680 +Like we don't actually even submit to npm. + +01:01:55.810 --> 01:01:59.380 +We have no package.json in our JavaScript framework. + +01:01:59.840 --> 01:02:01.900 +We actually, like, there's none of that stuff. + +01:02:02.090 --> 01:02:03.480 +It does not exist in our ecosystem. + +01:02:03.480 --> 01:02:07.460 +So we take it very seriously when we say it's funny to have a JavaScript framework that + +01:02:07.800 --> 01:02:09.560 +actively hates the JavaScript ecosystem. + +01:02:09.940 --> 01:02:11.620 +And you guys, I think also it's worth pointing out + +01:02:11.720 --> 01:02:13.540 +that you don't have a strong build step, + +01:02:14.120 --> 01:02:17.080 +tree shaking, web packing story, right? + +01:02:17.080 --> 01:02:17.840 +You just dropped- + +01:02:17.860 --> 01:02:18.940 +No, we do. + +01:02:18.940 --> 01:02:21.740 +The thing is that, like for example, + +01:02:21.740 --> 01:02:25.280 +beat is kind of the well-known way to do this stuff. + +01:02:25.280 --> 01:02:25.820 +But guess what? + +01:02:25.820 --> 01:02:27.400 +Under the hood, it uses ES build. + +01:02:27.400 --> 01:02:29.280 +And ES build is a go thing. + +01:02:29.280 --> 01:02:30.400 +We build a lot of our stuff in go. + +01:02:30.400 --> 01:02:32.100 +So it's literally embedded in our, + +01:02:32.100 --> 01:02:33.960 +like we just use the ES build directly. + +01:02:33.960 --> 01:02:36.620 +We don't need 20,000 things from npm. + +01:02:36.620 --> 01:02:38.559 +We just use the go tools inside of our binary + +01:02:38.580 --> 01:02:40.020 +because that's the fast thing to do. + +01:02:40.440 --> 01:02:41.860 +So we don't need all of that. + +01:02:41.870 --> 01:02:43.420 +So we have no dependencies, nothing, + +01:02:43.690 --> 01:02:46.360 +and we don't even use npm or any of that at all. + +01:02:46.720 --> 01:02:48.300 +Yeah, so the reason I mentioned the beta + +01:02:48.560 --> 01:02:51.200 +is during the beta phase, which lasted about six months, + +01:02:51.640 --> 01:02:53.920 +we, Datastar gained a lot of traction, + +01:02:54.250 --> 01:02:56.500 +a lot of interest, and people had a lot of requests. + +01:02:56.800 --> 01:02:58.520 +And we were like, yeah, we see, + +01:02:58.960 --> 01:02:59.960 +and because it's plugin-based, + +01:03:00.170 --> 01:03:01.660 +you can always just add another plugin. + +01:03:01.730 --> 01:03:02.600 +You can add it yourself, + +01:03:03.080 --> 01:03:05.520 +or we can build a plugin and add it to Datastar. + +01:03:05.960 --> 01:03:15.260 +But we were very adamant about keeping the open source Datastar framework as tight as possible. + +01:03:15.560 --> 01:03:18.280 +Like I said, it should do everything you need, but nothing you don't. + +01:03:18.420 --> 01:03:20.860 +So how do we do that while adding plugins? + +01:03:21.540 --> 01:03:26.720 +So during that beta phase, we started thinking about, well, do we have multiple versions of Datastar? + +01:03:26.780 --> 01:03:28.260 +Do we have a marketplace of plugins? + +01:03:28.470 --> 01:03:29.480 +Or how do we manage that? + +01:03:30.080 --> 01:03:35.880 +And at the same time, we were also asking ourselves, because Delaney and I both, we have full-time things that we're doing. + +01:03:35.920 --> 01:04:05.680 +And this is a side project, but we're almost doing it full time alongside our other full time things. So how do we make this project sustainable? Because it doesn't stop at Datastar. You probably see Rocket and Stellar CSS on that page in the navigation sidebar. Those are like projects that build on top of Datastar. So Datastar is just the foundation. And Rocket kind of takes it to web components and Stellar CSS is a CSS framework that builds on top of these concepts. + +01:04:05.820 --> 01:04:09.800 +So we're trying to fix not only JavaScript web components, but also CSS. + +01:04:10.560 --> 01:04:12.160 +So we have a long-term vision. + +01:04:12.820 --> 01:04:18.380 +How do we make that sustainable when we're both busy people anyway, and this just takes + +01:04:18.920 --> 01:04:21.900 +so much of our time and the project appears to be growing? + +01:04:22.600 --> 01:04:25.720 +So at that point, we decided, well, how do we want to even run this? + +01:04:25.880 --> 01:04:31.360 +So we decided we don't want to found some company and do VC like we're, if anything, + +01:04:31.830 --> 01:04:32.840 +anti-VC funding. + +01:04:32.920 --> 01:04:43.700 +So we founded a nonprofit organization in the U.S. called Star Federation, and that's what backs this project, including Rocket and Stellar CSS. + +01:04:44.780 --> 01:04:57.680 +To help fund that organization, we decided let's have Datastar be the open source framework, but then something called Datastar Pro, which is like all those plugins that we think are good ideas, but that most people don't need. + +01:04:58.010 --> 01:05:01.460 +We'll put those into Datastar Pro, and that can kind of grow over time. + +01:05:01.740 --> 01:05:07.120 +It's a collection of plugins that you might want if you're using Datastar in a professional setting. + +01:05:07.520 --> 01:05:11.280 +But, you know, if you're just using Datastar, you don't actually need it. + +01:05:11.580 --> 01:05:12.520 +And so that's what we tell people. + +01:05:12.680 --> 01:05:13.700 +Most people don't need it. + +01:05:13.860 --> 01:05:19.380 +It's a collection of plugins and it's a Datastar inspector, which sits on your page. + +01:05:19.900 --> 01:05:25.840 +You get access to the bundler and now you get access to Rocket and StadrCSS, which is a work in progress. + +01:05:26.720 --> 01:05:28.220 +Yeah, that was, I think, a good decision. + +01:05:28.520 --> 01:05:35.400 +Like there was definitely some uproar initially that, you know, some plugins were taken away, but those plugins were never taken away. + +01:05:35.480 --> 01:05:38.080 +They still exist in the repo if anybody needs them. + +01:05:38.540 --> 01:05:46.040 +What the result is, is that we have like some money coming into a bank account, which is not even used to pay maintainers. + +01:05:46.760 --> 01:05:52.500 +We use that for running costs and like for, you know, podcasting software. + +01:05:53.000 --> 01:05:55.940 +And if we need to travel to conferences, which we've yet to do. + +01:05:56.120 --> 01:06:05.400 +But essentially, it's like a way of having some money into the bank so that we can justify all of the work that we do in maintaining Datastar and pushing that forward. + +01:06:06.300 --> 01:06:07.300 +But the V1 thing. + +01:06:07.600 --> 01:06:14.440 +100% free sounds great until that means it becomes abandoned where, you know, and like people can't work on it anymore. + +01:06:14.740 --> 01:06:16.660 +And I think it's fair. + +01:06:16.860 --> 01:06:21.380 +There's one thing that's kind of interesting about the model, because especially with the tailwind stuff that's been going on lately. + +01:06:21.840 --> 01:06:26.260 +One of the things that we talked about, and people get very angry about this, but for example, + +01:06:26.660 --> 01:06:30.820 +Rocket is a web component layer that you basically just write Datastar in a declarative way, + +01:06:31.320 --> 01:06:34.300 +and it dynamically generates web components for you on the fly. + +01:06:34.370 --> 01:06:36.460 +And it's a great way to build web components. + +01:06:36.620 --> 01:06:38.080 +It'll save you tons of hours. + +01:06:38.540 --> 01:06:39.560 +And people won't pay for features. + +01:06:39.630 --> 01:06:40.320 +They pay for convenience. + +01:06:40.920 --> 01:06:45.860 +So the thing is, people said, well, I want you to generate out the content and make that + +01:06:46.060 --> 01:06:46.620 +open and available. + +01:06:46.960 --> 01:06:51.580 +And we said no, because basically the way we look at it is that almost like Pico 8, + +01:06:51.780 --> 01:06:54.000 +or any kind of game engine, you pay for the game + +01:06:54.140 --> 01:06:55.140 +and then all the mods are free. + +01:06:55.180 --> 01:06:57.940 +So all the rocket components and all this stuff + +01:06:58.000 --> 01:07:01.060 +is gonna be free, but the core engine is not free. + +01:07:01.520 --> 01:07:02.340 +It's a paid thing. + +01:07:02.380 --> 01:07:04.340 +And the reason why is if it becomes successful, + +01:07:04.580 --> 01:07:06.820 +if we do our job and we make it so it's easy for everybody, + +01:07:07.300 --> 01:07:09.980 +the Star Federation will do better over time. + +01:07:10.100 --> 01:07:12.340 +Whereas Tailwind's model of they're competing + +01:07:12.480 --> 01:07:14.580 +against every other person in that space, + +01:07:15.240 --> 01:07:16.680 +whereas it just does not work. + +01:07:16.800 --> 01:07:18.480 +So our thing is if we do get successful + +01:07:18.980 --> 01:07:20.340 +and we do get more people, + +01:07:20.740 --> 01:07:23.600 +then it's self-sustaining as in you paid for this little engine + +01:07:24.100 --> 01:07:27.400 +and now you get all the ecosystem around it of open source. + +01:07:27.710 --> 01:07:28.940 +So you can do open source in a way, + +01:07:29.040 --> 01:07:32.320 +but you have to find a core engine that is not open source. + +01:07:32.520 --> 01:07:34.420 +Otherwise it will fail in the modern world. + +01:07:34.620 --> 01:07:37.500 +Let's close this thing out with two super quick things + +01:07:37.640 --> 01:07:38.360 +because I know we're over time. + +01:07:38.800 --> 01:07:39.140 +Roadmap. + +01:07:39.700 --> 01:07:42.400 +Ben, you talked about taking your sweet time to 1.0. + +01:07:42.880 --> 01:07:44.400 +Is there a forward-looking roadmap? + +01:07:44.740 --> 01:07:46.400 +Are you guys done or what are things? + +01:07:46.740 --> 01:07:50.380 +The release counted RC1, I think it was about six months ago. + +01:07:51.200 --> 01:07:54.240 +and like the release has just been slowing down, slowing down. + +01:07:54.240 --> 01:07:57.360 +So that stagnation in like just releases with fixes + +01:07:57.960 --> 01:08:00.380 +is a good sign to me that we're very, very close. + +01:08:00.800 --> 01:08:04.760 +At this point, like the switch from release candidate to stable + +01:08:05.000 --> 01:08:08.360 +is just literally like just, you know, dropping the RC. + +01:08:09.040 --> 01:08:11.520 +There's no like features that are going into it. + +01:08:11.640 --> 01:08:12.640 +There's no big changes. + +01:08:13.240 --> 01:08:15.060 +We're taking our time because like I said, + +01:08:15.320 --> 01:08:18.580 +it's easy to put something up slightly prematurely + +01:08:18.720 --> 01:08:19.839 +and get some defaults wrong. + +01:08:20.160 --> 01:08:27.160 +I mean, that's what happened with HMX 2 was just like fixing some defaults that they decided they got wrong in version one. + +01:08:27.359 --> 01:08:29.980 +So we're trying to avoid a situation like that. + +01:08:30.380 --> 01:08:35.500 +And the only way to do that is to just let it simmer, let people use it, let people dog food it. + +01:08:35.740 --> 01:08:46.359 +And us, too, we're actively using for many projects using data ourselves and discovering every now and then, oh, this default is probably we're trying to avoid foot guns. + +01:08:46.660 --> 01:08:55.480 +So we're trying to make it so that the defaults give you the best possible experience that you need zero configuration, ideally, but you can configure as needed. + +01:08:55.910 --> 01:09:05.440 +But getting those defaults right is really the only thing stopping us from, not right, but locked down is the only thing stopping us from a V1 stable. + +01:09:05.960 --> 01:09:07.020 +I don't like to give timelines. + +01:09:07.380 --> 01:09:12.160 +In fact, it's one of our things that I tell Delaney, never promise a timeline. + +01:09:12.759 --> 01:09:16.160 +But I could see us in the first half of this year, + +01:09:16.540 --> 01:09:17.859 +just flipping the switch. + +01:09:18.040 --> 01:09:20.060 +But it sounds like you might be able to use the RC + +01:09:20.279 --> 01:09:22.299 +and you might more or less be safe, yeah. + +01:09:22.380 --> 01:09:25.200 +In fact, we recommend people rename the RC + +01:09:25.440 --> 01:09:28.480 +and they change the name to React-Foo + +01:09:28.880 --> 01:09:30.560 +so that they just drop it in their React projects + +01:09:30.920 --> 01:09:34.060 +because the entire framework is smaller than most components. + +01:09:34.580 --> 01:09:36.160 +Just start hiding it places. + +01:09:36.700 --> 01:09:37.900 +Yeah, don't even name it Datastar. + +01:09:38.180 --> 01:09:40.339 +A stealth takeover of the spa world. + +01:09:41.000 --> 01:09:42.020 +Awesome. I love it. + +01:09:42.380 --> 01:09:45.980 +All right, let's wrap up the show with a final call to action + +01:09:46.860 --> 01:09:50.259 +for people who want to use Datastar, learn more, get started. + +01:09:50.799 --> 01:09:54.460 +Chris, I'll let you go first so Ben and Delaney can have the final word. + +01:09:54.500 --> 01:09:58.460 +The first thing I was thinking of is because I get asked so much about + +01:09:59.420 --> 01:10:02.840 +how long it takes to connect to the server and things like that, + +01:10:03.240 --> 01:10:07.920 +there is a portion in the DjangoCon talk I gave in, I think it was 2023, + +01:10:08.480 --> 01:10:14.180 +where I showed a video of five phones, five Android phones, trying to do the same thing, + +01:10:14.380 --> 01:10:20.060 +shopping for eggs, I believe it was. And essentially one of them is an HTML driven + +01:10:20.740 --> 01:10:25.540 +multi-page app and it smokes the single page applications, the native apps and everything. + +01:10:25.920 --> 01:10:30.400 +And so I put a link in our chat. Maybe you'll be a part of the show notes. It's a deep link to go + +01:10:30.560 --> 01:10:36.639 +straight to that portion of the talk because it is like that video reminded me like, this is what + +01:10:36.660 --> 01:10:43.520 +I want to build. I want to build websites that are fun for people to use. And, you know, Datastar + +01:10:43.980 --> 01:10:50.260 +enables me to use real-time interactions with way less complexity than I ever thought + +01:10:50.680 --> 01:10:55.060 +possible. So I guess the two things I would say is, one, check out the deep link if you're at all + +01:10:55.220 --> 01:11:00.600 +interested. And number two, definitely try out something, you know, just even clone the Python + +01:11:00.920 --> 01:11:05.060 +repo and just try some of the examples and see what it's like. Yeah, I'll definitely link to that. + +01:11:05.240 --> 01:11:05.980 +Cool. Thanks, Ben. + +01:11:06.080 --> 01:11:08.120 +I also gave a conference talk last year. + +01:11:08.380 --> 01:11:10.160 +There's a recording, so I'll send you the link to that, + +01:11:10.340 --> 01:11:13.760 +which really walks through my journey of Datastar + +01:11:13.770 --> 01:11:18.440 +and how Datastar has truly opened my eyes to what's possible. + +01:11:18.740 --> 01:11:23.120 +I feel like I talk a lot about how Datastar is a journey of unlearning + +01:11:23.800 --> 01:11:29.060 +old and bad patterns, deeply rooted ones in what I think web development is. + +01:11:29.300 --> 01:11:31.780 +And these days, as I mentioned, + +01:11:32.060 --> 01:11:34.880 +I never would have thought that I'd be developing in Go, + +01:11:35.160 --> 01:11:37.120 +but I see like all like the, + +01:11:37.580 --> 01:11:39.060 +like I think even Python + +01:11:39.240 --> 01:11:41.220 +is getting better concurrency support, right? + +01:11:41.360 --> 01:11:44.400 +So I think you talked about that recently, Michael, here. + +01:11:44.620 --> 01:11:47.220 +So now I'm seeing with Datastar, + +01:11:47.360 --> 01:11:49.280 +I can do so much more on the backend. + +01:11:49.420 --> 01:11:51.720 +I can be so much more creative on the backend + +01:11:51.840 --> 01:11:53.200 +and that's what interests me. + +01:11:53.960 --> 01:11:55.120 +So it's just fun. + +01:11:55.420 --> 01:11:56.180 +What can I say? + +01:11:56.500 --> 01:11:57.080 +It's fun. + +01:11:57.220 --> 01:11:59.160 +I'm still really jealous of that presentation too. + +01:11:59.420 --> 01:12:00.000 +Well done with it. + +01:12:00.080 --> 01:12:01.100 +Yeah, certainly send me the link. + +01:12:01.100 --> 01:12:03.120 +I'll put it in the show notes and well done. + +01:12:03.600 --> 01:12:03.900 +Delaney. + +01:12:04.140 --> 01:12:07.080 +The irony is that like, I don't consider myself a web dev at all. + +01:12:07.410 --> 01:12:09.380 +It just happens to be something I do a little bit of. + +01:12:09.650 --> 01:12:13.940 +The thing that is the, when I first started making this public, I was like, Hey, I think + +01:12:14.000 --> 01:12:16.180 +I'm onto something like someone proved me wrong. + +01:12:16.670 --> 01:12:21.380 +I was, I'm a little bit more like kind of, I always say like in the jujitsu world type + +01:12:21.490 --> 01:12:24.880 +stuff, like you want someone to roll with you, not because you're trying to up them. + +01:12:24.980 --> 01:12:27.240 +It's that like, they're trying to help you find weaknesses in your game. + +01:12:27.520 --> 01:12:27.580 +Right. + +01:12:27.780 --> 01:12:30.780 +So I want there to be an active, like someone proved me wrong. + +01:12:31.080 --> 01:12:33.840 +And I'm at the point now where I feel so confident. + +01:12:34.240 --> 01:12:35.420 +I will put money on it. + +01:12:35.500 --> 01:12:38.440 +I've tried going out to people out in the dev Twitter and all that. + +01:12:38.760 --> 01:12:41.680 +I guarantee you, and I'm happy to put money up on this, + +01:12:41.980 --> 01:12:43.140 +if you could do it the Datastar way, + +01:12:43.500 --> 01:12:45.800 +whether you're using React or HTMLX or any other approach, + +01:12:46.140 --> 01:12:47.020 +it will be less code. + +01:12:47.300 --> 01:12:47.940 +It'll be faster. + +01:12:48.360 --> 01:12:48.940 +It'll be cheaper. + +01:12:49.460 --> 01:12:51.620 +And it'll be simpler to understand. + +01:12:52.060 --> 01:12:54.640 +I will take up anybody anywhere on that thing. + +01:12:55.100 --> 01:12:56.580 +Basically, it's not a boast. + +01:12:56.840 --> 01:12:58.860 +It's just the facts on the table. + +01:12:59.220 --> 01:13:02.480 +And it's a paradigm shift that I want the world to know about + +01:13:02.900 --> 01:13:03.940 +just so that people understand, + +01:13:04.280 --> 01:13:05.220 +hey, there's going to be someone + +01:13:05.240 --> 01:13:06.760 +that comes up with something better than I did, right? + +01:13:06.880 --> 01:13:07.520 +Like I'm standing, + +01:13:07.800 --> 01:13:10.280 +the reason why we have the fastest signal library in the world + +01:13:10.420 --> 01:13:11.620 +is because we listen to the people + +01:13:11.640 --> 01:13:12.360 +that are really good at that. + +01:13:12.460 --> 01:13:13.400 +We use alien signals. + +01:13:13.800 --> 01:13:15.440 +The reason why we have the fastest morphing library + +01:13:15.640 --> 01:13:16.900 +is that we listen to people and said, + +01:13:16.960 --> 01:13:18.900 +hey, there's people that care about this stuff + +01:13:18.920 --> 01:13:19.880 +and are working towards it. + +01:13:20.140 --> 01:13:21.700 +It's not that there's anything special here. + +01:13:22.080 --> 01:13:23.900 +It's that it's trying to build an ecosystem + +01:13:24.180 --> 01:13:25.500 +of like people that care about performance + +01:13:25.560 --> 01:13:27.160 +and people care about the details. + +01:13:27.600 --> 01:13:28.240 +And if you do that, + +01:13:28.560 --> 01:13:29.840 +then everything gets better. + +01:13:30.200 --> 01:13:32.340 +So it's not just where are we at now, + +01:13:32.440 --> 01:13:33.900 +but if anyone thinks they can do better, + +01:13:34.320 --> 01:13:34.900 +please join us. + +01:13:34.910 --> 01:13:35.660 +We want to hear it. + +01:13:35.960 --> 01:13:38.500 +But like, I'm done having the vibe code, + +01:13:38.660 --> 01:13:39.500 +or not the vibe code, + +01:13:39.560 --> 01:13:41.160 +but like the vibes around like, + +01:13:41.580 --> 01:13:42.620 +well, this doesn't feel like a spa + +01:13:42.710 --> 01:13:43.780 +or like a spa has its place. + +01:13:44.220 --> 01:13:45.300 +A couple of episodes ago, + +01:13:45.880 --> 01:13:47.880 +there was a Cody from the Litestar stuff said, + +01:13:48.160 --> 01:13:50.020 +there's a time and place for HTMX or Datastar. + +01:13:50.350 --> 01:13:52.000 +And he's just, that's just not true. + +01:13:52.400 --> 01:13:55.200 +It's just like hypermedia is the way to build things + +01:13:55.360 --> 01:13:57.160 +for a hypermedia client, which is the browser. + +01:13:57.600 --> 01:14:00.700 +So I will, anyone that can show that it's wrong, + +01:14:01.020 --> 01:14:01.640 +please let us know. + +01:14:01.980 --> 01:14:02.760 +Like we're here to help. + +01:14:02.860 --> 01:14:06.400 +I would love to see this paired with some Electron JS apps + +01:14:06.440 --> 01:14:08.120 +to make your desktop apps a little better too. + +01:14:08.280 --> 01:14:08.920 +So anyway. + +01:14:09.220 --> 01:14:10.320 +Seriously, oh my God. + +01:14:10.500 --> 01:14:11.500 +That's a different episode. + +01:14:11.740 --> 01:14:15.000 +So I just want to say thank you, Delaney, Ben and Chris. + +01:14:15.200 --> 01:14:15.940 +Thank you all for being here. + +01:14:16.480 --> 01:14:17.500 +And congrats on Datastar. + +01:14:17.500 --> 01:14:18.680 +It looks like a super project. + +01:14:18.940 --> 01:14:22.700 +I'm looking at some projects or an app running right over there + +01:14:23.160 --> 01:14:25.500 +on my left that I kind of wish I'd built Datastar. + +01:14:25.800 --> 01:14:29.220 +Well, and the thing is to make sure to not think that it's just used, like, yes, we talk + +01:14:29.260 --> 01:14:32.180 +about the real-time stuff, but it's better for even just normal crud stuff. + +01:14:32.260 --> 01:14:35.920 +And that's kind of hard to, like, it's not as sexy to talk about, but it's better at + +01:14:36.020 --> 01:14:36.280 +that too. + +01:14:36.400 --> 01:14:38.200 +Well, it's also 80% of what gets built. + +01:14:38.320 --> 01:14:40.180 +So it's important to like point it out, right? + +01:14:40.560 --> 01:14:40.640 +Yeah. + +01:14:41.080 --> 01:14:41.220 +All right. + +01:14:41.640 --> 01:14:41.920 +Bye everyone. + +01:14:42.260 --> 01:14:42.580 +Thanks for being here. + +01:14:42.740 --> 01:14:42.980 +Thank you. + +01:14:44.180 --> 01:14:46.320 +This has been another episode of Talk Python To Me. + +01:14:46.700 --> 01:14:47.420 +Thank you to our sponsors. + +01:14:47.640 --> 01:14:48.960 +Be sure to check out what they're offering. + +01:14:49.140 --> 01:14:50.480 +It really helps support the show. + +01:14:50.940 --> 01:14:52.320 +Take some stress out of your life. + +01:14:52.540 --> 01:14:58.120 +Get notified immediately about errors and performance issues in your web or mobile applications with Sentry. + +01:14:58.620 --> 01:15:03.060 +Just visit talkpython.fm/sentry and get started for free. + +01:15:03.560 --> 01:15:06.020 +Be sure to use our code talkpython26. + +01:15:06.720 --> 01:15:10.340 +That's Talk Python, the numbers two, six, all one word. + +01:15:11.020 --> 01:15:18.240 +This episode is brought to you by CommandBook, a native macOS app that I built that gives long-running terminal commands a permanent home. + +01:15:18.640 --> 01:15:20.640 +No more juggling six terminal tabs every morning. + +01:15:21.240 --> 01:15:24.040 +Carefully craft a command once, run it forever with auto restart, + +01:15:24.360 --> 01:15:25.900 +URL detection, and a full CLI. + +01:15:26.400 --> 01:15:29.380 +Download it for free at talkpython.fm/command book app. + +01:15:30.140 --> 01:15:32.020 +If you or your team needs to learn Python, + +01:15:32.240 --> 01:15:35.640 +we have over 270 hours of beginner and advanced courses + +01:15:35.960 --> 01:15:39.380 +on topics ranging from complete beginners to async code, + +01:15:39.520 --> 01:15:42.300 +Flask, Django, HTMX, and even LLMs. + +01:15:42.520 --> 01:15:44.800 +Best of all, there's no subscription in sight. + +01:15:45.340 --> 01:15:47.120 +Browse the catalog at talkpython.fm. + +01:15:47.860 --> 01:15:49.839 +And if you're not already subscribed to the show + +01:15:49.900 --> 01:15:51.200 +on your favorite podcast player, + +01:15:51.840 --> 01:15:52.480 +what are you waiting for? + +01:15:53.160 --> 01:15:54.900 +Just search for Python in your podcast player. + +01:15:55.080 --> 01:15:55.900 +We should be right at the top. + +01:15:56.340 --> 01:15:57.860 +If you enjoyed that geeky rap song, + +01:15:57.970 --> 01:15:59.140 +you can download the full track. + +01:15:59.310 --> 01:16:01.200 +The link is actually in your podcast blur show notes. + +01:16:02.040 --> 01:16:03.360 +This is your host, Michael Kennedy. + +01:16:03.760 --> 01:16:04.820 +Thank you so much for listening. + +01:16:05.050 --> 01:16:05.820 +I really appreciate it. + +01:16:06.260 --> 01:16:06.960 +I'll see you next time. + +01:16:18.840 --> 01:16:21.640 +I'm out. + From 814a011b7328016e3643d7054a9433c031824863 Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Sun, 1 Mar 2026 08:25:43 -0800 Subject: [PATCH 04/16] transcripts --- .../538-python-in-digital-humanities.txt | 2282 ++++++++++ .../538-python-in-digital-humanities.vtt | 3713 +++++++++++++++++ 2 files changed, 5995 insertions(+) create mode 100644 transcripts/538-python-in-digital-humanities.txt create mode 100644 transcripts/538-python-in-digital-humanities.vtt diff --git a/transcripts/538-python-in-digital-humanities.txt b/transcripts/538-python-in-digital-humanities.txt new file mode 100644 index 0000000..67af577 --- /dev/null +++ b/transcripts/538-python-in-digital-humanities.txt @@ -0,0 +1,2282 @@ +00:00:00 Digital humanities sounds niche until you realize that it can mean a searchable archive of U.S. + +00:00:05 amendment proposals, Irish folklore, or pigment science in ancient art. Today I'm talking with + +00:00:11 David Flood from Harvard's DARTH team about an unglamorous problem. What happens when the grant + +00:00:18 ends? But the website can't. His answer? Static sites, client-side search, and sneaky Python. + +00:00:24 Let's dive in. This is Talk Python To Me, episode 538, recorded January 22nd, 2026. + +00:00:48 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:53 This is your host, Michael Kennedy. + +00:00:55 I'm a PSF fellow who's been coding for over 25 years. + +00:00:59 Let's connect on social media. + +00:01:00 You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:04 The social links are all in your show notes. + +00:01:06 You can find over 10 years of past episodes at talkpython.fm. + +00:01:10 And if you want to be part of the show, you can join our recording live streams. + +00:01:14 That's right. + +00:01:14 We live stream the raw uncut version of each episode on YouTube. + +00:01:18 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:23 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:27 This episode is brought to you by Sentry. + +00:01:29 Don't let those errors go unnoticed. + +00:01:31 Use Sentry like we do here at Talk Python. + +00:01:33 Sign up at talkpython.fm/sentry. + +00:01:37 And it's brought to you by CommandBook, a native macOS app that I built that gives long-running + +00:01:42 terminal commands a permanent home. + +00:01:44 No more juggling six terminal tabs every morning. + +00:01:46 Carefully craft a command once, run it forever with auto-restart, URL detection, and a full + +00:01:51 CLI. + +00:01:51 Download it for free at talkpython.fm/command book app. + +00:01:56 Hello, David. Welcome to Talk Python To Me. Amazing to have you here. + +00:01:59 I'm glad to be here. Talk Python has been part of my story up to this point. + +00:02:03 Has it? Okay. Well, you are about to write the next chapter in the story. So that's pretty excellent. + +00:02:10 I have a sense of what's coming. We planned out what we're going to talk about and that sort of thing. + +00:02:15 And I'm really excited about this topic. So it's going to be a good one. + +00:02:21 Honestly, I think one of the real powers of the Python community and the reason the language has such staying power is there's such a diversity of use cases, technology, like technology standpoints, right? + +00:02:34 Like I build software for this group or I build these types of apps and it's not just, you know, like Ruby on Rails, which, you know, it's been very popular, but it's, it's for websites, right? + +00:02:44 You know what I mean? + +00:02:45 Yeah, absolutely. + +00:02:47 I mean, web development has dominated my use of it, but my entry into it, which I suppose I'll mention in a moment, was through all those little tools. + +00:02:57 Let's hear it. + +00:02:58 Who are you, David Flood? + +00:03:00 Tell us, introduce yourself real quick and tell us about how you got into it. + +00:03:04 So my background is in music and the humanities. + +00:03:09 I mean, in 2019, I didn't know what Python was or the name of any programming language. + +00:03:16 and I've been doing textual criticism, which is, you know, there's lots of criticisms in the academy. + +00:03:22 This is the one where if you have lots and lots of versions of the same text, + +00:03:27 you are comparing them to work out what the initial text was and like how it changed over time. + +00:03:33 Okay, give us an example. + +00:03:35 Okay, so one of the famous examples, hope I can remember it off the top of my head, + +00:03:40 is from Shakespeare. + +00:03:42 We're all familiar with the line to be or not to be. + +00:03:45 is the question. That is the question. Well, there's a variant of it. One of the early copies + +00:03:53 written by Shakespeare himself has... Somebody's going to be able to type into the chat exactly + +00:03:58 what it is. They'll know this anecdote. But it's something more like, "To be or not to be, I." + +00:04:04 That's the question. And so, which one is the original one? Why did he change it? That's kind + +00:04:09 of one example i work mainly in the in the new testament which is especially complicated because + +00:04:15 no other corpus from ancient history has as many copies of the same text as that corpus does so it's + +00:04:23 quite um quite quite complicated and our techniques have have grown grown because of that and perhaps + +00:04:29 become more advanced than now i mean that many variations over that huge span of time over + +00:04:37 different groups with different, maybe not intentions, but certainly colored by different + +00:04:43 worldviews and philosophies and so on. And yeah, I see the trouble. + +00:04:47 No, yeah. And they were people of the book. So copying it is something that happened a lot. And + +00:04:54 they copied the monks, like the medieval monks copied everything. They copied our Greek classics. + +00:05:01 So that's what I was interested in. And because of the wealth of data that we have, + +00:05:07 Computer tools are more and more important in that field. + +00:05:11 So when I started my PhD in 2019, I knew that I wanted to use some of these cutting-edge tools. + +00:05:17 Some of them may be surprising. + +00:05:19 For example, we've been using phylogenetic software. + +00:05:24 This is software that evolutionary biologists are using or computational biologists are using to track, for example, how COVID strains mutate over time. + +00:05:35 Oh, interesting. + +00:05:36 What they're comparing are the DNA letters. + +00:05:40 And so you have the sequence of letters and you're comparing how those change over time. + +00:05:44 Well, you can swap in textual variants for DNA letters. + +00:05:48 And now we can track how texts change over time and group them into families, things like that. + +00:05:56 It's like a time series, but of words or letters or something. + +00:05:59 Yeah, I mean, yeah, there's lots of important algorithms for comparing + +00:06:06 sequences of things. And so if we can just swap in Greek words and Greek text instead, + +00:06:12 then we can maybe apply it to textual criticism. So I was pretty interested in those things. That + +00:06:16 wasn't actually the method that brought me into it, but something like that, kind of computer + +00:06:21 intensive tools. What I learned is that these tools weren't actually available to me. They + +00:06:27 weren't desktop applications. And for the most part, they weren't public web applications. They + +00:06:36 PyPI or something like that, right? + +00:06:38 Yeah, exactly. + +00:06:39 Exactly. + +00:06:39 Or Java. + +00:06:41 And I needed to glue them together. + +00:06:43 So the long story short on that is during the first year of my PhD, I was picking up Python, + +00:06:50 watching YouTube videos while I was doing the dishes. + +00:06:52 And then the pandemic hit while I was living in Edinburgh in Scotland, probably not far + +00:06:57 from Will McCoogan. + +00:06:59 And so the pandemic gave me the excuse to spend even a few more hours each day picking up these + +00:07:06 new, these new technical skills. And so I did it, I was able to use these advanced tools in my in my + +00:07:13 work. But what was really important to me was sharing, like making that available to my colleagues, + +00:07:18 is I had to I had to move from writing these like bad top to bottom Python scripts into things that + +00:07:23 could be reused by other people. And that led me into the web, because the web is where that's how + +00:07:29 I can share with anybody. It's really wild how much the web is kind of the last bastion of + +00:07:36 app freedom. It's so bizarre because, you know, I've many times told the stories of the insane + +00:07:42 battles of just getting our apps that just playback video of content that's already on the web + +00:07:48 into the app store. I mean, weeks of fighting about the weirdest, most nonsensical things with + +00:07:54 both Google and Apple. But we also now have the Mac platform and the Windows platform very + +00:08:01 aggressively looking for digital code certificates and all sorts of signing and other kinds of proof + +00:08:07 like it you can't even just send somebody an executable anymore it won't run it's it's crazy + +00:08:13 it's it's down to like okay put it on the web i guess that's right i i i played the game of + +00:08:19 distributing desktop apps that's how i did it that's why i initially distributed things um + +00:08:25 and at this point i just require people to install python and then install my desktop app from pypi + +00:08:30 because it's too hard otherwise for me. + +00:08:33 I mean, I could pay for the code signing from Apple and do all of that, + +00:08:37 but it's just, it's too much work for the time that I have. + +00:08:40 Yeah, I'm about to do another round of it. + +00:08:42 I'm working on an app and my developer account is still active. + +00:08:45 So we might have a fresh round of fun. + +00:08:47 Hopefully it goes through this time. + +00:08:50 Anyway, I do think it's such a challenge. + +00:08:52 And are you leveraging? + +00:08:53 I don't know if the timing was right. + +00:08:55 Like maybe this was too early, but these days, are you leveraging things like uvx + +00:09:00 to run, or are you just pip install this thing and then run it? + +00:09:04 Yeah, I haven't updated the readme in a while, so I think it just asks for pip. + +00:09:08 But certainly, if somebody asked me today, I would say, yeah, just install this with uv. + +00:09:14 Because then they don't even need Python. + +00:09:16 Exactly. + +00:09:17 And that's brilliant. + +00:09:18 And that's a really, it is another barrier reduced in distributing these applications, + +00:09:23 right? + +00:09:23 Like, if you can get uv installed on a machine, then you don't even have to say install, just + +00:09:28 The way you run it is uvx my thing and it's all transparent to you, right? + +00:09:33 Which is beautiful. + +00:09:33 So what was it like? + +00:09:35 Yeah. + +00:09:35 So what was it like coming from what sounds like a not super screen focus, super + +00:09:43 techie aspect and having to dive into this world and someday you're probably + +00:09:47 like, how is it that I'm publishing stuff to PyPI? + +00:09:49 What has happened to me? + +00:09:51 Yeah. + +00:09:51 well, yeah, I remember when I, when I first signed up for GitHub, because + +00:09:56 you know, whatever YouTube tutorial I was working through at the time, you know, said that I needed + +00:10:02 to do that. You know, I think it all started making a lot of sense. I didn't have any technical + +00:10:08 background, but the world kind of open source software, it just kind of made sense. It felt + +00:10:17 like it fit really well into my academic, you know, circle. I think a lot of the attitudes are + +00:10:23 similar. I agree. I think they are actually. And I think that's, I think that's a pretty neat thing. + +00:10:28 Yeah. Very cool. All right. Well, let's talk about what you're doing with digital humanities. + +00:10:34 You're actually at a really interesting project or organization, I guess, that does many projects, + +00:10:40 right? Yeah. Yeah. So fast, fast forwarding, I did, I finished my PhD in the humanities. + +00:10:45 Sorry. I had so much fun. No, that's fine. That's fine. I had so much fun writing like these tools + +00:10:50 and then just solving the distribution problem to share them with other scholars. + +00:10:56 That was so fun that I was open to this kind of opportunity + +00:11:00 where now I'm doing this full time. + +00:11:02 And so, yes, so I'm on the, we call it affectionately Darth, + +00:11:07 which is digital arts and humanities at Harvard. + +00:11:11 There has to be a lot of Star Wars memes and references, + +00:11:14 I'm sure. + +00:11:14 If you can pull up a 404, I think there will be a Darth Vader reference. + +00:11:19 Seriously, I'm here for it. + +00:11:22 Yes, page not found. I find your lack of nav disturbing. + +00:11:27 You know what? I think that is beautiful. And I really, I really think that people should embrace + +00:11:33 the 404, the fun 404 page, you know, more, right? There should really be something going on that + +00:11:40 like makes it, you know, something hasn't worked out, but you can just, you can make people laugh. + +00:11:46 Yeah. I appreciate that. + +00:11:48 I've heard people push back against it. + +00:11:50 Like if you're on a, if you're on like your medical website and you're maybe about to get bad news and then you get like a picture of a kitten. + +00:12:00 Dr. Kitten doesn't know where your results went. + +00:12:02 Like I get that. + +00:12:02 That's not funny. + +00:12:04 But I mean, most things are not that serious. + +00:12:06 Yeah. + +00:12:07 Mostly. + +00:12:08 Okay. + +00:12:09 So what kind of things does Darth do? + +00:12:12 You've described this as kind of a web or tech agency within Harvard. + +00:12:17 Yeah, it is very much. + +00:12:18 So, you know, Harvard has a gigantic IT group. + +00:12:21 I don't know how many hundreds of people work, but more than 500 people in IT. + +00:12:28 We are a small team and we operate very much like a small agency. + +00:12:33 So usually what happens is a faculty member has a funded research project that's going to last for an amount of time. + +00:12:42 And then we consult with them to build it. + +00:12:44 And most of the time, I kind of think of these as I kind of have these different categories of these kinds of projects that I think of. + +00:12:54 I lost in my notes what I call them. + +00:12:56 But they are there. + +00:12:57 You have like a one is like a virtual research environment. + +00:13:01 So the focus is this is this is a platform that we're building for the research to be done on. + +00:13:07 Like the reason the research should be done in like a web app would be because you have access to visualization, to Postgres, to Pandas. + +00:13:17 So we can kind of build up this platform to do the actual research on and some of the data entry. + +00:13:23 So like a full on research application. + +00:13:26 Yeah, exactly. + +00:13:27 I guess you can also kind of see your work through the different stages of research projects and academic research and so on. + +00:13:36 And we'll get to maybe end of life in a sense further down in the conversation. + +00:13:42 But so this would be we have a grant or we just work here and we're going to work on some form of research. + +00:13:49 What do you give them? + +00:13:50 Right. And I think that's a super interesting challenge because one of the real common answers would be Jupyter, Jupyter Lab, Marimo, whatever. + +00:13:59 But that's still pretty code heavy for people who are possibly philosophers or something, you know. + +00:14:05 Oh, exactly. That's why in digital humanities, I won't even, maybe I won't even attempt to define + +00:14:13 it in any narrow sense, because I'll get in trouble with somebody. But you have two groups + +00:14:20 that are interfacing with each other. And one is digital humanities as a field, like as a subfield, + +00:14:26 all of its own. And these are people who have humanities domain, like knowledge, + +00:14:31 and technical skills, and they're bringing them together. And in a lot of cases, the audience for + +00:14:36 that kind of work is other people working in the digital humanities. But far more common, + +00:14:42 and this is what we work with, is people who have humanities domain expertise, and they want to + +00:14:49 publish or do research or share with other people who have that same humanities domain expertise, + +00:14:55 and they are now interested in adding a technical component to it. + +00:14:59 How can we supercharge what they have? + +00:15:03 This portion of Talk Python is brought to you by Sentry. + +00:15:06 I've been using Sentry personally on almost every application + +00:15:10 and API that I've built for Talk Python and beyond over the last few years. + +00:15:14 They're a core building block for keeping my infrastructure solid. + +00:15:18 They should be for yours as well. + +00:15:19 Here's why. + +00:15:20 Sentry doesn't just catch errors. + +00:15:22 It catches all the stuff that makes your app feel broken, + +00:15:25 the random slowdown, the freeze you can't reproduce, that bug that only shows up once real users hit it. + +00:15:30 And when something goes wrong, Sentry gives you the whole chain of events in one place. + +00:15:34 Errors, traces, replays, logs, dots connected. + +00:15:38 You can see what's led to the issue without digging through five different dashboards. + +00:15:42 SEER, Sentry's AI debugging agent, builds on this data, taking the full context, + +00:15:47 explaining why the issue happened, pointing to the code responsible, drafts a fix, + +00:15:52 and even flags if your PR is about to introduce a new problem. + +00:15:56 The workflow stays simple. + +00:15:58 Something breaks, Sentry alerts you, the dashboard shows you the full context, + +00:16:02 Seer helps you fix it and catch new issues before they ship. + +00:16:06 It's totally reasonable to go from an error occurred to fixed in production in just 10 minutes. + +00:16:12 I truly appreciate the support that Sentry has given me + +00:16:15 to help solve my bugs and issues in my apps, especially those tricky ones that only appear in production. + +00:16:21 I know you will too if you try them out. + +00:16:22 So get started today with Sentry. + +00:16:24 Just visit talkpython.fm/sentry and get $100 in Sentry credits. + +00:16:30 Please use that link. + +00:16:31 It's in your podcast player show notes. + +00:16:32 If you're signing up some other way, you can use our code talkpython26, all one word, + +00:16:38 talkpython26, to get $100 in credits. + +00:16:41 Thank you to Sentry for supporting the show. + +00:16:44 Maybe just take a moment and speak to, maybe, I don't know if this venue will actually speak + +00:16:49 directly to anybody who I was imagining here, but people who work with folks, what would you tell + +00:16:54 somebody who works with a group who have some technical skill, who could create some of these + +00:16:58 things that we're going to talk about, but the people who they've created for don't necessarily + +00:17:02 think they need it or know that they need it. I've gone often on rants about how programming is a + +00:17:09 superpower, not a replacement for your job, right? Yeah. That's a problem for a lot of people, + +00:17:15 especially because you might use some new computer tools to supercharge your research. + +00:17:20 But the article that you publish or the research output of that, the audience, they may not + +00:17:25 be interested in hearing about that at all. + +00:17:28 And so for most people who are working in this space, the tools, you have to use them + +00:17:33 in such a way that you can talk about the research output without talking about the + +00:17:37 tool. + +00:17:38 And we have other venues to talk about the tools themselves, like the Journal for Open + +00:17:43 source software and you can kind of get some of it out there. But that is a, that's the significant + +00:17:48 challenge is convincing people that it, that it could be useful and then convincing the audience + +00:17:53 that they should be interested in kind of the methods behind how some of the new research comes + +00:17:57 up. Also, I think I'm a big believer that presenting stuff in the right order is really, + +00:18:03 really important. If you present your research and it's beautiful and powerful and oh, look, + +00:18:07 we've also, by the way, covered a hundred times more data than any prior research. Surprise, + +00:18:12 I wonder how I did that. + +00:18:14 And then people are like, this is amazing. + +00:18:16 Then after you kind of hook them with the inspiration and what's possible, + +00:18:19 then you're like, let me tell you about the tool. + +00:18:21 And all of a sudden you're like, that's a cool tool, right? + +00:18:22 This is not just like geekery, like programmer, you know, + +00:18:26 Charlie Brown speak, wah, wah, wah, wah, wah. + +00:18:28 You know, it's like, no, I'm listening. + +00:18:29 Tell me now. + +00:18:30 Yeah, exactly. + +00:18:31 I mean, one of the things I think that really opens people's eyes + +00:18:35 is a really powerful search interface. + +00:18:38 You have all of this research data. + +00:18:40 just put it behind Elasticsearch with some really good filtering on it. And all of a sudden you have + +00:18:45 fast, rapid access to the data in a way you never had before. Like you were never scrolling through + +00:18:51 the Excel spreadsheets and finding exactly what you wanted, like you were with this new search + +00:18:55 interface. And that by itself is like so simple. We're so used to that in web development that + +00:19:00 like everything needs to have a fantastic search now. But so many people have their data locked + +00:19:05 behind, you know, a terrible search interface. + +00:19:07 Yeah, just a few things to sort of expose that. + +00:19:10 So this, give us a sense of what these data exploration web apps might look like. + +00:19:14 These are probably kind of mostly stuck to the inside, kind of internal to the research + +00:19:20 lab research team groups and so on. + +00:19:22 These are probably not that public facing, right? + +00:19:24 Almost everything we work on does end up having a public facing component. + +00:19:28 So maybe the research itself is done, locked behind a user login. + +00:19:34 That's just for the researchers. + +00:19:36 But then they expose that research to the public, usually with a good search interface + +00:19:41 and different pages for exploring their data and visualizations and things like that. + +00:19:47 So yeah, everything we do ends up becoming a production public web app in the end. + +00:19:52 And then another one of your categories, you put it was virtual research environments + +00:19:57 like data entry, publishing, authoring, collaboration. + +00:20:00 Tell us about that. + +00:20:01 Yeah, so a good example of this maybe is one of the projects that... Well, actually, the best example of it is the project I worked on + +00:20:08 during my PhD. It's called Apatosaurus. The short story behind the name is that it sounds like + +00:20:16 apparatus. In textual criticism, when you are displaying and visualizing variant readings to + +00:20:24 a base text, that form of visualizing it is a critical apparatus. A critical apparatus is a + +00:20:32 a pretty boring website name, but Apatosaurus dinosaurs might make textual criticism sound fun. + +00:20:37 Yeah, I do love dinosaurs. No, that's really cool. So this, this comes out as a web app. And I know + +00:20:43 you also have some, you talked about some desktop apps as well. + +00:20:46 Yep. Yep. That's right. So, yeah. So, so there's this people, people upload their, + +00:20:50 their collation to this and then they can visualize it. And like there, there's a public + +00:20:55 component of this as well, but really the backend is editing, editing a collation, + +00:21:00 and adding notes to all of the different readings and stuff. + +00:21:03 So I could show what the backend looks like, but we can also move on. + +00:21:08 - Let's move on just because most people will not totally hear, but just give us a sense of like, + +00:21:14 like what do people, what do you create for people so that they're like, yeah, I can use this app, right? + +00:21:21 Like give us a sense of some of the features, I guess is what I'm getting to. + +00:21:25 - Yeah, so another good example is we have a project at Harvard called Mapping Color in History. + +00:21:33 And this is a collaboration with a lab. + +00:21:37 This lab brings in pieces of artwork and they do spectral analysis on the pigments + +00:21:42 so they can identify what was used to make a particular color of this red + +00:21:48 or what was made to make this color of blue. + +00:21:51 And then the idea is tracking how did people make those pigments over time, + +00:21:57 over time and specifically in Asian art. + +00:22:02 Is this the Dharmra, Puna, Puna? + +00:22:05 No, this is mapping color in history. + +00:22:08 I don't think it's up here. + +00:22:09 Sorry about that. + +00:22:10 Somewhere. + +00:22:10 That's all right. + +00:22:11 I'll find it. + +00:22:12 Keep talking. + +00:22:13 Okay. + +00:22:14 So the front end is great. + +00:22:16 You know, like the public end, this is people can explore by pigments + +00:22:21 and then see the images that contain those pigments. + +00:22:24 Now in the back end, what the researchers will be able to do is correlate exactly which + +00:22:30 point of a painting the analysis was done on. + +00:22:34 So they have this deep zoom image viewer where they'll zoom in and they'll select the point + +00:22:39 where that was taken from. + +00:22:41 So how else would you do that other than a digital interface to indicate on an image of + +00:22:47 a painting where that spectral analysis was performed? + +00:22:52 Sounds almost like astronomy in a weird way. + +00:22:55 Oh, yeah. + +00:22:55 We zoomed into here and we took a different spectrum of the painting and we realized that it's actually identical to this, you know, something crazy like that, right? + +00:23:04 Yeah, yeah, yeah, that's right. + +00:23:06 Yeah, so it's essentially a pigments, like a pigments database. + +00:23:10 So the third category of these digital humanities projects that you put down was like data extraction, transformation. + +00:23:19 In data science, they often say, you know, 80% of the work is the data wrangling, which is like cleaning, organization, just getting it so you could possibly start asking questions about it. + +00:23:29 I'm sure you all do a lot of that. + +00:23:31 Absolutely. + +00:23:32 So often, the very beginning of a project might be an Excel sheet or several spreadsheets. + +00:23:41 And the first task is to ingest these into, you know, a proper database. + +00:23:46 Not so much MongoDB for us. + +00:23:48 It's going into Postgres. + +00:23:50 We're Django Shop. + +00:23:51 We're Django Shop. + +00:23:52 So it's going into Postgres. + +00:23:55 And yeah, no, that is probably the number one challenge of the early stage is figuring out what the right data model is, what the right relationships are to model the data. + +00:24:07 Doing that work is advantageous to everybody because, you know, it helps both the researchers who brought the data to think about it in a more organized way. + +00:24:17 I mean, they've been trying to do that. + +00:24:18 And they have the spreadsheets. + +00:24:20 But now we're modeling out the data so that we can add it to database tables and then to use later. + +00:24:27 So that works out well for everybody. + +00:24:30 And yeah, absolutely. + +00:24:31 Cleaning the data, getting dates, working with fuzzy dates, being able to parse July of 2020 or summer of 2020 and handling kind of all of those cases so that we do get dates in the end. + +00:24:45 One of the crazy stories from data parsing history is one of the, I can't remember exactly what it was, you talked about biology tools or genetics tools earlier. + +00:24:56 One of the groups that names genes had to change the name of a gene because it kept getting parsed by Excel into a date. + +00:25:04 Yeah, I remember that. + +00:25:05 I remember that. + +00:25:06 That's right. + +00:25:07 Yes. + +00:25:08 So these are the weird edge cases I'm sure you run into. + +00:25:11 Like it's not even supposed to be a date. + +00:25:13 Why is this a date? + +00:25:13 I don't know. + +00:25:14 Why is it helping out here? + +00:25:16 The code keeps crashing. + +00:25:18 Like pandas parsed it as a date and it's not or whatever. + +00:25:21 Absolutely. + +00:25:21 Yeah. + +00:25:22 Yeah. + +00:25:22 So yeah, usually lots of test suites around that ingest process until we've got it. + +00:25:27 Now, once we've got it in, usually the research is ongoing and then we're able to provide + +00:25:32 them now a new cleaned interface to do the additional data entry as the project is going. + +00:25:38 And that's usually a win-win for everybody. + +00:25:40 Sure. + +00:25:40 And so this sort of ETL ingestion side of everything is it's like, don't worry, + +00:25:46 Darth has got it for you. + +00:25:47 And then we'll provide you like a database connection to start working. + +00:25:51 Or do you give them the tools and then they kind of iterate on them? + +00:25:54 And how much is this you and how much is this you providing like CLI tools and stuff + +00:26:00 or notebooks over to people? + +00:26:03 I'd say most of the people that we're working with are aware of the technical tools, + +00:26:08 but they don't want a database connection. + +00:26:10 So we are giving them, we're doing the ingest and then building a platform where they can begin interacting with their data. + +00:26:17 Yeah, I'm sure they don't want one. + +00:26:20 Maybe you give them an app though, right? + +00:26:22 With like Elasticsearch and other things that they can. + +00:26:25 No, absolutely. + +00:26:25 Yeah, that's what we do. + +00:26:26 Yeah, okay. + +00:26:27 Yeah, we give them a web platform to begin exploring, to begin publishing. + +00:26:34 So I was thinking that you said you're a Django shop, which is cool. + +00:26:38 It sounds, though, to me like describing what you're doing, just imagining how this is. + +00:26:43 You're probably creating these projects often. + +00:26:46 How often does one of these projects actually last? + +00:26:49 Or how many of them do you iterate? + +00:26:53 I'm trying to get a sense. + +00:26:53 Do you work on stuff for a year or is it like every two weeks we're on a new project? + +00:26:58 It's why I think of us as like an agency. + +00:27:00 Because we get to work on greenfield projects fairly often, like you're imagining. + +00:27:04 Which would not be the case normally at a big university IT department. + +00:27:09 So, you know, maybe two or three projects a year, two or three big ones a year. + +00:27:15 And then we have to put to bed a few a year as well. + +00:27:18 Because these things, they're funded with grant money. + +00:27:21 And then the grant money runs out and it's time. + +00:27:24 And then we have to figure out what do we do with it now? + +00:27:26 We don't want to lose the data and this way of presenting it. + +00:27:31 But we can't keep paying for Elasticsearch. + +00:27:33 Yeah, of course. + +00:27:34 I'm certainly, we're going to dive into that because that is, but let's save that for the + +00:27:37 end. + +00:27:37 It seems like that's the arc of the story of these things. + +00:27:40 But I certainly think it's something that you don't think about that much, right? + +00:27:44 Like you said, it was only a hundred dollars a month for this. + +00:27:47 And we got a big grant. + +00:27:48 There's a bunch of, no big deal. + +00:27:49 But like when the grant's out, who's on the hook for a hundred dollars a month and making + +00:27:53 sure it survives upgrades and all that kind of business. + +00:27:56 No, that's right. + +00:27:57 Yeah. + +00:27:57 So my original question when I started on this path was thinking like, do you, how do you + +00:28:02 get started on these? + +00:28:03 Do you have like a big framework or a cookie cutter sort of thing or something like this + +00:28:07 is how we do it because it plugs into all this other automation and tools we built for + +00:28:11 the last 10 projects. + +00:28:13 You know, that's kind of a unique position. + +00:28:14 A lot of companies build one website for themselves and that's their app or they're + +00:28:19 an agency that goes across so many, so much variation. + +00:28:21 They can't do that kind of stuff. + +00:28:22 Right. + +00:28:23 That's right. + +00:28:24 That's right. + +00:28:25 That's a good question. + +00:28:26 We have things that we reuse. + +00:28:29 Some of them are open source, different search components and things that we maintain that + +00:28:36 we'll use across projects. + +00:28:37 And we have tried to do the cookie cutter Django project. + +00:28:41 The truth is, each project is different enough that really we like to evaluate it from first + +00:28:47 principles as we're evaluating it and thinking, what is the best technology to use? + +00:28:55 Yeah. + +00:28:55 Yeah. + +00:28:56 So yeah, we don't have a cookie cutter. + +00:28:59 We don't have a kind of a meta framework for bootstrapping them because they're sufficiently + +00:29:04 different from each other that we... + +00:29:05 I find that too. + +00:29:07 I find that too. + +00:29:08 The idea of how we could just grab this cookie cutter or copier. + +00:29:12 Are you familiar with copier? + +00:29:14 People out there might be familiar with that. + +00:29:15 It's a little bit like cookie cutter with the bonus that you can update it later if you + +00:29:21 change your mind about something, like actually change this project to use Postgres rather + +00:29:24 than SQLite or something, which is pretty cool. But every time that I do, every time I try to work + +00:29:30 with one of those projects, even ones that I've created for myself, I'm not, I hate not anyone. + +00:29:34 I'm like, oh, it's like 75% awesome and 25%. I just got to take this stuff out. You know, + +00:29:39 I'll just, I'll just do it from scratch. It's not, how hard is this? I'll just create a few folders + +00:29:43 and put a few things in there and I'll copy the one, like the pyproject.tom or like the one thing + +00:29:48 that's like, how do I do this again? I'll just copy that and we're good to go. Yeah. I mean, + +00:29:52 That's what I find. + +00:29:53 That's what I find. + +00:29:53 I find it, it seems like a really brilliant idea, but in practice, it hasn't saved us time yet. + +00:30:00 No, I mean, maybe it's a case study. + +00:30:02 Like, okay, let's see what they're doing for this one. + +00:30:04 Oh, that is interesting how they're integrating this other thing maybe, + +00:30:07 but as a true foundation, I find it in theory awesome. + +00:30:11 In practice, I just end up not doing it for various reasons. + +00:30:14 Don't know why. + +00:30:14 I'm gonna save this for later. + +00:30:16 Because the question I'm about to ask you is gonna send us just down a rat hole. + +00:30:21 So instead, before we go down the rat hole, maybe we could, not that one, maybe we could + +00:30:27 talk about, I mean, you talked about some, but let's maybe just feature some of the projects + +00:30:32 that are maybe more well-known that you guys have done. + +00:30:35 Sure. + +00:30:35 Yeah, good. + +00:30:36 So yeah, one of them is called the Amendments Project. + +00:30:40 And this is, I didn't know this until I started working on this project, that there are, there + +00:30:46 There have been thousands of, I think it's 22, at least 22,000 proposed amendments to + +00:30:52 the United States Constitution that never went anywhere. + +00:30:56 And so kind of the goal of this project is to show that there have been lots of attempts + +00:31:02 to amend the Constitution, but actually the Constitution is frozen. + +00:31:06 I mean, it's not actually amendable anymore, at least not in the politics of any time recently. + +00:31:12 So this is a database. + +00:31:14 I cannot imagine a situation where the U.S. Constitution gets amended. + +00:31:19 It has to be unanimous across all the states, right? + +00:31:21 Is that right? + +00:31:22 I can't remember. + +00:31:23 I don't know. + +00:31:23 I remember off the top of my head if it has to be unanimous, + +00:31:25 but it certainly has to be across party lines. + +00:31:28 Yeah, it's got to be pretty darn close if it's not at all. + +00:31:32 It's like time travel or travel to speed of light. + +00:31:36 Could be theoretically possible. + +00:31:38 Probably not going to happen. + +00:31:40 No, it's hard to see. + +00:31:41 It's hard to see. + +00:31:42 Yeah. + +00:31:42 So this is from a historian at Harvard. + +00:31:46 And so it's a database of all and the full text from all of these amendments. + +00:31:53 And, you know, it's from the public's point of view, it's a Postgres full text vector search interface for finding and filtering through on all of the different amendments that have been proposed. + +00:32:08 I love it. + +00:32:08 Yeah, this is a nice looking site. + +00:32:10 We work with a designer. + +00:32:12 she's very good yeah of course like an agency would right yep yep nice so we'll + +00:32:17 get a really pretty rich search interface and then off you go I have no idea even + +00:32:22 what I would search for but yeah well you can always search for something + +00:32:25 religious something abortion related there's gonna be lots of things there I + +00:32:29 thought all those also like guns but like I don't want to go down I'm not sure I + +00:32:32 even want to go down there right awesome though this looks super useful maybe + +00:32:37 someday we'll have a functional government again we'll see let's let's + +00:32:41 change it or maybe we'll go down and it's folklore like look at you so all right so yeah so another + +00:32:45 really great uh project at least from a content point of view uh that's interesting um the research + +00:32:51 that it's doing um is the fin folklore database um which so in in in celtic storytelling you know + +00:33:01 um moms have been telling and telling stories to daughters and and and and people have been + +00:33:08 telling stories for a very long time hundreds or a thousand years about um finn mcummel who is a + +00:33:14 hero a hero from irish mythology some of it some of it based in you know historical events but it + +00:33:21 goes back it goes back so far um so there are there's many hundreds or thousands of of of these + +00:33:29 stories that have been spread and versions of these stories that have that have been told and + +00:33:33 And so some of them are audio recordings where somebody like some researcher has gone out to an island off the coast of Scotland and recorded somebody telling their version of the hero of Finn and his band of heroes. + +00:33:47 You know, they defend Scotland and Ireland from invaders and attackers. + +00:33:53 Very exciting stories and stuff and a team of characters. + +00:33:59 So there's audio recordings and then there's documents, like written documents that contain + +00:34:05 these. + +00:34:05 And so this is a database of kind of all of those all in one place with, on the public + +00:34:11 side, a nice search interface for discovering them, you know, either using the map view or + +00:34:18 searching. + +00:34:18 Yeah, that's cool. + +00:34:19 I got my map view for some random thing I searched about here. + +00:34:22 Amazing. + +00:34:23 But this is pretty interesting, all these different tellings and stuff. + +00:34:26 Oh, and yeah, one of the big challenges with this project is that it's fully internationalized. + +00:34:33 So it's available in English. + +00:34:35 Everything is available in English, Scottish Gaelic, and Irish Gaelic, but that extends + +00:34:40 into the database. + +00:34:41 So usually people have multiple names recorded for them. + +00:34:45 And so, yeah, you may have one person with any number of names in different languages, + +00:34:51 sometimes more than one Scottish name, that kind of thing. + +00:34:54 And so the data model on this one is quite messy, but sensible. + +00:35:00 But yeah, it's quite a lot of different kinds of data to wrangle. + +00:35:03 And then with all of the translations for each thing. + +00:35:05 Yeah, that's wild. + +00:35:06 It's not just, we need the user interface of this thing to translate about. + +00:35:12 That's way more, right? + +00:35:13 Yeah, yeah, it is that. + +00:35:14 It is that. + +00:35:15 And then it is also, yes, all the items in the database have a translation or can. + +00:35:22 This portion of Talk Python To Me is brought to you by us. + +00:35:25 I'm thrilled to announce a brand new app built for developers created by yours truly. + +00:35:30 It's called Command Book. + +00:35:32 You know that thing you do every morning? + +00:35:34 Open up six terminal tabs, CD into this directory, activate that virtual environment, + +00:35:39 run the server with --reload. + +00:35:40 Now, CD somewhere else, start the background worker, another tab for Docker, + +00:35:45 another one to tail production logs. + +00:35:46 Every tab just says Python, Python, Python, Docker tail. + +00:35:50 and you're clicking through them going, which Python was that again? + +00:35:53 Where my app is running? + +00:35:55 Then sometime later, your dev server silently dies because it tried to reload + +00:35:59 while you're in the middle of a code edit, unmatched brace, a half-written import or something. + +00:36:04 Now you're hunting through tabs to figure out which process crashed + +00:36:07 and how to restart it. + +00:36:08 My app, CommandBook, gives all of these long-running commands a permanent home. + +00:36:13 You save a command once, the working directory, the environment, + +00:36:17 pre-commands like git pull, and from then on, you just click run. + +00:36:20 You can even group commands together to start and stop everything for a project + +00:36:24 with a single click. + +00:36:25 It also has what I call honey badger mode, auto restart on crash. + +00:36:29 So when your dev server goes down mid-reload, command book just brings it right back up + +00:36:34 and does so over and over until the code is fixed. + +00:36:37 It also detects URLs from your output so you're never scrolling through thousands of lines of logs + +00:36:42 just to figure out how to reopen your web app. + +00:36:44 And it shows you uptime, memory usage, and all sorts of cool things about your process. + +00:36:49 The whole thing is a native macOS app. + +00:36:51 No Electron, no Chromium, just 21 megs. + +00:36:54 And it comes with a full CLI. + +00:36:55 So anything you've configured in the UI, you can fire off from your terminal + +00:36:59 with just a single command. + +00:37:00 Right now it's macOS only, but if there's enough interest, + +00:37:04 I'll build a Windows version too. + +00:37:05 So let me know. + +00:37:07 Please check it out at talkpython.fm slash command book app. + +00:37:11 Download it for free, level up your developer workflow. + +00:37:14 The link is in your podcast player show notes. + +00:37:16 That's talkpython.fm/command book. + +00:37:19 I really hope you enjoy this new app that I built. + +00:37:22 You want to work in the native language of the people who did that part of the folklore + +00:37:26 or whatever, right? + +00:37:27 Yeah, well, and people are still speaking those languages. + +00:37:30 So people who would use this to, you know, like somebody may have heard a story from + +00:37:34 their mom or dad and are now would like to find other versions of that story. + +00:37:38 And they live in a part of Scotland where they speak Scottish Gaelic as their first language. + +00:37:42 They can still access the site. + +00:37:43 And then that mapping color history one, that's another one of the public ones that you said is pretty major. + +00:37:49 Yeah, that's right. + +00:37:50 Yeah. + +00:37:50 So, yeah, that's a pigments database. + +00:37:53 You can search by either English color names like blue and find all of these Asian paintings that have blue or a particular kind of pigment of how they made the blue. + +00:38:04 Yeah, nice. + +00:38:05 So what's the open source story? + +00:38:08 You're creating all these apps, maybe some of these frameworks. + +00:38:11 There's got to be some tools. + +00:38:12 Is there a big desire or already an effort to have a lot of these things open source or is it too niche or is it just like this is the advantage of Harvard has is other universities don't get this? + +00:38:27 No, it's something we talk about quite a bit. + +00:38:30 Usually these things start, usually they start closed source during development. + +00:38:35 And then we work with the faculty and we talk about how we can take, you know, like the repo for the web app, how we can take that public. + +00:38:45 And so we've done that for a number of projects. + +00:38:48 Not all of them are. + +00:38:50 But the ideal is that they all make their way into the open, and especially when they become archived. + +00:38:56 Sure. + +00:38:56 Yeah, that's a good way to help them live on. + +00:38:58 And they might even go into GitHub's Arctic Vault, which is crazy. + +00:39:03 I don't know if people know about that out there, but GitHub has, quite a while ago, started taking copies of all of the repos and backing them up and storing them in the Arctic vault. + +00:39:14 It's kind of cool. + +00:39:15 I really, really, really hope we never need that, but it's kind of neat. + +00:39:18 Yeah, me too. + +00:39:20 Usually universities have their own archival system, so any important research data is usually part of that system as well. + +00:39:30 I see. + +00:39:30 Okay. + +00:39:31 Yeah. + +00:39:32 Obviously, right? + +00:39:32 Like I'm just, I can't remember where it was. + +00:39:34 It was somewhere, I think it was South Korea or Taiwan where like seven years of government + +00:39:40 data got lost or something like that. + +00:39:41 It was really, really bad recently. + +00:39:43 There was a fire and I think they had backups, but maybe just into the building, you know, + +00:39:47 like we'll put that out. + +00:39:48 We'll back it up to the hard drive over here. + +00:39:50 Not good. + +00:39:51 No, not good. + +00:39:52 You definitely want this stuff to survive. + +00:39:54 I mean, academia has this history of like tomes that have survived the past and really, + +00:40:00 really long lived information. + +00:40:02 Right. + +00:40:02 besides the Library of Alexandria or something like that, maybe. + +00:40:05 That's what we want. + +00:40:06 That's what we want. + +00:40:07 We want it to, yeah, we want it to last. + +00:40:09 Absolutely. + +00:40:10 So maybe that's a good time to sort of talk about the trailing end. + +00:40:14 I think there's a lot of interesting things going on here. + +00:40:18 Just like you've run out of money, not because you actually run out of money. + +00:40:23 The grant is done and you've either spent or given back or whatever + +00:40:26 with the remaining little bits of money. + +00:40:28 It's always a weird balance with research. + +00:40:30 It's like, oh, we got $3,000 left on this research grant. + +00:40:33 What are we going to do with it? + +00:40:34 It's not like, oh, we're going to give it back. + +00:40:35 We just didn't need it. + +00:40:36 It's like, we're going to find a way to like fund a student to do a little more work or + +00:40:41 whatever. + +00:40:41 But eventually the grant is over. + +00:40:43 That's right. + +00:40:44 You've got some expensive app access to a big database because it needs a big search or + +00:40:49 a lot of compute or something. + +00:40:50 That's right. + +00:40:52 Everything during, like, I mean, anything, anything that's a, that's a Django app. + +00:40:56 We deploy to AWS using containers, which isn't the cheapest way to host anything. + +00:41:05 But that's for the most part the Harvard way. + +00:41:10 And it is robust and is reliable. + +00:41:12 And we don't have a DevOps person on call on the weekend to rescue one of these apps. + +00:41:22 So having them reliable is good. + +00:41:25 Okay, so it's on AWS and paying for the containers, paying for that Elasticsearch cluster, + +00:41:33 the RDS Postgres database. + +00:41:36 Okay, well, even if somebody wants to start paying for that out-of-pocket, + +00:41:40 all of those little services, they add up to enough that we need to do something + +00:41:44 when the project hits end of life. + +00:41:46 And so our gold standard that we've developed so far is asking, can this become a static website? + +00:41:55 Can we bake this out into all HTML files and acknowledge that there will be some trade-offs? + +00:42:01 We will trade off some searching. + +00:42:04 You know, it's not gonna have Elasticsearch. + +00:42:06 Doesn't mean that it won't have any search though. + +00:42:08 So we'll trade out Elasticsearch and it'll be very difficult to add new data, + +00:42:13 but that's okay because it's being archived. + +00:42:15 So can we get it into a static site? + +00:42:18 And that's challenging depending on how you've set it up. + +00:42:20 So we now have projects where we set them up from the beginning to be archivable like this. + +00:42:26 And one of them is called Water Stories. + +00:42:29 And it was a companion to an art installation at the Radcliffe Institute on the Harvard campus. + +00:42:36 And so this was this live site during the duration of the art installation where people could come in and add stories that they had about water onto an iPad. + +00:42:46 And then those went up to our database. + +00:42:49 we built that with something called Django bakery which if you opt in and you use all of their + +00:42:54 class-based views the way that they're meant to be used then you can bake this out into static files + +00:43:00 when you're done very low effort that was perfect that is such a cool idea and mad props to them for + +00:43:05 ASCII art logos come on now I feel like that should be in the view source if it's not but + +00:43:11 this is such a cool idea because you can you can just take a working site you guys are a Django + +00:43:17 shop. So you have a lot of your sites are written in Django and you just go make it static, right? + +00:43:22 Essentially. Yes. And, and what's, what's, what's really great about it is if they wanted to make + +00:43:27 a change and they have, they have asked since we, since we made it static, they've asked for a + +00:43:31 couple of changes. So locally, I just Docker compose up this whole application, make the change + +00:43:37 in the Django admin and rebake the site. And so it's, it can still be updated. Something, + +00:43:42 if you've never tried this, like something like, Hey, can we just add one more menu item? + +00:43:47 And you're like, no, no, no, we're not adding the menu item because you want that. + +00:43:50 That means we're changing 7,300 pages because they all bake in the whole HTML. + +00:43:56 Right? + +00:43:56 Exactly. + +00:43:57 Yeah, exactly. + +00:43:58 But if that's in my, in my Django database and my SQLite file, then no problem at + +00:44:02 all because then I just rebake it. + +00:44:04 Yeah, yeah, exactly. + +00:44:05 Absolutely. + +00:44:06 So I think this is super neat. + +00:44:09 There's also frozen, frozen flask. + +00:44:13 If I could get rid of all the ads, I do not need a Yeti thing, whatever that is. + +00:44:17 the glass, not the mythical thing, but frozen flask, which does a similar thing for flask + +00:44:25 apps. If you're a flask person probably would work with court. Don't know for sure, but probably. + +00:44:30 So that's a pretty interesting idea as well. throw that in there. but also what else? + +00:44:37 Also you talked about search, right? That can be, can be such a problem. And I'm a huge fan of your + +00:44:45 recommendation here with a page find. Tell us about page find. So this has been, I think it's been a + +00:44:50 bit of a game changer in how functional one of these archived sites can remain. So we're actually + +00:44:56 in the process of that amendments website that searches across 22,000 full texts of amendments. + +00:45:04 We are in the process of sunsetting that, and that will become a static site. And for that search, + +00:45:09 we already have an internal demo that proves that we can replace that Postgres full search + +00:45:16 with PageFind. You lose vector search. Yeah. You've kind of got to get really + +00:45:22 true keyword matching. Yeah. Yeah, that's right. But you still get filtering. I mean, + +00:45:27 and really faceting and filtering is when it comes to discovery of things, I mean, I find + +00:45:34 that's really what's useful. So filtering these amendments by state or by the Congress that was + +00:45:40 active at the time or by the person who co-wrote it. All of those are totally great in PageFind. + +00:45:50 And the keyword search is just fine in PageFind. One of the things I really like about it is that + +00:45:55 it takes your index and it chops it up into lots of little files that can just fly across the + +00:46:00 network. So it's a very fast search. It's not a huge network load, even if your index is + +00:46:07 initially very large. And it essentially cuts it up somewhat alphabetically. So if your search + +00:46:14 starts with T, or I should say a better word for audio, if it starts with W, then it will load up + +00:46:20 the index for words that start with W and fly that over the network instead of the whole thing. + +00:46:26 So it's pretty slick and it has a great Python API. + +00:46:29 So to do the proof of concept for the amendments search, I just took a database dump and then manually indexed with a Python script into PageFind. + +00:46:40 Wait, there's a Python API for PageFind? + +00:46:43 Yeah. So the way PageFind works, I should have said that, is the way most people will use it + +00:46:48 is by normally PageFind consumes HTML. So you give it access to your dist folder. + +00:46:56 Oh, okay. + +00:46:57 And then it crawls through all of your HTML files. + +00:47:00 And you can do great things like adding little HTML tags that are just for PageFind, + +00:47:05 that give it the filtering ability, or that you want to sort by something. + +00:47:09 And so that's great. + +00:47:11 Or you can just call PageFind from Python or from TypeScript and just build that index manually. + +00:47:18 Well, thanks a lot, David. + +00:47:19 I have another thing I've got to go research. + +00:47:21 This is awesome. + +00:47:22 I'm a huge fan of PageFind, as I said. + +00:47:24 on my personal website, mkennedy.codes, is just a pure stat. + +00:47:29 It starts in Markdown and ends up in HTML. + +00:47:31 But if you add page find in, you get a super rich, if you want to just know, you want to talk about, + +00:47:36 like what was about Docker, it shows you really nice results, + +00:47:40 pulling out the different parts of the page and sections that talk about it, + +00:47:43 like the headers and then what is said. + +00:47:45 And it even does like sub, sub word, you know, like you just type doc, + +00:47:50 it finds all the words that match that. + +00:47:52 And what I really like about it is a couple of things + +00:47:54 it's instant. It basically is like nearly instant. If you type a few things, it gets way faster + +00:47:59 because it's pulling down. And if you go and look in the network console here and you type + +00:48:05 something, you can see that it's actually pulling in these little tiny fragments, which this one's + +00:48:10 coming off disk cache in three milliseconds, right? But it breaks your index into a bunch of very small + +00:48:16 page find fragments that I think it's like, it starts with anything that starts with the word + +00:48:21 DO. These are all the prebuilt results and stuff like that. Right. + +00:48:25 That's right. That's right. + +00:48:26 Yeah. That's super cool. + +00:48:27 Yeah. One of our open source projects that, that we maintain is a view of a + +00:48:34 view JS component library for page find so that we can style it and reuse it + +00:48:39 across different projects. + +00:48:41 Oh, that's awesome. I love it. + +00:48:42 Yeah. I think this really unlocks it. + +00:48:44 And I mean, you go to so many, so many sites, like their documentation or just + +00:48:48 their web app in the search is so bad. + +00:48:51 You type something and it's like thinking, spinning, spinning, spinning, spinning. + +00:48:57 And then like five seconds later, it gives you kind of janky results. + +00:49:00 And if you just like throw a page find in there, it's, you can't type fast enough to + +00:49:05 outrun the results. + +00:49:05 You know what I mean? + +00:49:06 No, that's right. + +00:49:07 Yeah. + +00:49:07 Too many static site search solutions, they use like a, like a JSON blob that you, that + +00:49:12 you have to pull down and, and then iterate through. + +00:49:15 You know, what's worse. + +00:49:16 and I see this a lot, would be if you go to google.com + +00:49:21 and then you would say effectively site colon whatever + +00:49:24 and then you search Docker, right? + +00:49:26 They basically pull that. + +00:49:29 You know, they just say search this and you just get Google results for your site. + +00:49:33 And obviously it's, I mean, Google's fine, but it's just. + +00:49:36 No, I find that unusable, really. + +00:49:38 I do too. + +00:49:38 It really, you're like, ah, geez. + +00:49:41 But now I'm super excited to realize I can do that from my dynamic content as well. + +00:49:46 So with the Python integration. + +00:49:48 OK, nice. + +00:49:51 What about something truly static? + +00:49:53 Have you looked at Hugo and some of the other type of things? + +00:49:56 Sure. + +00:49:57 So when I see you've even got the tab up for the SUMEB project, + +00:50:02 which is-- that's essentially a database of many, many specimens + +00:50:09 taken from the SUMEB mine. + +00:50:11 So in the-- + +00:50:12 Oh, it is. + +00:50:13 Yeah, yeah, it is. + +00:50:13 So if you click on Minerals database, you open up that search interface and that's powered by PageFind. + +00:50:19 Oh, this is? + +00:50:21 Yes. + +00:50:22 I forget what I was... + +00:50:23 I see. + +00:50:24 You guys even hooked into... + +00:50:26 I was thinking just like pure static, like Hugo, like... + +00:50:30 Oh, yes. Yes. Yes. + +00:50:31 So this is an Astro site. + +00:50:33 So for this website, we have this as an Astro site so that we have a little... + +00:50:37 Because with Astro, they make it so easy to pull in like view components. + +00:50:42 So like our page find is a custom view JS component library with Astro. + +00:50:47 You can use React components, you can use the view components, but what it does is it's just + +00:50:52 a static site generator. Fantastic. So a little bit more designable + +00:50:57 than like Hugo or something. Here's your markdown file. Good luck with that. + +00:51:00 Yeah. I love Hugo though. Yeah. I use Hugo for different personal sites here and there, + +00:51:05 and it's just so fast and easy to get up and running. But yeah, it's great. + +00:51:08 - Great, great when it's a good friend. + +00:51:09 - That's what my website's written in, it's in Hugo. + +00:51:12 But if I'm integrating with anything else, I used to kind of like split it up, + +00:51:15 like this part's Hugo and this part's like a Python app. + +00:51:17 And it's pretty easy to get something that'll take a bunch of markdown files + +00:51:21 and just turn them into HTML and just put a page template around that. + +00:51:25 So I've kind of stepped away from mixing and matching that + +00:51:29 as much as I used to. + +00:51:30 So now if I got a static section of a dynamic site, but that doesn't address, + +00:51:34 has nothing to do with the archival side of things, right? + +00:51:38 Because the idea is that the thing that I'm describing is gone on purpose. + +00:51:42 That's right. + +00:51:42 So you've got some, we've got Django Bakery. + +00:51:46 I threw out Frozen Flask, and I'm sure there's a ton more that neither of us are aware of at the moment. + +00:51:52 So Django Bakery was really good for that purpose. + +00:51:56 And we're keeping our eyes open for projects that it's a good fit for. + +00:52:01 But that was a pretty simple website. + +00:52:03 It needed a dynamic backend, but it was quite straightforward. + +00:52:06 And for Django Bakery, you have to opt into inheriting from their class-based views. + +00:52:11 I see. + +00:52:12 So if you're doing, for example-- + +00:52:13 You've got to dig ahead of it, yeah. + +00:52:15 Yeah, yeah, yeah, absolutely. + +00:52:17 Yeah, hard to add retroactively. + +00:52:18 Probably impossible. + +00:52:20 Now, our other websites, like the fin example and the mapping color example, those are APIs. + +00:52:27 That's a Django API, Django REST framework for one, GraphQL for the other. + +00:52:32 One has a view front end, one has a React front end. + +00:52:34 OK, well, Django Bakery just isn't isn't going to work very well for like serializing JSON. + +00:52:39 Yeah, it's like awesome. + +00:52:40 Here's your unrendered JavaScript front end code and it's just going to look empty or something. + +00:52:45 Yeah. + +00:52:46 So it is a good reason to consider using like vanilla Django templates when possible, + +00:52:52 like for that reason. + +00:52:53 But those were, those were inherited from the vendors, those two sites. + +00:52:59 And we've made a lot of progress on those. + +00:53:01 So, you know, what, what to do in that, like in that situation, Django Bakery isn't an option. And those projects are not end of life + +00:53:10 yet. So we have some time, but we're, we're, we're, so what we're doing is strategizing, okay, + +00:53:15 how will we rescue them? How will we keep them alive once, once somebody needs to stop paying + +00:53:20 for hosting? And we have, we have ideas. We have, I think there's, there's clever, interesting + +00:53:26 things out there. We'll have to keep looking into it. There are some pretty interesting ideas. And + +00:53:34 that ran in a container, you could just have WebAssembly, but still have it go, right? + +00:53:41 Sort of a local loopback type of thing. + +00:53:43 Yeah, I'm really interested in this one because it enables essentially the full functionality + +00:53:51 of the live site to exist as what is just a static site. + +00:53:55 So because of Pyodide and projects like PyScript, we can run Python in the browser and we can + +00:54:03 run SQLite in the browser. And now we can even run Postgres in the browser with PG Lite. So if + +00:54:09 we can run all those things in the browser, then couldn't we have Django hosted right in the browser? + +00:54:15 And you can. So there's a proof of concept that proves it's possible called Django WebAssembly. + +00:54:23 And if you load this up, it'll let you log in to the Django admin. And you're not logging into + +00:54:29 anybody's backend, you're logging into your own browser where this is running in a service worker. + +00:54:36 Awesome. Look at that. Oh, hold on. I told me what the password was. Very secure. + +00:54:40 Matt, password. + +00:54:42 Well, it can be entirely insecure because, yeah, you're just, it's running right in your own browser. + +00:54:47 Yeah, that's awesome. And here we are, Django admin. Incredible. + +00:54:50 Yeah, so I'm pretty interested in this. You've got to convert an RDS Postgres database + +00:54:55 into either SQLite or something like PGLite, but I think that's all doable. + +00:54:59 So I think it's an exciting possibility. + +00:55:02 Yeah, for sure. + +00:55:03 I do think, so maybe you have a rich query system that you're powering by your database + +00:55:08 that's really heavy. + +00:55:09 Exactly. + +00:55:10 And it's got a bunch of data that's like, here's all of our working data + +00:55:13 that you might ask questions about. + +00:55:15 Maybe you just convert that to page find to help you find the pieces + +00:55:18 and then just keep the operational data and maybe like even a SQLite with like the Django RRM, + +00:55:23 you can just switch the connection, keep talking to it. + +00:55:25 I mean, there's possibilities to just get something not too terrible + +00:55:28 Well, it's not the same, but not that far off. + +00:55:31 Yeah, exactly. + +00:55:32 And then it goes on GitHub pages and it can live hopefully forever. + +00:55:35 I mean, it feels like GitHub will last forever, but it'll last longer than funding will anyways. + +00:55:41 It's definitely going to last longer than just something that we can't pay for anymore, right? + +00:55:48 I don't know how long GitHub's going to be around for, I think a while, but you never know, right? + +00:55:53 It seems like stuff's going to last forever, then it gets changed. + +00:55:57 We had subversion. + +00:55:59 Now it's completely gone, right? + +00:56:00 Just 20 years, 15 years later, but still, I think 100% there. + +00:56:05 Yeah. + +00:56:05 But if somebody can, if something ever happened, somebody just needs to copy that, + +00:56:09 that folder of HTML, CSS and JavaScript files and dump it into an S3 bucket or somewhere else. + +00:56:15 And then it can continue living there. + +00:56:17 So it's a good option. + +00:56:19 It's a great option. + +00:56:20 It's a really, really good option. + +00:56:21 I mean, I guess one of the long-term concerns might be what if the WebAssembly standard changes so much that it's not supported anymore? + +00:56:31 But you could probably bite-wise convert it if you had to, you know, like somebody would probably be able to create one. + +00:56:37 Yeah, that would be unfortunate. + +00:56:39 So I suppose if that happens, I mean, if that happens, yeah, we're booting up one of these projects is like booting up an emulator for some old DOS game. + +00:56:49 Right, right. + +00:56:49 Well, I mean, I guess let's think about this for a second. + +00:56:52 Somebody got, oh gosh, what was the chain? + +00:56:55 This is the whole, JavaScript, the PyCon talk where got like Firefox + +00:57:04 compiled into, not WASM, into, ASM JS or something like that. + +00:57:10 So it was run like Chrome was running Firefox, which was running, I think + +00:57:14 doom, which was also ASM JS. + +00:57:17 If we can do that, we could get something that would run, that would read old Web + +00:57:22 Assembly into new WebAssembly if it really mattered to the world. + +00:57:24 Absolutely. + +00:57:25 Yeah. + +00:57:26 Especially if it's in a public repo that people who care about the data can, + +00:57:30 can rescue it somehow. + +00:57:31 Yeah. + +00:57:32 What about like a virtual machine? + +00:57:34 You know, I agree. + +00:57:35 Yeah, absolutely. + +00:57:36 Could have saved me some, take a snapshot of Ubuntu LTS, some version, + +00:57:42 and just what are we going to do? + +00:57:44 Everything we do is Dockerized. + +00:57:46 Everything is in a container. + +00:57:47 So in the worst case scenario, we could give somebody the image, and they could run it if + +00:57:51 they have Docker. + +00:57:53 I think that's a nice peace of mind to know that no matter what, something will be able + +00:57:57 to run this container. + +00:57:59 And even in, I don't know if you've used GitHub, what is it called, Codespaces. + +00:58:05 I archived one project. + +00:58:07 It was kind of dramatic and sudden that it needed to be archived, so without much time + +00:58:12 to do anything. + +00:58:13 And it was a Ruby on Rails project. + +00:58:15 And I'm not a Rails developer, but I was able to get it archived in a way + +00:58:19 that anybody could, with one command, go to the repo on GitHub and boot it up in Codespaces + +00:58:27 and then have it live running from their Codespace. + +00:58:30 And so that works too. + +00:58:32 Very cool. + +00:58:32 I think as WebAssembly grows, there'll be more possibilities for these types of things. + +00:58:38 Yeah, amazing. + +00:58:39 I'm pretty excited about PageFind having a Python API. + +00:58:42 didn't realize that. So I'm going to be doing something with that for sure. So what else? + +00:58:46 Let me ask you one more thing before I kind of let you wrap up with some final thoughts here. + +00:58:51 What about AI? Oh, that's a good question. So AI, I mean, there's like, in my story, + +00:58:58 there's like one interesting part of AI, which is that I got started and self-learned everything I + +00:59:04 needed to about software development to begin doing this right before ChatGPT really came on + +00:59:10 was able to do real programming yeah you're like four years of legit programming before right so i + +00:59:17 think i mean so i was thinking i was thinking when i was thinking about how i got into it i thought + +00:59:21 what if i was four years later starting my phd and wanting to do these tools um i would have been + +00:59:28 able to accomplish what i needed to for my research without acquiring the technical skills and that + +00:59:34 would have been that's a good thing i'm not sure if that's good about it it could be both i would + +00:59:37 would have thought it was a good thing. I would have thought it's a good thing. But in my hands + +00:59:43 now, like a software engineer, AI is more powerful in my hands now than it would have been then. + +00:59:52 So I can make it work for me. Yeah, I can make it work for me in a way that I couldn't have been + +00:59:57 able to then. So I'm thankful for that, but it's something I think of. I don't want to say it's + +01:00:02 necessarily a bad thing, but it definitely marks a difference, a difference in time between other + +01:00:07 people who are maybe wanting to get into digital humanities, they're humanities researchers. They + +01:00:13 want to add some digital tools. You know, I think this will kind of, this will probably knock people + +01:00:18 off of the more technical path because it's not needed. I think it will too. And I think that that + +01:00:22 might be a negative. When you were telling me your story originally, I was thinking kind of like, + +01:00:27 how neat is it that you didn't sign up for, and the people you're working with probably didn't + +01:00:32 intend to sign you up for learning true software development. + +01:00:36 But look at this cool and interesting job that you now have that you never + +01:00:41 would have imagined. + +01:00:42 I'm sure when you signed up for your PhD, you're like, you know what I'm + +01:00:44 going to do when I get my PhD, I'm going to go X, Y, like, I'm going to + +01:00:47 join the Darth program. + +01:00:48 Like, no, probably not. + +01:00:49 Right. + +01:00:50 But here you are. + +01:00:51 And I think that's actually a really interesting knock on effect for a lot + +01:00:54 of researchers and people in grad schools, they're kind of put into this + +01:00:59 programming adjacent type of thing. + +01:01:01 You know, and a lot of folks sort of are like, actually, that's pretty interesting. + +01:01:04 I'm going to kind of lean into that. + +01:01:06 And I think AI might knock, like you said, knock people off that path to some degree. + +01:01:11 Yeah, yeah, definitely. + +01:01:12 So that's just like one part of the AI story. + +01:01:15 The other one is that, like how we use it. + +01:01:18 It's great for data extraction, pulling data out of different, you know, to make these + +01:01:25 search interfaces more powerful, to extract different data from them. + +01:01:30 That's just one example where it's been handy. + +01:01:33 We're looking for ways that it can really empower faculty. + +01:01:39 We're still very much in the exploration phase of how we can use it and provide it to faculty as a digital humanities tool. + +01:01:48 Sure. I was thinking pretty much when I asked the question of it, it's just like two parts. + +01:01:52 One, how is it? Are you guys using it to help take projects? + +01:01:56 Well, that would have been a month. No, actually, it's three days. + +01:01:58 You know what I mean? + +01:02:00 that. And then if people are asking, you know, a professor comes along and says, and we want our + +01:02:05 own custom AI thing, or we're using Harvard's internal one that we're allowed to use, but we + +01:02:13 won't be able to use it once the grant runs out. You know what I mean? Yeah. Yeah. I think one, + +01:02:17 one good example of this type of thing is that what we're starting to get is faculty who are + +01:02:23 vibe coding and now, and we are going to teach them. We're going to teach them how to do it. + +01:02:28 You know, instead of having them. + +01:02:31 Yeah, it's absolutely a skill. + +01:02:32 Yeah, no, it is. + +01:02:33 It is. + +01:02:34 Instead of copy and pasting from ChatGPT into VS Code, having them learn Copilot, maybe even having them download Cursor. + +01:02:43 Download some real dedicated tools to get this done to make them more productive. + +01:02:48 So, yeah, educating about how to do it is one thing. + +01:02:53 You asked if we're using it. + +01:02:54 We have access to Copilot. + +01:02:58 and that's great. I can't say that we've shipped anything in three days instead of a month yet, + +01:03:04 but one anecdote is that right now I'm doing some really interesting processing of music audio files, + +01:03:13 and somebody asked to have a beatboxer if I could chop that file up so that all of the individual + +01:03:19 sounds that the beatboxer makes are identified in a file. And so I'm using some music libraries, + +01:03:26 Python library called Librosa. There's some complicated math in there. It's a little bit + +01:03:32 too much for me. It's no problem for Claude. Claude knows how to do that math. And then, + +01:03:36 and I use my expertise to string it together to get a good output. + +01:03:39 Yeah. Awesome. You got time for one more quick question before we'll clap things up. + +01:03:44 For sure. + +01:03:45 Raymond out there, Raymond Yees asks, it says, it'd be good to hear how Harvard uses containers on AWS + +01:03:51 and its reliability. It's reliable, not cheapest way to host things. Are you thinking about moving + +01:03:56 moving that or is it not that much? Okay, I'll tell you about a failed experiment. + +01:04:03 We were using ECS and we're still using ECS. So that's AWS's main, you know, it's not Kubernetes, + +01:04:11 but it's one step down with their horizontal scaling container clusters. And I wanted to move + +01:04:17 us onto a single EC2 instance because our projects are popular, but they're not so popular that we + +01:04:23 actually have to worry about horizontal scaling. + +01:04:25 Right. + +01:04:26 It's not like it's front page in New York Times. + +01:04:30 I guess it probably could be. + +01:04:31 But even so, for the static sites, they probably still can take it. + +01:04:35 Yeah. + +01:04:35 So I priced it out and I got an example deployed, an example project deployed, and was able + +01:04:42 to confirm that it would indeed be much cheaper. + +01:04:45 And it was deployed in a similar way using AWS CDK. + +01:04:49 So it's all infrastructure is code all the way down. + +01:04:52 But it turns out there's all kinds of compliance. + +01:04:54 When you are in charge of the VM at like a big university, + +01:04:58 or I'm sure any corporate setting, if you are in charge of the VM and the OS on it, + +01:05:04 then you have to know that you have the latest patches in. + +01:05:07 You have to know that you have latest Ubuntu. + +01:05:09 And then there's other things, different observability things + +01:05:13 that you have to have in place that are not usually required + +01:05:17 if you're running in a container cluster like ECS. + +01:05:21 So it ends up being a lot less work and much easier to achieve compliance if we run containers + +01:05:28 or some other serverless thing. + +01:05:31 If I run all my personal projects, they all run in a single virtual machine, but we're + +01:05:37 running in containers. + +01:05:38 Yeah. + +01:05:38 Yeah. + +01:05:39 And you've got all the SOC 2 stuff and all those different things, right? + +01:05:42 Like there's layers. + +01:05:43 Yeah, that's right. + +01:05:44 Yeah. + +01:05:44 I mean, I'll mention that, but what I didn't say is that in that 2019, when I started learning + +01:05:50 Python. I discovered Talk Python almost immediately. And one of the first episodes that I listened to + +01:05:55 was the other digital humanities. Cornelius Van Litt. He was an awesome guest. + +01:06:01 That's right. Yeah. And I thought that was great. And that was also a bit about manuscripts, + +01:06:06 a little bit more on the image side than the text side. And I didn't understand everything + +01:06:11 that everybody was saying, but I just, I kept tuning in. And I think because of that, + +01:06:16 Because Talk Python was like this, you know, I've been remote working for most of my time. + +01:06:22 And Talk Python has been kind of like that conversation with the open source community + +01:06:27 that's been always in my ear. + +01:06:28 And I think that made, you know, a difference, making me feel like I understood the software + +01:06:34 landscape and like the developer culture and what was going on. + +01:06:37 And then the different Python libraries and what was possible. + +01:06:41 So to people who are interested in taking things in a more technical direction, I think + +01:06:47 it's helpful just to find a few things like that, that give you an insight into that world. + +01:06:53 And the more you listen to it, the more you start to hear the same acronyms and the same + +01:06:59 things said enough that you start to feel like, okay, now you're part of the club. + +01:07:03 I really appreciate that. + +01:07:05 That's cool. + +01:07:06 I've certainly had people reach out to me and say things that at first didn't make any + +01:07:09 sense to me. + +01:07:10 Like I've been listening for six weeks now and it's starting to make sense what you're talking about. + +01:07:14 Like, why have you been listening for six months when it made no sense? + +01:07:16 That's insane. + +01:07:17 But a lot of people use listening to the podcast, is it mine and others, as language immersion, right? + +01:07:24 Like I could get Duolingo and I could learn Portuguese + +01:07:28 or I could move to Brazil for a month. + +01:07:30 You know what I mean? + +01:07:31 And then I would really learn. + +01:07:32 - Yeah, exactly. + +01:07:33 - Right. + +01:07:34 - Exactly. + +01:07:34 No, I think there's truth to that. + +01:07:36 And some of the things I did was, you know, search through, like search the word deployment, because I'm trying to get my head around how to + +01:07:43 deploy for the first time. And I just want to hear people talk about it. Like I could read about it. + +01:07:47 I could read the tutorial, but I just want to hear people talk about deployment to get a sense of what + +01:07:52 actual deployment sounds like. There's something really different when you're learning or trying, + +01:07:57 even you're maybe an experienced programmer, but not in this particular area to hear a human + +01:08:01 side of it, not just the docs, not a sterile. These are the four steps, but like, I love it. + +01:08:08 I mean, it's probably why I created the show. + +01:08:10 It's because I didn't hear those stories. + +01:08:11 We got to tell those stories. + +01:08:13 Awesome. + +01:08:13 I appreciate that. + +01:08:14 So super cool. + +01:08:15 All right. + +01:08:16 So if other people are listening, maybe one of your pieces of advice is keep listening. + +01:08:21 You'll get there. + +01:08:22 Yeah. + +01:08:22 And if anybody is in the humanities and somehow found their way onto this episode with no technical experience, + +01:08:30 I just would give the caution of, like, you know, the anecdote that if AI coding had been + +01:08:37 around the way it is now when I was learning, I wouldn't be doing digital humanities at + +01:08:43 Harvard. + +01:08:43 I wouldn't have been able to get into this field. + +01:08:46 I wouldn't have known about it. + +01:08:47 So I guess just think about that when you're learning and applying new tools. + +01:08:52 I don't really know what the right fix for that is. + +01:08:55 That's a very challenging problem. + +01:08:56 I mean, you can say I'm just literally not going to fire it up. + +01:08:59 But I mean, we used to hunt through Stack Overflow and the web and over and over. + +01:09:03 And if you're really stuck or you really don't understand, like they're good at explaining + +01:09:06 stuff too. + +01:09:07 You just got to really stay in a learner's mindset, not just press the easy button and + +01:09:12 make this thing and move on. + +01:09:13 Easier said than done. + +01:09:14 Easier said than done. + +01:09:15 So yeah, I want to leave this with kind of a thought about how much things like Python + +01:09:22 and these tools and technology can really empower stuff that you wouldn't think is even + +01:09:27 related, like understanding old manuscripts and how painting is connected or changed over time and + +01:09:34 stuff, right? Those sound very much disjointed from tech and software, but they really are + +01:09:40 superpowers that you can bring to your work, whatever your industry is. I know our field of + +01:09:45 study, I know there's some sociologists out in the audience and I'm sure others as well. + +01:09:50 All right. Final thoughts, David, close it out. You said it great. I mean, you know, + +01:09:55 Just applying these technical tools to old questions, that is the core of digital humanities. + +01:10:02 When I first started hearing about this, I thought, I really don't know how this ties + +01:10:05 together. + +01:10:05 And after seeing it a few times, I definitely see the power of it. + +01:10:08 And I thank you for your time coming on. + +01:10:11 Thank you for sharing your look and the look inside of your team and inside of a small piece + +01:10:16 of Harvard. + +01:10:17 I really like these kinds of episodes because it's hard to see this from the outside, right? + +01:10:23 like you just see the results, but you don't see like the inner workings of the team + +01:10:27 and the motivation and stuff. + +01:10:28 So thank you so much for being here. + +01:10:31 And yeah, bye everyone. + +01:10:33 This has been another episode of Talk Python To Me. + +01:10:36 Thank you to our sponsors. + +01:10:37 Be sure to check out what they're offering. + +01:10:38 It really helps support the show. + +01:10:40 Take some stress out of your life. + +01:10:42 Get notified immediately about errors and performance issues in your web + +01:10:46 or mobile applications with Sentry. + +01:10:48 Just visit talkpython.fm/sentry and get started for free. + +01:10:53 Be sure to use our code, talkpython26. + +01:10:56 That's Talk Python, the numbers two, six, all one word. + +01:11:00 This episode is brought to you by CommandBook, a native macOS app that I built + +01:11:05 that gives long-running terminal commands a permanent home. + +01:11:08 No more juggling six terminal tabs every morning. + +01:11:10 Carefully craft a command once, run it forever with auto-restart, + +01:11:14 URL detection, and a full CLI. + +01:11:16 Download it for free at talkpython.fm/commandbook app. + +01:11:19 If you or your team needs to learn Python, We have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:11:32 Best of all, there's no subscription in sight. + +01:11:35 Browse the catalog at talkpython.fm. + +01:11:37 And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:11:42 Just search for Python in your podcast player. + +01:11:44 We should be right at the top. + +01:11:46 If you enjoy that geeky rap song, you can download the full track. + +01:11:49 The link is actually in your podcast blur show notes. + +01:11:51 This is your host, Michael Kennedy. + +01:11:53 Thank you so much for listening. + +01:11:54 I really appreciate it. + +01:11:56 I'll see you next time. + +01:12:08 I'm out. + diff --git a/transcripts/538-python-in-digital-humanities.vtt b/transcripts/538-python-in-digital-humanities.vtt new file mode 100644 index 0000000..a3be1c2 --- /dev/null +++ b/transcripts/538-python-in-digital-humanities.vtt @@ -0,0 +1,3713 @@ +WEBVTT + +00:00:00.020 --> 00:00:05.200 +Digital humanities sounds niche until you realize that it can mean a searchable archive of U.S. + +00:00:05.420 --> 00:00:11.580 +amendment proposals, Irish folklore, or pigment science in ancient art. Today I'm talking with + +00:00:11.720 --> 00:00:17.780 +David Flood from Harvard's DARTH team about an unglamorous problem. What happens when the grant + +00:00:18.060 --> 00:00:24.180 +ends? But the website can't. His answer? Static sites, client-side search, and sneaky Python. + +00:00:24.600 --> 00:00:30.680 +Let's dive in. This is Talk Python To Me, episode 538, recorded January 22nd, 2026. + +00:00:48.540 --> 00:00:52.880 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:53.180 --> 00:00:54.780 +This is your host, Michael Kennedy. + +00:00:55.130 --> 00:00:58.720 +I'm a PSF fellow who's been coding for over 25 years. + +00:00:59.360 --> 00:01:00.420 +Let's connect on social media. + +00:01:00.800 --> 00:01:03.880 +You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:04.239 --> 00:01:06.060 +The social links are all in your show notes. + +00:01:06.820 --> 00:01:10.340 +You can find over 10 years of past episodes at talkpython.fm. + +00:01:10.520 --> 00:01:13.720 +And if you want to be part of the show, you can join our recording live streams. + +00:01:14.080 --> 00:01:14.540 +That's right. + +00:01:14.730 --> 00:01:18.000 +We live stream the raw uncut version of each episode on YouTube. + +00:01:18.580 --> 00:01:23.020 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:23.180 --> 00:01:26.880 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:27.820 --> 00:01:29.480 +This episode is brought to you by Sentry. + +00:01:29.800 --> 00:01:31.040 +Don't let those errors go unnoticed. + +00:01:31.230 --> 00:01:32.840 +Use Sentry like we do here at Talk Python. + +00:01:33.340 --> 00:01:36.200 +Sign up at talkpython.fm/sentry. + +00:01:37.040 --> 00:01:42.320 +And it's brought to you by CommandBook, a native macOS app that I built that gives long-running + +00:01:42.520 --> 00:01:43.980 +terminal commands a permanent home. + +00:01:44.360 --> 00:01:46.360 +No more juggling six terminal tabs every morning. + +00:01:46.820 --> 00:01:51.180 +Carefully craft a command once, run it forever with auto-restart, URL detection, and a full + +00:01:51.300 --> 00:01:51.620 +CLI. + +00:01:51.960 --> 00:01:55.100 +Download it for free at talkpython.fm/command book app. + +00:01:56.040 --> 00:01:59.200 +Hello, David. Welcome to Talk Python To Me. Amazing to have you here. + +00:01:59.760 --> 00:02:03.500 +I'm glad to be here. Talk Python has been part of my story up to this point. + +00:02:03.760 --> 00:02:09.220 +Has it? Okay. Well, you are about to write the next chapter in the story. So that's pretty excellent. + +00:02:10.020 --> 00:02:14.800 +I have a sense of what's coming. We planned out what we're going to talk about and that sort of thing. + +00:02:15.200 --> 00:02:20.420 +And I'm really excited about this topic. So it's going to be a good one. + +00:02:21.060 --> 00:02:34.860 +Honestly, I think one of the real powers of the Python community and the reason the language has such staying power is there's such a diversity of use cases, technology, like technology standpoints, right? + +00:02:34.870 --> 00:02:44.740 +Like I build software for this group or I build these types of apps and it's not just, you know, like Ruby on Rails, which, you know, it's been very popular, but it's, it's for websites, right? + +00:02:44.770 --> 00:02:45.280 +You know what I mean? + +00:02:45.920 --> 00:02:46.780 +Yeah, absolutely. + +00:02:47.280 --> 00:02:57.260 +I mean, web development has dominated my use of it, but my entry into it, which I suppose I'll mention in a moment, was through all those little tools. + +00:02:57.660 --> 00:02:58.320 +Let's hear it. + +00:02:58.930 --> 00:03:00.440 +Who are you, David Flood? + +00:03:00.490 --> 00:03:03.760 +Tell us, introduce yourself real quick and tell us about how you got into it. + +00:03:04.480 --> 00:03:09.140 +So my background is in music and the humanities. + +00:03:09.630 --> 00:03:14.900 +I mean, in 2019, I didn't know what Python was or the name of any programming language. + +00:03:16.180 --> 00:03:22.280 +and I've been doing textual criticism, which is, you know, there's lots of criticisms in the academy. + +00:03:22.860 --> 00:03:26.520 +This is the one where if you have lots and lots of versions of the same text, + +00:03:27.100 --> 00:03:33.280 +you are comparing them to work out what the initial text was and like how it changed over time. + +00:03:33.760 --> 00:03:35.160 +Okay, give us an example. + +00:03:35.700 --> 00:03:39.800 +Okay, so one of the famous examples, hope I can remember it off the top of my head, + +00:03:40.220 --> 00:03:41.600 +is from Shakespeare. + +00:03:42.460 --> 00:03:44.500 +We're all familiar with the line to be or not to be. + +00:03:45.020 --> 00:03:52.520 +is the question. That is the question. Well, there's a variant of it. One of the early copies + +00:03:53.820 --> 00:03:58.900 +written by Shakespeare himself has... Somebody's going to be able to type into the chat exactly + +00:03:58.960 --> 00:04:03.780 +what it is. They'll know this anecdote. But it's something more like, "To be or not to be, I." + +00:04:04.080 --> 00:04:09.940 +That's the question. And so, which one is the original one? Why did he change it? That's kind + +00:04:09.900 --> 00:04:14.900 +of one example i work mainly in the in the new testament which is especially complicated because + +00:04:15.360 --> 00:04:23.300 +no other corpus from ancient history has as many copies of the same text as that corpus does so it's + +00:04:23.340 --> 00:04:29.320 +quite um quite quite complicated and our techniques have have grown grown because of that and perhaps + +00:04:29.560 --> 00:04:37.639 +become more advanced than now i mean that many variations over that huge span of time over + +00:04:37.660 --> 00:04:42.740 +different groups with different, maybe not intentions, but certainly colored by different + +00:04:43.060 --> 00:04:47.620 +worldviews and philosophies and so on. And yeah, I see the trouble. + +00:04:47.920 --> 00:04:53.940 +No, yeah. And they were people of the book. So copying it is something that happened a lot. And + +00:04:54.160 --> 00:05:01.260 +they copied the monks, like the medieval monks copied everything. They copied our Greek classics. + +00:05:01.900 --> 00:05:06.840 +So that's what I was interested in. And because of the wealth of data that we have, + +00:05:07.200 --> 00:05:10.700 +Computer tools are more and more important in that field. + +00:05:11.020 --> 00:05:17.200 +So when I started my PhD in 2019, I knew that I wanted to use some of these cutting-edge tools. + +00:05:17.660 --> 00:05:19.260 +Some of them may be surprising. + +00:05:19.860 --> 00:05:24.100 +For example, we've been using phylogenetic software. + +00:05:24.480 --> 00:05:35.440 +This is software that evolutionary biologists are using or computational biologists are using to track, for example, how COVID strains mutate over time. + +00:05:35.680 --> 00:05:36.420 +Oh, interesting. + +00:05:36.440 --> 00:05:39.800 +What they're comparing are the DNA letters. + +00:05:40.320 --> 00:05:43.740 +And so you have the sequence of letters and you're comparing how those change over time. + +00:05:44.000 --> 00:05:48.160 +Well, you can swap in textual variants for DNA letters. + +00:05:48.600 --> 00:05:55.380 +And now we can track how texts change over time and group them into families, things like that. + +00:05:56.120 --> 00:05:59.580 +It's like a time series, but of words or letters or something. + +00:05:59.720 --> 00:06:06.400 +Yeah, I mean, yeah, there's lots of important algorithms for comparing + +00:06:06.400 --> 00:06:11.720 +sequences of things. And so if we can just swap in Greek words and Greek text instead, + +00:06:12.480 --> 00:06:16.420 +then we can maybe apply it to textual criticism. So I was pretty interested in those things. That + +00:06:16.600 --> 00:06:21.380 +wasn't actually the method that brought me into it, but something like that, kind of computer + +00:06:21.620 --> 00:06:27.760 +intensive tools. What I learned is that these tools weren't actually available to me. They + +00:06:27.940 --> 00:06:36.380 +weren't desktop applications. And for the most part, they weren't public web applications. They + +00:06:36.400 --> 00:06:38.000 +PyPI or something like that, right? + +00:06:38.420 --> 00:06:39.060 +Yeah, exactly. + +00:06:39.360 --> 00:06:39.460 +Exactly. + +00:06:39.540 --> 00:06:40.040 +Or Java. + +00:06:41.180 --> 00:06:43.380 +And I needed to glue them together. + +00:06:43.780 --> 00:06:49.660 +So the long story short on that is during the first year of my PhD, I was picking up Python, + +00:06:50.000 --> 00:06:51.880 +watching YouTube videos while I was doing the dishes. + +00:06:52.800 --> 00:06:57.220 +And then the pandemic hit while I was living in Edinburgh in Scotland, probably not far + +00:06:57.460 --> 00:06:58.160 +from Will McCoogan. + +00:06:59.220 --> 00:07:06.360 +And so the pandemic gave me the excuse to spend even a few more hours each day picking up these + +00:07:06.380 --> 00:07:12.900 +new, these new technical skills. And so I did it, I was able to use these advanced tools in my in my + +00:07:13.100 --> 00:07:17.440 +work. But what was really important to me was sharing, like making that available to my colleagues, + +00:07:18.120 --> 00:07:23.860 +is I had to I had to move from writing these like bad top to bottom Python scripts into things that + +00:07:23.860 --> 00:07:29.500 +could be reused by other people. And that led me into the web, because the web is where that's how + +00:07:29.500 --> 00:07:35.740 +I can share with anybody. It's really wild how much the web is kind of the last bastion of + +00:07:36.640 --> 00:07:42.480 +app freedom. It's so bizarre because, you know, I've many times told the stories of the insane + +00:07:42.900 --> 00:07:48.500 +battles of just getting our apps that just playback video of content that's already on the web + +00:07:48.860 --> 00:07:54.740 +into the app store. I mean, weeks of fighting about the weirdest, most nonsensical things with + +00:07:54.860 --> 00:08:01.819 +both Google and Apple. But we also now have the Mac platform and the Windows platform very + +00:08:01.840 --> 00:08:07.780 +aggressively looking for digital code certificates and all sorts of signing and other kinds of proof + +00:08:07.920 --> 00:08:12.840 +like it you can't even just send somebody an executable anymore it won't run it's it's crazy + +00:08:13.120 --> 00:08:18.940 +it's it's down to like okay put it on the web i guess that's right i i i played the game of + +00:08:19.080 --> 00:08:24.540 +distributing desktop apps that's how i did it that's why i initially distributed things um + +00:08:25.140 --> 00:08:30.860 +and at this point i just require people to install python and then install my desktop app from pypi + +00:08:30.880 --> 00:08:33.400 +because it's too hard otherwise for me. + +00:08:33.820 --> 00:08:36.479 +I mean, I could pay for the code signing from Apple + +00:08:36.890 --> 00:08:37.599 +and do all of that, + +00:08:37.740 --> 00:08:40.320 +but it's just, it's too much work for the time that I have. + +00:08:40.500 --> 00:08:42.140 +Yeah, I'm about to do another round of it. + +00:08:42.200 --> 00:08:42.979 +I'm working on an app + +00:08:44.060 --> 00:08:45.680 +and my developer account is still active. + +00:08:45.880 --> 00:08:47.680 +So we might have a fresh round of fun. + +00:08:47.820 --> 00:08:49.260 +Hopefully it goes through this time. + +00:08:50.320 --> 00:08:52.160 +Anyway, I do think it's such a challenge. + +00:08:52.380 --> 00:08:53.520 +And are you leveraging? + +00:08:53.940 --> 00:08:55.180 +I don't know if the timing was right. + +00:08:55.300 --> 00:08:56.300 +Like maybe this was too early, + +00:08:56.780 --> 00:08:59.740 +but these days, are you leveraging things like uvx + +00:09:00.060 --> 00:09:03.640 +to run, or are you just pip install this thing and then run it? + +00:09:04.100 --> 00:09:08.200 +Yeah, I haven't updated the readme in a while, so I think it just asks for pip. + +00:09:08.740 --> 00:09:14.220 +But certainly, if somebody asked me today, I would say, yeah, just install this with uv. + +00:09:14.920 --> 00:09:16.260 +Because then they don't even need Python. + +00:09:16.700 --> 00:09:17.100 +Exactly. + +00:09:17.420 --> 00:09:17.860 +And that's brilliant. + +00:09:18.440 --> 00:09:22.900 +And that's a really, it is another barrier reduced in distributing these applications, + +00:09:23.160 --> 00:09:23.220 +right? + +00:09:23.300 --> 00:09:28.600 +Like, if you can get uv installed on a machine, then you don't even have to say install, just + +00:09:28.560 --> 00:09:32.960 +The way you run it is uvx my thing and it's all transparent to you, right? + +00:09:33.020 --> 00:09:33.520 +Which is beautiful. + +00:09:33.900 --> 00:09:34.880 +So what was it like? + +00:09:35.100 --> 00:09:35.320 +Yeah. + +00:09:35.790 --> 00:09:42.340 +So what was it like coming from what sounds like a not super screen focus, super + +00:09:43.020 --> 00:09:47.300 +techie aspect and having to dive into this world and someday you're probably + +00:09:47.420 --> 00:09:49.900 +like, how is it that I'm publishing stuff to PyPI? + +00:09:49.990 --> 00:09:50.740 +What has happened to me? + +00:09:51.300 --> 00:09:51.720 +Yeah. + +00:09:51.970 --> 00:09:56.259 +well, yeah, I remember when I, when I first signed up for GitHub, because + +00:09:56.320 --> 00:10:02.160 +you know, whatever YouTube tutorial I was working through at the time, you know, said that I needed + +00:10:02.160 --> 00:10:08.720 +to do that. You know, I think it all started making a lot of sense. I didn't have any technical + +00:10:08.980 --> 00:10:16.880 +background, but the world kind of open source software, it just kind of made sense. It felt + +00:10:17.020 --> 00:10:23.820 +like it fit really well into my academic, you know, circle. I think a lot of the attitudes are + +00:10:23.840 --> 00:10:27.980 +similar. I agree. I think they are actually. And I think that's, I think that's a pretty neat thing. + +00:10:28.480 --> 00:10:34.240 +Yeah. Very cool. All right. Well, let's talk about what you're doing with digital humanities. + +00:10:34.760 --> 00:10:40.400 +You're actually at a really interesting project or organization, I guess, that does many projects, + +00:10:40.600 --> 00:10:45.020 +right? Yeah. Yeah. So fast, fast forwarding, I did, I finished my PhD in the humanities. + +00:10:45.200 --> 00:10:50.780 +Sorry. I had so much fun. No, that's fine. That's fine. I had so much fun writing like these tools + +00:10:50.860 --> 00:10:54.120 +and then just solving the distribution problem + +00:10:54.450 --> 00:10:55.760 +to share them with other scholars. + +00:10:56.740 --> 00:11:00.120 +That was so fun that I was open to this kind of opportunity + +00:11:00.770 --> 00:11:01.980 +where now I'm doing this full time. + +00:11:02.570 --> 00:11:04.200 +And so, yes, so I'm on the, + +00:11:04.500 --> 00:11:06.720 +we call it affectionately Darth, + +00:11:07.440 --> 00:11:10.240 +which is digital arts and humanities at Harvard. + +00:11:11.200 --> 00:11:14.160 +There has to be a lot of Star Wars memes and references, + +00:11:14.310 --> 00:11:14.680 +I'm sure. + +00:11:14.980 --> 00:11:16.400 +If you can pull up a 404, + +00:11:16.770 --> 00:11:19.060 +I think there will be a Darth Vader reference. + +00:11:19.470 --> 00:11:20.660 +Seriously, I'm here for it. + +00:11:22.360 --> 00:11:25.680 +Yes, page not found. I find your lack of nav disturbing. + +00:11:27.660 --> 00:11:33.020 +You know what? I think that is beautiful. And I really, I really think that people should embrace + +00:11:33.580 --> 00:11:40.860 +the 404, the fun 404 page, you know, more, right? There should really be something going on that + +00:11:40.900 --> 00:11:45.560 +like makes it, you know, something hasn't worked out, but you can just, you can make people laugh. + +00:11:46.280 --> 00:11:47.340 +Yeah. I appreciate that. + +00:11:48.560 --> 00:11:50.360 +I've heard people push back against it. + +00:11:50.490 --> 00:11:57.900 +Like if you're on a, if you're on like your medical website and you're maybe about to get bad news and then you get like a picture of a kitten. + +00:12:00.160 --> 00:12:01.880 +Dr. Kitten doesn't know where your results went. + +00:12:02.020 --> 00:12:02.760 +Like I get that. + +00:12:02.800 --> 00:12:03.300 +That's not funny. + +00:12:04.060 --> 00:12:05.780 +But I mean, most things are not that serious. + +00:12:06.560 --> 00:12:06.780 +Yeah. + +00:12:07.540 --> 00:12:07.920 +Mostly. + +00:12:08.780 --> 00:12:08.920 +Okay. + +00:12:09.180 --> 00:12:11.740 +So what kind of things does Darth do? + +00:12:12.080 --> 00:12:16.680 +You've described this as kind of a web or tech agency within Harvard. + +00:12:17.280 --> 00:12:18.200 +Yeah, it is very much. + +00:12:18.420 --> 00:12:21.740 +So, you know, Harvard has a gigantic IT group. + +00:12:21.890 --> 00:12:28.080 +I don't know how many hundreds of people work, but more than 500 people in IT. + +00:12:28.840 --> 00:12:33.000 +We are a small team and we operate very much like a small agency. + +00:12:33.540 --> 00:12:41.500 +So usually what happens is a faculty member has a funded research project that's going to last for an amount of time. + +00:12:42.210 --> 00:12:44.640 +And then we consult with them to build it. + +00:12:44.880 --> 00:12:53.160 +And most of the time, I kind of think of these as I kind of have these different categories of these kinds of projects that I think of. + +00:12:54.070 --> 00:12:56.060 +I lost in my notes what I call them. + +00:12:56.170 --> 00:12:57.240 +But they are there. + +00:12:57.390 --> 00:13:01.160 +You have like a one is like a virtual research environment. + +00:13:01.450 --> 00:13:07.400 +So the focus is this is this is a platform that we're building for the research to be done on. + +00:13:07.720 --> 00:13:17.000 +Like the reason the research should be done in like a web app would be because you have access to visualization, to Postgres, to Pandas. + +00:13:17.170 --> 00:13:23.420 +So we can kind of build up this platform to do the actual research on and some of the data entry. + +00:13:23.700 --> 00:13:26.060 +So like a full on research application. + +00:13:26.660 --> 00:13:27.040 +Yeah, exactly. + +00:13:27.580 --> 00:13:36.040 +I guess you can also kind of see your work through the different stages of research projects and academic research and so on. + +00:13:36.220 --> 00:13:41.900 +And we'll get to maybe end of life in a sense further down in the conversation. + +00:13:42.470 --> 00:13:48.680 +But so this would be we have a grant or we just work here and we're going to work on some form of research. + +00:13:49.210 --> 00:13:49.960 +What do you give them? + +00:13:50.480 --> 00:13:58.540 +Right. And I think that's a super interesting challenge because one of the real common answers would be Jupyter, Jupyter Lab, Marimo, whatever. + +00:13:59.130 --> 00:14:05.380 +But that's still pretty code heavy for people who are possibly philosophers or something, you know. + +00:14:05.800 --> 00:14:12.820 +Oh, exactly. That's why in digital humanities, I won't even, maybe I won't even attempt to define + +00:14:13.710 --> 00:14:19.820 +it in any narrow sense, because I'll get in trouble with somebody. But you have two groups + +00:14:20.370 --> 00:14:26.580 +that are interfacing with each other. And one is digital humanities as a field, like as a subfield, + +00:14:26.800 --> 00:14:31.320 +all of its own. And these are people who have humanities domain, like knowledge, + +00:14:31.860 --> 00:14:36.680 +and technical skills, and they're bringing them together. And in a lot of cases, the audience for + +00:14:36.840 --> 00:14:42.480 +that kind of work is other people working in the digital humanities. But far more common, + +00:14:42.780 --> 00:14:49.220 +and this is what we work with, is people who have humanities domain expertise, and they want to + +00:14:49.600 --> 00:14:55.560 +publish or do research or share with other people who have that same humanities domain expertise, + +00:14:55.640 --> 00:14:59.640 +and they are now interested in adding a technical component to it. + +00:14:59.960 --> 00:15:02.000 +How can we supercharge what they have? + +00:15:03.500 --> 00:15:06.120 +This portion of Talk Python is brought to you by Sentry. + +00:15:06.580 --> 00:15:09.680 +I've been using Sentry personally on almost every application + +00:15:10.060 --> 00:15:12.480 +and API that I've built for Talk Python and beyond + +00:15:13.260 --> 00:15:14.260 +over the last few years. + +00:15:14.580 --> 00:15:17.460 +They're a core building block for keeping my infrastructure solid. + +00:15:18.060 --> 00:15:19.360 +They should be for yours as well. + +00:15:19.640 --> 00:15:20.020 +Here's why. + +00:15:20.680 --> 00:15:22.100 +Sentry doesn't just catch errors. + +00:15:22.200 --> 00:15:24.900 +It catches all the stuff that makes your app feel broken, + +00:15:25.280 --> 00:15:27.560 +the random slowdown, the freeze you can't reproduce, + +00:15:28.260 --> 00:15:30.620 +that bug that only shows up once real users hit it. + +00:15:30.960 --> 00:15:31.820 +And when something goes wrong, + +00:15:32.180 --> 00:15:34.500 +Sentry gives you the whole chain of events in one place. + +00:15:34.720 --> 00:15:37.700 +Errors, traces, replays, logs, dots connected. + +00:15:38.080 --> 00:15:39.900 +You can see what's led to the issue + +00:15:40.040 --> 00:15:41.880 +without digging through five different dashboards. + +00:15:42.700 --> 00:15:44.720 +SEER, Sentry's AI debugging agent, + +00:15:45.200 --> 00:15:47.180 +builds on this data, taking the full context, + +00:15:47.840 --> 00:15:49.820 +explaining why the issue happened, + +00:15:50.400 --> 00:15:52.780 +pointing to the code responsible, drafts a fix, + +00:15:52.880 --> 00:15:55.840 +and even flags if your PR is about to introduce a new problem. + +00:15:56.680 --> 00:15:57.720 +The workflow stays simple. + +00:15:58.160 --> 00:15:59.900 +Something breaks, Sentry alerts you, + +00:16:00.080 --> 00:16:01.880 +the dashboard shows you the full context, + +00:16:02.220 --> 00:16:05.360 +Seer helps you fix it and catch new issues before they ship. + +00:16:06.080 --> 00:16:08.920 +It's totally reasonable to go from an error occurred + +00:16:09.080 --> 00:16:11.140 +to fixed in production in just 10 minutes. + +00:16:12.200 --> 00:16:14.960 +I truly appreciate the support that Sentry has given me + +00:16:15.060 --> 00:16:17.520 +to help solve my bugs and issues in my apps, + +00:16:18.160 --> 00:16:20.740 +especially those tricky ones that only appear in production. + +00:16:21.100 --> 00:16:22.580 +I know you will too if you try them out. + +00:16:22.880 --> 00:16:24.520 +So get started today with Sentry. + +00:16:24.700 --> 00:16:29.720 +Just visit talkpython.fm/sentry and get $100 in Sentry credits. + +00:16:30.240 --> 00:16:30.960 +Please use that link. + +00:16:31.060 --> 00:16:32.320 +It's in your podcast player show notes. + +00:16:32.420 --> 00:16:37.760 +If you're signing up some other way, you can use our code talkpython26, all one word, + +00:16:38.340 --> 00:16:40.900 +talkpython26, to get $100 in credits. + +00:16:41.680 --> 00:16:43.360 +Thank you to Sentry for supporting the show. + +00:16:44.500 --> 00:16:49.319 +Maybe just take a moment and speak to, maybe, I don't know if this venue will actually speak + +00:16:49.340 --> 00:16:54.200 +directly to anybody who I was imagining here, but people who work with folks, what would you tell + +00:16:54.340 --> 00:16:58.720 +somebody who works with a group who have some technical skill, who could create some of these + +00:16:58.880 --> 00:17:02.280 +things that we're going to talk about, but the people who they've created for don't necessarily + +00:17:02.540 --> 00:17:09.420 +think they need it or know that they need it. I've gone often on rants about how programming is a + +00:17:09.680 --> 00:17:15.260 +superpower, not a replacement for your job, right? Yeah. That's a problem for a lot of people, + +00:17:15.360 --> 00:17:20.500 +especially because you might use some new computer tools to supercharge your research. + +00:17:20.980 --> 00:17:25.740 +But the article that you publish or the research output of that, the audience, they may not + +00:17:25.860 --> 00:17:27.660 +be interested in hearing about that at all. + +00:17:28.040 --> 00:17:32.760 +And so for most people who are working in this space, the tools, you have to use them + +00:17:33.000 --> 00:17:37.720 +in such a way that you can talk about the research output without talking about the + +00:17:37.860 --> 00:17:38.000 +tool. + +00:17:38.260 --> 00:17:42.979 +And we have other venues to talk about the tools themselves, like the Journal for Open + +00:17:43.000 --> 00:17:48.660 +source software and you can kind of get some of it out there. But that is a, that's the significant + +00:17:48.880 --> 00:17:53.020 +challenge is convincing people that it, that it could be useful and then convincing the audience + +00:17:53.230 --> 00:17:57.800 +that they should be interested in kind of the methods behind how some of the new research comes + +00:17:57.800 --> 00:18:02.940 +up. Also, I think I'm a big believer that presenting stuff in the right order is really, + +00:18:03.130 --> 00:18:07.620 +really important. If you present your research and it's beautiful and powerful and oh, look, + +00:18:07.760 --> 00:18:12.500 +we've also, by the way, covered a hundred times more data than any prior research. Surprise, + +00:18:12.760 --> 00:18:13.520 +I wonder how I did that. + +00:18:14.160 --> 00:18:15.400 +And then people are like, this is amazing. + +00:18:16.580 --> 00:18:19.580 +Then after you kind of hook them with the inspiration and what's possible, + +00:18:19.680 --> 00:18:21.480 +then you're like, let me tell you about the tool. + +00:18:21.600 --> 00:18:22.860 +And all of a sudden you're like, that's a cool tool, right? + +00:18:22.920 --> 00:18:26.100 +This is not just like geekery, like programmer, you know, + +00:18:26.440 --> 00:18:28.120 +Charlie Brown speak, wah, wah, wah, wah, wah. + +00:18:28.300 --> 00:18:29.660 +You know, it's like, no, I'm listening. + +00:18:29.880 --> 00:18:30.460 +Tell me now. + +00:18:30.820 --> 00:18:31.260 +Yeah, exactly. + +00:18:31.620 --> 00:18:34.960 +I mean, one of the things I think that really opens people's eyes + +00:18:35.300 --> 00:18:37.720 +is a really powerful search interface. + +00:18:38.260 --> 00:18:39.740 +You have all of this research data. + +00:18:40.120 --> 00:18:45.020 +just put it behind Elasticsearch with some really good filtering on it. And all of a sudden you have + +00:18:45.180 --> 00:18:50.740 +fast, rapid access to the data in a way you never had before. Like you were never scrolling through + +00:18:51.140 --> 00:18:55.160 +the Excel spreadsheets and finding exactly what you wanted, like you were with this new search + +00:18:55.400 --> 00:19:00.360 +interface. And that by itself is like so simple. We're so used to that in web development that + +00:19:00.480 --> 00:19:05.099 +like everything needs to have a fantastic search now. But so many people have their data locked + +00:19:05.120 --> 00:19:07.500 +behind, you know, a terrible search interface. + +00:19:07.960 --> 00:19:10.400 +Yeah, just a few things to sort of expose that. + +00:19:10.500 --> 00:19:14.880 +So this, give us a sense of what these data exploration web apps might look like. + +00:19:14.940 --> 00:19:20.060 +These are probably kind of mostly stuck to the inside, kind of internal to the research + +00:19:20.540 --> 00:19:22.820 +lab research team groups and so on. + +00:19:22.960 --> 00:19:24.720 +These are probably not that public facing, right? + +00:19:24.980 --> 00:19:28.580 +Almost everything we work on does end up having a public facing component. + +00:19:28.940 --> 00:19:33.960 +So maybe the research itself is done, locked behind a user login. + +00:19:34.300 --> 00:19:35.440 +That's just for the researchers. + +00:19:36.290 --> 00:19:38.880 +But then they expose that research to the public, + +00:19:39.520 --> 00:19:41.080 +usually with a good search interface + +00:19:41.640 --> 00:19:44.840 +and different pages for exploring their data + +00:19:45.020 --> 00:19:47.200 +and visualizations and things like that. + +00:19:47.380 --> 00:19:49.360 +So yeah, everything we do ends up becoming + +00:19:49.850 --> 00:19:52.560 +a production public web app in the end. + +00:19:52.760 --> 00:19:54.740 +And then another one of your categories, + +00:19:54.830 --> 00:19:57.000 +you put it was virtual research environments + +00:19:57.260 --> 00:19:59.740 +like data entry, publishing, authoring, collaboration. + +00:20:00.050 --> 00:20:00.540 +Tell us about that. + +00:20:01.280 --> 00:20:03.139 +Yeah, so a good example of this maybe + +00:20:03.160 --> 00:20:08.820 +is one of the projects that... Well, actually, the best example of it is the project I worked on + +00:20:08.930 --> 00:20:16.380 +during my PhD. It's called Apatosaurus. The short story behind the name is that it sounds like + +00:20:16.540 --> 00:20:24.280 +apparatus. In textual criticism, when you are displaying and visualizing variant readings to + +00:20:24.710 --> 00:20:31.959 +a base text, that form of visualizing it is a critical apparatus. A critical apparatus is a + +00:20:32.040 --> 00:20:37.500 +a pretty boring website name, but Apatosaurus dinosaurs might make textual criticism sound fun. + +00:20:37.720 --> 00:20:43.180 +Yeah, I do love dinosaurs. No, that's really cool. So this, this comes out as a web app. And I know + +00:20:43.180 --> 00:20:46.160 +you also have some, you talked about some desktop apps as well. + +00:20:46.640 --> 00:20:50.460 +Yep. Yep. That's right. So, yeah. So, so there's this people, people upload their, + +00:20:50.550 --> 00:20:54.960 +their collation to this and then they can visualize it. And like there, there's a public + +00:20:55.440 --> 00:21:00.320 +component of this as well, but really the backend is editing, editing a collation, + +00:21:00.500 --> 00:21:03.160 +and adding notes to all of the different readings and stuff. + +00:21:03.560 --> 00:21:07.060 +So I could show what the backend looks like, + +00:21:07.280 --> 00:21:08.200 +but we can also move on. + +00:21:08.440 --> 00:21:11.420 +- Let's move on just because most people + +00:21:11.620 --> 00:21:14.540 +will not totally hear, but just give us a sense of like, + +00:21:14.880 --> 00:21:18.740 +like what do people, what do you create for people + +00:21:18.900 --> 00:21:21.660 +so that they're like, yeah, I can use this app, right? + +00:21:21.760 --> 00:21:23.440 +Like give us a sense of some of the features, + +00:21:23.660 --> 00:21:24.960 +I guess is what I'm getting to. + +00:21:25.260 --> 00:21:29.319 +- Yeah, so another good example is we have a project + +00:21:29.360 --> 00:21:32.380 +at Harvard called Mapping Color in History. + +00:21:33.150 --> 00:21:36.980 +And this is a collaboration with a lab. + +00:21:37.150 --> 00:21:38.900 +This lab brings in pieces of artwork + +00:21:39.480 --> 00:21:42.560 +and they do spectral analysis on the pigments + +00:21:42.590 --> 00:21:45.160 +so they can identify what was used + +00:21:45.160 --> 00:21:48.360 +to make a particular color of this red + +00:21:48.550 --> 00:21:50.860 +or what was made to make this color of blue. + +00:21:51.420 --> 00:21:53.640 +And then the idea is tracking + +00:21:54.000 --> 00:21:56.880 +how did people make those pigments over time, + +00:21:57.280 --> 00:22:01.720 +over time and specifically in Asian art. + +00:22:02.260 --> 00:22:04.380 +Is this the Dharmra, Puna, Puna? + +00:22:05.440 --> 00:22:07.960 +No, this is mapping color in history. + +00:22:08.030 --> 00:22:09.640 +I don't think it's up here. + +00:22:09.770 --> 00:22:10.280 +Sorry about that. + +00:22:10.420 --> 00:22:10.680 +Somewhere. + +00:22:10.940 --> 00:22:11.300 +That's all right. + +00:22:11.330 --> 00:22:11.840 +I'll find it. + +00:22:12.030 --> 00:22:12.440 +Keep talking. + +00:22:13.680 --> 00:22:13.840 +Okay. + +00:22:14.050 --> 00:22:16.300 +So the front end is great. + +00:22:16.510 --> 00:22:18.040 +You know, like the public end, + +00:22:18.180 --> 00:22:21.000 +this is people can explore by pigments + +00:22:21.280 --> 00:22:24.440 +and then see the images that contain those pigments. + +00:22:24.560 --> 00:22:30.680 +Now in the back end, what the researchers will be able to do is correlate exactly which + +00:22:30.960 --> 00:22:34.260 +point of a painting the analysis was done on. + +00:22:34.490 --> 00:22:38.640 +So they have this deep zoom image viewer where they'll zoom in and they'll select the point + +00:22:39.390 --> 00:22:40.280 +where that was taken from. + +00:22:41.090 --> 00:22:47.640 +So how else would you do that other than a digital interface to indicate on an image of + +00:22:47.950 --> 00:22:52.060 +a painting where that spectral analysis was performed? + +00:22:52.380 --> 00:22:55.020 +Sounds almost like astronomy in a weird way. + +00:22:55.050 --> 00:22:55.300 +Oh, yeah. + +00:22:55.840 --> 00:23:04.580 +We zoomed into here and we took a different spectrum of the painting and we realized that it's actually identical to this, you know, something crazy like that, right? + +00:23:04.900 --> 00:23:06.140 +Yeah, yeah, yeah, that's right. + +00:23:06.200 --> 00:23:08.560 +Yeah, so it's essentially a pigments, like a pigments database. + +00:23:10.100 --> 00:23:17.940 +So the third category of these digital humanities projects that you put down was like data extraction, transformation. + +00:23:19.260 --> 00:23:29.540 +In data science, they often say, you know, 80% of the work is the data wrangling, which is like cleaning, organization, just getting it so you could possibly start asking questions about it. + +00:23:29.820 --> 00:23:30.960 +I'm sure you all do a lot of that. + +00:23:31.180 --> 00:23:31.440 +Absolutely. + +00:23:32.560 --> 00:23:40.880 +So often, the very beginning of a project might be an Excel sheet or several spreadsheets. + +00:23:41.800 --> 00:23:46.020 +And the first task is to ingest these into, you know, a proper database. + +00:23:46.640 --> 00:23:48.600 +Not so much MongoDB for us. + +00:23:48.760 --> 00:23:49.840 +It's going into Postgres. + +00:23:50.340 --> 00:23:51.420 +We're Django Shop. + +00:23:51.680 --> 00:23:52.480 +We're Django Shop. + +00:23:52.630 --> 00:23:53.760 +So it's going into Postgres. + +00:23:55.090 --> 00:24:06.560 +And yeah, no, that is probably the number one challenge of the early stage is figuring out what the right data model is, what the right relationships are to model the data. + +00:24:07.300 --> 00:24:17.020 +Doing that work is advantageous to everybody because, you know, it helps both the researchers who brought the data to think about it in a more organized way. + +00:24:17.410 --> 00:24:18.540 +I mean, they've been trying to do that. + +00:24:18.720 --> 00:24:19.680 +And they have the spreadsheets. + +00:24:20.160 --> 00:24:27.800 +But now we're modeling out the data so that we can add it to database tables and then to use later. + +00:24:27.880 --> 00:24:29.340 +So that works out well for everybody. + +00:24:30.000 --> 00:24:30.720 +And yeah, absolutely. + +00:24:31.100 --> 00:24:45.760 +Cleaning the data, getting dates, working with fuzzy dates, being able to parse July of 2020 or summer of 2020 and handling kind of all of those cases so that we do get dates in the end. + +00:24:45.780 --> 00:24:55.980 +One of the crazy stories from data parsing history is one of the, I can't remember exactly what it was, you talked about biology tools or genetics tools earlier. + +00:24:56.100 --> 00:25:03.780 +One of the groups that names genes had to change the name of a gene because it kept getting parsed by Excel into a date. + +00:25:04.880 --> 00:25:05.520 +Yeah, I remember that. + +00:25:05.590 --> 00:25:06.200 +I remember that. + +00:25:06.320 --> 00:25:06.600 +That's right. + +00:25:07.260 --> 00:25:07.580 +Yes. + +00:25:08.100 --> 00:25:10.580 +So these are the weird edge cases I'm sure you run into. + +00:25:11.940 --> 00:25:13.120 +Like it's not even supposed to be a date. + +00:25:13.220 --> 00:25:13.940 +Why is this a date? + +00:25:13.990 --> 00:25:14.780 +I don't know. + +00:25:14.940 --> 00:25:16.240 +Why is it helping out here? + +00:25:16.920 --> 00:25:17.820 +The code keeps crashing. + +00:25:18.000 --> 00:25:20.480 +Like pandas parsed it as a date and it's not or whatever. + +00:25:21.220 --> 00:25:21.540 +Absolutely. + +00:25:21.980 --> 00:25:22.060 +Yeah. + +00:25:22.140 --> 00:25:22.300 +Yeah. + +00:25:22.340 --> 00:25:27.320 +So yeah, usually lots of test suites around that ingest process until we've got it. + +00:25:27.640 --> 00:25:32.220 +Now, once we've got it in, usually the research is ongoing and then we're able to provide + +00:25:32.420 --> 00:25:38.140 +them now a new cleaned interface to do the additional data entry as the project is going. + +00:25:38.420 --> 00:25:39.780 +And that's usually a win-win for everybody. + +00:25:40.180 --> 00:25:40.340 +Sure. + +00:25:40.620 --> 00:25:45.780 +And so this sort of ETL ingestion side of everything is it's like, don't worry, + +00:25:46.420 --> 00:25:47.460 +Darth has got it for you. + +00:25:47.760 --> 00:25:51.180 +And then we'll provide you like a database connection to start working. + +00:25:51.480 --> 00:25:54.700 +Or do you give them the tools and then they kind of iterate on them? + +00:25:54.940 --> 00:26:00.220 +And how much is this you and how much is this you providing like CLI tools and stuff + +00:26:00.460 --> 00:26:01.540 +or notebooks over to people? + +00:26:03.560 --> 00:26:08.520 +I'd say most of the people that we're working with are aware of the technical tools, + +00:26:08.640 --> 00:26:10.380 +but they don't want a database connection. + +00:26:10.800 --> 00:26:16.520 +So we are giving them, we're doing the ingest and then building a platform where they can begin interacting with their data. + +00:26:17.240 --> 00:26:18.720 +Yeah, I'm sure they don't want one. + +00:26:20.140 --> 00:26:22.600 +Maybe you give them an app though, right? + +00:26:22.820 --> 00:26:24.940 +With like Elasticsearch and other things that they can. + +00:26:25.120 --> 00:26:25.400 +No, absolutely. + +00:26:25.680 --> 00:26:26.400 +Yeah, that's what we do. + +00:26:26.720 --> 00:26:27.040 +Yeah, okay. + +00:26:27.140 --> 00:26:32.520 +Yeah, we give them a web platform to begin exploring, to begin publishing. + +00:26:34.320 --> 00:26:38.760 +So I was thinking that you said you're a Django shop, which is cool. + +00:26:38.840 --> 00:26:43.280 +It sounds, though, to me like describing what you're doing, just imagining how this is. + +00:26:43.640 --> 00:26:46.000 +You're probably creating these projects often. + +00:26:46.520 --> 00:26:49.440 +How often does one of these projects actually last? + +00:26:49.980 --> 00:26:51.900 +Or how many of them do you iterate? + +00:26:53.180 --> 00:26:53.700 +I'm trying to get a sense. + +00:26:53.920 --> 00:26:56.960 +Do you work on stuff for a year or is it like every two weeks we're on a new project? + +00:26:58.180 --> 00:26:59.980 +It's why I think of us as like an agency. + +00:27:00.900 --> 00:27:04.240 +Because we get to work on greenfield projects fairly often, like you're imagining. + +00:27:04.700 --> 00:27:08.880 +Which would not be the case normally at a big university IT department. + +00:27:09.960 --> 00:27:15.040 +So, you know, maybe two or three projects a year, two or three big ones a year. + +00:27:15.500 --> 00:27:18.640 +And then we have to put to bed a few a year as well. + +00:27:18.740 --> 00:27:21.260 +Because these things, they're funded with grant money. + +00:27:21.600 --> 00:27:24.060 +And then the grant money runs out and it's time. + +00:27:24.200 --> 00:27:26.320 +And then we have to figure out what do we do with it now? + +00:27:26.380 --> 00:27:31.060 +We don't want to lose the data and this way of presenting it. + +00:27:31.100 --> 00:27:33.140 +But we can't keep paying for Elasticsearch. + +00:27:33.520 --> 00:27:34.120 +Yeah, of course. + +00:27:34.380 --> 00:27:37.620 +I'm certainly, we're going to dive into that because that is, but let's save that for the + +00:27:37.740 --> 00:27:37.780 +end. + +00:27:37.800 --> 00:27:40.920 +It seems like that's the arc of the story of these things. + +00:27:40.960 --> 00:27:44.700 +But I certainly think it's something that you don't think about that much, right? + +00:27:44.980 --> 00:27:47.360 +Like you said, it was only a hundred dollars a month for this. + +00:27:47.440 --> 00:27:48.280 +And we got a big grant. + +00:27:48.400 --> 00:27:49.200 +There's a bunch of, no big deal. + +00:27:49.280 --> 00:27:52.880 +But like when the grant's out, who's on the hook for a hundred dollars a month and making + +00:27:53.040 --> 00:27:55.920 +sure it survives upgrades and all that kind of business. + +00:27:56.400 --> 00:27:56.780 +No, that's right. + +00:27:57.080 --> 00:27:57.240 +Yeah. + +00:27:57.360 --> 00:28:02.720 +So my original question when I started on this path was thinking like, do you, how do you + +00:28:02.780 --> 00:28:03.400 +get started on these? + +00:28:03.460 --> 00:28:07.640 +Do you have like a big framework or a cookie cutter sort of thing or something like this + +00:28:07.760 --> 00:28:11.740 +is how we do it because it plugs into all this other automation and tools we built for + +00:28:11.800 --> 00:28:12.680 +the last 10 projects. + +00:28:13.220 --> 00:28:14.400 +You know, that's kind of a unique position. + +00:28:14.920 --> 00:28:19.020 +A lot of companies build one website for themselves and that's their app or they're + +00:28:19.020 --> 00:28:21.480 +an agency that goes across so many, so much variation. + +00:28:21.660 --> 00:28:22.600 +They can't do that kind of stuff. + +00:28:22.680 --> 00:28:22.820 +Right. + +00:28:23.220 --> 00:28:23.820 +That's right. + +00:28:24.080 --> 00:28:24.360 +That's right. + +00:28:25.300 --> 00:28:25.920 +That's a good question. + +00:28:26.320 --> 00:28:28.840 +We have things that we reuse. + +00:28:29.000 --> 00:28:35.400 +Some of them are open source, different search components and things that we maintain that + +00:28:36.110 --> 00:28:37.420 +we'll use across projects. + +00:28:37.930 --> 00:28:41.100 +And we have tried to do the cookie cutter Django project. + +00:28:41.640 --> 00:28:47.240 +The truth is, each project is different enough that really we like to evaluate it from first + +00:28:47.520 --> 00:28:54.120 +principles as we're evaluating it and thinking, what is the best technology to use? + +00:28:55.370 --> 00:28:55.520 +Yeah. + +00:28:55.750 --> 00:28:55.920 +Yeah. + +00:28:56.020 --> 00:28:59.000 +So yeah, we don't have a cookie cutter. + +00:28:59.050 --> 00:29:04.200 +We don't have a kind of a meta framework for bootstrapping them because they're sufficiently + +00:29:04.450 --> 00:29:05.780 +different from each other that we... + +00:29:05.960 --> 00:29:06.680 +I find that too. + +00:29:07.030 --> 00:29:07.640 +I find that too. + +00:29:08.120 --> 00:29:12.520 +The idea of how we could just grab this cookie cutter or copier. + +00:29:12.610 --> 00:29:13.600 +Are you familiar with copier? + +00:29:14.080 --> 00:29:15.480 +People out there might be familiar with that. + +00:29:15.600 --> 00:29:20.740 +It's a little bit like cookie cutter with the bonus that you can update it later if you + +00:29:21.020 --> 00:29:24.499 +change your mind about something, like actually change this project to use Postgres rather + +00:29:24.520 --> 00:29:29.960 +than SQLite or something, which is pretty cool. But every time that I do, every time I try to work + +00:29:30.040 --> 00:29:33.680 +with one of those projects, even ones that I've created for myself, I'm not, I hate not anyone. + +00:29:34.160 --> 00:29:39.400 +I'm like, oh, it's like 75% awesome and 25%. I just got to take this stuff out. You know, + +00:29:39.940 --> 00:29:43.700 +I'll just, I'll just do it from scratch. It's not, how hard is this? I'll just create a few folders + +00:29:43.860 --> 00:29:48.200 +and put a few things in there and I'll copy the one, like the pyproject.tom or like the one thing + +00:29:48.300 --> 00:29:52.260 +that's like, how do I do this again? I'll just copy that and we're good to go. Yeah. I mean, + +00:29:52.540 --> 00:29:53.160 +That's what I find. + +00:29:53.460 --> 00:29:53.920 +That's what I find. + +00:29:53.930 --> 00:29:56.660 +I find it, it seems like a really brilliant idea, + +00:29:56.920 --> 00:30:00.280 +but in practice, it hasn't saved us time yet. + +00:30:00.880 --> 00:30:02.320 +No, I mean, maybe it's a case study. + +00:30:02.460 --> 00:30:04.320 +Like, okay, let's see what they're doing for this one. + +00:30:04.320 --> 00:30:05.160 +Oh, that is interesting + +00:30:05.270 --> 00:30:07.460 +how they're integrating this other thing maybe, + +00:30:07.640 --> 00:30:10.820 +but as a true foundation, I find it in theory awesome. + +00:30:11.280 --> 00:30:13.840 +In practice, I just end up not doing it for various reasons. + +00:30:14.200 --> 00:30:14.560 +Don't know why. + +00:30:14.840 --> 00:30:15.840 +I'm gonna save this for later. + +00:30:16.700 --> 00:30:17.880 +Because the question I'm about to ask you + +00:30:17.900 --> 00:30:20.860 +is gonna send us just down a rat hole. + +00:30:21.260 --> 00:30:26.960 +So instead, before we go down the rat hole, maybe we could, not that one, maybe we could + +00:30:27.020 --> 00:30:32.300 +talk about, I mean, you talked about some, but let's maybe just feature some of the projects + +00:30:32.300 --> 00:30:34.580 +that are maybe more well-known that you guys have done. + +00:30:35.059 --> 00:30:35.460 +Sure. + +00:30:35.780 --> 00:30:36.120 +Yeah, good. + +00:30:36.580 --> 00:30:40.220 +So yeah, one of them is called the Amendments Project. + +00:30:40.900 --> 00:30:46.119 +And this is, I didn't know this until I started working on this project, that there are, there + +00:30:46.140 --> 00:30:52.740 +There have been thousands of, I think it's 22, at least 22,000 proposed amendments to + +00:30:52.740 --> 00:30:55.800 +the United States Constitution that never went anywhere. + +00:30:56.230 --> 00:31:01.500 +And so kind of the goal of this project is to show that there have been lots of attempts + +00:31:02.140 --> 00:31:06.400 +to amend the Constitution, but actually the Constitution is frozen. + +00:31:06.690 --> 00:31:11.480 +I mean, it's not actually amendable anymore, at least not in the politics of any time recently. + +00:31:12.440 --> 00:31:13.580 +So this is a database. + +00:31:14.180 --> 00:31:19.040 +I cannot imagine a situation where the U.S. Constitution gets amended. + +00:31:19.310 --> 00:31:21.440 +It has to be unanimous across all the states, right? + +00:31:21.820 --> 00:31:22.140 +Is that right? + +00:31:22.250 --> 00:31:22.740 +I can't remember. + +00:31:23.440 --> 00:31:23.840 +I don't know. + +00:31:23.940 --> 00:31:25.720 +I remember off the top of my head if it has to be unanimous, + +00:31:25.730 --> 00:31:27.900 +but it certainly has to be across party lines. + +00:31:28.520 --> 00:31:30.940 +Yeah, it's got to be pretty darn close if it's not at all. + +00:31:32.020 --> 00:31:36.200 +It's like time travel or travel to speed of light. + +00:31:36.590 --> 00:31:37.640 +Could be theoretically possible. + +00:31:38.180 --> 00:31:38.920 +Probably not going to happen. + +00:31:40.560 --> 00:31:41.340 +No, it's hard to see. + +00:31:41.600 --> 00:31:42.140 +It's hard to see. + +00:31:42.310 --> 00:31:42.400 +Yeah. + +00:31:42.620 --> 00:31:46.160 +So this is from a historian at Harvard. + +00:31:46.680 --> 00:31:53.120 +And so it's a database of all and the full text from all of these amendments. + +00:31:53.340 --> 00:32:07.600 +And, you know, it's from the public's point of view, it's a Postgres full text vector search interface for finding and filtering through on all of the different amendments that have been proposed. + +00:32:08.180 --> 00:32:08.660 +I love it. + +00:32:08.980 --> 00:32:10.020 +Yeah, this is a nice looking site. + +00:32:10.520 --> 00:32:11.560 +We work with a designer. + +00:32:12.080 --> 00:32:16.920 +she's very good yeah of course like an agency would right yep yep nice so we'll + +00:32:17.020 --> 00:32:22.060 +get a really pretty rich search interface and then off you go I have no idea even + +00:32:22.100 --> 00:32:24.900 +what I would search for but yeah well you can always search for something + +00:32:25.180 --> 00:32:28.720 +religious something abortion related there's gonna be lots of things there I + +00:32:29.080 --> 00:32:31.920 +thought all those also like guns but like I don't want to go down I'm not sure I + +00:32:32.020 --> 00:32:37.200 +even want to go down there right awesome though this looks super useful maybe + +00:32:37.380 --> 00:32:41.039 +someday we'll have a functional government again we'll see let's let's + +00:32:41.060 --> 00:32:45.820 +change it or maybe we'll go down and it's folklore like look at you so all right so yeah so another + +00:32:45.980 --> 00:32:51.680 +really great uh project at least from a content point of view uh that's interesting um the research + +00:32:51.840 --> 00:33:00.680 +that it's doing um is the fin folklore database um which so in in in celtic storytelling you know + +00:33:01.820 --> 00:33:07.919 +um moms have been telling and telling stories to daughters and and and and people have been + +00:33:08.000 --> 00:33:14.660 +telling stories for a very long time hundreds or a thousand years about um finn mcummel who is a + +00:33:14.910 --> 00:33:21.040 +hero a hero from irish mythology some of it some of it based in you know historical events but it + +00:33:21.120 --> 00:33:28.940 +goes back it goes back so far um so there are there's many hundreds or thousands of of of these + +00:33:29.300 --> 00:33:33.300 +stories that have been spread and versions of these stories that have that have been told and + +00:33:33.160 --> 00:33:47.440 +And so some of them are audio recordings where somebody like some researcher has gone out to an island off the coast of Scotland and recorded somebody telling their version of the hero of Finn and his band of heroes. + +00:33:47.640 --> 00:33:53.080 +You know, they defend Scotland and Ireland from invaders and attackers. + +00:33:53.880 --> 00:33:57.960 +Very exciting stories and stuff and a team of characters. + +00:33:59.020 --> 00:34:04.880 +So there's audio recordings and then there's documents, like written documents that contain + +00:34:05.120 --> 00:34:05.180 +these. + +00:34:05.180 --> 00:34:11.139 +And so this is a database of kind of all of those all in one place with, on the public + +00:34:11.340 --> 00:34:17.820 +side, a nice search interface for discovering them, you know, either using the map view or + +00:34:18.040 --> 00:34:18.139 +searching. + +00:34:18.480 --> 00:34:19.040 +Yeah, that's cool. + +00:34:19.350 --> 00:34:22.159 +I got my map view for some random thing I searched about here. + +00:34:22.600 --> 00:34:22.919 +Amazing. + +00:34:23.260 --> 00:34:26.300 +But this is pretty interesting, all these different tellings and stuff. + +00:34:26.720 --> 00:34:33.200 +Oh, and yeah, one of the big challenges with this project is that it's fully internationalized. + +00:34:33.450 --> 00:34:35.120 +So it's available in English. + +00:34:35.379 --> 00:34:40.560 +Everything is available in English, Scottish Gaelic, and Irish Gaelic, but that extends + +00:34:40.730 --> 00:34:41.419 +into the database. + +00:34:41.790 --> 00:34:45.360 +So usually people have multiple names recorded for them. + +00:34:45.850 --> 00:34:50.919 +And so, yeah, you may have one person with any number of names in different languages, + +00:34:51.040 --> 00:34:53.879 +sometimes more than one Scottish name, that kind of thing. + +00:34:54.020 --> 00:34:59.540 +And so the data model on this one is quite messy, but sensible. + +00:35:00.300 --> 00:35:03.020 +But yeah, it's quite a lot of different kinds of data to wrangle. + +00:35:03.240 --> 00:35:05.180 +And then with all of the translations for each thing. + +00:35:05.320 --> 00:35:05.960 +Yeah, that's wild. + +00:35:06.100 --> 00:35:11.380 +It's not just, we need the user interface of this thing to translate about. + +00:35:12.640 --> 00:35:13.420 +That's way more, right? + +00:35:13.720 --> 00:35:14.820 +Yeah, yeah, it is that. + +00:35:14.960 --> 00:35:15.540 +It is that. + +00:35:15.740 --> 00:35:20.360 +And then it is also, yes, all the items in the database have a translation or can. + +00:35:22.420 --> 00:35:25.160 +This portion of Talk Python To Me is brought to you by us. + +00:35:25.500 --> 00:35:30.020 +I'm thrilled to announce a brand new app built for developers created by yours truly. + +00:35:30.480 --> 00:35:31.640 +It's called Command Book. + +00:35:32.440 --> 00:35:33.700 +You know that thing you do every morning? + +00:35:34.200 --> 00:35:38.520 +Open up six terminal tabs, CD into this directory, activate that virtual environment, + +00:35:39.000 --> 00:35:40.460 +run the server with --reload. + +00:35:40.760 --> 00:35:44.720 +Now, CD somewhere else, start the background worker, another tab for Docker, + +00:35:45.080 --> 00:35:46.440 +another one to tail production logs. + +00:35:46.900 --> 00:35:49.620 +Every tab just says Python, Python, Python, Docker tail. + +00:35:50.380 --> 00:35:51.660 +and you're clicking through them going, + +00:35:52.120 --> 00:35:53.320 +which Python was that again? + +00:35:53.840 --> 00:35:54.540 +Where my app is running? + +00:35:55.200 --> 00:35:58.060 +Then sometime later, your dev server silently dies + +00:35:58.260 --> 00:35:59.260 +because it tried to reload + +00:35:59.480 --> 00:36:00.840 +while you're in the middle of a code edit, + +00:36:01.500 --> 00:36:04.080 +unmatched brace, a half-written import or something. + +00:36:04.820 --> 00:36:05.840 +Now you're hunting through tabs + +00:36:05.880 --> 00:36:07.380 +to figure out which process crashed + +00:36:07.440 --> 00:36:08.340 +and how to restart it. + +00:36:08.800 --> 00:36:09.680 +My app, CommandBook, + +00:36:10.000 --> 00:36:13.320 +gives all of these long-running commands a permanent home. + +00:36:13.880 --> 00:36:15.120 +You save a command once, + +00:36:15.500 --> 00:36:16.680 +the working directory, the environment, + +00:36:17.040 --> 00:36:18.140 +pre-commands like git pull, + +00:36:18.480 --> 00:36:20.120 +and from then on, you just click run. + +00:36:20.680 --> 00:36:22.040 +You can even group commands together + +00:36:22.340 --> 00:36:24.140 +to start and stop everything for a project + +00:36:24.460 --> 00:36:25.140 +with a single click. + +00:36:25.560 --> 00:36:27.540 +It also has what I call honey badger mode, + +00:36:27.750 --> 00:36:29.140 +auto restart on crash. + +00:36:29.720 --> 00:36:32.140 +So when your dev server goes down mid-reload, + +00:36:32.700 --> 00:36:34.580 +command book just brings it right back up + +00:36:34.840 --> 00:36:36.900 +and does so over and over until the code is fixed. + +00:36:37.520 --> 00:36:39.320 +It also detects URLs from your output + +00:36:39.470 --> 00:36:41.900 +so you're never scrolling through thousands of lines of logs + +00:36:42.080 --> 00:36:44.040 +just to figure out how to reopen your web app. + +00:36:44.580 --> 00:36:46.280 +And it shows you uptime, memory usage, + +00:36:46.480 --> 00:36:48.460 +and all sorts of cool things about your process. + +00:36:49.160 --> 00:36:51.140 +The whole thing is a native macOS app. + +00:36:51.360 --> 00:36:53.680 +No Electron, no Chromium, just 21 megs. + +00:36:54.160 --> 00:36:55.460 +And it comes with a full CLI. + +00:36:55.700 --> 00:36:57.540 +So anything you've configured in the UI, + +00:36:57.960 --> 00:36:59.220 +you can fire off from your terminal + +00:36:59.380 --> 00:37:00.400 +with just a single command. + +00:37:00.860 --> 00:37:02.700 +Right now it's macOS only, + +00:37:03.220 --> 00:37:04.080 +but if there's enough interest, + +00:37:04.320 --> 00:37:05.380 +I'll build a Windows version too. + +00:37:05.620 --> 00:37:06.340 +So let me know. + +00:37:07.180 --> 00:37:09.300 +Please check it out at talkpython.fm + +00:37:09.560 --> 00:37:11.160 +slash command book app. + +00:37:11.640 --> 00:37:12.400 +Download it for free, + +00:37:12.980 --> 00:37:14.140 +level up your developer workflow. + +00:37:14.600 --> 00:37:16.220 +The link is in your podcast player show notes. + +00:37:16.840 --> 00:37:18.780 +That's talkpython.fm/command book. + +00:37:19.210 --> 00:37:21.000 +I really hope you enjoy this new app that I built. + +00:37:22.640 --> 00:37:26.660 +You want to work in the native language of the people who did that part of the folklore + +00:37:26.930 --> 00:37:27.340 +or whatever, right? + +00:37:27.680 --> 00:37:30.280 +Yeah, well, and people are still speaking those languages. + +00:37:30.520 --> 00:37:34.580 +So people who would use this to, you know, like somebody may have heard a story from + +00:37:34.630 --> 00:37:37.960 +their mom or dad and are now would like to find other versions of that story. + +00:37:38.350 --> 00:37:41.940 +And they live in a part of Scotland where they speak Scottish Gaelic as their first language. + +00:37:42.520 --> 00:37:43.680 +They can still access the site. + +00:37:43.880 --> 00:37:48.700 +And then that mapping color history one, that's another one of the public ones that you said is pretty major. + +00:37:49.520 --> 00:37:50.080 +Yeah, that's right. + +00:37:50.280 --> 00:37:50.420 +Yeah. + +00:37:50.720 --> 00:37:53.140 +So, yeah, that's a pigments database. + +00:37:53.360 --> 00:38:04.160 +You can search by either English color names like blue and find all of these Asian paintings that have blue or a particular kind of pigment of how they made the blue. + +00:38:04.440 --> 00:38:04.960 +Yeah, nice. + +00:38:05.360 --> 00:38:07.700 +So what's the open source story? + +00:38:08.300 --> 00:38:11.240 +You're creating all these apps, maybe some of these frameworks. + +00:38:11.420 --> 00:38:12.280 +There's got to be some tools. + +00:38:12.680 --> 00:38:25.900 +Is there a big desire or already an effort to have a lot of these things open source or is it too niche or is it just like this is the advantage of Harvard has is other universities don't get this? + +00:38:27.300 --> 00:38:29.680 +No, it's something we talk about quite a bit. + +00:38:30.820 --> 00:38:35.140 +Usually these things start, usually they start closed source during development. + +00:38:35.380 --> 00:38:44.920 +And then we work with the faculty and we talk about how we can take, you know, like the repo for the web app, how we can take that public. + +00:38:45.450 --> 00:38:48.100 +And so we've done that for a number of projects. + +00:38:48.320 --> 00:38:49.100 +Not all of them are. + +00:38:50.000 --> 00:38:55.880 +But the ideal is that they all make their way into the open, and especially when they become archived. + +00:38:56.160 --> 00:38:56.440 +Sure. + +00:38:56.770 --> 00:38:58.840 +Yeah, that's a good way to help them live on. + +00:38:58.960 --> 00:39:03.900 +And they might even go into GitHub's Arctic Vault, which is crazy. + +00:39:03.980 --> 00:39:14.620 +I don't know if people know about that out there, but GitHub has, quite a while ago, started taking copies of all of the repos and backing them up and storing them in the Arctic vault. + +00:39:14.810 --> 00:39:15.240 +It's kind of cool. + +00:39:15.810 --> 00:39:18.660 +I really, really, really hope we never need that, but it's kind of neat. + +00:39:18.820 --> 00:39:19.340 +Yeah, me too. + +00:39:20.319 --> 00:39:30.300 +Usually universities have their own archival system, so any important research data is usually part of that system as well. + +00:39:30.560 --> 00:39:30.840 +I see. + +00:39:30.990 --> 00:39:31.080 +Okay. + +00:39:31.440 --> 00:39:31.520 +Yeah. + +00:39:32.160 --> 00:39:32.780 +Obviously, right? + +00:39:32.840 --> 00:39:34.940 +Like I'm just, I can't remember where it was. + +00:39:34.940 --> 00:39:39.760 +It was somewhere, I think it was South Korea or Taiwan where like seven years of government + +00:39:40.060 --> 00:39:41.820 +data got lost or something like that. + +00:39:41.820 --> 00:39:43.200 +It was really, really bad recently. + +00:39:43.500 --> 00:39:46.980 +There was a fire and I think they had backups, but maybe just into the building, you know, + +00:39:47.040 --> 00:39:48.000 +like we'll put that out. + +00:39:48.340 --> 00:39:49.800 +We'll back it up to the hard drive over here. + +00:39:50.180 --> 00:39:50.480 +Not good. + +00:39:51.000 --> 00:39:52.920 +No, not good. + +00:39:52.920 --> 00:39:54.340 +You definitely want this stuff to survive. + +00:39:54.360 --> 00:40:00.080 +I mean, academia has this history of like tomes that have survived the past and really, + +00:40:00.260 --> 00:40:02.340 +really long lived information. + +00:40:02.640 --> 00:40:02.700 +Right. + +00:40:02.760 --> 00:40:05.400 +besides the Library of Alexandria or something like that, maybe. + +00:40:05.680 --> 00:40:06.380 +That's what we want. + +00:40:06.620 --> 00:40:07.080 +That's what we want. + +00:40:07.080 --> 00:40:08.860 +We want it to, yeah, we want it to last. + +00:40:09.560 --> 00:40:09.860 +Absolutely. + +00:40:10.180 --> 00:40:14.540 +So maybe that's a good time to sort of talk about the trailing end. + +00:40:14.540 --> 00:40:17.020 +I think there's a lot of interesting things going on here. + +00:40:18.360 --> 00:40:22.740 +Just like you've run out of money, not because you actually run out of money. + +00:40:23.260 --> 00:40:26.520 +The grant is done and you've either spent or given back or whatever + +00:40:26.820 --> 00:40:28.460 +with the remaining little bits of money. + +00:40:28.780 --> 00:40:30.100 +It's always a weird balance with research. + +00:40:30.600 --> 00:40:33.740 +It's like, oh, we got $3,000 left on this research grant. + +00:40:33.790 --> 00:40:34.700 +What are we going to do with it? + +00:40:34.780 --> 00:40:35.820 +It's not like, oh, we're going to give it back. + +00:40:35.930 --> 00:40:36.580 +We just didn't need it. + +00:40:36.940 --> 00:40:41.300 +It's like, we're going to find a way to like fund a student to do a little more work or + +00:40:41.460 --> 00:40:41.520 +whatever. + +00:40:41.720 --> 00:40:43.400 +But eventually the grant is over. + +00:40:43.940 --> 00:40:44.240 +That's right. + +00:40:44.660 --> 00:40:48.560 +You've got some expensive app access to a big database because it needs a big search or + +00:40:49.260 --> 00:40:50.200 +a lot of compute or something. + +00:40:50.780 --> 00:40:51.040 +That's right. + +00:40:52.130 --> 00:40:56.000 +Everything during, like, I mean, anything, anything that's a, that's a Django app. + +00:40:56.640 --> 00:41:04.820 +We deploy to AWS using containers, which isn't the cheapest way to host anything. + +00:41:05.760 --> 00:41:09.100 +But that's for the most part the Harvard way. + +00:41:10.200 --> 00:41:12.240 +And it is robust and is reliable. + +00:41:12.800 --> 00:41:22.320 +And we don't have a DevOps person on call on the weekend to rescue one of these apps. + +00:41:22.480 --> 00:41:24.980 +So having them reliable is good. + +00:41:25.400 --> 00:41:29.940 +Okay, so it's on AWS and paying for the containers, + +00:41:30.180 --> 00:41:32.500 +paying for that Elasticsearch cluster, + +00:41:33.180 --> 00:41:36.280 +the RDS Postgres database. + +00:41:36.890 --> 00:41:38.940 +Okay, well, even if somebody wants to start paying + +00:41:38.940 --> 00:41:39.860 +for that out-of-pocket, + +00:41:40.100 --> 00:41:40.980 +all of those little services, + +00:41:41.090 --> 00:41:44.080 +they add up to enough that we need to do something + +00:41:44.530 --> 00:41:46.340 +when the project hits end of life. + +00:41:46.650 --> 00:41:50.359 +And so our gold standard that we've developed so far + +00:41:50.380 --> 00:41:55.340 +is asking, can this become a static website? + +00:41:55.860 --> 00:41:58.600 +Can we bake this out into all HTML files + +00:41:59.200 --> 00:42:01.800 +and acknowledge that there will be some trade-offs? + +00:42:01.960 --> 00:42:04.440 +We will trade off some searching. + +00:42:04.880 --> 00:42:06.540 +You know, it's not gonna have Elasticsearch. + +00:42:06.840 --> 00:42:08.320 +Doesn't mean that it won't have any search though. + +00:42:08.620 --> 00:42:10.160 +So we'll trade out Elasticsearch + +00:42:10.560 --> 00:42:12.840 +and it'll be very difficult to add new data, + +00:42:13.340 --> 00:42:15.020 +but that's okay because it's being archived. + +00:42:15.340 --> 00:42:16.940 +So can we get it into a static site? + +00:42:18.040 --> 00:42:20.940 +And that's challenging depending on how you've set it up. + +00:42:20.980 --> 00:42:26.460 +So we now have projects where we set them up from the beginning to be archivable like this. + +00:42:26.460 --> 00:42:28.960 +And one of them is called Water Stories. + +00:42:29.520 --> 00:42:35.960 +And it was a companion to an art installation at the Radcliffe Institute on the Harvard campus. + +00:42:36.700 --> 00:42:45.380 +And so this was this live site during the duration of the art installation where people could come in and add stories that they had about water onto an iPad. + +00:42:46.380 --> 00:42:47.660 +And then those went up to our database. + +00:42:49.000 --> 00:42:54.340 +we built that with something called Django bakery which if you opt in and you use all of their + +00:42:54.440 --> 00:43:00.480 +class-based views the way that they're meant to be used then you can bake this out into static files + +00:43:00.480 --> 00:43:05.660 +when you're done very low effort that was perfect that is such a cool idea and mad props to them for + +00:43:05.760 --> 00:43:11.440 +ASCII art logos come on now I feel like that should be in the view source if it's not but + +00:43:11.800 --> 00:43:17.260 +this is such a cool idea because you can you can just take a working site you guys are a Django + +00:43:17.260 --> 00:43:22.360 +shop. So you have a lot of your sites are written in Django and you just go make it static, right? + +00:43:22.860 --> 00:43:27.220 +Essentially. Yes. And, and what's, what's, what's really great about it is if they wanted to make + +00:43:27.240 --> 00:43:31.400 +a change and they have, they have asked since we, since we made it static, they've asked for a + +00:43:31.480 --> 00:43:37.020 +couple of changes. So locally, I just Docker compose up this whole application, make the change + +00:43:37.120 --> 00:43:42.420 +in the Django admin and rebake the site. And so it's, it can still be updated. Something, + +00:43:42.600 --> 00:43:46.800 +if you've never tried this, like something like, Hey, can we just add one more menu item? + +00:43:47.140 --> 00:43:50.100 +And you're like, no, no, no, we're not adding the menu item because you want that. + +00:43:50.140 --> 00:43:55.980 +That means we're changing 7,300 pages because they all bake in the whole HTML. + +00:43:56.400 --> 00:43:56.460 +Right? + +00:43:56.700 --> 00:43:57.020 +Exactly. + +00:43:57.560 --> 00:43:57.940 +Yeah, exactly. + +00:43:58.190 --> 00:44:02.740 +But if that's in my, in my Django database and my SQLite file, then no problem at + +00:44:02.840 --> 00:44:04.340 +all because then I just rebake it. + +00:44:04.620 --> 00:44:05.400 +Yeah, yeah, exactly. + +00:44:05.600 --> 00:44:06.100 +Absolutely. + +00:44:06.859 --> 00:44:09.480 +So I think this is super neat. + +00:44:09.560 --> 00:44:12.920 +There's also frozen, frozen flask. + +00:44:13.520 --> 00:44:16.740 +If I could get rid of all the ads, I do not need a Yeti thing, whatever that is. + +00:44:17.200 --> 00:44:24.900 +the glass, not the mythical thing, but frozen flask, which does a similar thing for flask + +00:44:25.300 --> 00:44:30.180 +apps. If you're a flask person probably would work with court. Don't know for sure, but probably. + +00:44:30.520 --> 00:44:36.680 +So that's a pretty interesting idea as well. throw that in there. but also what else? + +00:44:37.460 --> 00:44:45.380 +Also you talked about search, right? That can be, can be such a problem. And I'm a huge fan of your + +00:44:45.320 --> 00:44:50.960 +recommendation here with a page find. Tell us about page find. So this has been, I think it's been a + +00:44:50.960 --> 00:44:56.920 +bit of a game changer in how functional one of these archived sites can remain. So we're actually + +00:44:56.920 --> 00:45:03.360 +in the process of that amendments website that searches across 22,000 full texts of amendments. + +00:45:04.080 --> 00:45:09.580 +We are in the process of sunsetting that, and that will become a static site. And for that search, + +00:45:09.740 --> 00:45:16.020 +we already have an internal demo that proves that we can replace that Postgres full search + +00:45:16.760 --> 00:45:22.280 +with PageFind. You lose vector search. Yeah. You've kind of got to get really + +00:45:22.960 --> 00:45:27.340 +true keyword matching. Yeah. Yeah, that's right. But you still get filtering. I mean, + +00:45:27.360 --> 00:45:34.019 +and really faceting and filtering is when it comes to discovery of things, I mean, I find + +00:45:34.040 --> 00:45:40.360 +that's really what's useful. So filtering these amendments by state or by the Congress that was + +00:45:40.500 --> 00:45:50.160 +active at the time or by the person who co-wrote it. All of those are totally great in PageFind. + +00:45:50.380 --> 00:45:55.240 +And the keyword search is just fine in PageFind. One of the things I really like about it is that + +00:45:55.540 --> 00:46:00.939 +it takes your index and it chops it up into lots of little files that can just fly across the + +00:46:00.960 --> 00:46:06.640 +network. So it's a very fast search. It's not a huge network load, even if your index is + +00:46:07.260 --> 00:46:13.360 +initially very large. And it essentially cuts it up somewhat alphabetically. So if your search + +00:46:14.070 --> 00:46:20.800 +starts with T, or I should say a better word for audio, if it starts with W, then it will load up + +00:46:20.810 --> 00:46:26.000 +the index for words that start with W and fly that over the network instead of the whole thing. + +00:46:26.120 --> 00:46:29.220 +So it's pretty slick and it has a great Python API. + +00:46:29.760 --> 00:46:33.320 +So to do the proof of concept for the amendments search, + +00:46:33.950 --> 00:46:40.000 +I just took a database dump and then manually indexed with a Python script into PageFind. + +00:46:40.180 --> 00:46:42.980 +Wait, there's a Python API for PageFind? + +00:46:43.360 --> 00:46:47.800 +Yeah. So the way PageFind works, I should have said that, is the way most people will use it + +00:46:48.140 --> 00:46:55.160 +is by normally PageFind consumes HTML. So you give it access to your dist folder. + +00:46:56.040 --> 00:46:56.660 +Oh, okay. + +00:46:57.700 --> 00:47:00.360 +And then it crawls through all of your HTML files. + +00:47:00.580 --> 00:47:05.840 +And you can do great things like adding little HTML tags that are just for PageFind, + +00:47:05.940 --> 00:47:09.340 +that give it the filtering ability, or that you want to sort by something. + +00:47:09.620 --> 00:47:10.780 +And so that's great. + +00:47:11.380 --> 00:47:18.280 +Or you can just call PageFind from Python or from TypeScript and just build that index manually. + +00:47:18.660 --> 00:47:19.660 +Well, thanks a lot, David. + +00:47:19.800 --> 00:47:21.160 +I have another thing I've got to go research. + +00:47:21.360 --> 00:47:21.840 +This is awesome. + +00:47:22.560 --> 00:47:24.520 +I'm a huge fan of PageFind, as I said. + +00:47:24.540 --> 00:47:27.040 +on my personal website, mkennedy.codes, + +00:47:27.440 --> 00:47:29.040 +is just a pure stat. + +00:47:29.090 --> 00:47:31.520 +It starts in Markdown and ends up in HTML. + +00:47:31.940 --> 00:47:34.600 +But if you add page find in, you get a super rich, + +00:47:34.780 --> 00:47:36.240 +if you want to just know, you want to talk about, + +00:47:36.360 --> 00:47:37.120 +like what was about Docker, + +00:47:37.590 --> 00:47:39.960 +it shows you really nice results, + +00:47:40.500 --> 00:47:42.060 +pulling out the different parts of the page + +00:47:42.300 --> 00:47:43.480 +and sections that talk about it, + +00:47:43.560 --> 00:47:45.540 +like the headers and then what is said. + +00:47:45.670 --> 00:47:48.520 +And it even does like sub, sub word, + +00:47:48.920 --> 00:47:50.280 +you know, like you just type doc, + +00:47:50.620 --> 00:47:51.980 +it finds all the words that match that. + +00:47:52.160 --> 00:47:54.500 +And what I really like about it is a couple of things + +00:47:54.520 --> 00:47:59.660 +it's instant. It basically is like nearly instant. If you type a few things, it gets way faster + +00:47:59.720 --> 00:48:04.600 +because it's pulling down. And if you go and look in the network console here and you type + +00:48:05.220 --> 00:48:10.540 +something, you can see that it's actually pulling in these little tiny fragments, which this one's + +00:48:10.640 --> 00:48:16.480 +coming off disk cache in three milliseconds, right? But it breaks your index into a bunch of very small + +00:48:16.980 --> 00:48:22.000 +page find fragments that I think it's like, it starts with anything that starts with the word + +00:48:21.980 --> 00:48:24.860 +DO. These are all the prebuilt results and stuff like that. Right. + +00:48:25.080 --> 00:48:26.220 +That's right. That's right. + +00:48:26.440 --> 00:48:27.440 +Yeah. That's super cool. + +00:48:27.940 --> 00:48:34.440 +Yeah. One of our open source projects that, that we maintain is a view of a + +00:48:34.440 --> 00:48:39.780 +view JS component library for page find so that we can style it and reuse it + +00:48:39.810 --> 00:48:40.680 +across different projects. + +00:48:41.040 --> 00:48:42.460 +Oh, that's awesome. I love it. + +00:48:42.780 --> 00:48:44.180 +Yeah. I think this really unlocks it. + +00:48:44.180 --> 00:48:48.800 +And I mean, you go to so many, so many sites, like their documentation or just + +00:48:48.850 --> 00:48:51.440 +their web app in the search is so bad. + +00:48:51.640 --> 00:48:56.680 +You type something and it's like thinking, spinning, spinning, spinning, spinning. + +00:48:57.040 --> 00:49:00.280 +And then like five seconds later, it gives you kind of janky results. + +00:49:00.700 --> 00:49:04.680 +And if you just like throw a page find in there, it's, you can't type fast enough to + +00:49:05.100 --> 00:49:05.760 +outrun the results. + +00:49:05.820 --> 00:49:06.220 +You know what I mean? + +00:49:06.520 --> 00:49:07.100 +No, that's right. + +00:49:07.180 --> 00:49:07.260 +Yeah. + +00:49:07.880 --> 00:49:12.980 +Too many static site search solutions, they use like a, like a JSON blob that you, that + +00:49:12.980 --> 00:49:15.280 +you have to pull down and, and then iterate through. + +00:49:15.940 --> 00:49:16.580 +You know, what's worse. + +00:49:16.680 --> 00:49:21.240 +and I see this a lot, would be if you go to google.com + +00:49:21.820 --> 00:49:24.540 +and then you would say effectively site colon whatever + +00:49:24.790 --> 00:49:26.260 +and then you search Docker, right? + +00:49:26.350 --> 00:49:28.120 +They basically pull that. + +00:49:29.000 --> 00:49:30.520 +You know, they just say search this + +00:49:30.760 --> 00:49:33.560 +and you just get Google results for your site. + +00:49:33.650 --> 00:49:36.460 +And obviously it's, I mean, Google's fine, but it's just. + +00:49:36.600 --> 00:49:38.360 +No, I find that unusable, really. + +00:49:38.460 --> 00:49:38.820 +I do too. + +00:49:38.950 --> 00:49:40.260 +It really, you're like, ah, geez. + +00:49:41.140 --> 00:49:43.220 +But now I'm super excited to realize + +00:49:43.370 --> 00:49:46.220 +I can do that from my dynamic content as well. + +00:49:46.640 --> 00:49:48.460 +So with the Python integration. + +00:49:48.880 --> 00:49:49.760 +OK, nice. + +00:49:51.360 --> 00:49:53.480 +What about something truly static? + +00:49:53.600 --> 00:49:56.440 +Have you looked at Hugo and some of the other type of things? + +00:49:56.880 --> 00:49:57.160 +Sure. + +00:49:57.390 --> 00:50:01.960 +So when I see you've even got the tab up for the SUMEB project, + +00:50:02.300 --> 00:50:08.680 +which is-- that's essentially a database of many, many specimens + +00:50:09.200 --> 00:50:10.440 +taken from the SUMEB mine. + +00:50:11.460 --> 00:50:12.320 +So in the-- + +00:50:12.530 --> 00:50:13.040 +Oh, it is. + +00:50:13.040 --> 00:50:13.740 +Yeah, yeah, it is. + +00:50:13.900 --> 00:50:15.640 +So if you click on Minerals database, + +00:50:16.180 --> 00:50:19.620 +you open up that search interface and that's powered by PageFind. + +00:50:19.760 --> 00:50:20.660 +Oh, this is? + +00:50:21.200 --> 00:50:21.300 +Yes. + +00:50:22.520 --> 00:50:23.740 +I forget what I was... + +00:50:23.930 --> 00:50:24.260 +I see. + +00:50:24.530 --> 00:50:26.640 +You guys even hooked into... + +00:50:26.700 --> 00:50:29.880 +I was thinking just like pure static, like Hugo, like... + +00:50:30.080 --> 00:50:31.540 +Oh, yes. Yes. Yes. + +00:50:31.840 --> 00:50:32.980 +So this is an Astro site. + +00:50:33.360 --> 00:50:37.540 +So for this website, we have this as an Astro site so that we have a little... + +00:50:37.600 --> 00:50:41.520 +Because with Astro, they make it so easy to pull in like view components. + +00:50:42.100 --> 00:50:47.720 +So like our page find is a custom view JS component library with Astro. + +00:50:47.730 --> 00:50:52.620 +You can use React components, you can use the view components, but what it does is it's just + +00:50:52.620 --> 00:50:56.980 +a static site generator. Fantastic. So a little bit more designable + +00:50:57.460 --> 00:51:00.120 +than like Hugo or something. Here's your markdown file. Good luck with that. + +00:51:00.220 --> 00:51:05.020 +Yeah. I love Hugo though. Yeah. I use Hugo for different personal sites here and there, + +00:51:05.070 --> 00:51:08.420 +and it's just so fast and easy to get up and running. But yeah, it's great. + +00:51:08.440 --> 00:51:09.400 +- Great, great when it's a good friend. + +00:51:09.400 --> 00:51:10.740 +- That's what my website's written in, it's in Hugo. + +00:51:12.239 --> 00:51:14.280 +But if I'm integrating with anything else, + +00:51:14.400 --> 00:51:15.740 +I used to kind of like split it up, + +00:51:15.790 --> 00:51:17.920 +like this part's Hugo and this part's like a Python app. + +00:51:17.920 --> 00:51:20.000 +And it's pretty easy to get something + +00:51:20.140 --> 00:51:21.620 +that'll take a bunch of markdown files + +00:51:21.820 --> 00:51:23.200 +and just turn them into HTML + +00:51:23.700 --> 00:51:25.400 +and just put a page template around that. + +00:51:25.580 --> 00:51:29.000 +So I've kind of stepped away from mixing and matching that + +00:51:29.140 --> 00:51:29.960 +as much as I used to. + +00:51:30.230 --> 00:51:32.940 +So now if I got a static section of a dynamic site, + +00:51:33.400 --> 00:51:34.000 +but that doesn't address, + +00:51:34.140 --> 00:51:37.780 +has nothing to do with the archival side of things, right? + +00:51:38.440 --> 00:51:41.840 +Because the idea is that the thing that I'm describing is gone on purpose. + +00:51:42.180 --> 00:51:42.600 +That's right. + +00:51:42.840 --> 00:51:45.980 +So you've got some, we've got Django Bakery. + +00:51:46.440 --> 00:51:52.580 +I threw out Frozen Flask, and I'm sure there's a ton more that neither of us are aware of at the moment. + +00:51:52.800 --> 00:51:56.380 +So Django Bakery was really good for that purpose. + +00:51:56.640 --> 00:52:00.600 +And we're keeping our eyes open for projects that it's a good fit for. + +00:52:01.560 --> 00:52:03.420 +But that was a pretty simple website. + +00:52:03.620 --> 00:52:06.260 +It needed a dynamic backend, but it was quite straightforward. + +00:52:06.960 --> 00:52:09.860 +And for Django Bakery, you have to opt into inheriting + +00:52:10.080 --> 00:52:11.520 +from their class-based views. + +00:52:11.580 --> 00:52:11.840 +I see. + +00:52:12.700 --> 00:52:13.800 +So if you're doing, for example-- + +00:52:13.800 --> 00:52:14.880 +You've got to dig ahead of it, yeah. + +00:52:15.260 --> 00:52:16.680 +Yeah, yeah, yeah, absolutely. + +00:52:17.000 --> 00:52:18.640 +Yeah, hard to add retroactively. + +00:52:18.780 --> 00:52:19.380 +Probably impossible. + +00:52:20.340 --> 00:52:23.120 +Now, our other websites, like the fin example + +00:52:23.380 --> 00:52:27.060 +and the mapping color example, those are APIs. + +00:52:27.500 --> 00:52:29.800 +That's a Django API, Django REST framework for one, + +00:52:30.700 --> 00:52:31.920 +GraphQL for the other. + +00:52:32.540 --> 00:52:34.560 +One has a view front end, one has a React front end. + +00:52:34.900 --> 00:52:36.920 +OK, well, Django Bakery just isn't + +00:52:36.940 --> 00:52:39.580 +isn't going to work very well for like serializing JSON. + +00:52:39.760 --> 00:52:40.680 +Yeah, it's like awesome. + +00:52:40.940 --> 00:52:44.080 +Here's your unrendered JavaScript front end code + +00:52:44.180 --> 00:52:45.560 +and it's just going to look empty or something. + +00:52:45.980 --> 00:52:46.060 +Yeah. + +00:52:46.400 --> 00:52:48.800 +So it is a good reason to consider using + +00:52:49.680 --> 00:52:51.460 +like vanilla Django templates when possible, + +00:52:52.440 --> 00:52:53.220 +like for that reason. + +00:52:53.440 --> 00:52:57.880 +But those were, those were inherited from the vendors, + +00:52:58.880 --> 00:52:59.420 +those two sites. + +00:52:59.440 --> 00:53:00.960 +And we've made a lot of progress on those. + +00:53:01.520 --> 00:53:04.740 +So, you know, what, what to do in that, + +00:53:05.000 --> 00:53:10.360 +like in that situation, Django Bakery isn't an option. And those projects are not end of life + +00:53:10.600 --> 00:53:14.960 +yet. So we have some time, but we're, we're, we're, so what we're doing is strategizing, okay, + +00:53:15.280 --> 00:53:20.720 +how will we rescue them? How will we keep them alive once, once somebody needs to stop paying + +00:53:20.880 --> 00:53:25.620 +for hosting? And we have, we have ideas. We have, I think there's, there's clever, interesting + +00:53:26.060 --> 00:53:34.900 +things out there. We'll have to keep looking into it. There are some pretty interesting ideas. And + +00:53:34.920 --> 00:53:41.020 +that ran in a container, you could just have WebAssembly, but still have it go, right? + +00:53:41.140 --> 00:53:42.780 +Sort of a local loopback type of thing. + +00:53:43.000 --> 00:53:50.640 +Yeah, I'm really interested in this one because it enables essentially the full functionality + +00:53:51.140 --> 00:53:54.960 +of the live site to exist as what is just a static site. + +00:53:55.640 --> 00:54:03.160 +So because of Pyodide and projects like PyScript, we can run Python in the browser and we can + +00:54:03.120 --> 00:54:09.220 +run SQLite in the browser. And now we can even run Postgres in the browser with PG Lite. So if + +00:54:09.300 --> 00:54:15.320 +we can run all those things in the browser, then couldn't we have Django hosted right in the browser? + +00:54:15.880 --> 00:54:22.320 +And you can. So there's a proof of concept that proves it's possible called Django WebAssembly. + +00:54:23.360 --> 00:54:29.940 +And if you load this up, it'll let you log in to the Django admin. And you're not logging into + +00:54:29.960 --> 00:54:36.380 +anybody's backend, you're logging into your own browser where this is running in a service worker. + +00:54:36.680 --> 00:54:40.280 +Awesome. Look at that. Oh, hold on. I told me what the password was. Very secure. + +00:54:40.860 --> 00:54:41.940 +Matt, password. + +00:54:42.220 --> 00:54:47.000 +Well, it can be entirely insecure because, yeah, you're just, it's running right in your own browser. + +00:54:47.300 --> 00:54:50.080 +Yeah, that's awesome. And here we are, Django admin. Incredible. + +00:54:50.480 --> 00:54:55.020 +Yeah, so I'm pretty interested in this. You've got to convert an RDS Postgres database + +00:54:55.640 --> 00:54:59.640 +into either SQLite or something like PGLite, but I think that's all doable. + +00:54:59.980 --> 00:55:01.920 +So I think it's an exciting possibility. + +00:55:02.340 --> 00:55:02.940 +Yeah, for sure. + +00:55:03.010 --> 00:55:06.860 +I do think, so maybe you have a rich query system + +00:55:07.030 --> 00:55:08.140 +that you're powering by your database + +00:55:08.480 --> 00:55:09.040 +that's really heavy. + +00:55:09.480 --> 00:55:09.840 +Exactly. + +00:55:10.120 --> 00:55:11.680 +And it's got a bunch of data that's like, + +00:55:11.720 --> 00:55:13.500 +here's all of our working data + +00:55:13.620 --> 00:55:14.740 +that you might ask questions about. + +00:55:15.060 --> 00:55:16.920 +Maybe you just convert that to page find + +00:55:17.580 --> 00:55:18.540 +to help you find the pieces + +00:55:18.960 --> 00:55:20.500 +and then just keep the operational data + +00:55:20.720 --> 00:55:23.300 +and maybe like even a SQLite with like the Django RRM, + +00:55:23.300 --> 00:55:25.600 +you can just switch the connection, keep talking to it. + +00:55:25.750 --> 00:55:26.900 +I mean, there's possibilities + +00:55:27.050 --> 00:55:28.900 +to just get something not too terrible + +00:55:28.920 --> 00:55:30.740 +Well, it's not the same, but not that far off. + +00:55:31.080 --> 00:55:31.680 +Yeah, exactly. + +00:55:32.190 --> 00:55:35.420 +And then it goes on GitHub pages and it can live hopefully forever. + +00:55:35.700 --> 00:55:40.300 +I mean, it feels like GitHub will last forever, but it'll last longer than funding will anyways. + +00:55:41.120 --> 00:55:48.380 +It's definitely going to last longer than just something that we can't pay for anymore, right? + +00:55:48.520 --> 00:55:53.900 +I don't know how long GitHub's going to be around for, I think a while, but you never know, right? + +00:55:53.960 --> 00:55:57.400 +It seems like stuff's going to last forever, then it gets changed. + +00:55:57.520 --> 00:55:58.180 +We had subversion. + +00:55:59.000 --> 00:56:00.480 +Now it's completely gone, right? + +00:56:00.800 --> 00:56:04.780 +Just 20 years, 15 years later, but still, I think 100% there. + +00:56:05.020 --> 00:56:05.260 +Yeah. + +00:56:05.580 --> 00:56:09.520 +But if somebody can, if something ever happened, somebody just needs to copy that, + +00:56:09.750 --> 00:56:15.800 +that folder of HTML, CSS and JavaScript files and dump it into an S3 bucket or somewhere else. + +00:56:15.950 --> 00:56:17.360 +And then it can continue living there. + +00:56:17.860 --> 00:56:18.800 +So it's a good option. + +00:56:19.440 --> 00:56:20.020 +It's a great option. + +00:56:20.320 --> 00:56:21.400 +It's a really, really good option. + +00:56:21.660 --> 00:56:30.940 +I mean, I guess one of the long-term concerns might be what if the WebAssembly standard changes so much that it's not supported anymore? + +00:56:31.520 --> 00:56:36.860 +But you could probably bite-wise convert it if you had to, you know, like somebody would probably be able to create one. + +00:56:37.240 --> 00:56:38.560 +Yeah, that would be unfortunate. + +00:56:39.060 --> 00:56:48.860 +So I suppose if that happens, I mean, if that happens, yeah, we're booting up one of these projects is like booting up an emulator for some old DOS game. + +00:56:49.060 --> 00:56:49.540 +Right, right. + +00:56:49.720 --> 00:56:52.320 +Well, I mean, I guess let's think about this for a second. + +00:56:52.840 --> 00:56:55.460 +Somebody got, oh gosh, what was the chain? + +00:56:55.510 --> 00:57:03.180 +This is the whole, JavaScript, the PyCon talk where got like Firefox + +00:57:04.280 --> 00:57:10.080 +compiled into, not WASM, into, ASM JS or something like that. + +00:57:10.250 --> 00:57:14.300 +So it was run like Chrome was running Firefox, which was running, I think + +00:57:14.620 --> 00:57:17.060 +doom, which was also ASM JS. + +00:57:17.940 --> 00:57:21.800 +If we can do that, we could get something that would run, that would read old Web + +00:57:22.000 --> 00:57:24.540 +Assembly into new WebAssembly if it really mattered to the world. + +00:57:24.860 --> 00:57:25.180 +Absolutely. + +00:57:25.800 --> 00:57:25.980 +Yeah. + +00:57:26.240 --> 00:57:30.380 +Especially if it's in a public repo that people who care about the data can, + +00:57:30.680 --> 00:57:31.560 +can rescue it somehow. + +00:57:31.980 --> 00:57:32.080 +Yeah. + +00:57:32.420 --> 00:57:34.040 +What about like a virtual machine? + +00:57:34.500 --> 00:57:35.140 +You know, I agree. + +00:57:35.220 --> 00:57:35.640 +Yeah, absolutely. + +00:57:36.440 --> 00:57:42.220 +Could have saved me some, take a snapshot of Ubuntu LTS, some version, + +00:57:42.420 --> 00:57:43.600 +and just what are we going to do? + +00:57:44.200 --> 00:57:46.000 +Everything we do is Dockerized. + +00:57:46.400 --> 00:57:47.320 +Everything is in a container. + +00:57:47.780 --> 00:57:51.900 +So in the worst case scenario, we could give somebody the image, and they could run it if + +00:57:51.910 --> 00:57:52.420 +they have Docker. + +00:57:53.310 --> 00:57:57.780 +I think that's a nice peace of mind to know that no matter what, something will be able + +00:57:57.790 --> 00:57:59.040 +to run this container. + +00:57:59.440 --> 00:58:03.000 +And even in, I don't know if you've used GitHub, what is it called, Codespaces. + +00:58:05.319 --> 00:58:06.680 +I archived one project. + +00:58:07.570 --> 00:58:12.740 +It was kind of dramatic and sudden that it needed to be archived, so without much time + +00:58:12.850 --> 00:58:13.320 +to do anything. + +00:58:13.500 --> 00:58:15.460 +And it was a Ruby on Rails project. + +00:58:15.680 --> 00:58:18.220 +And I'm not a Rails developer, but I + +00:58:18.260 --> 00:58:19.600 +was able to get it archived in a way + +00:58:19.780 --> 00:58:22.620 +that anybody could, with one command, + +00:58:23.300 --> 00:58:27.040 +go to the repo on GitHub and boot it up in Codespaces + +00:58:27.440 --> 00:58:30.540 +and then have it live running from their Codespace. + +00:58:30.540 --> 00:58:31.800 +And so that works too. + +00:58:32.040 --> 00:58:32.380 +Very cool. + +00:58:32.600 --> 00:58:35.120 +I think as WebAssembly grows, there'll + +00:58:35.120 --> 00:58:38.200 +be more possibilities for these types of things. + +00:58:38.600 --> 00:58:39.300 +Yeah, amazing. + +00:58:39.660 --> 00:58:42.640 +I'm pretty excited about PageFind having a Python API. + +00:58:42.900 --> 00:58:46.440 +didn't realize that. So I'm going to be doing something with that for sure. So what else? + +00:58:46.960 --> 00:58:51.180 +Let me ask you one more thing before I kind of let you wrap up with some final thoughts here. + +00:58:51.620 --> 00:58:58.300 +What about AI? Oh, that's a good question. So AI, I mean, there's like, in my story, + +00:58:58.660 --> 00:59:04.580 +there's like one interesting part of AI, which is that I got started and self-learned everything I + +00:59:04.660 --> 00:59:10.840 +needed to about software development to begin doing this right before ChatGPT really came on + +00:59:10.860 --> 00:59:17.240 +was able to do real programming yeah you're like four years of legit programming before right so i + +00:59:17.380 --> 00:59:21.320 +think i mean so i was thinking i was thinking when i was thinking about how i got into it i thought + +00:59:21.660 --> 00:59:28.500 +what if i was four years later starting my phd and wanting to do these tools um i would have been + +00:59:28.570 --> 00:59:34.580 +able to accomplish what i needed to for my research without acquiring the technical skills and that + +00:59:34.610 --> 00:59:38.140 +would have been that's a good thing i'm not sure if that's good about it it could be both i would + +00:59:37.980 --> 00:59:43.220 +would have thought it was a good thing. I would have thought it's a good thing. But in my hands + +00:59:43.740 --> 00:59:52.220 +now, like a software engineer, AI is more powerful in my hands now than it would have been then. + +00:59:52.610 --> 00:59:57.560 +So I can make it work for me. Yeah, I can make it work for me in a way that I couldn't have been + +00:59:57.610 --> 01:00:01.980 +able to then. So I'm thankful for that, but it's something I think of. I don't want to say it's + +01:00:02.800 --> 01:00:07.940 +necessarily a bad thing, but it definitely marks a difference, a difference in time between other + +01:00:07.960 --> 01:00:13.120 +people who are maybe wanting to get into digital humanities, they're humanities researchers. They + +01:00:13.140 --> 01:00:17.740 +want to add some digital tools. You know, I think this will kind of, this will probably knock people + +01:00:18.040 --> 01:00:22.280 +off of the more technical path because it's not needed. I think it will too. And I think that that + +01:00:22.460 --> 01:00:27.640 +might be a negative. When you were telling me your story originally, I was thinking kind of like, + +01:00:27.760 --> 01:00:32.740 +how neat is it that you didn't sign up for, and the people you're working with probably didn't + +01:00:32.760 --> 01:00:36.300 +intend to sign you up for learning true software development. + +01:00:36.820 --> 01:00:41.000 +But look at this cool and interesting job that you now have that you never + +01:00:41.160 --> 01:00:41.880 +would have imagined. + +01:00:42.000 --> 01:00:44.400 +I'm sure when you signed up for your PhD, you're like, you know what I'm + +01:00:44.400 --> 01:00:47.320 +going to do when I get my PhD, I'm going to go X, Y, like, I'm going to + +01:00:47.400 --> 01:00:48.020 +join the Darth program. + +01:00:48.120 --> 01:00:49.780 +Like, no, probably not. + +01:00:49.900 --> 01:00:50.000 +Right. + +01:00:50.120 --> 01:00:50.760 +But here you are. + +01:00:51.380 --> 01:00:54.880 +And I think that's actually a really interesting knock on effect for a lot + +01:00:54.960 --> 01:00:59.040 +of researchers and people in grad schools, they're kind of put into this + +01:00:59.660 --> 01:01:01.020 +programming adjacent type of thing. + +01:01:01.400 --> 01:01:04.740 +You know, and a lot of folks sort of are like, actually, that's pretty interesting. + +01:01:04.940 --> 01:01:06.160 +I'm going to kind of lean into that. + +01:01:06.490 --> 01:01:10.300 +And I think AI might knock, like you said, knock people off that path to some degree. + +01:01:11.100 --> 01:01:11.720 +Yeah, yeah, definitely. + +01:01:12.210 --> 01:01:14.700 +So that's just like one part of the AI story. + +01:01:15.050 --> 01:01:17.900 +The other one is that, like how we use it. + +01:01:18.840 --> 01:01:25.540 +It's great for data extraction, pulling data out of different, you know, to make these + +01:01:25.890 --> 01:01:30.100 +search interfaces more powerful, to extract different data from them. + +01:01:30.540 --> 01:01:33.000 +That's just one example where it's been handy. + +01:01:33.800 --> 01:01:38.180 +We're looking for ways that it can really empower faculty. + +01:01:39.160 --> 01:01:47.460 +We're still very much in the exploration phase of how we can use it and provide it to faculty as a digital humanities tool. + +01:01:48.220 --> 01:01:52.240 +Sure. I was thinking pretty much when I asked the question of it, it's just like two parts. + +01:01:52.400 --> 01:01:56.300 +One, how is it? Are you guys using it to help take projects? + +01:01:56.440 --> 01:01:58.320 +Well, that would have been a month. No, actually, it's three days. + +01:01:58.820 --> 01:01:59.260 +You know what I mean? + +01:02:00.300 --> 01:02:05.840 +that. And then if people are asking, you know, a professor comes along and says, and we want our + +01:02:05.930 --> 01:02:12.880 +own custom AI thing, or we're using Harvard's internal one that we're allowed to use, but we + +01:02:13.040 --> 01:02:17.600 +won't be able to use it once the grant runs out. You know what I mean? Yeah. Yeah. I think one, + +01:02:17.820 --> 01:02:23.280 +one good example of this type of thing is that what we're starting to get is faculty who are + +01:02:23.780 --> 01:02:28.180 +vibe coding and now, and we are going to teach them. We're going to teach them how to do it. + +01:02:28.540 --> 01:02:30.780 +You know, instead of having them. + +01:02:31.200 --> 01:02:32.500 +Yeah, it's absolutely a skill. + +01:02:32.900 --> 01:02:33.500 +Yeah, no, it is. + +01:02:33.720 --> 01:02:34.040 +It is. + +01:02:34.800 --> 01:02:43.200 +Instead of copy and pasting from ChatGPT into VS Code, having them learn Copilot, maybe even having them download Cursor. + +01:02:43.600 --> 01:02:48.320 +Download some real dedicated tools to get this done to make them more productive. + +01:02:48.780 --> 01:02:52.860 +So, yeah, educating about how to do it is one thing. + +01:02:53.200 --> 01:02:54.240 +You asked if we're using it. + +01:02:54.900 --> 01:02:58.000 +We have access to Copilot. + +01:02:58.980 --> 01:03:04.140 +and that's great. I can't say that we've shipped anything in three days instead of a month yet, + +01:03:04.780 --> 01:03:13.440 +but one anecdote is that right now I'm doing some really interesting processing of music audio files, + +01:03:13.940 --> 01:03:19.500 +and somebody asked to have a beatboxer if I could chop that file up so that all of the individual + +01:03:19.820 --> 01:03:26.440 +sounds that the beatboxer makes are identified in a file. And so I'm using some music libraries, + +01:03:26.840 --> 01:03:32.000 +Python library called Librosa. There's some complicated math in there. It's a little bit + +01:03:32.040 --> 01:03:36.160 +too much for me. It's no problem for Claude. Claude knows how to do that math. And then, + +01:03:36.720 --> 01:03:39.580 +and I use my expertise to string it together to get a good output. + +01:03:39.940 --> 01:03:44.500 +Yeah. Awesome. You got time for one more quick question before we'll clap things up. + +01:03:44.500 --> 01:03:44.660 +For sure. + +01:03:45.300 --> 01:03:51.160 +Raymond out there, Raymond Yees asks, it says, it'd be good to hear how Harvard uses containers on AWS + +01:03:51.840 --> 01:03:56.060 +and its reliability. It's reliable, not cheapest way to host things. Are you thinking about moving + +01:03:56.380 --> 01:04:02.480 +moving that or is it not that much? Okay, I'll tell you about a failed experiment. + +01:04:03.520 --> 01:04:11.180 +We were using ECS and we're still using ECS. So that's AWS's main, you know, it's not Kubernetes, + +01:04:11.560 --> 01:04:17.840 +but it's one step down with their horizontal scaling container clusters. And I wanted to move + +01:04:17.840 --> 01:04:23.580 +us onto a single EC2 instance because our projects are popular, but they're not so popular that we + +01:04:23.500 --> 01:04:25.580 +actually have to worry about horizontal scaling. + +01:04:25.860 --> 01:04:26.120 +Right. + +01:04:26.220 --> 01:04:29.760 +It's not like it's front page in New York Times. + +01:04:30.280 --> 01:04:31.300 +I guess it probably could be. + +01:04:31.460 --> 01:04:34.300 +But even so, for the static sites, they probably still can take it. + +01:04:35.300 --> 01:04:35.380 +Yeah. + +01:04:35.640 --> 01:04:42.180 +So I priced it out and I got an example deployed, an example project deployed, and was able + +01:04:42.180 --> 01:04:44.860 +to confirm that it would indeed be much cheaper. + +01:04:45.940 --> 01:04:48.780 +And it was deployed in a similar way using AWS CDK. + +01:04:49.020 --> 01:04:51.540 +So it's all infrastructure is code all the way down. + +01:04:52.080 --> 01:04:54.680 +But it turns out there's all kinds of compliance. + +01:04:54.970 --> 01:04:58.300 +When you are in charge of the VM at like a big university, + +01:04:58.630 --> 01:05:00.580 +or I'm sure any corporate setting, + +01:05:00.980 --> 01:05:03.920 +if you are in charge of the VM and the OS on it, + +01:05:04.220 --> 01:05:07.260 +then you have to know that you have the latest patches in. + +01:05:07.460 --> 01:05:08.920 +You have to know that you have latest Ubuntu. + +01:05:09.490 --> 01:05:10.960 +And then there's other things, + +01:05:12.460 --> 01:05:13.860 +different observability things + +01:05:13.860 --> 01:05:14.740 +that you have to have in place + +01:05:15.900 --> 01:05:17.600 +that are not usually required + +01:05:17.880 --> 01:05:20.700 +if you're running in a container cluster like ECS. + +01:05:21.480 --> 01:05:27.700 +So it ends up being a lot less work and much easier to achieve compliance if we run containers + +01:05:28.120 --> 01:05:31.120 +or some other serverless thing. + +01:05:31.440 --> 01:05:37.160 +If I run all my personal projects, they all run in a single virtual machine, but we're + +01:05:37.280 --> 01:05:37.800 +running in containers. + +01:05:38.340 --> 01:05:38.560 +Yeah. + +01:05:38.560 --> 01:05:38.660 +Yeah. + +01:05:39.300 --> 01:05:42.260 +And you've got all the SOC 2 stuff and all those different things, right? + +01:05:42.320 --> 01:05:43.380 +Like there's layers. + +01:05:43.940 --> 01:05:44.440 +Yeah, that's right. + +01:05:44.740 --> 01:05:44.800 +Yeah. + +01:05:44.920 --> 01:05:50.300 +I mean, I'll mention that, but what I didn't say is that in that 2019, when I started learning + +01:05:50.520 --> 01:05:55.520 +Python. I discovered Talk Python almost immediately. And one of the first episodes that I listened to + +01:05:55.520 --> 01:06:01.060 +was the other digital humanities. Cornelius Van Litt. He was an awesome guest. + +01:06:01.260 --> 01:06:06.220 +That's right. Yeah. And I thought that was great. And that was also a bit about manuscripts, + +01:06:06.820 --> 01:06:11.760 +a little bit more on the image side than the text side. And I didn't understand everything + +01:06:11.790 --> 01:06:15.880 +that everybody was saying, but I just, I kept tuning in. And I think because of that, + +01:06:16.120 --> 01:06:21.660 +Because Talk Python was like this, you know, I've been remote working for most of my time. + +01:06:22.400 --> 01:06:27.000 +And Talk Python has been kind of like that conversation with the open source community + +01:06:27.700 --> 01:06:28.920 +that's been always in my ear. + +01:06:28.920 --> 01:06:33.340 +And I think that made, you know, a difference, making me feel like I understood the software + +01:06:34.060 --> 01:06:37.420 +landscape and like the developer culture and what was going on. + +01:06:37.640 --> 01:06:40.900 +And then the different Python libraries and what was possible. + +01:06:41.640 --> 01:06:47.280 +So to people who are interested in taking things in a more technical direction, I think + +01:06:47.280 --> 01:06:52.560 +it's helpful just to find a few things like that, that give you an insight into that world. + +01:06:53.020 --> 01:06:59.060 +And the more you listen to it, the more you start to hear the same acronyms and the same + +01:06:59.360 --> 01:07:02.640 +things said enough that you start to feel like, okay, now you're part of the club. + +01:07:03.000 --> 01:07:04.360 +I really appreciate that. + +01:07:05.180 --> 01:07:05.580 +That's cool. + +01:07:06.080 --> 01:07:09.780 +I've certainly had people reach out to me and say things that at first didn't make any + +01:07:09.940 --> 01:07:10.240 +sense to me. + +01:07:10.360 --> 01:07:12.200 +Like I've been listening for six weeks now + +01:07:12.400 --> 01:07:14.540 +and it's starting to make sense what you're talking about. + +01:07:14.540 --> 01:07:15.980 +Like, why have you been listening for six months + +01:07:16.030 --> 01:07:16.800 +when it made no sense? + +01:07:16.940 --> 01:07:17.420 +That's insane. + +01:07:17.680 --> 01:07:20.880 +But a lot of people use listening to the podcast, + +01:07:21.070 --> 01:07:24.500 +is it mine and others, as language immersion, right? + +01:07:24.640 --> 01:07:28.380 +Like I could get Duolingo and I could learn Portuguese + +01:07:28.720 --> 01:07:30.580 +or I could move to Brazil for a month. + +01:07:30.830 --> 01:07:31.380 +You know what I mean? + +01:07:31.580 --> 01:07:32.200 +And then I would really learn. + +01:07:32.200 --> 01:07:32.480 +- Yeah, exactly. + +01:07:33.160 --> 01:07:33.460 +- Right. + +01:07:34.000 --> 01:07:34.140 +- Exactly. + +01:07:34.270 --> 01:07:36.040 +No, I think there's truth to that. + +01:07:36.260 --> 01:07:38.660 +And some of the things I did was, you know, + +01:07:38.820 --> 01:07:42.920 +search through, like search the word deployment, because I'm trying to get my head around how to + +01:07:43.020 --> 01:07:47.000 +deploy for the first time. And I just want to hear people talk about it. Like I could read about it. + +01:07:47.000 --> 01:07:52.120 +I could read the tutorial, but I just want to hear people talk about deployment to get a sense of what + +01:07:52.300 --> 01:07:56.480 +actual deployment sounds like. There's something really different when you're learning or trying, + +01:07:57.240 --> 01:08:01.380 +even you're maybe an experienced programmer, but not in this particular area to hear a human + +01:08:01.840 --> 01:08:08.500 +side of it, not just the docs, not a sterile. These are the four steps, but like, I love it. + +01:08:08.700 --> 01:08:10.080 +I mean, it's probably why I created the show. + +01:08:10.280 --> 01:08:11.680 +It's because I didn't hear those stories. + +01:08:11.780 --> 01:08:12.940 +We got to tell those stories. + +01:08:13.440 --> 01:08:13.540 +Awesome. + +01:08:13.860 --> 01:08:14.660 +I appreciate that. + +01:08:14.860 --> 01:08:15.620 +So super cool. + +01:08:15.840 --> 01:08:16.020 +All right. + +01:08:16.359 --> 01:08:21.080 +So if other people are listening, maybe one of your pieces of advice is keep listening. + +01:08:21.580 --> 01:08:22.299 +You'll get there. + +01:08:22.480 --> 01:08:22.859 +Yeah. + +01:08:22.960 --> 01:08:30.060 +And if anybody is in the humanities and somehow found their way onto this episode with no technical experience, + +01:08:30.819 --> 01:08:37.060 +I just would give the caution of, like, you know, the anecdote that if AI coding had been + +01:08:37.259 --> 01:08:42.940 +around the way it is now when I was learning, I wouldn't be doing digital humanities at + +01:08:43.060 --> 01:08:43.200 +Harvard. + +01:08:43.540 --> 01:08:45.600 +I wouldn't have been able to get into this field. + +01:08:46.420 --> 01:08:47.420 +I wouldn't have known about it. + +01:08:47.799 --> 01:08:52.380 +So I guess just think about that when you're learning and applying new tools. + +01:08:52.720 --> 01:08:54.980 +I don't really know what the right fix for that is. + +01:08:55.060 --> 01:08:56.299 +That's a very challenging problem. + +01:08:56.500 --> 01:08:59.560 +I mean, you can say I'm just literally not going to fire it up. + +01:08:59.720 --> 01:09:03.279 +But I mean, we used to hunt through Stack Overflow and the web and over and over. + +01:09:03.460 --> 01:09:06.859 +And if you're really stuck or you really don't understand, like they're good at explaining + +01:09:06.960 --> 01:09:07.319 +stuff too. + +01:09:07.359 --> 01:09:12.200 +You just got to really stay in a learner's mindset, not just press the easy button and + +01:09:12.319 --> 01:09:13.259 +make this thing and move on. + +01:09:13.700 --> 01:09:14.380 +Easier said than done. + +01:09:14.680 --> 01:09:15.359 +Easier said than done. + +01:09:15.620 --> 01:09:22.000 +So yeah, I want to leave this with kind of a thought about how much things like Python + +01:09:22.220 --> 01:09:27.260 +and these tools and technology can really empower stuff that you wouldn't think is even + +01:09:27.279 --> 01:09:34.620 +related, like understanding old manuscripts and how painting is connected or changed over time and + +01:09:34.799 --> 01:09:39.720 +stuff, right? Those sound very much disjointed from tech and software, but they really are + +01:09:40.080 --> 01:09:45.319 +superpowers that you can bring to your work, whatever your industry is. I know our field of + +01:09:45.460 --> 01:09:49.600 +study, I know there's some sociologists out in the audience and I'm sure others as well. + +01:09:50.279 --> 01:09:54.700 +All right. Final thoughts, David, close it out. You said it great. I mean, you know, + +01:09:55.340 --> 01:10:01.840 +Just applying these technical tools to old questions, that is the core of digital humanities. + +01:10:02.220 --> 01:10:04.900 +When I first started hearing about this, I thought, I really don't know how this ties + +01:10:05.060 --> 01:10:05.160 +together. + +01:10:05.400 --> 01:10:08.780 +And after seeing it a few times, I definitely see the power of it. + +01:10:08.780 --> 01:10:11.000 +And I thank you for your time coming on. + +01:10:11.260 --> 01:10:16.760 +Thank you for sharing your look and the look inside of your team and inside of a small piece + +01:10:16.940 --> 01:10:17.260 +of Harvard. + +01:10:17.780 --> 01:10:22.960 +I really like these kinds of episodes because it's hard to see this from the outside, right? + +01:10:23.060 --> 01:10:24.880 +like you just see the results, + +01:10:24.950 --> 01:10:27.180 +but you don't see like the inner workings of the team + +01:10:27.320 --> 01:10:28.140 +and the motivation and stuff. + +01:10:28.360 --> 01:10:30.640 +So thank you so much for being here. + +01:10:31.150 --> 01:10:32.480 +And yeah, bye everyone. + +01:10:33.980 --> 01:10:36.100 +This has been another episode of Talk Python To Me. + +01:10:36.370 --> 01:10:37.200 +Thank you to our sponsors. + +01:10:37.390 --> 01:10:38.700 +Be sure to check out what they're offering. + +01:10:38.940 --> 01:10:40.260 +It really helps support the show. + +01:10:40.720 --> 01:10:42.100 +Take some stress out of your life. + +01:10:42.480 --> 01:10:44.280 +Get notified immediately about errors + +01:10:44.640 --> 01:10:46.440 +and performance issues in your web + +01:10:46.450 --> 01:10:47.920 +or mobile applications with Sentry. + +01:10:48.440 --> 01:10:51.300 +Just visit talkpython.fm/sentry + +01:10:51.800 --> 01:10:52.860 +and get started for free. + +01:10:53.280 --> 01:10:55.800 +Be sure to use our code, talkpython26. + +01:10:56.760 --> 01:11:00.140 +That's Talk Python, the numbers two, six, all one word. + +01:11:00.820 --> 01:11:02.920 +This episode is brought to you by CommandBook, + +01:11:03.240 --> 01:11:05.320 +a native macOS app that I built + +01:11:05.480 --> 01:11:08.040 +that gives long-running terminal commands a permanent home. + +01:11:08.440 --> 01:11:10.440 +No more juggling six terminal tabs every morning. + +01:11:10.880 --> 01:11:12.280 +Carefully craft a command once, + +01:11:12.440 --> 01:11:14.020 +run it forever with auto-restart, + +01:11:14.160 --> 01:11:15.700 +URL detection, and a full CLI. + +01:11:16.060 --> 01:11:19.180 +Download it for free at talkpython.fm/commandbook app. + +01:11:19.920 --> 01:11:21.800 +If you or your team needs to learn Python, + +01:11:22.040 --> 01:11:32.080 +We have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:11:32.400 --> 01:11:34.580 +Best of all, there's no subscription in sight. + +01:11:35.240 --> 01:11:36.900 +Browse the catalog at talkpython.fm. + +01:11:37.600 --> 01:11:42.260 +And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:11:42.900 --> 01:11:44.700 +Just search for Python in your podcast player. + +01:11:44.790 --> 01:11:45.680 +We should be right at the top. + +01:11:46.100 --> 01:11:48.940 +If you enjoy that geeky rap song, you can download the full track. + +01:11:49.070 --> 01:11:50.980 +The link is actually in your podcast blur show notes. + +01:11:51.760 --> 01:11:53.140 +This is your host, Michael Kennedy. + +01:11:53.560 --> 01:11:54.600 +Thank you so much for listening. + +01:11:54.830 --> 01:11:55.620 +I really appreciate it. + +01:11:56.040 --> 01:11:56.760 +I'll see you next time. + +01:12:08.400 --> 01:12:11.200 +I'm out. + From 5f4637dd5a8214576a50a5935d71ef64bc91723a Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Fri, 6 Mar 2026 08:44:36 -0800 Subject: [PATCH 05/16] transcripts --- ...hing-up-with-the-python-typing-council.txt | 2360 ++++ ...hing-up-with-the-python-typing-council.vtt | 9868 +++++++++++++++++ 2 files changed, 12228 insertions(+) create mode 100644 transcripts/539-catching-up-with-the-python-typing-council.txt create mode 100644 transcripts/539-catching-up-with-the-python-typing-council.vtt diff --git a/transcripts/539-catching-up-with-the-python-typing-council.txt b/transcripts/539-catching-up-with-the-python-typing-council.txt new file mode 100644 index 0000000..1714687 --- /dev/null +++ b/transcripts/539-catching-up-with-the-python-typing-council.txt @@ -0,0 +1,2360 @@ +00:00:00 You're adding type-ins to your Python code. + +00:00:02 Your editor is happy, autocomplete is working great, but then you switch tools + +00:00:06 and suddenly there are red squigglies everywhere. + +00:00:09 Who decides what a float annotation actually means or whether passing none where an int is expected + +00:00:15 should be an error? + +00:00:17 It turns out there's a five-person council dedicated to exactly these questions + +00:00:21 and two brand new Rust-based type checkers are raising the bar as well. + +00:00:27 On this episode, I sit down with three of the members of the Python Typing Council, + +00:00:31 Yela Zylstra, Rebecca Chen, and Carl Meyer to learn about how the type system is governed, + +00:00:37 where the spec and type checkers agree and disagree, and I get the council's official advice + +00:00:42 on how much typing is just enough. + +00:00:45 This is Talk Python To Me, episode 539, recorded January 27th, 2026. + +00:00:53 Talk Python To Me, yeah, we ready to roll. + +00:00:56 Upgrading the code, no fear of getting old. + +00:00:59 Async in the air, new frameworks in sight, geeky rap on deck. + +00:01:03 Quart crew, it's time to unite. + +00:01:05 We started in Pyramid, cruising old school lanes, had that stable base, yeah, sir. + +00:01:09 Welcome to Talk Python To Me, the number one Python podcast for developers + +00:01:13 and data scientists. + +00:01:14 This is your host, Michael Kennedy. + +00:01:16 I'm a PSF fellow who's been coding for over 25 years. + +00:01:20 Let's connect on social media. + +00:01:22 You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:25 The social links are all in your show notes. + +00:01:27 You can find over 10 years of past episodes at talkpython.fm. + +00:01:31 And if you want to be part of the show, you can join our recording live streams. + +00:01:35 That's right, we live stream the raw, uncut version of each episode on YouTube. + +00:01:39 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:44 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:48 This episode is brought to you by Sentry. + +00:01:50 Don't let those errors go unnoticed. + +00:01:52 Use Sentry like we do here at Talk Python. + +00:01:53 Sign up at talkpython.fm/sentry. + +00:01:57 And it's brought to you by our Agentic AI programming for Python course. + +00:02:02 Learn to work with AI that actually understands your code base and build real features. + +00:02:07 Visit talkpython.fm/agentic-ai. + +00:02:11 Della, Rebecca, and Carl, welcome to all of you type-loving Pythonistas. + +00:02:18 Awesome to have you here on the show. + +00:02:20 Thanks for being here. + +00:02:20 We're going to talk Python typing, especially from the perspective of the Python Typing Council, + +00:02:27 which honestly, I am a huge fan of Python typing. + +00:02:30 It's still something I learned about not too long ago. + +00:02:33 So I'm going to be learning along with everyone else, what it is you all do and so on. + +00:02:38 So I'm really excited to be diving into this. + +00:02:41 I think since types came to Python, I think it's made it a little bit more rigorous, + +00:02:46 you know, for all those people out there like, oh, it's not a real language without any form of static typing. + +00:02:51 We can't use it on real projects. + +00:02:53 I don't know how true that was, but certainly it's less true now. + +00:02:56 You know, you can pick per project. + +00:02:58 So it's super cool. + +00:02:59 Before we get into all that, though, let's just go around for a quick introductions. + +00:03:03 Jala, welcome to the show. + +00:03:05 Awesome to have you here. + +00:03:06 Who are you? + +00:03:06 Hi, yeah. + +00:03:07 Jala, I've been on the Python Typing Council since the beginning. + +00:03:10 I helped set it up a couple of years ago. + +00:03:12 Outside of the typing work, I currently work at OpenAI, where I work on developer productivity, + +00:03:17 which means things like running CI for people and helping, generally helping people be productive. + +00:03:23 I've been working with Python for more than a decade. + +00:03:25 Started out because my previous job was mostly in Python and then got more and more involved with the language. + +00:03:32 So let me get this right. + +00:03:33 At OpenAI, you're basically helping developers there have better developer tooling + +00:03:37 and common packages and workflows and stuff like that. + +00:03:41 Is that kind of the story? + +00:03:42 That's right. + +00:03:43 Mostly around things that happen in CI, like running tests efficiently, figuring out the right tests to run, + +00:03:49 getting the right CI workers out. + +00:03:50 That sounds very exciting. + +00:03:51 Right in the epicenter of all the big tech stuff these days. + +00:03:56 Super cool. + +00:03:57 Rebecca, hello. + +00:03:58 Welcome. + +00:03:58 Hey, thanks for having me. + +00:04:00 I'm Rebecca. + +00:04:01 I've been on the Typing Council also for about three years, I think, since the, less than three, + +00:04:07 since the beginning. + +00:04:08 But my day job, I work at Meta on Python typing, gone Pyrefly, which is a new type checker + +00:04:16 and language server written in Rust, still in beta. + +00:04:20 Prior to that, I was at Google for eight years, also on the Python team. + +00:04:24 I just, I really like Python. + +00:04:26 Yeah, super neat. + +00:04:27 I'm a big fan of both Pyrefly and ty, which will both have representatives here, I know. + +00:04:33 And I think it's just a super exciting time for Python types. + +00:04:37 And certainly that's one of the reasons. + +00:04:38 So very cool. + +00:04:39 Carl, welcome back. + +00:04:40 Thank you. + +00:04:41 Great to be here. + +00:04:42 Yeah, Carl Meyer. + +00:04:43 I currently work at Astral, where I work on ty, which is a Python type checker and language server + +00:04:50 written in Rust, also in beta. + +00:04:53 And yeah, I guess, how did I get into typing? + +00:04:55 Or I've been on the Typing Council, not since the beginning. + +00:04:59 I think it's been a year and a half. + +00:05:01 And yeah, I got into Python typing at the time in 2016, 2017. + +00:05:07 I was working at Instagram. + +00:05:08 And that was in the very early days of Python typing. + +00:05:12 The PEP44, PEP43, the early Python typing PEPs had recently come out within the last couple of years. + +00:05:19 And one of the co-authors of some of those PEPs, Lukash Lange, was actually sitting at a desk + +00:05:24 right next to me at the time. + +00:05:25 And at some point, we started to think that we should try this Python typing stuff + +00:05:28 on the Instagram server monolith. + +00:05:31 And so I took that on as a side project. + +00:05:33 And then it eventually became the main project. + +00:05:35 And then it took like three years. + +00:05:37 So a lot of Python typing experience there. + +00:05:40 There absolutely is. + +00:05:40 You know, I think a couple of things I'd like to touch on there. + +00:05:43 First of all, Instagram, is it maybe the biggest Django deployment in the world? + +00:05:48 It's certainly one of the bigger ones, right? + +00:05:50 And I think a lot of people don't necessarily know that a core chunk of Instagram is actually Python, right? + +00:05:55 I mean, I don't know if we have any way to know how big the Django deployments in the wild might be. + +00:06:00 But it's certainly a big one. + +00:06:01 Yeah, it's definitely a big one. + +00:06:02 There were some talks about dismissing the garbage collector from the Instagram folks. + +00:06:08 That wasn't you giving the talk, but at PyCon. + +00:06:11 So that was pretty interesting. + +00:06:12 But I think actually that work that you're talking about, especially with Lukash, really kind of opened + +00:06:19 a lot of people's eyes about Python typing, right? + +00:06:22 He gave a couple of PyCon talks, showed, you know, real metrics of how much of the code base is typed, + +00:06:28 how much it's changed, like error detection, that kind of stuff. + +00:06:33 So let me ask you, do you feel like it would be different? + +00:06:36 Would it have gone different now if tools like TY and Pyrefly existed back then? + +00:06:42 Is Python typing different now than it was then? + +00:06:44 Certainly, yes. + +00:06:45 I mean, there's been, the type system has gotten more complex over time. + +00:06:49 So it is both more expressive and more complex. + +00:06:52 And yeah, we have more type checkers available now. + +00:06:56 I do agree that it's more complicated, and I don't know how to feel about that. + +00:06:59 It is more expressive, but I feel like it's starting to get, I mean, we're not at C++ ATL, + +00:07:06 like templates of templates of templates, but still, it's getting more serious. + +00:07:12 But I guess one of the really nice parts is that you can just take as much as you want + +00:07:16 of the complexity, and you can just leave the rest, right? + +00:07:19 That's part of the magic of Python typing, is that it's a gradual typing system. + +00:07:23 That's a choice people get to make. + +00:07:25 It can be none, it can be quite a bit, and anywhere in between. + +00:07:30 So I guess that's probably one of the decisions. + +00:07:33 Let's talk about the typing council. + +00:07:34 So when did the typing council come along, and did the typing council exist to create + +00:07:40 all of these PEPs and make this happen, or was it afterwards? + +00:07:43 Like, what's the history of the typing council and its purpose, folks? + +00:07:47 We'll run it. + +00:07:47 Yeah, it postdates most of the PEPs. + +00:07:49 So initially, the text system was created just through the regular PEP process. + +00:07:52 It means that something gets submitted, first still to Guido as the BDFL, + +00:07:57 later to the steering council. + +00:07:59 Meant that it's very hard to make changes to, like, this specification. + +00:08:03 Like, anytime you want to make change something about how the type systems would work, + +00:08:06 we had to go through this PEP procedure, talk to the steering council, who are very busy people, + +00:08:11 who deal with a lot of other aspects of the language other than typing. + +00:08:14 So Shantanu and I came up with this idea of creating a separate council to specifically in charge of typing, + +00:08:21 that would be in a specification where we can make small changes ourselves + +00:08:25 without having to go through this whole PEP process. + +00:08:27 And this way, when all the type checkers agreed that something needs to go a certain way + +00:08:31 and it's not exactly what's in the PEPs, we can change it and have a place to record that + +00:08:36 and people can refer to it and new type checkers can also try to follow those decisions. + +00:08:41 Very interesting. + +00:08:41 I didn't realize that it was sort of, was there to allow for small changes + +00:08:46 to be made to make that much easier. + +00:08:48 But of course that makes sense because the PEP process is, it's pretty serious and drawn out. + +00:08:52 And we've seen even small language changes have quite passionate folks, I guess we should say. + +00:09:00 So yeah, yeah, very nice. + +00:09:02 Do you have any examples of the types of changes that y'all have, that have happened over the years + +00:09:06 that maybe were typing council only? + +00:09:09 One was the specification where how overloads work, which is perhaps not really a small change, + +00:09:14 but one of the most complicated features in the type system really is the overloads, + +00:09:17 where you can give multiple signatures for a function and type checkers sort of select + +00:09:22 which one to use based on the arguments when the function is called. + +00:09:26 And when it was initially created, from what I recall, there just wasn't really a specification. + +00:09:31 It's just like you use the signatures in a way that makes sense. + +00:09:35 And Eric Trout, who's currently on the council, came up with a pretty specific procedure + +00:09:40 for exactly how overload should work to make it so that type checkers have, + +00:09:45 well, sort of users can understand how it works and sort of type checkers can have something + +00:09:47 to work towards to make sure that they work infinite overloads in the same way. + +00:09:51 Maybe a smaller example that is an example of something that would have been too small for a pep + +00:09:56 and hard to accomplish before the typing council existed. + +00:10:00 And this is actually a change that I pushed through before the, before I was on the typing council, + +00:10:05 but the typing council approved it, was a clarification around the interpretation of data class fields. + +00:10:11 If a final annotation is applied to a data class field, does that mean, so if you apply a final annotation + +00:10:18 to a regular class attribute, since it can't be changed, that implies that it's a class variable. + +00:10:24 And there was a question of if that should be the interpretation with the data class or not. + +00:10:28 So we discussed that and made a clarification to the spec. + +00:10:31 I've never really thought about final being applied to a class field, but I've always used them + +00:10:36 sort of just for constants. + +00:10:38 But, you know, maybe people out there don't know, like typing dot final bracket type, + +00:10:43 right? + +00:10:44 That's kind of the way you can do constants in Python, right? + +00:10:47 Constants for the type checker. + +00:10:48 Nothing in the runtime will stop you from editing it. + +00:10:51 That's... + +00:10:51 Not there. + +00:10:52 Not there. + +00:10:52 I have some examples coming up and I'm interested to hear your thoughts on it, + +00:10:56 but for sure it's, there is this tension, right? + +00:11:00 I mean, I think that's probably worth touching on as well is this is a tension for Python + +00:11:04 in general is you can write all the types you want and then when you run your code, + +00:11:09 it just doesn't care. + +00:11:11 There's a few instances, Pydantic, FastAPI, a few others, but generally speaking, + +00:11:15 it's there for the editors and the type checkers and the linters and not for runtime, right? + +00:11:20 Yeah, that's right. + +00:11:21 There's many exceptions to that. + +00:11:23 There's a product like mypyC, which comes with mypy that's used those types + +00:11:27 to compile your code into more efficient machine codes. + +00:11:30 Maybe there's going to be more products like that in the future. + +00:11:32 I don't know. + +00:11:33 But yes, in general, it's separate from the runtime. + +00:11:36 Sort of a similar model to TypeScript where TypeScript gets compiled into JavaScript + +00:11:40 and types just go away. + +00:11:42 Here, we don't do a compilation step, but still the same idea of the types + +00:11:45 just not influencing the runtime. + +00:11:47 Although we do make them available for introspection via done annotations attributes, + +00:11:51 which is what has enabled the projects like Pydantic and other sort of runtime checkers + +00:11:56 to make use of type annotations at runtime also. + +00:11:59 Yeah, I don't know if the typing council was around for this, but there was proposed, + +00:12:03 I don't remember the exact details, but something to the effect of for type checking, + +00:12:07 not actually doing some of the full imports or something along those lines, + +00:12:13 right, where the runtime behavior would have made it hard for tools like Pydantic + +00:12:17 and others to get that. + +00:12:19 And there was some kind of compromise, right? + +00:12:21 I don't remember the details here. + +00:12:22 Anyone does? + +00:12:23 Yeah, what happened was that there was going to be a change. + +00:12:25 That's what the from future import annotations import does, that changes all annotations + +00:12:29 into raw strings. + +00:12:30 So the default behavior before recently was that annotations that are regular codes. + +00:12:36 If you write devf return to ints and you import the module, it just looks up + +00:12:40 the name ints and puts that in an annotations dictionary, which makes introspection easy, + +00:12:44 but it made a test on costs on performance because memory usage sometimes was high + +00:12:50 and also made things harder to use sometimes because if you use a name that's not defined yet + +00:12:55 at runtime, you get an error. + +00:12:57 That often comes up if you have like a class that has a reference in an adaptation + +00:13:01 to the class itself or circular dependency classes. + +00:13:05 Right. + +00:13:06 The circular imports because you want to say this class is created by that thing + +00:13:11 and it returns one, but you know, somehow you've got to import the other one + +00:13:15 and that's such a hassle. + +00:13:17 Yeah, it's, yeah, even out in the audience we have, Tom says, circular imports. + +00:13:21 Oh, yeah, for sure. + +00:13:22 What about lazy imports? + +00:13:24 Like that just recently got accepted and will be in 3.15. + +00:13:27 Which I'm super excited about because I think it'll make app startup a lot faster + +00:13:32 for many use cases. + +00:13:34 But does that have knock-on effects for typing? + +00:13:36 Not that directly because I think for a type checker lazy imports mostly just look + +00:13:41 like regular imports. + +00:13:42 I guess I should maybe leave that for the people who are actually working on type checkers + +00:13:46 and being written right now. + +00:13:47 Yeah, Rebecca, do you see this making any difference for you? + +00:13:50 Lazy imports? + +00:13:51 To be honest, it's not something we've looked at too carefully yet. + +00:13:55 3.15 seems a little more in the future, but I don't think it's likely to make a huge difference. + +00:14:03 Carl? + +00:14:04 I've thought about it briefly and I think that it, I think the type checkers + +00:14:07 really won't need to care. + +00:14:08 Maybe there will be some edge cases that will come up that I haven't thought of, + +00:14:11 but it shouldn't be a big deal. + +00:14:12 Yeah, that's what I thought as well. + +00:14:14 The one variation that I can certainly see is if you have a, if you have something + +00:14:20 specified in a type, like say for a field of a class or a Pydantic model + +00:14:25 or something that would otherwise not trigger the lazy import to become imported, + +00:14:30 would potentially having types specify cause more importing to happen sooner + +00:14:36 in the runtime? + +00:14:36 Yeah, there's actually an issue related to this that I think we may need + +00:14:39 to resolve before 3.15, but I don't know how yet. + +00:14:43 If you use a type in a data class annotation that's lazy imported, actually creating + +00:14:47 a data class will delay by the import. + +00:14:49 It will try to resolve the import and actually make it not lazy. + +00:14:55 This is because data classes doesn't really need to look at all of the annotations + +00:14:58 in your class, but it looks at them enough to trigger reification of the import. + +00:15:04 I shared this with some of the people on the lazy imports team, but we haven't + +00:15:07 yet come up with a good way around it. + +00:15:09 I think this might end up being a bit of a food gun, so I feel like we should ideally + +00:15:12 find a workaround, but I don't know what it would be yet. + +00:15:15 I don't know that it's wrong that it converts it to an eager import, which it needs + +00:15:20 to know what it is potentially, right? + +00:15:22 It actually doesn't. + +00:15:24 Data classes just need to know whether it is classed for or not. + +00:15:27 I think that's pretty much all. + +00:15:28 I guess there's an init for also, but it doesn't really need to know anything else. + +00:15:32 So in theory, it should be possible to just say, hey, it is not classed for, so don't bother + +00:15:37 importing it. + +00:15:38 Okay, so that's for data classes, but say if I specify a parameter type on a function. + +00:15:43 Yeah, then it should be fine. + +00:15:45 I guess, again, unless something is, if it does annotate, so if you have + +00:15:49 something like a decorator that looks at annotations in your function, it might reify + +00:15:53 those imports. + +00:15:54 There is one other potentially interesting thing for type checkers. + +00:15:57 It's already difficult for type checkers to figure out when like a submodule + +00:16:02 should be considered to be an attribute of the parent module because the way + +00:16:06 this happens in Python is that any import of a submodule anywhere will attach + +00:16:10 that submodule as an attribute on the parent module, but that at runtime, + +00:16:15 that could literally happen anywhere. + +00:16:17 It could happen in totally unrelated code outside of the module and a type checker + +00:16:20 probably won't be able to see that. + +00:16:22 So type checkers already have sort of complex sets of rules around where they look + +00:16:26 for these submodule imports and when they consider a submodule import to be reliably + +00:16:30 happening enough that it should, that the type checker should consider this submodule + +00:16:35 to exist as an attribute. + +00:16:38 And lazy imports may make that even, we'll add one more wrinkle to those + +00:16:43 sets of heuristics in that we'll have to decide if you have a lazy import + +00:16:46 of a submodule and you're done to init.py, it's lazy. + +00:16:50 So should the type checker consider that submodule to be imported or not be imported? + +00:16:55 It'll be another case where there's no clear right answer and we'll just have + +00:16:58 to make a decision one way or the other. + +00:17:02 This portion of Talk Python is brought to you by Sentry. + +00:17:05 I've been using Sentry personally on almost every application and API that I've built + +00:17:10 for Talk Python and beyond over the last few years. + +00:17:13 They're a core building block for keeping my infrastructure solid. + +00:17:16 They should be for yours as well. + +00:17:18 Here's why. + +00:17:19 Sentry doesn't just catch errors. + +00:17:21 It catches all the stuff that makes your app feel broken. + +00:17:23 The random slowdown, the freeze you can't reproduce, that bug that only shows up + +00:17:28 once real users hit it. + +00:17:29 And when something goes wrong, Sentry gives you the whole chain of events + +00:17:32 in one place. + +00:17:33 Errors, traces, replays, logs, dots connected. + +00:17:36 You can see what's led to the issue without digging through five different dashboards. + +00:17:41 Sear, Sentry's AI debugging agents builds on this data, taking the full context, + +00:17:46 explaining why the issue happened, pointing to the code responsible, drafts a fix, + +00:17:51 and even flags if your PR is about to introduce a new problem. + +00:17:55 The workflow stays simple. + +00:17:56 Something breaks, Sentry alerts you, the dashboard shows you the full context, + +00:18:01 Sear helps you fix it and catch new issues before they ship. + +00:18:04 It's totally reasonable to go from an error occurred to fixed in production + +00:18:08 in just 10 minutes. + +00:18:10 I truly appreciate the support that Sentry has given me to help solve my + +00:18:14 bugs and issues in my apps, especially those tricky ones that only appear in + +00:18:19 production. + +00:18:19 I know you will too if you try them out. + +00:18:21 So get started today with Sentry. + +00:18:23 Just visit talkpython.fm slash Sentry and get $100 in Sentry credits. + +00:18:28 Please use that link. + +00:18:29 It's in your podcast player show notes. + +00:18:31 If you're signing up some other way, you can use our code talkpython26, all one word, + +00:18:36 talkpython26, to get $100 in credits. + +00:18:40 Thank you to Sentry for supporting the show. + +00:18:43 Yeah, there's some variations across type checkers, which we'll get to later. + +00:18:47 I think, though, before we move off this, there's actually off introducing the + +00:18:52 typing council. + +00:18:53 I think we should point out that there's two other folks who couldn't be here + +00:18:56 who are also on the typing council, Eric Trout and Yuka Letts, the docilo? + +00:19:03 Sorry, Yuka. + +00:19:05 But I want to make sure that we point out there's actually five people, not just the + +00:19:08 three of you, right? + +00:19:09 How do you get on the council? + +00:19:11 Is there an election? + +00:19:13 Do you just apply? + +00:19:14 I think these are filled by the members themselves. + +00:19:16 So when somebody declares the intention to leave the council, we basically ask for + +00:19:20 people who are interested and then make a selection. + +00:19:23 Generally, we try to get people who have experience in the type system. + +00:19:27 We try to get a good cross representation of people working on different + +00:19:29 type checkers. + +00:19:30 We have Carl and Rebecca here who work on two type checkers, ty and Pyrefly. + +00:19:36 Yuka works on Pyrites, which are two of the most lightly used type checkers. + +00:19:41 So we try to get representation of people working on those parts of the ecosystem. + +00:19:46 That's really cool that it's got a bias towards finding people actually doing the work. + +00:19:50 So let's talk about the specification project at typing.python.org. + +00:19:55 What is this here? + +00:19:56 I'll talk a bit about it. + +00:19:58 I guess it's a specification for how the type system used to work. + +00:20:03 The way it started was that, Yela, you basically took all the typing peps + +00:20:07 and like stapled them together, right, to make like one long doc. + +00:20:12 And since then, we've been iterating on it, filling in parts that were missing + +00:20:16 like overload evaluation and making other changes as well. + +00:20:21 Yeah, it's tricky, right? + +00:20:22 Because traditionally, the typing system is kind of defined across a series of + +00:20:28 peps. + +00:20:28 And so what is the document that tells you how it works, right? + +00:20:31 Yeah, that made it hard because often there's peps built on top of each other. + +00:20:35 So then in the extreme, you might see like one thing in one pep and then there's + +00:20:39 another pep that adds an aspect of it, another one that adds another aspect. + +00:20:43 And overall it makes it very hard to follow. + +00:20:44 One of the things I did recently was rewrite the typed dict another. + +00:20:55 Ended up rewriting the whole thing to basically put all those features together + +00:20:58 in a coherent whole rather than just having them all copy-pasted one after the + +00:21:04 other. + +00:21:04 Okay, so if somebody really wants a good understanding of the Python typing system, + +00:21:09 they go to typing.python.org. + +00:21:11 You know, one thing I think maybe is worth touching on, it's just kind of out of the + +00:21:15 blue a bit, but I think it's a really interesting aspect of the Python typing system + +00:21:20 is the, what is it called, the numerical tower or the number tower, where + +00:21:24 it's like, if I have a number, I could specify it as an int, or I could specify + +00:21:29 it as a float, and those kinds of things, but do you really need to say it's an + +00:21:34 int pipe float, or a union of int and float, if it could be either, right? + +00:21:39 And the, what is it called? + +00:21:40 It's the numerical tower, right? + +00:21:42 Yeah, there are different towers too. + +00:21:43 In Python, there's also this thing called a numbers module that you have there, that's + +00:21:48 just basically ignored by the type system. + +00:21:50 It's been useful for some people, I feel like in general that module just + +00:21:52 hasn't worked out very well as being very useful. + +00:21:55 I think the interesting aspect is that you know, that you can say it's a + +00:21:59 float, and that's basically equivalent to union of integer and float, and so + +00:22:04 on, right? + +00:22:05 I think the typing numbers in Python is pretty interesting. + +00:22:08 I think every type checker has a different interpretation of what a float + +00:22:13 annotation actually means. + +00:22:16 It's an area of some lack of clarity in the spec. + +00:22:19 Yeah, a lot of contentiousness. + +00:22:20 If we could go back in time, I would, like, knowing what I know now, I probably advocate + +00:22:25 for things being done differently because, like, beginning, you know, like, there + +00:22:30 were multiple things, like, with similar flavor. + +00:22:33 Like, there was also one where you could give a parameter a non-none annotation and + +00:22:39 default it to none for convenience, and we've largely, like, moved away from stuff like + +00:22:44 that in favor of explicitness. + +00:22:46 Yeah, what the current spec says is that basically if you have a function that takes + +00:22:49 a float, you're also allowed to pass an int. + +00:22:52 That's not really enough. + +00:22:54 It doesn't tell you how these things work in all cases, and we've had some + +00:22:59 attempts to try to come up with a way to specify that special case in a way that + +00:23:04 makes more sense, at least makes more sense to me. + +00:23:07 It's been very contentious. + +00:23:08 People have very strong opinions about this. + +00:23:10 I guess non-obvious is what I'd like to say, really, honestly. + +00:23:13 So I'd like to get the official counsel's thoughts on this. + +00:23:17 When is typing too much typing? + +00:23:20 I made the joke about C++, ATL, if you've ever worked with that, it's like + +00:23:24 a template class where templated classes are part of the concrete type of the + +00:23:30 template. + +00:23:30 It's just off the hook. + +00:23:32 There's certainly places where typing can be too much, and a lot of the purity of Python + +00:23:38 or the readability of Python is the fact that it's got so few symbols. + +00:23:44 And so adding types adds context, but it also makes it a little harder to read. + +00:23:49 When is too much typing? + +00:23:50 When do you recommend typing? + +00:23:53 Rebecca, I'll let you go first, but what are your thoughts on how much typing should + +00:23:57 I use in Python? + +00:23:59 I'll give you what is my official stance, which is that if you want your + +00:24:04 type checker to work well, you should type annotate your API boundaries. + +00:24:09 So parameters and turns in public functions, public class attributes, things + +00:24:14 like that, and even things that seem true trivial, like, oh, this function returns + +00:24:18 none, better to annotate it because, you know, someone else might be depending on + +00:24:24 your library and consuming that type of information. + +00:24:27 I will say personally, what I tend to do is I annotate things that I think are + +00:24:32 non-trivial because I want to see that as documentation. + +00:24:37 And if something, you know, a function that does return none, to be honest, I will + +00:24:42 probably forget to annotate it half the time because I'll be like, I honestly don't + +00:24:46 need to see it. + +00:24:47 One of the interesting features of the Pyrefly VS Code extension, that's the only + +00:24:53 one I can speak of at the moment, and Carl, you've got to tell me if the + +00:24:56 ty one does this as well, is it will sort of overlay its belief of what types + +00:25:03 are. + +00:25:03 Like, if there's, you say x equals a function return value and it knows what the + +00:25:07 function returns, it'll have a gray, like, colon int, if it returned an int or something. + +00:25:11 So you can kind of read the code and see what the types are without actually putting + +00:25:17 it into the text of the code. + +00:25:19 It's only within the editor. + +00:25:20 Does ty do something like that, Carl? + +00:25:22 Yes, we also have inlay type ins. + +00:25:23 Yeah, inlay type ins, that's what it's called. + +00:25:25 So, yeah, I don't know, that also brings an interesting challenge, not a challenge, + +00:25:28 like a wrinkle to the recommendation of should I put types on, like, the return value + +00:25:35 because I want to know that's a list of user, not a list of user IDs or whatever, for + +00:25:39 example, like a list of UUID. + +00:25:41 But if, it's going to show up anyway in the editor, maybe I don't have to write + +00:25:45 that, right? + +00:25:45 And so that becomes sort of somewhere where you could debate again, I think. + +00:25:50 However, I do 100% agree with you, Rebecca, that put it on your API boundaries. + +00:25:54 If, like, this is the place that people get into some part of your code and they + +00:25:58 don't know or want to know about the inside of it, having types there is really helpful + +00:26:03 both for editors, for type checkers, and just for reading code, and even for AI, which + +00:26:08 is a crazy world. + +00:26:09 Yeah. + +00:26:09 Carl? + +00:26:10 What are your thoughts here? + +00:26:11 How much typing is too much typing? + +00:26:13 What's the guidelines here? + +00:26:14 I think I agree with Rebecca's answer. + +00:26:16 I mean, that one place you definitely want to have explicit type annotations is that + +00:26:21 API boundaries, the public API of a library, etc. + +00:26:24 In terms of what's too much typing, I mean, there are certainly patterns + +00:26:28 that have historically been used in Python that we still can't express well + +00:26:34 in the type system, or that require extremely complex type annotations to + +00:26:39 express well, and I think there it becomes a judgment call. + +00:26:43 If it's like a core, widely used API, you may get a lot of benefit from some + +00:26:48 very complex and verbose annotations, and so then it's worth sort of going + +00:26:53 through that pain and the pain of adding them and of reading them in order to get that + +00:26:57 additional typing coverage everywhere you use that API. + +00:27:01 If it's much less frequently used code that's highly dynamic, maybe it's not worth + +00:27:05 it in that case. + +00:27:06 I think there's a lot of judgment calls here. + +00:27:08 What about like one-off scripts? + +00:27:10 You know, I'm going to write this thing to just move this data from here to there, + +00:27:13 and once it's moved, I don't need it again. + +00:27:15 It's done with that old system, we're going to the new one. + +00:27:17 Maybe less typing. + +00:27:18 Yeah, I think that's what's useful for you. + +00:27:20 Often I feel like one-off scripts are not really one-off like maybe you want to move some + +00:27:24 similar data later, and then it's useful if you can understand your code again, + +00:27:27 if you want to read what you did. + +00:27:29 You thought you didn't need it again, and all of a sudden at six months old, + +00:27:32 you don't understand it, and the types that help a lot, right? + +00:27:35 Yeah. + +00:27:36 Yellow, what's your advice? + +00:27:38 What Cara and Rebecca said makes sense to me too. + +00:27:40 I think types have advantages in terms of documenting humor readers, what + +00:27:45 is going on, and in terms of catching mistakes that otherwise would not be + +00:27:49 caught until runtime perhaps. + +00:27:51 They have costs in maybe making your code harder to read if there's too much + +00:27:54 going on. + +00:27:54 So add types as long as those benefits outweigh the costs. + +00:27:57 Yeah. + +00:27:58 I mean, do you recommend to anyone that they just 100% go full like C++, + +00:28:04 C# on it, and just type it every single thing? + +00:28:08 Is there an advantage like for static type checkers, you know, like mypy type stuff + +00:28:12 you can run across and get that? + +00:28:14 I mean, you could do that with Powerfly or TY and the CLI as well, but you know, + +00:28:18 thinking more mypy is like kind of being real strict on some of that stuff. + +00:28:21 Personally, I do tend to annotate almost all like function parameters and if class + +00:28:26 attributes, if I make a class, sometimes it's not as necessary, like you don't + +00:28:30 really need to annotate your tests perhaps, or you don't need to annotate + +00:28:33 internal functions as much, but for my own coding, I usually find it helpful + +00:28:36 to do that. + +00:28:37 But sometimes I see people annotating even local variables where it's very + +00:28:41 obvious to type check if the type is and they can just infer it reliably, and then + +00:28:46 it really just adds noise and you shouldn't do it. + +00:28:48 Yeah, exactly. + +00:28:48 If you've got a function that's annotated with a return value and you say x equals + +00:28:53 the function call, then the type checkers can infer that and you're just causing + +00:28:57 extra noise, I guess. + +00:29:00 So suppose you all want to change something. + +00:29:03 What's the process of actually going through and making some changes? + +00:29:08 Mostly sort of two levels of this. + +00:29:10 Well, maybe there's even three levels. + +00:29:11 The first one is if it's something that's so small that's just like a wording + +00:29:15 clarification or something, we just make a PR to the repo and a few of us + +00:29:19 look at it and we change it. + +00:29:21 The second level is when it's sort of a smaller change that doesn't really + +00:29:25 introduce a new feature and then we make a PR to the typing spec repo and + +00:29:30 we formally have all of us sign off on it. + +00:29:32 That's what happens like what Carl mentioned earlier of the final change in data + +00:29:37 classes. + +00:29:37 It's had to merge this one, yeah, add Carl. + +00:29:41 I love it. + +00:29:42 This repo itself doesn't have anything. + +00:29:44 It's the Python typing repo where the decisions are made. + +00:29:48 The typing council just has like some documentation. + +00:29:52 Yeah. + +00:29:52 And then the third level is peps, like really big new changes. + +00:29:55 You can still write a PEP and then we make a recommendation and the steering + +00:29:59 council makes a decision eventually. + +00:30:00 So if I wanted to suggest something, I could come up here and I could open up + +00:30:05 an issue, maybe start a conversation on typing, Python slash typing. + +00:30:10 And you can make a pull request to change the spec. + +00:30:12 Okay. + +00:30:12 And so the pull request would not be to change the code, like how Python + +00:30:17 maybe interprets code that has this new thing, but to suggest that the spec + +00:30:22 has it, which then would start a process that ultimately might make CPython + +00:30:26 understand it, right? + +00:30:27 Well, CPython itself probably doesn't do anything with it. + +00:30:30 I guess most of the things that go directly here are changes to how to interpret + +00:30:33 things that are already in CPython. + +00:30:37 This portion of Talk Python To Me is brought to you by us. + +00:30:40 I want to tell you about a course I put together that I'm really proud of, Agentic + +00:30:45 AI Programming for Python Developers. + +00:30:48 I know a lot of you have tried AI coding tools and come away thinking, well, this is + +00:30:52 more hassle than it's worth. + +00:30:54 And honestly, all the vibe coding hype isn't helping. + +00:30:58 It's a smokescreen that hides what these tools can actually do. + +00:31:01 This course is about agentic engineering, applying real software engineering + +00:31:06 practices with AI that understands your entire code database, runs your tests, and + +00:31:11 builds complete features under your direction. + +00:31:14 I've used these techniques to ship real production code across Talk Python, + +00:31:18 Python Bytes, and completely new projects. + +00:31:21 I migrated an entire CSS framework on a production site with thousands of lines of HTML + +00:31:25 in a few hours, twice. + +00:31:28 I shipped a new search feature with caching and async in under an hour. + +00:31:32 I built a complete CLI tool for Talk Python from scratch, tested, documented, and + +00:31:38 published to PyPI in an afternoon. + +00:31:41 Real projects, real production code, both Greenfield and legacy. + +00:31:46 No toy demos, no fluff. + +00:31:48 I'll show you the guardrails, the planning techniques, and the workflows that + +00:31:51 turn AI into a genuine engineering partner. + +00:31:54 Check it out at talkpython.fm slash agentic dash engineering. + +00:31:58 That's talkpython.fm slash agentic dash engineering. + +00:32:01 The link is in your podcast player's show notes. + +00:32:05 If it's adding something new, it will usually need to go through a path, + +00:32:09 except if it's something very small. + +00:32:10 Let's talk about that for a minute. + +00:32:11 We got two representatives here of the newer breed of tools. + +00:32:17 What's the story for inconsistencies across interpretations of the spec? + +00:32:23 I know that there's slight variations. + +00:32:25 I've also, you know, not putting either of you on the spotlight, but like using, say, + +00:32:31 PyCharm and like writing code. + +00:32:33 So it's type checkers happy and then using something like Pyright. + +00:32:37 And so it has a real different interpretation of what you should let slide and what you + +00:32:42 shouldn't. + +00:32:43 I feel like Pyright is more, much more focused on like enforcing the nullability and or + +00:32:49 the lack thereof. + +00:32:50 And it warns of inconsistencies there where PyCharm doesn't seem to care as much. + +00:32:54 I don't know which one I like better, but I know they're different. + +00:32:57 And if I write code in one that I open the other, I'm like, huh, why is it + +00:33:00 upset? + +00:33:00 It seemed like it was fine. + +00:33:02 How do you all navigate this? + +00:33:03 Yeah. + +00:33:03 One thing useful to say about the spec there is that the spec covers a lot of + +00:33:07 things. + +00:33:08 In particular, it tends to cover sort of the details of more advanced type + +00:33:11 system features. + +00:33:12 But there's a lot of very fundamental stuff about how a type checker works + +00:33:17 in terms of how it does inference and how it does type narrowing. + +00:33:21 And even in some cases, like you mentioned, you know, what it chooses to + +00:33:24 emit errors on that isn't really covered by the spec, partly maybe because we + +00:33:29 haven't gotten to it and also partly intentionally in that there may be room in + +00:33:35 some of those cases for different type checkers to work differently if they're + +00:33:38 serving different needs. + +00:33:39 Like if PyCharm is primarily concerned about being a useful kind of IDE and providing go-to + +00:33:45 definition and that sort of thing, maybe emitting lots of warnings or errors + +00:33:50 and all kinds of things where your code might be doing something wrong isn't + +00:33:54 as high a priority. + +00:33:54 And another type checker might have a different priority. + +00:33:57 One thing I do want to mention is that it may not seem like it, but things are already + +00:34:04 much better than they used to be. + +00:34:07 Like previously, I worked on a different type checker called PyType. + +00:34:11 And at that time, it was, you know, sort of the Wild West. + +00:34:14 Like we want to know how other type checkers like do something. + +00:34:18 Well, you know, like open up the mypy playground, open up the PyRite playground, + +00:34:22 see what tells you. + +00:34:25 Now we at least have spec and conformance tests. + +00:34:28 Yeah, that's really cool. + +00:34:29 How much would you say that your two type checkers maybe bring in mypy as well? + +00:34:36 Like how much do they agree versus disagree? + +00:34:38 You know, like you only see the differences. + +00:34:40 You don't see in which ways that they are the same as a consumer of them so much, + +00:34:44 right? + +00:34:44 You're like, why is this one squiggly when it wasn't squiggly before? + +00:34:47 But how similar or different are they? + +00:34:49 I don't know how we would quantify that. + +00:34:51 I think there's a lot that is the same just because it's based on how Python actually + +00:34:55 works. + +00:34:56 We're both trying to model the same language and then there's certainly also + +00:34:59 plenty of differences or things that we handle differently. + +00:35:02 So Rebecca, do you have a better way to quantify that? + +00:35:04 Yeah, I agree. + +00:35:05 it's hard to quantify, I suppose, talk a bit abstractly about various type + +00:35:11 checkers philosophies. + +00:35:13 With Pyrefly, we really try to do a lot of type inference. + +00:35:17 So that's a way in which we intentionally diverge a bit from mypy. + +00:35:21 But other than that deliberate decision, if we see ways in which we are accidentally + +00:35:25 different, we do try to fix that because otherwise people would have a hard time running + +00:35:31 multiple type checkers or migrating. + +00:35:33 Yeah, differences obviously cause pain for users who are using multiple type + +00:35:37 checkers or writing libraries that need to support multiple type checkers. + +00:35:40 So like Rebecca said, it's like if we are different from other type checkers, we want + +00:35:45 to be sure that there's a good reason for that difference. + +00:35:47 The difference should be because of philosophical choice, not just you happen to have + +00:35:52 chosen slightly differently, right? + +00:35:54 Yeah, and it's not just people who run different type checkers. + +00:35:58 Like you pointed out, Carl, a lot of times it is if I have a library and + +00:36:02 then different people want to consume that library, then their type checker + +00:36:06 may or may not warn them about how my library declares its types and so on. + +00:36:11 I'll give you a real quick example. + +00:36:14 I have a, I can't remember which one it was, I have three or four different open + +00:36:19 source libraries that I've created that somehow work with creating, basically passing + +00:36:24 data to templates in web apps, right? + +00:36:27 So one is like I want to use the Chameleon web template framework, but with fast + +00:36:31 API or with Flask or there's some other variations like partials and so on. + +00:36:35 I can't remember which one, but it doesn't really matter. + +00:36:37 One of them decorated a Flask. + +00:36:40 I think it was a Flask, especially makes it irrelevant. + +00:36:43 A Flask endpoint and PyRite was really upset. + +00:36:47 Like the error message filled the entire page of how it was inconsistent with + +00:36:53 what it expected for the definition of the Flask view method. + +00:36:56 I'm like, no one is going to call this. + +00:36:58 Like what does it even matter what this type is? + +00:37:01 It still runs fine. + +00:37:02 The runtime is fine. + +00:37:03 You know, it's no problem with this decorator. + +00:37:05 It worked fine, but something about the way that the Flask at get returned + +00:37:11 the type versus what my thing returned varied in like a really slight way. + +00:37:16 I didn't care, but somebody was using some editor that used PyRite and they're like, you + +00:37:21 have to help fix this. + +00:37:22 I can't take all these warnings. + +00:37:23 They're huge and they're everywhere. + +00:37:25 Like, okay, I'll go fix it. + +00:37:27 Right. + +00:37:27 And I went and I put way more effort than was justified into a function type + +00:37:32 that no one ever calls just to make the errors on some type checker I didn't use go + +00:37:38 away. + +00:37:38 Right. + +00:37:39 And that's the kind of thing where it becomes just a headache. + +00:37:41 I don't know. + +00:37:42 I wish I remember. + +00:37:42 I probably got that written down in an issue somebody filed, but it was, it was + +00:37:46 a gnarly error. + +00:37:47 And, or if you're working on an open source project, you know, you can't make + +00:37:50 everybody use the same editor that wants to contribute on a big project. + +00:37:55 And so you might run into this variation as well. + +00:37:57 So there's a lot of cases. + +00:37:58 Yeah. + +00:37:58 It can be really difficult to make these decisions about what kind of, what + +00:38:03 sorts of errors people want their type checker to catch or what's too pedantic. + +00:38:08 You want your type checker to catch non-obvious errors, not just the obvious + +00:38:12 ones that you probably would have seen by looking at the code yourself. + +00:38:15 But then there'll be cases where somebody says, well, I don't care. + +00:38:18 That's too pedantic. + +00:38:19 And it is difficult to make everyone happy. + +00:38:21 Who decides what the right signature of a flask view and point should be like if you + +00:38:27 the framework can call it. + +00:38:28 It should be okay. + +00:38:29 There's not. + +00:38:30 Just because it had a decorator before, that doesn't mean that's the official structure. + +00:38:33 But anyway, I do think one of the bigger philosophical differences has to do around this + +00:38:39 concept of nullability. + +00:38:41 Do you guys call it nullability or none ability? + +00:38:44 Like nullability comes from the other languages. + +00:38:46 And by that, I mean, I can specify that I have an integer. + +00:38:49 And in the Python type system, it cannot be set to none, even though in the runtime it can. + +00:38:55 It has to be a concrete int type unless you make it a optional int or an + +00:39:00 int pipe none or one of those type things, right? + +00:39:03 And how strong that gets enforced seems to be one of the biggest difference + +00:39:07 of opinions that I've seen around. + +00:39:10 Like, how do you all think about that? + +00:39:11 That's interesting to me that that's your experience because my experience has been that + +00:39:16 that's actually an area where everyone seems to agree as far as I can tell that these are is + +00:39:21 an important source of bugs and it's better to catch them. + +00:39:23 So I think all of the type checkers, maybe you said PyCharm doesn't. + +00:39:26 I don't think PyCharm does that. + +00:39:28 I'm pretty sure it doesn't because I agree that it's an important thing to check, but it's + +00:39:34 also a point of a lot of friction. + +00:39:37 And by that, I mean, let's suppose I'm going to have a class that I need to create an + +00:39:42 instance of and then put values into. + +00:39:45 And I know once I put the values into it, let's say it has a user ID, I know for certain that + +00:39:50 that's going to be an integer, right? + +00:39:53 So I'd like to say user ID colon int because everywhere I use that object later, if it's a + +00:39:58 function that takes an int and I specify it as optional int, I will get a type check warning + +00:40:03 every single call site when I try to pass that. + +00:40:07 But I know from the semantics of the behavior that it's going to always be an int + +00:40:12 unless it's not initialized, right? + +00:40:14 And like in this short period where I want to create it. + +00:40:17 So I can't set the type to int. + +00:40:19 I have to set the optional int until I've loaded it. + +00:40:21 And, but there's like this, I don't know, that's, that's the part where I see a lot of it + +00:40:25 show up is inconsistencies and then warnings all over the place. + +00:40:29 So I'm like, well, but that function is actually checking if it's none and it'll return null, + +00:40:35 you know, none or something like that. + +00:40:36 So I totally agree with you. + +00:40:38 It's just somewhere I've seen the most inconsistencies across maybe PyCharm versus + +00:40:43 others. + +00:40:44 mypy also has a legacy mode for not checking none things called non-strict optional. + +00:40:50 We're trying to get rid of that from mypy because yeah, strict optional, like being + +00:40:54 strict about it is the more sensible thing to do. + +00:40:56 But it's possible that you've seen that too. + +00:40:59 Yeah, I agree. + +00:40:59 So what you mentioned is maybe sort of a special case of the case where you pass + +00:41:03 something to a class and there's initialization that changes the types. + +00:41:06 Doesn't necessarily have to deal with none. + +00:41:08 It could also just be like the attribute doesn't exist at all beforehand + +00:41:10 or something. + +00:41:11 Yeah, we don't have a good solution for that. + +00:41:13 Maybe there's room for something to support that use case better. + +00:41:17 I don't know what it would look like. + +00:41:18 In some cases, there's ways you can, these things can sometimes nudge you towards a + +00:41:22 different design that is actually safer and will avoid errors. + +00:41:26 Like in the kind of case you're talking about, you know, is it actually necessary + +00:41:30 that an uninitialized object and an initialized one are represented by the + +00:41:34 same type? + +00:41:34 Or is there a way to adjust the API so that those are actually different types than you + +00:41:38 solve the problem and your code is safer or so? + +00:41:41 I'm thinking like you submit a web form and before you parse it, you've got to create the instance + +00:41:46 to set the values. + +00:41:47 And I don't know. + +00:41:48 It's not worth diving into, but I do find this differentiation between like + +00:41:52 the strict enforcement of none versus not none. + +00:41:55 I think it's powerful and I do think you all are right that it does catch a lot of + +00:41:58 errors. + +00:41:58 It's just, it's just a difference and it's just an interesting, interesting + +00:42:02 choice. + +00:42:02 But I didn't get a concrete answer from the official counsel. + +00:42:06 Nullable or noneable? + +00:42:08 What is it? + +00:42:09 I feel like you just don't really even talk about it as a term mostly. + +00:42:12 It's, yeah, none is special in the type system in like how you represent + +00:42:16 it, but it's not really special in other ways. + +00:42:19 You don't have a term for int pipe none? + +00:42:22 Int or none. + +00:42:22 Historically, the term was optional, although I think that term has problems and + +00:42:26 we're sort of moving away from it because specifically one problem is that optional can mean + +00:42:33 you don't have to pass it in, like I say, as a function parameter. + +00:42:37 Let's talk a little bit about TypeShed. + +00:42:40 I think TypeShed is pretty interesting. + +00:42:41 Maybe people don't know too much about it. + +00:42:44 So I'm sure you all are familiar with this project that you can basically add + +00:42:48 type information that the libraries didn't bother to include for you, right? + +00:42:53 What are thoughts on TypeShed? + +00:42:54 How much do you all lean on this to sort of round out missing types? + +00:42:58 There are two parts to TypeShed, right? + +00:43:00 There's the standard library type stubs, which I think are invaluable. + +00:43:06 Like all the type checkers use those. + +00:43:08 And I mean, will the standard library itself ever have inline types? + +00:43:12 Who knows? + +00:43:12 This might be around forever. + +00:43:14 And then there are also the third party stubs. + +00:43:18 And I think that's what you're describing. + +00:43:20 They're libraries that for whatever reason don't ship with stubs themselves. + +00:43:24 Those are in TypeShed. + +00:43:26 And I think it's been like for a while, there's sort of been a question of like what + +00:43:31 we want to do with like TypeShed's third party stubs, right? + +00:43:35 Because like ideally like libraries would ship with their own types, but there + +00:43:39 are various obstacles to that. + +00:43:41 The obstacles that I know of used to be like, we want this to run on Python 2 and Python 3. + +00:43:47 Or we want it to run on Python 3.3 still. + +00:43:50 But it's been a long time since any non-type supporting version of Python was a real, + +00:43:57 you know, a supported type of thing, right? + +00:43:59 I mean, even 3.9 became deprecated. + +00:44:02 So on one hand, I feel like they could be merged in, but there's also a lot of + +00:44:07 other areas that are maybe we don't, they're not common, right? + +00:44:12 Like other libraries, like pick some, let's say Pyramid. + +00:44:16 I don't think the Pyramid web framework really ever got types added to it. + +00:44:20 Somebody could go and create a typeshed stub or a types underscore pyramid you could + +00:44:26 pip install and then we'll add the types, right? + +00:44:28 I certainly see it being really valuable for third party things that are just + +00:44:31 not going to get the type attention they need. + +00:44:33 Yeah, I think typeshed is great. + +00:44:35 I've spent a lot of time on improving it. + +00:44:37 As Rebecca said, especially with a standard library, it's irreplaceable. + +00:44:41 For third party libraries, I think it's become less needed over time. + +00:44:45 It used to be that very few third party libraries had any types. + +00:44:49 Now that's obviously changed. + +00:44:51 A lot of libraries ship their own types, but still there are quite a few types + +00:44:56 of libraries left where there aren't inline types and typeshed can provide + +00:45:00 useful types. + +00:45:01 I think typeshed also provides a service because it has a really great framework for testing + +00:45:05 these types. + +00:45:06 We have tools like step tests and various type checkers that help to make sure these types are + +00:45:12 good and meet a high standard. + +00:45:15 So yeah, I think they're still useful for many libraries. + +00:45:17 Yeah, I was just looking at the types dash flask and I guess it must be, must + +00:45:23 be gone because now type flask must have it internally. + +00:45:26 So it's kind of an interim sort of thing. + +00:45:28 That's pretty cool. + +00:45:29 In general, typeshed has the policy that we remove the snaps from typeshed if + +00:45:32 they are in the library itself. + +00:45:34 I find these super valuable because if there's a library I want to work with and it just doesn't + +00:45:39 have types for whatever reason, you can install stuff from here and all of a sudden your editor's + +00:45:43 way happier. + +00:45:44 I mean, I know we, you all agreed on like the API boundaries and I did as well. + +00:45:50 It's like that's one of the really cool things. + +00:45:51 The other thing that really makes me excited about types is if I hit dot in my editor, I get + +00:45:57 a meaningful list of real information about what I'm working on. + +00:46:00 And so adding, adding these types of things are pretty interesting. + +00:46:04 I want to ask you all about sort of these rogue, rogue tools that do stuff + +00:46:10 with Python typing that maybe y'all didn't intend. + +00:46:12 Like we all mentioned Pydantic, we've got Typer and FastAPI, but even a little farther out + +00:46:19 there is a bear type. + +00:46:21 Are you familiar with bear type? + +00:46:22 Yeah. + +00:46:22 Bear type's interesting. + +00:46:24 You can import, they have fun. + +00:46:27 They have fun with their, their import names and stuff. + +00:46:31 But basically you can put a, either a decorator onto some sort of call site or + +00:46:36 something, or you can just do it to an entire package or entire modules rather. + +00:46:41 So just run bear type dot claw import bear type this. + +00:46:45 And then it actually turns into runtime type checks. + +00:46:49 Good idea, bad idea. + +00:46:51 Interesting. + +00:46:52 What do you all think? + +00:46:53 So un-Pythonic, you won't even open the webpage. + +00:46:56 People should feel free to write whatever code helps them make like better + +00:47:00 software. + +00:47:01 I haven't really used bear type much myself, but it's clearly useful for + +00:47:04 some people. + +00:47:05 And I think generally in designing a type system, we should try to accommodate all users who + +00:47:09 do useful things to the type system. + +00:47:10 And that includes things like Pydentic or bear type. + +00:47:13 It's pretty fast. + +00:47:14 It's not as big of a hit as you would, you would imagine. + +00:47:18 They, let me see, what are they, somewhere they had a really fun, fun saying in here, but here + +00:47:23 we go. + +00:47:24 Bear type brings Rust C++ inspired zero cost abstractions into the lawless world of dynamically + +00:47:29 typed Python by enforcing type safety at the granular level of functions and + +00:47:33 methods against type hints standardized by the Python community. + +00:47:37 order one, non-amortized worst case time with negligible constant factors. + +00:47:41 Like, how about that? + +00:47:43 No, it's a pretty neat library and it's pretty fast. + +00:47:45 I honestly, I've never used it in production. + +00:47:47 Having type hints and squigglies in the editors or in the linters has always + +00:47:53 been enough for me, but I can see using this if it's really critical and you're + +00:47:57 having issues, maybe you want to catch some runtime errors. + +00:48:00 I don't know. + +00:48:00 It's not quite an endorsement, but it sure is like a, huh, that's different. + +00:48:04 I definitely think that the extent to which type checkers may have a different understanding of + +00:48:11 your code from what happens at runtime and there isn't anything built in to catch + +00:48:15 that is sometimes a pain point. + +00:48:18 And so the desire to have your type annotations, to find out at runtime if your + +00:48:22 type annotations are telling you a lie, it makes a lot of sense why people would like + +00:48:26 that. + +00:48:27 I mean, it's something used to from other languages where the type checker is built + +00:48:30 into the compiler. + +00:48:30 Right. + +00:48:31 You get like a runtime type cast, like cannot. + +00:48:33 We kind of get that if you try to parse a thing, you know, like put the int + +00:48:38 param around a string and it's not really a parsable as an int. + +00:48:42 But for like real type information, I think personally I would use this as like I might apply types, + +00:48:48 type checking to a module for debugging and development for a minute and just see + +00:48:53 what happens and then turn it back off. + +00:48:55 You know, I don't know that I'd just ship production code that way. + +00:48:58 But anyway, I got a couple more questions. + +00:49:00 We're getting shorter on time here. + +00:49:02 What was one of the harder questions that you all, harder decisions you all had to address on the + +00:49:09 council? + +00:49:09 I think the most contentious one was PEP 724, if I remember the number correctly. + +00:49:15 It was around a feature called type guards, which is around user-defined type + +00:49:20 narrowing functions. + +00:49:22 Initially, they find that in a way that later was found to be somewhat problematic + +00:49:25 and we basically came up with a better set of proposed semantics that maybe we should have done + +00:49:30 the first time around. + +00:49:33 And what this PEP proposed, and as you can see, I sponsored it, is that we + +00:49:37 basically changed the meaning of the existing type guards under certain conditions. + +00:49:41 What is a type guard? + +00:49:42 A type guard is a function, like there's a good example there, the isiterable. + +00:49:46 It's a function that tells you how to narrow something. + +00:49:50 So in this example, there's an isiterable type guard, which narrows an object to + +00:49:55 an iterable of anything. + +00:49:56 And then inside the func there, you can see if isiterable file, it knows + +00:50:01 that it's an iterable. + +00:50:04 And in this case, yeah, I guess it just narrows exactly to iterable any. + +00:50:08 That's one of the ways that type guards works. + +00:50:10 I see. + +00:50:11 And the type that returns kind of communicates to the type system, like that this + +00:50:15 function ensures that this, the thing that came in as an arbitrary object, in fact, + +00:50:21 is one of these. + +00:50:22 Okay. + +00:50:22 Interesting. + +00:50:23 Yeah. + +00:50:23 So that was a tricky one, huh? + +00:50:25 Any other standout, Rebecca or Carl? + +00:50:27 Well, the current discussion around what is the meaning of a float annotation, still + +00:50:32 unresolved, contentious topic. + +00:50:34 Okay. + +00:50:34 Gotcha. + +00:50:35 I mean, this on PEP724 is also what came to my mind immediately as well, because + +00:50:42 this was challenging discussion because, you know, like there were very conflicting considerations at + +00:50:48 play. + +00:50:49 It's like, what semantics did we want in the long term? + +00:50:52 And what did we want the type system to look like, you know, say like 10 years + +00:50:56 from now versus backwards compatibility and what the migration story would look + +00:51:00 like? + +00:51:01 It was quite tricky. + +00:51:02 I guess that's something you will always have to be cognizant of is like every + +00:51:07 change, even if it's an improvement, has to justify the fact that now you have + +00:51:12 challenges with the version history over time. + +00:51:16 I'm thinking like dict of string comma int with a capital or lowercase d. + +00:51:22 I've got people, I did a YouTube video showing something with the lowercase + +00:51:26 version because I was using something super modern like Python 3.11. + +00:51:30 And I got a message like, hey, Michael, you don't know how to write Python. + +00:51:34 Your code is broken. + +00:51:35 This code that you wrote just doesn't even run. + +00:51:38 I don't know how this is. + +00:51:39 I'm like, what version of Python is in? + +00:51:40 3.8. + +00:51:41 Nope. + +00:51:41 You can't use 3.8 for that. + +00:51:43 You're going to need to get a newer one. + +00:51:44 You know what I mean? + +00:51:44 But like those are complexities that get added to Python because of that. + +00:51:49 Now you've got two ways to specify what a dict is. + +00:51:52 There's a preferred new way, but there's still the old way and it just, it sort of piles + +00:51:57 up. + +00:51:58 And it's very hard to ever actually get rid of the old way, even if there's no good + +00:52:00 reason to use it anymore. + +00:52:01 Exactly. + +00:52:02 Once it's there, it's written in ink pretty much, right? + +00:52:05 Like we have five or six different ways to format strings. + +00:52:08 Maybe with t-strings at six now. + +00:52:10 They're all going to still be there, right? + +00:52:12 So every change, every decision you make is not just a matter of, is it the + +00:52:16 right decision, right? + +00:52:17 It's the, is it worth it? + +00:52:19 I'm sure. + +00:52:20 Yeah. + +00:52:21 I don't know. + +00:52:21 How do you all balance that? + +00:52:22 Like that's tricky. + +00:52:23 With things like the dict chains, at least we sort of know we're moving towards + +00:52:27 better states and there's two things, but they mean exactly the same thing. + +00:52:31 So the confusion is not as bad. + +00:52:34 The problem with type cards is that we're going to change how some existing thing + +00:52:38 works, like what it meant. + +00:52:39 And I think there are good reasons that maybe that's the right thing to do, but the, + +00:52:44 it would also have been pretty confusing for people if their existing types suddenly started + +00:52:47 meaning something completely different. + +00:52:49 Absolutely. + +00:52:50 Hence float. + +00:52:51 Okay. + +00:52:51 What's coming next? + +00:52:53 Like 3.15, 3.16, do you all have things that are in the works that you think are going to come + +00:52:58 or debates that are brewing? + +00:53:01 For 3.15, the, there's a type dict feature coming, extra items. + +00:53:05 you can already use it in tapping extensions if you want to use it, but it will be in + +00:53:10 CPath as of 2.15. + +00:53:12 It's likely we'll have a small thing I added called disjoint basis, which is very technical, + +00:53:17 but helps type narrowing in some cases. + +00:53:19 Yeah. + +00:53:19 I think those are the things that are likely to make it. + +00:53:22 There's, we can only speculate about what else people can propose. + +00:53:25 We're sort of bound by what people actually write up as peps. + +00:53:28 We have to wait for Google to write the peps before we can approve them. + +00:53:30 I think there's PEP 747 for type form, which I think is not, I think we recommended + +00:53:35 its acceptance, but I don't think the steering council accepted it yet or it hasn't + +00:53:39 been accepted formally. + +00:53:40 I think that's on their plate. + +00:53:41 Yeah. + +00:53:42 Yeah. + +00:53:42 So that's also pretty likely to make it into 3.15. + +00:53:45 This is one example of a case that will be pretty useful to people working + +00:53:49 with type annotations at runtime because it'll allow you to, it's sort of a meta + +00:53:53 thing where you can annotate, have a type annotation that describes another type + +00:53:58 annotation. + +00:53:59 So that's useful if you're, if you're writing code that works with type annotations. + +00:54:03 Make the peidantics of the world very happy. + +00:54:05 I am actually pretty excited about type form because, you know, I feel like there's + +00:54:10 a gap and we can express in the type system and we're good. + +00:54:14 And there are cases in the existing type system, like for instance, the cast + +00:54:18 function and some other cases where something takes any type expression as an + +00:54:24 argument. + +00:54:24 We actually don't have a good way to annotate that today and this will provide a nice + +00:54:28 way to express that. + +00:54:28 Let me pull up one thing really quick. + +00:54:31 Quick shout out to Will McGuggan here. + +00:54:32 He just released his Toad project, which is the new, takes textual and rich + +00:54:38 and all that kind of stuff and applies it to like, what if we had a better cloud code + +00:54:42 type of experience, which is pretty interesting. + +00:54:44 So the reason I'm bringing this up is, you know, final question. + +00:54:47 What about, do you all even worry about the role of like how types interact with AI + +00:54:54 and agentic coding tools? + +00:54:56 I know that if you have some code that has types on it and you give it to an AI, it's + +00:55:02 got a better chance of understanding what's happening than if you give it purely + +00:55:05 untyped code and say, tell me about this, right? + +00:55:08 It doesn't even know necessarily what's being passed to it. + +00:55:10 But is that anything you'll think about or what are your thoughts on this? + +00:55:14 Certainly think about it some. + +00:55:15 I mean, I think overall my feeling is that these coding agents seem to do better than + +00:55:21 more kind of the tighter feedback loops you can give them to work with. + +00:55:26 And so typing is another useful source of feedback where you can say, add type + +00:55:29 annotations and make sure the type checker passes and seems so it still seems pretty useful + +00:55:34 in that world. + +00:55:35 Yeah, you can easily write rules that say when you are done on anything I've asked + +00:55:39 you to do, always run ty or always run Pyrefly and make sure that there's no more, no + +00:55:45 new errors or at least or ideally zero errors, right? + +00:55:48 But nothing has been introduced. + +00:55:50 Yeah, pretty interesting. + +00:55:51 You other folks, Rebecca, Yela? + +00:55:53 Yeah, I guess in general, I think typing will remain useful for AI. + +00:55:57 We are probably rapidly moving to a world where a large proportion of all code + +00:56:00 is written by AI. + +00:56:02 Not everybody likes that opinion, Yael. + +00:56:03 Not everybody likes that. + +00:56:04 I guess I, maybe my current line of work makes me think that's more likely to happen. + +00:56:08 You don't have to like the fact it's going to be night soon, but it's going to be + +00:56:11 night. + +00:56:12 You know what I mean? + +00:56:12 Like there's the, I just think there's so much momentum on this, at least in + +00:56:16 the next five years or something, that it's going to be really, it's, it's a + +00:56:19 truth of how many people are writing code regardless of whether individuals want to + +00:56:23 write code that way. + +00:56:24 You know what I mean? + +00:56:25 So I think it's a consideration. + +00:56:26 Yeah. + +00:56:26 Yeah. + +00:56:26 I forgot that you worked at OpenAI. + +00:56:28 So of course, I should pull up a codex example or something, shouldn't I? + +00:56:32 Yeah. + +00:56:33 Codex is great. + +00:56:33 Use it. + +00:56:34 No, but I mean, do you have any further insight into like the role of types + +00:56:38 and coding agents? + +00:56:40 I know that's not exactly what you work on, right? + +00:56:41 You're more at the lower. + +00:56:42 As Carl said, types can also be helpful for AI to understand code better and to + +00:56:47 get a better feedback loop. + +00:56:48 I feel like the very big AI, the board is like humans. + +00:56:51 And if AI makes, sorry, if typing makes humans better at writing understanding + +00:56:55 this code, they're probably also big AI better at it. + +00:56:58 It's the locality of information. + +00:57:00 You can read the function and know everything you need to know about what's going + +00:57:03 into it without bouncing around and trying to understand blocks of code and like what might've been + +00:57:08 created that's getting impacted. + +00:57:09 It's good for humans and also good for AI. + +00:57:12 Right. + +00:57:12 Rebecca. + +00:57:12 Because I don't have much need to, and I'll say I am maybe a little more skeptical + +00:57:18 than most of my coworkers about the quality of AI generated code. + +00:57:23 But that means I think I am particularly gung-ho about, you know, like get AI to use + +00:57:29 types, type checkers, keep the guardrails there. + +00:57:33 I think that'll be very important. + +00:57:34 Yeah, if it's going to make a mistake, don't let it at least like make the type + +00:57:38 system become disconnected and not working. + +00:57:40 Like it has to keep the types hanging together as a minimum bar, right? + +00:57:44 And you can easily set that up as an automation. + +00:57:46 Yeah. + +00:57:47 Interesting to think of it as guardrails rather than an accelerant. + +00:57:50 But yeah, 100% it is. + +00:57:52 All right, folks. + +00:57:52 I think that's it for all the time that we have. + +00:57:55 Thank you. + +00:57:56 Thank you for being here. + +00:57:57 Final thoughts before we go. + +00:57:59 Carl, I'll let you go first. + +00:58:00 Final thoughts for people out there interested in Python typing. + +00:58:02 Yeah. + +00:58:02 Well, first of all, thanks for having us on the podcast. + +00:58:05 Really appreciate it. + +00:58:06 And thoughts for people out there. + +00:58:08 I guess if you have ideas of how Python typing could be improved, discuss.python. + +00:58:14 Python.org is a good place to bring up ideas and discuss them with the typing community and see + +00:58:19 what positive changes we can make. + +00:58:22 Rebecca. + +00:58:22 First, thank you, Michael. + +00:58:24 This is a lot of fun. + +00:58:26 Last thoughts? + +00:58:28 Hey, so, you know, like we'll look at the typing council and sometimes think, + +00:58:32 oh, you know, like the PEP has like governance in its name, but I wouldn't say we're really + +00:58:37 a governing body or anything. + +00:58:39 It's like people who are using the type system, like users, they're the ones who come up with, + +00:58:45 you know, like all the best ideas, propose them, discuss them. + +00:58:48 And we're just here to sort of be like, hey, you know, like we have some background and like + +00:58:55 how type checkers work and maybe some of the history and we can provide input. + +00:58:58 But I just encourage people, if there's a change you want to see in the type system, you know, + +00:59:03 like propose it yourself. + +00:59:05 It's very friendly and open community. + +00:59:07 Yeah. + +00:59:08 Now people who have listened know a little bit more about how to do so. + +00:59:11 Awesome. + +00:59:12 Thanks. + +00:59:12 Jale, final word. + +00:59:13 Yeah. + +00:59:13 Also, again, thank you for having me here. + +00:59:16 It's been great talking to all of you. + +00:59:17 I guess what I want to say is similar to what Karin Rebecca just said. + +00:59:20 If you want to have something changed to the type system, I'd really encourage you to sign up + +00:59:24 for discuss.python.org, make a proposal, go through the process. + +00:59:27 It can be somewhat daunting, perhaps, especially if you have to create a PEP, but it is + +00:59:32 doable. + +00:59:32 There are several recent typing PEPs have just been community members who saw something they + +00:59:37 wanted to improve, proposed a PEP, and saw it to completion. + +00:59:41 If there's something you want to see in the type system, then you can do + +00:59:44 it too. + +00:59:44 Thank you all for keeping Python typing going strong. + +00:59:48 Really appreciate your time on the show. + +00:59:49 See you all later. + +00:59:50 Bye. + +00:59:51 Bye. + +00:59:51 This has been another episode of Talk Python To Me. + +00:59:55 Thank you to our sponsors. + +00:59:56 Be sure to check out what they're offering. + +00:59:57 It really helps support the show. + +00:59:59 Take some stress out of your life. + +01:00:01 Get notified immediately about errors and performance issues in your web or mobile + +01:00:06 applications with Sentry. + +01:00:07 Just visit talkpython.fm slash Sentry and get started for free. + +01:00:12 Be sure to use our code talkpython26. + +01:00:15 That's talkpython, the numbers two, six, all one word. + +01:00:19 And it's brought to you by our Agentic AI programming for Python course. + +01:00:24 Learn to work with AI that actually understands your code base and build real + +01:00:28 features. + +01:00:29 Visit talkpython.fm slash agentic dash AI. + +01:00:33 If you or your team needs to learn Python, we have over 270 hours of beginner + +01:00:38 and advanced courses on topics ranging from complete beginners to async code, Flask, Django, + +01:00:44 HTMX, and even LLMs. + +01:00:46 Best of all, there's no subscription in sight. + +01:00:48 Browse the catalog at talkpython.fm. + +01:00:51 And if you're not already subscribed to the show on your favorite podcast + +01:00:54 player, what are you waiting for? + +01:00:56 Just search for Python in your podcast player. + +01:00:58 We should be right at the top. + +01:00:59 If you enjoy that geeky rap song, you can download the full track. + +01:01:03 The link is actually in your podcast blur show notes. + +01:01:05 This is your host, Michael Kennedy. + +01:01:07 Thank you so much for listening. + +01:01:08 I really appreciate it. + +01:01:09 I'll see you next time. + +01:01:18 Voyager. + +01:01:19 Voyager. + +01:01:21 And we ready to roll Upgrading the code No fear of getting whole We tapped into that modern vibe + +01:01:33 Overcame each storm Talk Python To Me IceSync is the norm + diff --git a/transcripts/539-catching-up-with-the-python-typing-council.vtt b/transcripts/539-catching-up-with-the-python-typing-council.vtt new file mode 100644 index 0000000..ccae9b1 --- /dev/null +++ b/transcripts/539-catching-up-with-the-python-typing-council.vtt @@ -0,0 +1,9868 @@ +WEBVTT + +00:00:00.000 --> 00:00:02.000 +You're adding type-ins to your Python code. + +00:00:02.240 --> 00:00:04.560 +Your editor is happy, autocomplete is working great, + +00:00:05.000 --> 00:00:06.260 +but then you switch tools + +00:00:06.260 --> 00:00:08.540 +and suddenly there are red squigglies everywhere. + +00:00:09.220 --> 00:00:12.160 +Who decides what a float annotation actually means + +00:00:12.160 --> 00:00:15.680 +or whether passing none where an int is expected + +00:00:15.680 --> 00:00:16.700 +should be an error? + +00:00:17.180 --> 00:00:19.260 +It turns out there's a five-person council + +00:00:19.260 --> 00:00:21.260 +dedicated to exactly these questions + +00:00:21.260 --> 00:00:25.100 +and two brand new Rust-based type checkers + +00:00:25.100 --> 00:00:26.680 +are raising the bar as well. + +00:00:27.200 --> 00:00:29.620 +On this episode, I sit down with three of the members + +00:00:29.620 --> 00:00:31.300 +of the Python Typing Council, + +00:00:31.620 --> 00:00:34.380 +Yela Zylstra, Rebecca Chen, and Carl Meyer + +00:00:34.380 --> 00:00:37.040 +to learn about how the type system is governed, + +00:00:37.520 --> 00:00:40.240 +where the spec and type checkers agree and disagree, + +00:00:40.720 --> 00:00:42.880 +and I get the council's official advice + +00:00:42.880 --> 00:00:45.240 +on how much typing is just enough. + +00:00:45.800 --> 00:00:48.640 +This is Talk Python To Me, episode 539, + +00:00:49.260 --> 00:00:51.860 +recorded January 27th, 2026. + +00:00:53.460 --> 00:00:56.300 +Talk Python To Me, yeah, we ready to roll. + +00:00:56.540 --> 00:00:59.160 +Upgrading the code, no fear of getting old. + +00:00:59.160 --> 00:01:01.780 +Async in the air, new frameworks in sight, + +00:01:01.920 --> 00:01:02.960 +geeky rap on deck. + +00:01:03.260 --> 00:01:04.960 +Quart crew, it's time to unite. + +00:01:05.060 --> 00:01:07.980 +We started in Pyramid, cruising old school lanes, + +00:01:08.260 --> 00:01:09.760 +had that stable base, yeah, sir. + +00:01:09.760 --> 00:01:10.840 +Welcome to Talk Python To Me, + +00:01:10.940 --> 00:01:13.180 +the number one Python podcast for developers + +00:01:13.180 --> 00:01:14.220 +and data scientists. + +00:01:14.640 --> 00:01:16.080 +This is your host, Michael Kennedy. + +00:01:16.420 --> 00:01:20.060 +I'm a PSF fellow who's been coding for over 25 years. + +00:01:20.600 --> 00:01:21.740 +Let's connect on social media. + +00:01:22.060 --> 00:01:23.940 +You'll find me and Talk Python on Mastodon, + +00:01:24.040 --> 00:01:25.220 +Bluesky, and X. + +00:01:25.400 --> 00:01:27.380 +The social links are all in your show notes. + +00:01:27.380 --> 00:01:30.420 +You can find over 10 years of past episodes + +00:01:30.420 --> 00:01:31.620 +at talkpython.fm. + +00:01:31.720 --> 00:01:33.380 +And if you want to be part of the show, + +00:01:33.500 --> 00:01:35.140 +you can join our recording live streams. + +00:01:35.300 --> 00:01:37.260 +That's right, we live stream the raw, + +00:01:37.380 --> 00:01:39.360 +uncut version of each episode on YouTube. + +00:01:39.680 --> 00:01:42.400 +Just visit talkpython.fm/youtube + +00:01:42.400 --> 00:01:44.360 +to see the schedule of upcoming events. + +00:01:44.520 --> 00:01:46.160 +Be sure to subscribe there and press the bell + +00:01:46.160 --> 00:01:48.180 +so you'll get notified anytime we're recording. + +00:01:48.720 --> 00:01:50.600 +This episode is brought to you by Sentry. + +00:01:50.720 --> 00:01:52.240 +Don't let those errors go unnoticed. + +00:01:52.400 --> 00:01:53.980 +Use Sentry like we do here at Talk Python. + +00:01:53.980 --> 00:01:57.360 +Sign up at talkpython.fm/sentry. + +00:01:57.720 --> 00:01:58.880 +And it's brought to you by + +00:01:58.880 --> 00:02:01.920 +our Agentic AI programming for Python course. + +00:02:02.340 --> 00:02:05.360 +Learn to work with AI that actually understands your code base + +00:02:05.360 --> 00:02:06.900 +and build real features. + +00:02:07.400 --> 00:02:10.800 +Visit talkpython.fm/agentic-ai. + +00:02:11.940 --> 00:02:14.320 +Della, Rebecca, and Carl, + +00:02:14.780 --> 00:02:17.940 +welcome to all of you type-loving Pythonistas. + +00:02:18.480 --> 00:02:19.780 +Awesome to have you here on the show. + +00:02:20.100 --> 00:02:20.860 +Thanks for being here. + +00:02:20.860 --> 00:02:22.700 +We're going to talk Python typing, + +00:02:23.100 --> 00:02:26.920 +especially from the perspective of the Python Typing Council, + +00:02:27.380 --> 00:02:30.700 +which honestly, I am a huge fan of Python typing. + +00:02:30.840 --> 00:02:33.160 +It's still something I learned about not too long ago. + +00:02:33.300 --> 00:02:35.900 +So I'm going to be learning along with everyone else, + +00:02:35.980 --> 00:02:38.020 +what it is you all do and so on. + +00:02:38.100 --> 00:02:41.540 +So I'm really excited to be diving into this. + +00:02:41.700 --> 00:02:44.420 +I think since types came to Python, + +00:02:44.540 --> 00:02:46.620 +I think it's made it a little bit more rigorous, + +00:02:46.920 --> 00:02:48.520 +you know, for all those people out there like, + +00:02:48.520 --> 00:02:51.820 +oh, it's not a real language without any form of static typing. + +00:02:51.900 --> 00:02:52.960 +We can't use it on real projects. + +00:02:53.060 --> 00:02:53.920 +I don't know how true that was, + +00:02:54.020 --> 00:02:56.680 +but certainly it's less true now. + +00:02:56.880 --> 00:02:58.380 +You know, you can pick per project. + +00:02:58.480 --> 00:02:59.220 +So it's super cool. + +00:02:59.480 --> 00:03:00.700 +Before we get into all that, though, + +00:03:00.920 --> 00:03:03.000 +let's just go around for a quick introductions. + +00:03:03.880 --> 00:03:04.860 +Jala, welcome to the show. + +00:03:05.160 --> 00:03:05.920 +Awesome to have you here. + +00:03:06.180 --> 00:03:06.580 +Who are you? + +00:03:06.700 --> 00:03:07.200 +Hi, yeah. + +00:03:07.540 --> 00:03:09.260 +Jala, I've been on the Python Typing Council + +00:03:09.260 --> 00:03:09.980 +since the beginning. + +00:03:10.240 --> 00:03:12.040 +I helped set it up a couple of years ago. + +00:03:12.380 --> 00:03:13.860 +Outside of the typing work, + +00:03:13.860 --> 00:03:15.060 +I currently work at OpenAI, + +00:03:15.320 --> 00:03:17.120 +where I work on developer productivity, + +00:03:17.480 --> 00:03:19.480 +which means things like running CI for people + +00:03:19.480 --> 00:03:22.760 +and helping, generally helping people be productive. + +00:03:23.240 --> 00:03:25.440 +I've been working with Python for more than a decade. + +00:03:25.940 --> 00:03:28.880 +Started out because my previous job was mostly in Python + +00:03:28.880 --> 00:03:31.440 +and then got more and more involved with the language. + +00:03:32.240 --> 00:03:33.500 +So let me get this right. + +00:03:33.620 --> 00:03:36.440 +At OpenAI, you're basically helping developers there + +00:03:36.440 --> 00:03:37.700 +have better developer tooling + +00:03:37.700 --> 00:03:41.360 +and common packages and workflows and stuff like that. + +00:03:41.520 --> 00:03:42.380 +Is that kind of the story? + +00:03:42.380 --> 00:03:42.940 +That's right. + +00:03:43.380 --> 00:03:45.060 +Mostly around things that happen in CI, + +00:03:45.600 --> 00:03:46.940 +like running tests efficiently, + +00:03:47.520 --> 00:03:48.800 +figuring out the right tests to run, + +00:03:49.140 --> 00:03:50.540 +getting the right CI workers out. + +00:03:50.620 --> 00:03:51.460 +That sounds very exciting. + +00:03:51.720 --> 00:03:56.180 +Right in the epicenter of all the big tech stuff these days. + +00:03:56.260 --> 00:03:56.700 +Super cool. + +00:03:57.140 --> 00:03:58.080 +Rebecca, hello. + +00:03:58.200 --> 00:03:58.400 +Welcome. + +00:03:58.640 --> 00:04:00.000 +Hey, thanks for having me. + +00:04:00.100 --> 00:04:00.540 +I'm Rebecca. + +00:04:01.040 --> 00:04:05.340 +I've been on the Typing Council also for about three years, + +00:04:05.700 --> 00:04:07.140 +I think, since the, less than three, + +00:04:07.440 --> 00:04:08.120 +since the beginning. + +00:04:08.120 --> 00:04:13.400 +But my day job, I work at Meta on Python typing, + +00:04:13.760 --> 00:04:16.820 +gone Pyrefly, which is a new type checker + +00:04:16.820 --> 00:04:19.580 +and language server written in Rust, still in beta. + +00:04:20.080 --> 00:04:23.360 +Prior to that, I was at Google for eight years, + +00:04:23.460 --> 00:04:24.800 +also on the Python team. + +00:04:24.920 --> 00:04:26.280 +I just, I really like Python. + +00:04:26.600 --> 00:04:27.460 +Yeah, super neat. + +00:04:27.460 --> 00:04:29.680 +I'm a big fan of both Pyrefly and ty, + +00:04:30.180 --> 00:04:33.040 +which will both have representatives here, I know. + +00:04:33.720 --> 00:04:36.840 +And I think it's just a super exciting time for Python types. + +00:04:37.000 --> 00:04:38.620 +And certainly that's one of the reasons. + +00:04:38.940 --> 00:04:39.620 +So very cool. + +00:04:39.940 --> 00:04:40.700 +Carl, welcome back. + +00:04:40.980 --> 00:04:41.300 +Thank you. + +00:04:41.380 --> 00:04:41.900 +Great to be here. + +00:04:42.260 --> 00:04:43.800 +Yeah, Carl Meyer. + +00:04:43.800 --> 00:04:47.800 +I currently work at Astral, where I work on ty, + +00:04:48.080 --> 00:04:50.760 +which is a Python type checker and language server + +00:04:50.760 --> 00:04:52.780 +written in Rust, also in beta. + +00:04:53.320 --> 00:04:55.600 +And yeah, I guess, how did I get into typing? + +00:04:55.900 --> 00:04:58.680 +Or I've been on the Typing Council, not since the beginning. + +00:04:59.120 --> 00:05:01.100 +I think it's been a year and a half. + +00:05:01.720 --> 00:05:07.280 +And yeah, I got into Python typing at the time in 2016, 2017. + +00:05:07.280 --> 00:05:08.880 +I was working at Instagram. + +00:05:08.880 --> 00:05:12.520 +And that was in the very early days of Python typing. + +00:05:12.880 --> 00:05:17.020 +The PEP44, PEP43, the early Python typing PEPs + +00:05:17.020 --> 00:05:19.160 +had recently come out within the last couple of years. + +00:05:19.500 --> 00:05:21.440 +And one of the co-authors of some of those PEPs, + +00:05:21.580 --> 00:05:24.180 +Lukash Lange, was actually sitting at a desk + +00:05:24.180 --> 00:05:25.360 +right next to me at the time. + +00:05:25.860 --> 00:05:27.060 +And at some point, we started to think + +00:05:27.060 --> 00:05:28.840 +that we should try this Python typing stuff + +00:05:28.840 --> 00:05:30.960 +on the Instagram server monolith. + +00:05:31.400 --> 00:05:33.880 +And so I took that on as a side project. + +00:05:33.880 --> 00:05:35.680 +And then it eventually became the main project. + +00:05:35.680 --> 00:05:37.400 +And then it took like three years. + +00:05:37.400 --> 00:05:39.880 +So a lot of Python typing experience there. + +00:05:40.060 --> 00:05:40.780 +There absolutely is. + +00:05:40.880 --> 00:05:42.600 +You know, I think a couple of things + +00:05:42.600 --> 00:05:43.600 +I'd like to touch on there. + +00:05:43.840 --> 00:05:47.520 +First of all, Instagram, is it maybe the biggest Django + +00:05:47.520 --> 00:05:48.420 +deployment in the world? + +00:05:48.520 --> 00:05:50.020 +It's certainly one of the bigger ones, right? + +00:05:50.080 --> 00:05:51.640 +And I think a lot of people don't necessarily + +00:05:51.640 --> 00:05:55.320 +know that a core chunk of Instagram is actually Python, right? + +00:05:55.440 --> 00:05:57.380 +I mean, I don't know if we have any way to know + +00:05:57.380 --> 00:06:00.400 +how big the Django deployments in the wild might be. + +00:06:00.460 --> 00:06:01.480 +But it's certainly a big one. + +00:06:01.560 --> 00:06:02.500 +Yeah, it's definitely a big one. + +00:06:02.500 --> 00:06:06.680 +There were some talks about dismissing the garbage collector + +00:06:06.680 --> 00:06:08.500 +from the Instagram folks. + +00:06:08.660 --> 00:06:11.040 +That wasn't you giving the talk, but at PyCon. + +00:06:11.160 --> 00:06:12.120 +So that was pretty interesting. + +00:06:12.320 --> 00:06:15.980 +But I think actually that work that you're talking about, + +00:06:16.040 --> 00:06:19.480 +especially with Lukash, really kind of opened + +00:06:19.480 --> 00:06:22.520 +a lot of people's eyes about Python typing, right? + +00:06:22.560 --> 00:06:25.600 +He gave a couple of PyCon talks, showed, you know, + +00:06:25.640 --> 00:06:28.440 +real metrics of how much of the code base is typed, + +00:06:28.640 --> 00:06:31.780 +how much it's changed, like error detection, + +00:06:32.400 --> 00:06:33.180 +that kind of stuff. + +00:06:33.180 --> 00:06:36.760 +So let me ask you, do you feel like it would be different? + +00:06:36.860 --> 00:06:39.500 +Would it have gone different now if tools like TY + +00:06:39.500 --> 00:06:41.840 +and Pyrefly existed back then? + +00:06:42.180 --> 00:06:44.440 +Is Python typing different now than it was then? + +00:06:44.560 --> 00:06:45.320 +Certainly, yes. + +00:06:45.420 --> 00:06:47.260 +I mean, there's been, the type system has gotten + +00:06:47.260 --> 00:06:49.320 +more complex over time. + +00:06:49.320 --> 00:06:52.240 +So it is both more expressive and more complex. + +00:06:52.740 --> 00:06:55.880 +And yeah, we have more type checkers available now. + +00:06:56.460 --> 00:06:58.080 +I do agree that it's more complicated, + +00:06:58.080 --> 00:06:59.540 +and I don't know how to feel about that. + +00:06:59.540 --> 00:07:03.320 +It is more expressive, but I feel like it's starting to get, + +00:07:03.860 --> 00:07:06.880 +I mean, we're not at C++ ATL, + +00:07:06.960 --> 00:07:08.860 +like templates of templates of templates, + +00:07:09.060 --> 00:07:12.200 +but still, it's getting more serious. + +00:07:12.340 --> 00:07:14.080 +But I guess one of the really nice parts + +00:07:14.080 --> 00:07:16.820 +is that you can just take as much as you want + +00:07:16.820 --> 00:07:18.940 +of the complexity, and you can just leave the rest, right? + +00:07:19.220 --> 00:07:21.060 +That's part of the magic of Python typing, + +00:07:21.300 --> 00:07:23.460 +is that it's a gradual typing system. + +00:07:23.820 --> 00:07:25.780 +That's a choice people get to make. + +00:07:25.780 --> 00:07:28.400 +It can be none, it can be quite a bit, + +00:07:28.800 --> 00:07:30.220 +and anywhere in between. + +00:07:30.560 --> 00:07:32.780 +So I guess that's probably one of the decisions. + +00:07:33.220 --> 00:07:34.480 +Let's talk about the typing council. + +00:07:34.700 --> 00:07:37.380 +So when did the typing council come along, + +00:07:37.800 --> 00:07:40.640 +and did the typing council exist to create + +00:07:40.640 --> 00:07:42.320 +all of these PEPs and make this happen, + +00:07:42.400 --> 00:07:43.460 +or was it afterwards? + +00:07:43.720 --> 00:07:45.380 +Like, what's the history of the typing council + +00:07:45.380 --> 00:07:46.780 +and its purpose, folks? + +00:07:47.160 --> 00:07:47.460 +We'll run it. + +00:07:47.520 --> 00:07:49.340 +Yeah, it postdates most of the PEPs. + +00:07:49.480 --> 00:07:51.260 +So initially, the text system was created + +00:07:51.260 --> 00:07:52.740 +just through the regular PEP process. + +00:07:52.740 --> 00:07:54.220 +It means that something gets submitted, + +00:07:54.800 --> 00:07:57.440 +first still to Guido as the BDFL, + +00:07:57.860 --> 00:07:58.900 +later to the steering council. + +00:07:59.360 --> 00:08:01.520 +Meant that it's very hard to make changes + +00:08:01.520 --> 00:08:02.740 +to, like, this specification. + +00:08:03.020 --> 00:08:04.860 +Like, anytime you want to make change something + +00:08:04.860 --> 00:08:06.580 +about how the type systems would work, + +00:08:06.900 --> 00:08:08.520 +we had to go through this PEP procedure, + +00:08:09.040 --> 00:08:09.940 +talk to the steering council, + +00:08:09.940 --> 00:08:11.020 +who are very busy people, + +00:08:11.160 --> 00:08:12.980 +who deal with a lot of other aspects + +00:08:12.980 --> 00:08:14.100 +of the language other than typing. + +00:08:14.640 --> 00:08:17.360 +So Shantanu and I came up with this idea + +00:08:17.360 --> 00:08:19.000 +of creating a separate council + +00:08:19.000 --> 00:08:21.040 +to specifically in charge of typing, + +00:08:21.040 --> 00:08:22.840 +that would be in a specification + +00:08:22.840 --> 00:08:25.080 +where we can make small changes ourselves + +00:08:25.080 --> 00:08:26.000 +without having to go through + +00:08:26.000 --> 00:08:27.100 +this whole PEP process. + +00:08:27.520 --> 00:08:29.340 +And this way, when all the type checkers + +00:08:29.340 --> 00:08:31.360 +agreed that something needs to go a certain way + +00:08:31.360 --> 00:08:32.940 +and it's not exactly what's in the PEPs, + +00:08:33.420 --> 00:08:36.520 +we can change it and have a place to record that + +00:08:36.520 --> 00:08:37.680 +and people can refer to it + +00:08:37.680 --> 00:08:39.720 +and new type checkers can also try + +00:08:39.720 --> 00:08:40.740 +to follow those decisions. + +00:08:41.060 --> 00:08:41.480 +Very interesting. + +00:08:41.680 --> 00:08:43.540 +I didn't realize that it was sort of, + +00:08:43.540 --> 00:08:46.360 +was there to allow for small changes + +00:08:46.360 --> 00:08:47.920 +to be made to make that much easier. + +00:08:48.040 --> 00:08:49.140 +But of course that makes sense + +00:08:49.140 --> 00:08:50.360 +because the PEP process is, + +00:08:50.540 --> 00:08:52.360 +it's pretty serious and drawn out. + +00:08:52.420 --> 00:08:55.560 +And we've seen even small language changes + +00:08:55.560 --> 00:08:58.080 +have quite passionate folks, + +00:08:58.180 --> 00:08:59.800 +I guess we should say. + +00:09:00.140 --> 00:09:01.620 +So yeah, yeah, very nice. + +00:09:02.020 --> 00:09:04.180 +Do you have any examples of the types of changes + +00:09:04.180 --> 00:09:05.080 +that y'all have, + +00:09:05.420 --> 00:09:06.600 +that have happened over the years + +00:09:06.600 --> 00:09:09.020 +that maybe were typing council only? + +00:09:09.140 --> 00:09:09.980 +One was the specification + +00:09:09.980 --> 00:09:11.660 +where how overloads work, + +00:09:11.780 --> 00:09:13.940 +which is perhaps not really a small change, + +00:09:14.040 --> 00:09:16.040 +but one of the most complicated features + +00:09:16.040 --> 00:09:17.940 +in the type system really is the overloads, + +00:09:17.940 --> 00:09:19.680 +where you can give multiple signatures + +00:09:19.680 --> 00:09:20.540 +for a function + +00:09:20.540 --> 00:09:22.820 +and type checkers sort of select + +00:09:22.820 --> 00:09:25.080 +which one to use based on the arguments + +00:09:25.080 --> 00:09:26.280 +when the function is called. + +00:09:26.820 --> 00:09:28.080 +And when it was initially created, + +00:09:28.460 --> 00:09:29.400 +from what I recall, + +00:09:29.640 --> 00:09:31.120 +there just wasn't really a specification. + +00:09:31.540 --> 00:09:33.480 +It's just like you use the signatures + +00:09:33.480 --> 00:09:34.960 +in a way that makes sense. + +00:09:35.420 --> 00:09:36.200 +And Eric Trout, + +00:09:36.280 --> 00:09:37.400 +who's currently on the council, + +00:09:37.620 --> 00:09:40.080 +came up with a pretty specific procedure + +00:09:40.080 --> 00:09:42.000 +for exactly how overload should work + +00:09:42.000 --> 00:09:44.720 +to make it so that type checkers have, + +00:09:45.160 --> 00:09:46.620 +well, sort of users can understand how it works + +00:09:46.620 --> 00:09:47.820 +and sort of type checkers can have something + +00:09:47.820 --> 00:09:49.000 +to work towards to make sure + +00:09:49.000 --> 00:09:51.220 +that they work infinite overloads in the same way. + +00:09:51.220 --> 00:09:52.480 +Maybe a smaller example + +00:09:52.480 --> 00:09:54.100 +that is an example of something + +00:09:54.100 --> 00:09:56.140 +that would have been too small for a pep + +00:09:56.140 --> 00:09:57.660 +and hard to accomplish + +00:09:57.660 --> 00:09:59.780 +before the typing council existed. + +00:10:00.420 --> 00:10:01.600 +And this is actually a change + +00:10:01.600 --> 00:10:03.680 +that I pushed through before the, + +00:10:04.220 --> 00:10:05.440 +before I was on the typing council, + +00:10:05.560 --> 00:10:06.980 +but the typing council approved it, + +00:10:07.240 --> 00:10:08.780 +was a clarification around + +00:10:08.780 --> 00:10:11.620 +the interpretation of data class fields. + +00:10:11.920 --> 00:10:13.860 +If a final annotation is applied + +00:10:13.860 --> 00:10:15.080 +to a data class field, + +00:10:15.680 --> 00:10:16.500 +does that mean, + +00:10:16.980 --> 00:10:18.800 +so if you apply a final annotation + +00:10:18.800 --> 00:10:20.340 +to a regular class attribute, + +00:10:20.880 --> 00:10:22.120 +since it can't be changed, + +00:10:22.260 --> 00:10:24.260 +that implies that it's a class variable. + +00:10:24.620 --> 00:10:25.720 +And there was a question of + +00:10:25.720 --> 00:10:26.900 +if that should be the interpretation + +00:10:26.900 --> 00:10:28.060 +with the data class or not. + +00:10:28.360 --> 00:10:29.360 +So we discussed that + +00:10:29.360 --> 00:10:31.320 +and made a clarification to the spec. + +00:10:31.420 --> 00:10:32.740 +I've never really thought about final + +00:10:32.740 --> 00:10:35.700 +being applied to a class field, + +00:10:35.840 --> 00:10:36.960 +but I've always used them + +00:10:36.960 --> 00:10:38.220 +sort of just for constants. + +00:10:38.520 --> 00:10:38.840 +But, you know, + +00:10:38.860 --> 00:10:40.320 +maybe people out there don't know, + +00:10:40.320 --> 00:10:43.320 +like typing dot final bracket type, + +00:10:43.540 --> 00:10:43.800 +right? + +00:10:44.260 --> 00:10:45.360 +That's kind of the way + +00:10:45.360 --> 00:10:47.160 +you can do constants in Python, right? + +00:10:47.360 --> 00:10:48.440 +Constants for the type checker. + +00:10:48.860 --> 00:10:49.560 +Nothing in the runtime + +00:10:49.560 --> 00:10:50.660 +will stop you from editing it. + +00:10:51.080 --> 00:10:51.400 +That's... + +00:10:51.400 --> 00:10:51.980 +Not there. + +00:10:52.200 --> 00:10:52.820 +Not there. + +00:10:52.920 --> 00:10:54.460 +I have some examples coming up + +00:10:54.460 --> 00:10:55.660 +and I'm interested + +00:10:55.660 --> 00:10:56.620 +to hear your thoughts on it, + +00:10:56.640 --> 00:10:58.160 +but for sure it's, + +00:10:58.520 --> 00:10:59.860 +there is this tension, right? + +00:11:00.000 --> 00:11:01.340 +I mean, I think that's probably + +00:11:01.340 --> 00:11:02.500 +worth touching on as well + +00:11:02.500 --> 00:11:04.140 +is this is a tension for Python + +00:11:04.140 --> 00:11:05.280 +in general is + +00:11:05.280 --> 00:11:07.380 +you can write all the types you want + +00:11:07.380 --> 00:11:09.160 +and then when you run your code, + +00:11:09.280 --> 00:11:10.900 +it just doesn't care. + +00:11:11.000 --> 00:11:11.940 +There's a few instances, + +00:11:12.360 --> 00:11:14.020 +Pydantic, FastAPI, + +00:11:14.160 --> 00:11:14.600 +a few others, + +00:11:14.720 --> 00:11:15.740 +but generally speaking, + +00:11:15.960 --> 00:11:17.200 +it's there for the editors + +00:11:17.200 --> 00:11:18.180 +and the type checkers + +00:11:18.180 --> 00:11:18.760 +and the linters + +00:11:18.760 --> 00:11:20.460 +and not for runtime, right? + +00:11:20.660 --> 00:11:21.280 +Yeah, that's right. + +00:11:21.580 --> 00:11:23.440 +There's many exceptions to that. + +00:11:23.660 --> 00:11:25.160 +There's a product like mypyC, + +00:11:25.380 --> 00:11:26.460 +which comes with mypy + +00:11:26.460 --> 00:11:27.580 +that's used those types + +00:11:27.580 --> 00:11:28.660 +to compile your code + +00:11:28.660 --> 00:11:30.600 +into more efficient machine codes. + +00:11:30.920 --> 00:11:31.520 +Maybe there's going to be + +00:11:31.520 --> 00:11:32.240 +more products like that + +00:11:32.240 --> 00:11:32.660 +in the future. + +00:11:32.760 --> 00:11:33.220 +I don't know. + +00:11:33.540 --> 00:11:34.220 +But yes, in general, + +00:11:34.400 --> 00:11:35.980 +it's separate from the runtime. + +00:11:36.520 --> 00:11:37.060 +Sort of a similar + +00:11:37.060 --> 00:11:38.380 +model to TypeScript + +00:11:38.380 --> 00:11:39.820 +where TypeScript + +00:11:39.820 --> 00:11:40.940 +gets compiled into JavaScript + +00:11:40.940 --> 00:11:42.020 +and types just go away. + +00:11:42.620 --> 00:11:43.260 +Here, we don't do + +00:11:43.260 --> 00:11:44.060 +a compilation step, + +00:11:44.180 --> 00:11:45.120 +but still the same idea + +00:11:45.120 --> 00:11:45.720 +of the types + +00:11:45.720 --> 00:11:47.180 +just not influencing the runtime. + +00:11:47.420 --> 00:11:48.300 +Although we do make them + +00:11:48.300 --> 00:11:49.700 +available for introspection + +00:11:49.700 --> 00:11:51.500 +via done annotations attributes, + +00:11:51.500 --> 00:11:52.740 +which is what has enabled + +00:11:52.740 --> 00:11:54.420 +the projects like Pydantic + +00:11:54.420 --> 00:11:55.940 +and other sort of + +00:11:55.940 --> 00:11:56.840 +runtime checkers + +00:11:56.840 --> 00:11:58.660 +to make use of type annotations + +00:11:58.660 --> 00:11:59.400 +at runtime also. + +00:11:59.540 --> 00:12:00.000 +Yeah, I don't know + +00:12:00.000 --> 00:12:01.060 +if the typing council + +00:12:01.060 --> 00:12:02.020 +was around for this, + +00:12:02.120 --> 00:12:03.340 +but there was proposed, + +00:12:03.480 --> 00:12:03.780 +I don't remember + +00:12:03.780 --> 00:12:04.520 +the exact details, + +00:12:04.620 --> 00:12:05.660 +but something to the effect + +00:12:05.660 --> 00:12:07.420 +of for type checking, + +00:12:07.640 --> 00:12:08.760 +not actually doing + +00:12:08.760 --> 00:12:11.040 +some of the full imports + +00:12:11.040 --> 00:12:12.840 +or something along those lines, + +00:12:13.200 --> 00:12:13.420 +right, + +00:12:13.460 --> 00:12:15.080 +where the runtime behavior + +00:12:15.080 --> 00:12:15.960 +would have made it hard + +00:12:15.960 --> 00:12:17.420 +for tools like Pydantic + +00:12:17.420 --> 00:12:18.940 +and others to get that. + +00:12:19.300 --> 00:12:20.000 +And there was + +00:12:20.000 --> 00:12:21.040 +some kind of compromise, + +00:12:21.180 --> 00:12:21.280 +right? + +00:12:21.280 --> 00:12:22.600 +I don't remember the details here. + +00:12:22.720 --> 00:12:23.140 +Anyone does? + +00:12:23.260 --> 00:12:23.960 +Yeah, what happened was + +00:12:23.960 --> 00:12:24.540 +that there was going + +00:12:24.540 --> 00:12:25.180 +to be a change. + +00:12:25.600 --> 00:12:26.240 +That's what the + +00:12:26.240 --> 00:12:26.980 +from future import + +00:12:26.980 --> 00:12:28.040 +annotations import does, + +00:12:28.100 --> 00:12:29.580 +that changes all annotations + +00:12:29.580 --> 00:12:30.600 +into raw strings. + +00:12:30.600 --> 00:12:32.540 +So the default behavior + +00:12:32.540 --> 00:12:34.260 +before recently + +00:12:34.260 --> 00:12:35.400 +was that annotations + +00:12:35.400 --> 00:12:36.500 +that are regular codes. + +00:12:36.660 --> 00:12:37.760 +If you write devf + +00:12:37.760 --> 00:12:38.700 +return to ints + +00:12:38.700 --> 00:12:39.800 +and you import the module, + +00:12:39.900 --> 00:12:40.720 +it just looks up + +00:12:40.720 --> 00:12:41.400 +the name ints + +00:12:41.400 --> 00:12:41.920 +and puts that + +00:12:41.920 --> 00:12:43.000 +in an annotations dictionary, + +00:12:43.420 --> 00:12:44.780 +which makes introspection easy, + +00:12:44.940 --> 00:12:45.680 +but it made + +00:12:45.680 --> 00:12:47.100 +a test on costs + +00:12:47.100 --> 00:12:48.400 +on performance + +00:12:48.400 --> 00:12:49.520 +because memory usage + +00:12:49.520 --> 00:12:50.260 +sometimes was high + +00:12:50.260 --> 00:12:51.580 +and also made things + +00:12:51.580 --> 00:12:52.720 +harder to use sometimes + +00:12:52.720 --> 00:12:54.340 +because if you use a name + +00:12:54.340 --> 00:12:55.420 +that's not defined yet + +00:12:55.420 --> 00:12:56.020 +at runtime, + +00:12:56.540 --> 00:12:57.120 +you get an error. + +00:12:57.120 --> 00:12:58.160 +That often comes up + +00:12:58.160 --> 00:12:59.700 +if you have like a class + +00:12:59.700 --> 00:13:00.940 +that has a reference + +00:13:00.940 --> 00:13:01.760 +in an adaptation + +00:13:01.760 --> 00:13:02.820 +to the class itself + +00:13:02.820 --> 00:13:03.840 +or circular + +00:13:03.840 --> 00:13:05.420 +dependency classes. + +00:13:05.880 --> 00:13:05.980 +Right. + +00:13:06.080 --> 00:13:07.120 +The circular imports + +00:13:07.120 --> 00:13:08.640 +because you want to say + +00:13:08.640 --> 00:13:09.720 +this class + +00:13:09.720 --> 00:13:11.520 +is created by that thing + +00:13:11.520 --> 00:13:12.760 +and it returns one, + +00:13:12.840 --> 00:13:13.360 +but you know, + +00:13:13.420 --> 00:13:15.020 +somehow you've got + +00:13:15.020 --> 00:13:15.740 +to import the other one + +00:13:15.740 --> 00:13:17.040 +and that's such a hassle. + +00:13:17.420 --> 00:13:17.860 +Yeah, it's, + +00:13:18.220 --> 00:13:18.780 +yeah, even out + +00:13:18.780 --> 00:13:19.900 +in the audience we have, + +00:13:20.280 --> 00:13:20.740 +Tom says, + +00:13:20.820 --> 00:13:21.500 +circular imports. + +00:13:21.620 --> 00:13:22.480 +Oh, yeah, for sure. + +00:13:22.800 --> 00:13:24.260 +What about lazy imports? + +00:13:24.260 --> 00:13:25.420 +Like that just recently + +00:13:25.420 --> 00:13:26.040 +got accepted + +00:13:26.040 --> 00:13:27.000 +and will be in 3.15. + +00:13:27.620 --> 00:13:29.320 +Which I'm super excited about + +00:13:29.320 --> 00:13:29.980 +because I think + +00:13:29.980 --> 00:13:31.140 +it'll make app startup + +00:13:31.140 --> 00:13:32.100 +a lot faster + +00:13:32.100 --> 00:13:33.980 +for many use cases. + +00:13:34.400 --> 00:13:35.280 +But does that have + +00:13:35.280 --> 00:13:35.980 +knock-on effects + +00:13:35.980 --> 00:13:36.480 +for typing? + +00:13:36.800 --> 00:13:37.860 +Not that directly + +00:13:37.860 --> 00:13:39.040 +because I think + +00:13:39.040 --> 00:13:39.740 +for a type checker + +00:13:39.740 --> 00:13:40.380 +lazy imports + +00:13:40.380 --> 00:13:41.160 +mostly just look + +00:13:41.160 --> 00:13:42.320 +like regular imports. + +00:13:42.680 --> 00:13:43.180 +I guess I should + +00:13:43.180 --> 00:13:43.920 +maybe leave that + +00:13:43.920 --> 00:13:44.320 +for the people + +00:13:44.320 --> 00:13:44.740 +who are actually + +00:13:44.740 --> 00:13:46.040 +working on type checkers + +00:13:46.040 --> 00:13:47.060 +and being written right now. + +00:13:47.160 --> 00:13:47.660 +Yeah, Rebecca, + +00:13:47.880 --> 00:13:48.560 +do you see this + +00:13:48.560 --> 00:13:49.600 +making any difference + +00:13:49.600 --> 00:13:50.040 +for you? + +00:13:50.540 --> 00:13:51.060 +Lazy imports? + +00:13:51.060 --> 00:13:51.920 +To be honest, + +00:13:52.080 --> 00:13:53.480 +it's not something + +00:13:53.480 --> 00:13:54.860 +we've looked at + +00:13:54.860 --> 00:13:55.760 +too carefully yet. + +00:13:55.760 --> 00:13:56.760 +3.15 + +00:13:56.760 --> 00:13:57.740 +seems a little + +00:13:57.740 --> 00:13:59.080 +more in the future, + +00:13:59.380 --> 00:14:01.460 +but I don't think + +00:14:01.460 --> 00:14:02.220 +it's likely + +00:14:02.220 --> 00:14:03.680 +to make a huge difference. + +00:14:03.960 --> 00:14:04.180 +Carl? + +00:14:04.340 --> 00:14:05.300 +I've thought about it briefly + +00:14:05.300 --> 00:14:06.260 +and I think that it, + +00:14:06.560 --> 00:14:07.400 +I think the type checkers + +00:14:07.400 --> 00:14:08.540 +really won't need to care. + +00:14:08.920 --> 00:14:09.520 +Maybe there will be + +00:14:09.520 --> 00:14:10.140 +some edge cases + +00:14:10.140 --> 00:14:10.660 +that will come up + +00:14:10.660 --> 00:14:11.420 +that I haven't thought of, + +00:14:11.500 --> 00:14:12.500 +but it shouldn't be a big deal. + +00:14:12.620 --> 00:14:13.380 +Yeah, that's what I thought + +00:14:13.380 --> 00:14:13.920 +as well. + +00:14:14.100 --> 00:14:15.660 +The one variation + +00:14:15.660 --> 00:14:17.080 +that I can certainly see + +00:14:17.080 --> 00:14:18.620 +is if you have a, + +00:14:19.040 --> 00:14:20.260 +if you have something + +00:14:20.260 --> 00:14:21.600 +specified in a type, + +00:14:21.920 --> 00:14:23.580 +like say for a field + +00:14:23.580 --> 00:14:24.260 +of a class + +00:14:24.260 --> 00:14:25.260 +or a Pydantic model + +00:14:25.260 --> 00:14:25.640 +or something + +00:14:25.640 --> 00:14:26.800 +that would otherwise + +00:14:26.800 --> 00:14:27.840 +not trigger + +00:14:27.840 --> 00:14:28.780 +the lazy import + +00:14:28.780 --> 00:14:30.100 +to become imported, + +00:14:30.460 --> 00:14:31.180 +would potentially + +00:14:31.180 --> 00:14:33.220 +having types specify + +00:14:33.220 --> 00:14:34.720 +cause more importing + +00:14:34.720 --> 00:14:36.000 +to happen sooner + +00:14:36.000 --> 00:14:36.640 +in the runtime? + +00:14:36.920 --> 00:14:37.520 +Yeah, there's actually + +00:14:37.520 --> 00:14:38.460 +an issue related to this + +00:14:38.460 --> 00:14:39.980 +that I think we may need + +00:14:39.980 --> 00:14:41.540 +to resolve before 3.15, + +00:14:41.940 --> 00:14:42.840 +but I don't know how yet. + +00:14:43.200 --> 00:14:44.120 +If you use a type + +00:14:44.120 --> 00:14:45.120 +in a data class annotation + +00:14:45.120 --> 00:14:46.260 +that's lazy imported, + +00:14:46.580 --> 00:14:47.220 +actually creating + +00:14:47.220 --> 00:14:47.920 +a data class + +00:14:47.920 --> 00:14:48.820 +will delay + +00:14:48.820 --> 00:14:49.520 +by the import. + +00:14:49.520 --> 00:14:50.660 +It will try to + +00:14:50.660 --> 00:14:52.760 +resolve the import + +00:14:52.760 --> 00:14:54.140 +and actually make it + +00:14:54.140 --> 00:14:54.520 +not lazy. + +00:14:55.000 --> 00:14:55.700 +This is because + +00:14:55.700 --> 00:14:56.640 +data classes + +00:14:56.640 --> 00:14:57.760 +doesn't really need + +00:14:57.760 --> 00:14:58.280 +to look at all + +00:14:58.280 --> 00:14:58.960 +of the annotations + +00:14:58.960 --> 00:14:59.620 +in your class, + +00:14:59.700 --> 00:15:00.300 +but it looks at them + +00:15:00.300 --> 00:15:01.840 +enough to trigger + +00:15:01.840 --> 00:15:03.860 +reification of the import. + +00:15:04.120 --> 00:15:04.740 +I shared this + +00:15:04.740 --> 00:15:05.220 +with some of the people + +00:15:05.220 --> 00:15:06.160 +on the lazy imports team, + +00:15:06.260 --> 00:15:07.460 +but we haven't + +00:15:07.460 --> 00:15:08.200 +yet come up + +00:15:08.200 --> 00:15:09.120 +with a good way around it. + +00:15:09.240 --> 00:15:09.980 +I think this might + +00:15:09.980 --> 00:15:10.980 +end up being + +00:15:10.980 --> 00:15:11.740 +a bit of a food gun, + +00:15:11.820 --> 00:15:12.260 +so I feel like + +00:15:12.260 --> 00:15:12.800 +we should ideally + +00:15:12.800 --> 00:15:13.880 +find a workaround, + +00:15:14.060 --> 00:15:14.720 +but I don't know + +00:15:14.720 --> 00:15:15.420 +what it would be yet. + +00:15:15.580 --> 00:15:15.920 +I don't know + +00:15:15.920 --> 00:15:16.680 +that it's wrong + +00:15:16.680 --> 00:15:17.900 +that it converts it + +00:15:17.900 --> 00:15:18.820 +to an eager import, + +00:15:18.820 --> 00:15:20.000 +which it needs + +00:15:20.000 --> 00:15:20.740 +to know what it is + +00:15:20.740 --> 00:15:21.200 +potentially, + +00:15:21.500 --> 00:15:21.820 +right? + +00:15:22.560 --> 00:15:23.660 +It actually doesn't. + +00:15:24.000 --> 00:15:24.420 +Data classes + +00:15:24.420 --> 00:15:25.240 +just need to know + +00:15:25.240 --> 00:15:26.080 +whether it is classed + +00:15:26.080 --> 00:15:26.600 +for or not. + +00:15:27.400 --> 00:15:27.960 +I think that's + +00:15:27.960 --> 00:15:28.520 +pretty much all. + +00:15:28.660 --> 00:15:28.960 +I guess there's + +00:15:28.960 --> 00:15:29.640 +an init for also, + +00:15:29.780 --> 00:15:30.760 +but it doesn't really + +00:15:30.760 --> 00:15:31.080 +need to know + +00:15:31.080 --> 00:15:31.800 +anything else. + +00:15:32.320 --> 00:15:33.220 +So in theory, + +00:15:33.280 --> 00:15:33.900 +it should be possible + +00:15:33.900 --> 00:15:34.640 +to just say, + +00:15:34.800 --> 00:15:35.700 +hey, it is not classed + +00:15:35.700 --> 00:15:37.240 +for, so don't bother + +00:15:37.240 --> 00:15:37.860 +importing it. + +00:15:38.100 --> 00:15:39.000 +Okay, so that's + +00:15:39.000 --> 00:15:39.660 +for data classes, + +00:15:39.820 --> 00:15:41.320 +but say if I specify + +00:15:41.320 --> 00:15:42.400 +a parameter type + +00:15:42.400 --> 00:15:43.180 +on a function. + +00:15:43.420 --> 00:15:43.880 +Yeah, then it + +00:15:43.880 --> 00:15:44.420 +should be fine. + +00:15:45.100 --> 00:15:46.280 +I guess, again, + +00:15:46.380 --> 00:15:47.540 +unless something is, + +00:15:47.960 --> 00:15:48.800 +if it does annotate, + +00:15:48.820 --> 00:15:49.580 +so if you have + +00:15:49.580 --> 00:15:49.920 +something like + +00:15:49.920 --> 00:15:50.480 +a decorator + +00:15:50.480 --> 00:15:50.960 +that looks at + +00:15:50.960 --> 00:15:51.880 +annotations in your + +00:15:51.880 --> 00:15:52.320 +function, + +00:15:52.660 --> 00:15:53.660 +it might reify + +00:15:53.660 --> 00:15:54.220 +those imports. + +00:15:54.540 --> 00:15:55.120 +There is one other + +00:15:55.120 --> 00:15:56.160 +potentially interesting + +00:15:56.160 --> 00:15:57.200 +thing for type checkers. + +00:15:57.440 --> 00:15:58.040 +It's already + +00:15:58.040 --> 00:15:59.880 +difficult for type checkers + +00:15:59.880 --> 00:16:00.620 +to figure out + +00:16:00.620 --> 00:16:02.480 +when like a submodule + +00:16:02.480 --> 00:16:03.380 +should be considered + +00:16:03.380 --> 00:16:04.380 +to be an attribute + +00:16:04.380 --> 00:16:05.340 +of the parent module + +00:16:05.340 --> 00:16:06.380 +because the way + +00:16:06.380 --> 00:16:07.260 +this happens in Python + +00:16:07.260 --> 00:16:08.440 +is that any import + +00:16:08.440 --> 00:16:09.440 +of a submodule + +00:16:09.440 --> 00:16:10.420 +anywhere will attach + +00:16:10.420 --> 00:16:11.360 +that submodule + +00:16:11.360 --> 00:16:12.280 +as an attribute + +00:16:12.280 --> 00:16:13.220 +on the parent module, + +00:16:13.540 --> 00:16:15.100 +but that at runtime, + +00:16:15.560 --> 00:16:16.560 +that could literally + +00:16:16.560 --> 00:16:17.220 +happen anywhere. + +00:16:17.220 --> 00:16:17.780 +It could happen + +00:16:17.780 --> 00:16:18.540 +in totally unrelated + +00:16:18.540 --> 00:16:19.860 +code outside of the module + +00:16:19.860 --> 00:16:20.680 +and a type checker + +00:16:20.680 --> 00:16:21.380 +probably won't be able + +00:16:21.380 --> 00:16:21.920 +to see that. + +00:16:22.220 --> 00:16:22.820 +So type checkers + +00:16:22.820 --> 00:16:23.740 +already have sort of + +00:16:23.740 --> 00:16:24.800 +complex sets of rules + +00:16:24.800 --> 00:16:26.080 +around where they look + +00:16:26.080 --> 00:16:27.020 +for these submodule + +00:16:27.020 --> 00:16:27.920 +imports and when they + +00:16:27.920 --> 00:16:28.980 +consider a submodule + +00:16:28.980 --> 00:16:30.600 +import to be reliably + +00:16:30.600 --> 00:16:31.740 +happening enough + +00:16:31.740 --> 00:16:32.360 +that it should, + +00:16:32.480 --> 00:16:33.160 +that the type checker + +00:16:33.160 --> 00:16:34.800 +should consider + +00:16:34.800 --> 00:16:35.620 +this submodule + +00:16:35.620 --> 00:16:36.900 +to exist as an attribute. + +00:16:38.080 --> 00:16:38.740 +And lazy imports + +00:16:38.740 --> 00:16:40.200 +may make that even, + +00:16:40.660 --> 00:16:41.500 +we'll add one more + +00:16:41.500 --> 00:16:41.820 +wrinkle + +00:16:41.820 --> 00:16:43.140 +to those + +00:16:43.140 --> 00:16:44.080 +sets of heuristics + +00:16:44.080 --> 00:16:44.820 +in that we'll have + +00:16:44.820 --> 00:16:45.300 +to decide + +00:16:45.300 --> 00:16:46.060 +if you have + +00:16:46.060 --> 00:16:46.760 +a lazy import + +00:16:46.760 --> 00:16:47.540 +of a submodule + +00:16:47.540 --> 00:16:47.940 +and you're done + +00:16:47.940 --> 00:16:48.720 +to init.py, + +00:16:48.860 --> 00:16:49.900 +it's lazy. + +00:16:50.420 --> 00:16:51.480 +So should the type checker + +00:16:51.480 --> 00:16:53.180 +consider that submodule + +00:16:53.180 --> 00:16:54.520 +to be imported + +00:16:54.520 --> 00:16:55.400 +or not be imported? + +00:16:55.860 --> 00:16:57.000 +It'll be another case + +00:16:57.000 --> 00:16:57.500 +where there's no + +00:16:57.500 --> 00:16:58.320 +clear right answer + +00:16:58.320 --> 00:16:58.920 +and we'll just have + +00:16:58.920 --> 00:17:00.020 +to make a decision + +00:17:00.020 --> 00:17:00.700 +one way or the other. + +00:17:02.700 --> 00:17:03.200 +This portion of + +00:17:03.200 --> 00:17:04.060 +Talk Python is brought + +00:17:04.060 --> 00:17:04.840 +to you by Sentry. + +00:17:05.060 --> 00:17:05.980 +I've been using + +00:17:05.980 --> 00:17:06.920 +Sentry personally + +00:17:06.920 --> 00:17:07.920 +on almost every + +00:17:07.920 --> 00:17:09.180 +application and API + +00:17:09.180 --> 00:17:10.000 +that I've built + +00:17:10.000 --> 00:17:10.860 +for Talk Python + +00:17:10.860 --> 00:17:11.420 +and beyond + +00:17:11.420 --> 00:17:13.040 +over the last few years. + +00:17:13.040 --> 00:17:13.900 +They're a core + +00:17:13.900 --> 00:17:14.540 +building block + +00:17:14.540 --> 00:17:14.980 +for keeping + +00:17:14.980 --> 00:17:15.720 +my infrastructure + +00:17:15.720 --> 00:17:16.260 +solid. + +00:17:16.780 --> 00:17:17.300 +They should be + +00:17:17.300 --> 00:17:18.140 +for yours as well. + +00:17:18.300 --> 00:17:18.780 +Here's why. + +00:17:19.420 --> 00:17:20.020 +Sentry doesn't + +00:17:20.020 --> 00:17:20.880 +just catch errors. + +00:17:21.020 --> 00:17:21.900 +It catches all + +00:17:21.900 --> 00:17:22.300 +the stuff + +00:17:22.300 --> 00:17:22.820 +that makes your + +00:17:22.820 --> 00:17:23.700 +app feel broken. + +00:17:23.940 --> 00:17:24.860 +The random slowdown, + +00:17:25.080 --> 00:17:25.680 +the freeze you + +00:17:25.680 --> 00:17:26.320 +can't reproduce, + +00:17:26.840 --> 00:17:27.500 +that bug that + +00:17:27.500 --> 00:17:28.100 +only shows up + +00:17:28.100 --> 00:17:29.080 +once real users + +00:17:29.080 --> 00:17:29.440 +hit it. + +00:17:29.700 --> 00:17:30.220 +And when something + +00:17:30.220 --> 00:17:30.700 +goes wrong, + +00:17:30.900 --> 00:17:31.480 +Sentry gives you + +00:17:31.480 --> 00:17:32.140 +the whole chain + +00:17:32.140 --> 00:17:32.600 +of events + +00:17:32.600 --> 00:17:33.260 +in one place. + +00:17:33.420 --> 00:17:33.740 +Errors, + +00:17:33.920 --> 00:17:34.200 +traces, + +00:17:34.460 --> 00:17:34.940 +replays, + +00:17:35.000 --> 00:17:35.320 +logs, + +00:17:35.740 --> 00:17:36.460 +dots connected. + +00:17:36.800 --> 00:17:37.360 +You can see + +00:17:37.360 --> 00:17:38.320 +what's led to + +00:17:38.320 --> 00:17:38.720 +the issue + +00:17:38.720 --> 00:17:39.320 +without digging + +00:17:39.320 --> 00:17:39.880 +through five + +00:17:39.880 --> 00:17:40.660 +different dashboards. + +00:17:41.400 --> 00:17:41.660 +Sear, + +00:17:42.000 --> 00:17:42.740 +Sentry's AI + +00:17:42.740 --> 00:17:43.520 +debugging agents + +00:17:43.520 --> 00:17:44.480 +builds on this + +00:17:44.480 --> 00:17:44.700 +data, + +00:17:44.820 --> 00:17:45.420 +taking the full + +00:17:45.420 --> 00:17:46.020 +context, + +00:17:46.640 --> 00:17:47.580 +explaining why + +00:17:47.580 --> 00:17:48.120 +the issue + +00:17:48.120 --> 00:17:48.600 +happened, + +00:17:49.140 --> 00:17:49.740 +pointing to + +00:17:49.740 --> 00:17:50.180 +the code + +00:17:50.180 --> 00:17:50.720 +responsible, + +00:17:50.940 --> 00:17:51.560 +drafts a fix, + +00:17:51.640 --> 00:17:52.320 +and even flags + +00:17:52.320 --> 00:17:52.960 +if your PR + +00:17:52.960 --> 00:17:53.620 +is about to + +00:17:53.620 --> 00:17:54.340 +introduce a new + +00:17:54.340 --> 00:17:54.640 +problem. + +00:17:55.020 --> 00:17:55.880 +The workflow + +00:17:55.880 --> 00:17:56.520 +stays simple. + +00:17:56.820 --> 00:17:57.460 +Something breaks, + +00:17:57.920 --> 00:17:58.680 +Sentry alerts you, + +00:17:58.840 --> 00:17:59.360 +the dashboard + +00:17:59.360 --> 00:17:59.860 +shows you + +00:17:59.860 --> 00:18:00.680 +the full context, + +00:18:01.020 --> 00:18:01.680 +Sear helps you + +00:18:01.680 --> 00:18:02.400 +fix it and + +00:18:02.400 --> 00:18:03.380 +catch new issues + +00:18:03.380 --> 00:18:04.100 +before they ship. + +00:18:04.760 --> 00:18:05.400 +It's totally + +00:18:05.400 --> 00:18:06.140 +reasonable to go + +00:18:06.140 --> 00:18:07.160 +from an error + +00:18:07.160 --> 00:18:07.980 +occurred to + +00:18:07.980 --> 00:18:08.880 +fixed in production + +00:18:08.880 --> 00:18:10.000 +in just 10 minutes. + +00:18:10.000 --> 00:18:11.360 +I truly + +00:18:11.360 --> 00:18:11.980 +appreciate the + +00:18:11.980 --> 00:18:12.460 +support that + +00:18:12.460 --> 00:18:12.920 +Sentry has + +00:18:12.920 --> 00:18:13.980 +given me to + +00:18:13.980 --> 00:18:14.720 +help solve my + +00:18:14.720 --> 00:18:15.560 +bugs and issues + +00:18:15.560 --> 00:18:16.340 +in my apps, + +00:18:16.880 --> 00:18:17.620 +especially those + +00:18:17.620 --> 00:18:18.460 +tricky ones that + +00:18:18.460 --> 00:18:19.180 +only appear in + +00:18:19.180 --> 00:18:19.500 +production. + +00:18:19.840 --> 00:18:20.420 +I know you will + +00:18:20.420 --> 00:18:21.020 +too if you try + +00:18:21.020 --> 00:18:21.360 +them out. + +00:18:21.640 --> 00:18:22.520 +So get started + +00:18:22.520 --> 00:18:23.020 +today with + +00:18:23.020 --> 00:18:23.280 +Sentry. + +00:18:23.460 --> 00:18:23.940 +Just visit + +00:18:23.940 --> 00:18:24.940 +talkpython.fm + +00:18:24.940 --> 00:18:25.860 +slash Sentry + +00:18:25.860 --> 00:18:27.080 +and get $100 + +00:18:27.080 --> 00:18:28.560 +in Sentry credits. + +00:18:28.960 --> 00:18:29.520 +Please use that + +00:18:29.520 --> 00:18:29.720 +link. + +00:18:29.780 --> 00:18:30.100 +It's in your + +00:18:30.100 --> 00:18:30.640 +podcast player + +00:18:30.640 --> 00:18:31.080 +show notes. + +00:18:31.280 --> 00:18:31.500 +If you're + +00:18:31.500 --> 00:18:31.920 +signing up + +00:18:31.920 --> 00:18:32.360 +some other + +00:18:32.360 --> 00:18:32.620 +way, + +00:18:32.740 --> 00:18:33.420 +you can use + +00:18:33.420 --> 00:18:33.880 +our code + +00:18:33.880 --> 00:18:35.220 +talkpython26, + +00:18:35.780 --> 00:18:36.520 +all one word, + +00:18:36.660 --> 00:18:37.620 +talkpython26, + +00:18:37.620 --> 00:18:38.880 +to get $100 + +00:18:38.880 --> 00:18:39.720 +in credits. + +00:18:40.380 --> 00:18:40.900 +Thank you to + +00:18:40.900 --> 00:18:41.460 +Sentry for + +00:18:41.460 --> 00:18:41.940 +supporting the + +00:18:41.940 --> 00:18:42.120 +show. + +00:18:43.160 --> 00:18:43.920 +Yeah, there's + +00:18:43.920 --> 00:18:44.540 +some variations + +00:18:44.540 --> 00:18:45.280 +across type + +00:18:45.280 --> 00:18:45.840 +checkers, which + +00:18:45.840 --> 00:18:46.540 +we'll get to + +00:18:46.540 --> 00:18:47.200 +later. + +00:18:47.600 --> 00:18:48.020 +I think, + +00:18:48.140 --> 00:18:48.700 +though, before + +00:18:48.700 --> 00:18:49.440 +we move off + +00:18:49.440 --> 00:18:50.420 +this, there's + +00:18:50.420 --> 00:18:51.960 +actually off + +00:18:51.960 --> 00:18:52.740 +introducing the + +00:18:52.740 --> 00:18:53.400 +typing council. + +00:18:53.560 --> 00:18:54.220 +I think we + +00:18:54.220 --> 00:18:54.680 +should point out + +00:18:54.680 --> 00:18:55.720 +that there's two + +00:18:55.720 --> 00:18:56.380 +other folks who + +00:18:56.380 --> 00:18:56.980 +couldn't be here + +00:18:56.980 --> 00:18:58.240 +who are also on + +00:18:58.240 --> 00:18:58.760 +the typing + +00:18:58.760 --> 00:18:59.560 +council, Eric + +00:18:59.560 --> 00:19:00.280 +Trout and + +00:19:00.280 --> 00:19:02.000 +Yuka Letts, + +00:19:02.380 --> 00:19:03.180 +the docilo? + +00:19:03.480 --> 00:19:03.940 +Sorry, + +00:19:04.100 --> 00:19:04.340 +Yuka. + +00:19:05.480 --> 00:19:06.280 +But I want to + +00:19:06.280 --> 00:19:06.660 +make sure that + +00:19:06.660 --> 00:19:07.440 +we point out + +00:19:07.440 --> 00:19:07.780 +there's actually + +00:19:07.780 --> 00:19:08.320 +five people, + +00:19:08.420 --> 00:19:08.880 +not just the + +00:19:08.880 --> 00:19:09.300 +three of you, + +00:19:09.320 --> 00:19:09.460 +right? + +00:19:09.840 --> 00:19:10.180 +How do you + +00:19:10.180 --> 00:19:10.860 +get on the + +00:19:10.860 --> 00:19:11.440 +council? + +00:19:11.820 --> 00:19:12.220 +Is there an + +00:19:12.220 --> 00:19:12.600 +election? + +00:19:13.020 --> 00:19:13.620 +Do you just + +00:19:13.620 --> 00:19:14.240 +apply? + +00:19:14.980 --> 00:19:15.300 +I think these + +00:19:15.300 --> 00:19:15.760 +are filled by + +00:19:15.760 --> 00:19:16.100 +the members + +00:19:16.100 --> 00:19:16.460 +themselves. + +00:19:16.740 --> 00:19:17.080 +So when + +00:19:17.080 --> 00:19:17.960 +somebody declares + +00:19:17.960 --> 00:19:18.400 +the intention + +00:19:18.400 --> 00:19:19.120 +to leave the + +00:19:19.120 --> 00:19:20.000 +council, we + +00:19:20.000 --> 00:19:20.880 +basically ask for + +00:19:20.880 --> 00:19:21.260 +people who are + +00:19:21.260 --> 00:19:22.080 +interested and + +00:19:22.080 --> 00:19:22.640 +then make a + +00:19:22.640 --> 00:19:23.000 +selection. + +00:19:23.440 --> 00:19:23.860 +Generally, we + +00:19:23.860 --> 00:19:24.640 +try to get + +00:19:24.640 --> 00:19:25.420 +people who have + +00:19:25.420 --> 00:19:26.120 +experience in the + +00:19:26.120 --> 00:19:26.680 +type system. + +00:19:27.140 --> 00:19:27.700 +We try to get a + +00:19:27.700 --> 00:19:28.120 +good cross + +00:19:28.120 --> 00:19:28.780 +representation of + +00:19:28.780 --> 00:19:29.200 +people working + +00:19:29.200 --> 00:19:29.720 +on different + +00:19:29.720 --> 00:19:30.420 +type checkers. + +00:19:30.700 --> 00:19:32.020 +We have Carl + +00:19:32.020 --> 00:19:32.620 +and Rebecca + +00:19:32.620 --> 00:19:33.320 +here who work + +00:19:33.320 --> 00:19:34.280 +on two type + +00:19:34.280 --> 00:19:34.620 +checkers, + +00:19:34.760 --> 00:19:35.160 +TY and + +00:19:35.160 --> 00:19:35.720 +Pyrefly. + +00:19:36.160 --> 00:19:36.820 +Yuka works + +00:19:36.820 --> 00:19:38.800 +on Pyrites, + +00:19:39.040 --> 00:19:39.560 +which are two + +00:19:39.560 --> 00:19:39.980 +of the most + +00:19:39.980 --> 00:19:40.420 +lightly used + +00:19:40.420 --> 00:19:41.060 +type checkers. + +00:19:41.480 --> 00:19:42.400 +So we try to + +00:19:42.400 --> 00:19:43.240 +get representation + +00:19:43.240 --> 00:19:44.340 +of people working + +00:19:44.340 --> 00:19:45.260 +on those parts + +00:19:45.260 --> 00:19:45.840 +of the ecosystem. + +00:19:46.100 --> 00:19:46.360 +That's really + +00:19:46.360 --> 00:19:47.620 +cool that it's + +00:19:47.620 --> 00:19:48.120 +got a bias + +00:19:48.120 --> 00:19:48.680 +towards finding + +00:19:48.680 --> 00:19:49.340 +people actually + +00:19:49.340 --> 00:19:50.300 +doing the work. + +00:19:50.700 --> 00:19:51.320 +So let's talk + +00:19:51.320 --> 00:19:52.020 +about the + +00:19:52.020 --> 00:19:53.380 +specification project + +00:19:53.380 --> 00:19:55.320 +at typing.python.org. + +00:19:55.880 --> 00:19:56.440 +What is this + +00:19:56.440 --> 00:19:56.720 +here? + +00:19:56.720 --> 00:19:57.840 +I'll talk a bit + +00:19:57.840 --> 00:19:58.320 +about it. + +00:19:58.320 --> 00:19:59.560 +I guess it's + +00:19:59.560 --> 00:20:00.880 +a specification + +00:20:00.880 --> 00:20:01.900 +for how the + +00:20:01.900 --> 00:20:02.680 +type system + +00:20:02.680 --> 00:20:03.620 +used to work. + +00:20:03.760 --> 00:20:04.460 +The way it + +00:20:04.460 --> 00:20:05.120 +started was + +00:20:05.120 --> 00:20:05.660 +that, Yela, + +00:20:05.740 --> 00:20:06.320 +you basically + +00:20:06.320 --> 00:20:07.220 +took all the + +00:20:07.220 --> 00:20:07.920 +typing peps + +00:20:07.920 --> 00:20:08.320 +and like + +00:20:08.320 --> 00:20:09.160 +stapled them + +00:20:09.160 --> 00:20:09.860 +together, + +00:20:09.860 --> 00:20:10.620 +right, to + +00:20:10.620 --> 00:20:11.180 +make like + +00:20:11.180 --> 00:20:11.920 +one long + +00:20:11.920 --> 00:20:12.320 +doc. + +00:20:12.460 --> 00:20:13.320 +And since + +00:20:13.320 --> 00:20:13.780 +then, we've + +00:20:13.780 --> 00:20:14.480 +been iterating + +00:20:14.480 --> 00:20:15.080 +on it, + +00:20:15.360 --> 00:20:15.800 +filling in + +00:20:15.800 --> 00:20:16.280 +parts that + +00:20:16.280 --> 00:20:16.800 +were missing + +00:20:16.800 --> 00:20:18.160 +like overload + +00:20:18.160 --> 00:20:19.660 +evaluation and + +00:20:19.660 --> 00:20:20.720 +making other + +00:20:20.720 --> 00:20:21.400 +changes as + +00:20:21.400 --> 00:20:21.560 +well. + +00:20:21.660 --> 00:20:22.200 +Yeah, it's + +00:20:22.200 --> 00:20:22.780 +tricky, right? + +00:20:22.860 --> 00:20:23.340 +Because + +00:20:23.340 --> 00:20:24.800 +traditionally, the + +00:20:24.800 --> 00:20:25.960 +typing system + +00:20:25.960 --> 00:20:26.420 +is kind of + +00:20:26.420 --> 00:20:27.320 +defined across + +00:20:27.320 --> 00:20:28.020 +a series of + +00:20:28.020 --> 00:20:28.360 +peps. + +00:20:28.780 --> 00:20:29.440 +And so what + +00:20:29.440 --> 00:20:30.000 +is the document + +00:20:30.000 --> 00:20:30.480 +that tells you + +00:20:30.480 --> 00:20:31.000 +how it works, + +00:20:31.060 --> 00:20:31.220 +right? + +00:20:31.340 --> 00:20:31.680 +Yeah, that + +00:20:31.680 --> 00:20:32.100 +made it hard + +00:20:32.100 --> 00:20:32.800 +because often + +00:20:32.800 --> 00:20:33.900 +there's peps + +00:20:33.900 --> 00:20:34.500 +built on top + +00:20:34.500 --> 00:20:35.000 +of each other. + +00:20:35.320 --> 00:20:36.440 +So then in + +00:20:36.440 --> 00:20:36.820 +the extreme, + +00:20:37.020 --> 00:20:37.580 +you might see + +00:20:37.580 --> 00:20:38.220 +like one thing + +00:20:38.220 --> 00:20:39.140 +in one pep + +00:20:39.140 --> 00:20:39.640 +and then there's + +00:20:39.640 --> 00:20:40.080 +another pep + +00:20:40.080 --> 00:20:40.480 +that adds + +00:20:40.480 --> 00:20:41.120 +an aspect of + +00:20:41.120 --> 00:20:41.480 +it, another + +00:20:41.480 --> 00:20:41.960 +one that adds + +00:20:41.960 --> 00:20:42.700 +another aspect. + +00:20:43.060 --> 00:20:43.400 +And overall + +00:20:43.400 --> 00:20:43.760 +it makes it + +00:20:43.760 --> 00:20:44.280 +very hard to + +00:20:44.280 --> 00:20:44.540 +follow. + +00:20:44.920 --> 00:20:45.300 +One of the + +00:20:45.300 --> 00:20:45.720 +things I did + +00:20:45.720 --> 00:20:46.260 +recently was + +00:20:46.260 --> 00:20:47.060 +rewrite the + +00:20:47.060 --> 00:20:47.740 +typed dict + +00:20:47.740 --> 00:20:47.740 + + +00:20:54.800 --> 00:20:55.000 +another. + +00:20:55.280 --> 00:20:55.680 +Ended up + +00:20:55.680 --> 00:20:56.240 +rewriting the + +00:20:56.240 --> 00:20:56.720 +whole thing + +00:20:56.720 --> 00:20:57.680 +to basically + +00:20:57.680 --> 00:20:58.100 +put all those + +00:20:58.100 --> 00:20:58.700 +features together + +00:20:58.700 --> 00:20:59.960 +in a coherent + +00:20:59.960 --> 00:21:00.720 +whole rather than + +00:21:00.720 --> 00:21:02.240 +just having them + +00:21:02.240 --> 00:21:03.260 +all copy-pasted + +00:21:03.260 --> 00:21:04.080 +one after the + +00:21:04.080 --> 00:21:04.260 +other. + +00:21:04.400 --> 00:21:05.080 +Okay, so if + +00:21:05.080 --> 00:21:05.580 +somebody really + +00:21:05.580 --> 00:21:06.520 +wants a good + +00:21:06.520 --> 00:21:07.960 +understanding of + +00:21:07.960 --> 00:21:08.480 +the Python + +00:21:08.480 --> 00:21:09.240 +typing system, + +00:21:09.560 --> 00:21:09.960 +they go to + +00:21:09.960 --> 00:21:11.340 +typing.python.org. + +00:21:11.540 --> 00:21:12.020 +You know, one + +00:21:12.020 --> 00:21:12.700 +thing I think + +00:21:12.700 --> 00:21:13.320 +maybe is worth + +00:21:13.320 --> 00:21:13.840 +touching on, + +00:21:13.920 --> 00:21:14.500 +it's just kind + +00:21:14.500 --> 00:21:15.320 +of out of the + +00:21:15.320 --> 00:21:15.960 +blue a bit, + +00:21:16.060 --> 00:21:16.540 +but I think + +00:21:16.540 --> 00:21:17.000 +it's a really + +00:21:17.000 --> 00:21:17.960 +interesting aspect + +00:21:17.960 --> 00:21:19.400 +of the Python + +00:21:19.400 --> 00:21:20.240 +typing system + +00:21:20.240 --> 00:21:21.000 +is the, + +00:21:21.320 --> 00:21:21.560 +what is it + +00:21:21.560 --> 00:21:21.840 +called, the + +00:21:21.840 --> 00:21:22.720 +numerical tower + +00:21:22.720 --> 00:21:23.280 +or the number + +00:21:23.280 --> 00:21:24.340 +tower, where + +00:21:24.340 --> 00:21:25.280 +it's like, if + +00:21:25.280 --> 00:21:25.620 +I have a + +00:21:25.620 --> 00:21:26.400 +number, I + +00:21:26.400 --> 00:21:27.080 +could specify + +00:21:27.080 --> 00:21:27.880 +it as an + +00:21:27.880 --> 00:21:28.800 +int, or I + +00:21:28.800 --> 00:21:29.340 +could specify + +00:21:29.340 --> 00:21:29.920 +it as a + +00:21:29.920 --> 00:21:30.940 +float, and + +00:21:30.940 --> 00:21:31.400 +those kinds + +00:21:31.400 --> 00:21:32.520 +of things, but + +00:21:32.520 --> 00:21:33.420 +do you really + +00:21:33.420 --> 00:21:34.020 +need to say + +00:21:34.020 --> 00:21:34.580 +it's an + +00:21:34.580 --> 00:21:35.860 +int pipe + +00:21:35.860 --> 00:21:36.620 +float, or a + +00:21:36.620 --> 00:21:37.140 +union of + +00:21:37.140 --> 00:21:37.520 +int and + +00:21:37.520 --> 00:21:38.160 +float, if it + +00:21:38.160 --> 00:21:38.500 +could be + +00:21:38.500 --> 00:21:39.080 +either, right? + +00:21:39.200 --> 00:21:40.080 +And the, what + +00:21:40.080 --> 00:21:40.480 +is it called? + +00:21:40.600 --> 00:21:40.980 +It's the + +00:21:40.980 --> 00:21:41.680 +numerical tower, + +00:21:41.780 --> 00:21:42.020 +right? + +00:21:42.240 --> 00:21:42.560 +Yeah, there + +00:21:42.560 --> 00:21:42.920 +are different + +00:21:42.920 --> 00:21:43.560 +towers too. + +00:21:43.760 --> 00:21:44.540 +In Python, there's + +00:21:44.540 --> 00:21:45.100 +also this thing + +00:21:45.100 --> 00:21:45.680 +called a numbers + +00:21:45.680 --> 00:21:46.920 +module that you + +00:21:46.920 --> 00:21:48.080 +have there, that's + +00:21:48.080 --> 00:21:48.920 +just basically + +00:21:48.920 --> 00:21:49.420 +ignored by the + +00:21:49.420 --> 00:21:49.920 +type system. + +00:21:50.140 --> 00:21:50.520 +It's been + +00:21:50.520 --> 00:21:51.100 +useful for some + +00:21:51.100 --> 00:21:51.660 +people, I feel + +00:21:51.660 --> 00:21:52.060 +like in general + +00:21:52.060 --> 00:21:52.980 +that module just + +00:21:52.980 --> 00:21:53.780 +hasn't worked out + +00:21:53.780 --> 00:21:54.320 +very well as + +00:21:54.320 --> 00:21:54.740 +being very + +00:21:54.740 --> 00:21:55.060 +useful. + +00:21:55.260 --> 00:21:55.840 +I think the + +00:21:55.840 --> 00:21:56.700 +interesting aspect + +00:21:56.700 --> 00:21:58.080 +is that you + +00:21:58.080 --> 00:21:58.560 +know, that you + +00:21:58.560 --> 00:21:59.840 +can say it's a + +00:21:59.840 --> 00:22:00.580 +float, and that's + +00:22:00.580 --> 00:22:01.480 +basically equivalent + +00:22:01.480 --> 00:22:02.860 +to union of + +00:22:02.860 --> 00:22:03.540 +integer and + +00:22:03.540 --> 00:22:04.800 +float, and so + +00:22:04.800 --> 00:22:05.280 +on, right? + +00:22:05.340 --> 00:22:05.880 +I think the + +00:22:05.880 --> 00:22:07.520 +typing numbers in + +00:22:07.520 --> 00:22:08.180 +Python is pretty + +00:22:08.180 --> 00:22:08.540 +interesting. + +00:22:08.760 --> 00:22:09.300 +I think every + +00:22:09.300 --> 00:22:10.740 +type checker has + +00:22:10.740 --> 00:22:11.160 +a different + +00:22:11.160 --> 00:22:12.640 +interpretation of + +00:22:12.640 --> 00:22:13.680 +what a float + +00:22:13.680 --> 00:22:14.600 +annotation actually + +00:22:14.600 --> 00:22:14.960 +means. + +00:22:16.020 --> 00:22:17.100 +It's an area of + +00:22:17.100 --> 00:22:18.160 +some lack of + +00:22:18.160 --> 00:22:18.780 +clarity in the + +00:22:18.780 --> 00:22:19.020 +spec. + +00:22:19.140 --> 00:22:19.660 +Yeah, a lot of + +00:22:19.660 --> 00:22:20.340 +contentiousness. + +00:22:20.480 --> 00:22:21.740 +If we could go + +00:22:21.740 --> 00:22:22.960 +back in time, I + +00:22:22.960 --> 00:22:23.760 +would, like, + +00:22:23.780 --> 00:22:24.380 +knowing what I + +00:22:24.380 --> 00:22:24.940 +know now, I + +00:22:24.940 --> 00:22:25.660 +probably advocate + +00:22:25.660 --> 00:22:26.420 +for things being + +00:22:26.420 --> 00:22:27.520 +done differently + +00:22:27.520 --> 00:22:28.540 +because, like, + +00:22:28.700 --> 00:22:29.820 +beginning, you + +00:22:29.820 --> 00:22:30.440 +know, like, there + +00:22:30.440 --> 00:22:31.440 +were multiple + +00:22:31.440 --> 00:22:32.700 +things, like, with + +00:22:32.700 --> 00:22:33.460 +similar flavor. + +00:22:33.660 --> 00:22:34.180 +Like, there was + +00:22:34.180 --> 00:22:35.420 +also one where + +00:22:35.420 --> 00:22:36.480 +you could give + +00:22:36.480 --> 00:22:37.940 +a parameter a + +00:22:37.940 --> 00:22:38.480 +non-none + +00:22:38.480 --> 00:22:39.320 +annotation and + +00:22:39.320 --> 00:22:40.340 +default it to + +00:22:40.340 --> 00:22:40.800 +none for + +00:22:40.800 --> 00:22:41.700 +convenience, and + +00:22:41.700 --> 00:22:42.220 +we've largely, + +00:22:42.360 --> 00:22:43.420 +like, moved away + +00:22:43.420 --> 00:22:44.120 +from stuff like + +00:22:44.120 --> 00:22:44.740 +that in favor of + +00:22:44.740 --> 00:22:45.680 +explicitness. + +00:22:46.060 --> 00:22:46.540 +Yeah, what the + +00:22:46.540 --> 00:22:47.360 +current spec says + +00:22:47.360 --> 00:22:48.300 +is that basically + +00:22:48.300 --> 00:22:48.960 +if you have a + +00:22:48.960 --> 00:22:49.540 +function that takes + +00:22:49.540 --> 00:22:50.640 +a float, you're + +00:22:50.640 --> 00:22:51.320 +also allowed to + +00:22:51.320 --> 00:22:51.940 +pass an int. + +00:22:52.440 --> 00:22:53.480 +That's not + +00:22:53.480 --> 00:22:54.160 +really enough. + +00:22:54.260 --> 00:22:54.820 +It doesn't tell + +00:22:54.820 --> 00:22:55.620 +you how these + +00:22:55.620 --> 00:22:56.320 +things work in + +00:22:56.320 --> 00:22:57.800 +all cases, and + +00:22:57.800 --> 00:22:59.040 +we've had some + +00:22:59.040 --> 00:23:00.140 +attempts to try to + +00:23:00.140 --> 00:23:01.280 +come up with a + +00:23:01.280 --> 00:23:02.220 +way to specify + +00:23:02.220 --> 00:23:03.440 +that special case + +00:23:03.440 --> 00:23:04.160 +in a way that + +00:23:04.160 --> 00:23:05.360 +makes more sense, + +00:23:05.560 --> 00:23:06.020 +at least makes + +00:23:06.020 --> 00:23:06.700 +more sense to me. + +00:23:07.080 --> 00:23:07.740 +It's been very + +00:23:07.740 --> 00:23:08.260 +contentious. + +00:23:08.460 --> 00:23:09.040 +People have very + +00:23:09.040 --> 00:23:09.600 +strong opinions + +00:23:09.600 --> 00:23:10.220 +about this. + +00:23:10.400 --> 00:23:11.340 +I guess non-obvious + +00:23:11.340 --> 00:23:12.180 +is what I'd like to + +00:23:12.180 --> 00:23:12.760 +say, really, + +00:23:12.980 --> 00:23:13.320 +honestly. + +00:23:13.760 --> 00:23:15.000 +So I'd like to get + +00:23:15.000 --> 00:23:16.060 +the official + +00:23:16.060 --> 00:23:17.120 +counsel's thoughts + +00:23:17.120 --> 00:23:17.560 +on this. + +00:23:17.760 --> 00:23:18.620 +When is typing + +00:23:18.620 --> 00:23:19.960 +too much typing? + +00:23:20.440 --> 00:23:21.380 +I made the joke + +00:23:21.380 --> 00:23:22.300 +about C++, + +00:23:22.600 --> 00:23:23.700 +ATL, if you've + +00:23:23.700 --> 00:23:24.200 +ever worked with + +00:23:24.200 --> 00:23:24.820 +that, it's like + +00:23:24.820 --> 00:23:27.100 +a template class + +00:23:27.100 --> 00:23:28.020 +where templated + +00:23:28.020 --> 00:23:28.940 +classes are part + +00:23:28.940 --> 00:23:29.540 +of the concrete + +00:23:29.540 --> 00:23:30.220 +type of the + +00:23:30.220 --> 00:23:30.440 +template. + +00:23:30.560 --> 00:23:31.600 +It's just off + +00:23:31.600 --> 00:23:32.020 +the hook. + +00:23:32.400 --> 00:23:32.900 +There's certainly + +00:23:32.900 --> 00:23:33.740 +places where + +00:23:33.740 --> 00:23:34.880 +typing can be + +00:23:34.880 --> 00:23:36.180 +too much, and + +00:23:36.180 --> 00:23:37.220 +a lot of the + +00:23:37.220 --> 00:23:38.640 +purity of Python + +00:23:38.640 --> 00:23:40.240 +or the readability + +00:23:40.240 --> 00:23:41.320 +of Python is the + +00:23:41.320 --> 00:23:42.460 +fact that it's + +00:23:42.460 --> 00:23:43.320 +got so few + +00:23:43.320 --> 00:23:43.740 +symbols. + +00:23:44.260 --> 00:23:44.700 +And so adding + +00:23:44.700 --> 00:23:45.700 +types adds + +00:23:45.700 --> 00:23:46.460 +context, but it + +00:23:46.460 --> 00:23:47.920 +also makes it + +00:23:47.920 --> 00:23:48.260 +a little harder + +00:23:48.260 --> 00:23:48.700 +to read. + +00:23:49.060 --> 00:23:49.960 +When is too + +00:23:49.960 --> 00:23:50.520 +much typing? + +00:23:50.960 --> 00:23:51.460 +When do you + +00:23:51.460 --> 00:23:52.320 +recommend typing? + +00:23:53.320 --> 00:23:54.120 +Rebecca, I'll + +00:23:54.120 --> 00:23:54.820 +let you go first, + +00:23:54.940 --> 00:23:55.840 +but what are your + +00:23:55.840 --> 00:23:57.060 +thoughts on how + +00:23:57.060 --> 00:23:57.980 +much typing should + +00:23:57.980 --> 00:23:58.760 +I use in Python? + +00:23:59.040 --> 00:24:00.380 +I'll give you + +00:24:00.380 --> 00:24:01.420 +what is my + +00:24:01.420 --> 00:24:03.160 +official stance, + +00:24:03.260 --> 00:24:03.940 +which is that if + +00:24:03.940 --> 00:24:04.460 +you want your + +00:24:04.460 --> 00:24:05.240 +type checker to + +00:24:05.240 --> 00:24:06.500 +work well, you + +00:24:06.500 --> 00:24:07.460 +should type + +00:24:07.460 --> 00:24:08.160 +annotate your + +00:24:08.160 --> 00:24:09.320 +API boundaries. + +00:24:09.880 --> 00:24:10.500 +So parameters + +00:24:10.500 --> 00:24:11.260 +and turns in + +00:24:11.260 --> 00:24:12.060 +public functions, + +00:24:12.340 --> 00:24:12.840 +public class + +00:24:12.840 --> 00:24:14.000 +attributes, things + +00:24:14.000 --> 00:24:14.900 +like that, and + +00:24:14.900 --> 00:24:15.980 +even things that + +00:24:15.980 --> 00:24:16.440 +seem true + +00:24:16.460 --> 00:24:17.200 +trivial, like, + +00:24:17.300 --> 00:24:17.560 +oh, this + +00:24:17.560 --> 00:24:18.900 +function returns + +00:24:18.900 --> 00:24:20.160 +none, better + +00:24:20.160 --> 00:24:21.400 +to annotate + +00:24:21.400 --> 00:24:22.040 +it because, you + +00:24:22.040 --> 00:24:22.540 +know, someone + +00:24:22.540 --> 00:24:23.080 +else might be + +00:24:23.080 --> 00:24:24.120 +depending on + +00:24:24.120 --> 00:24:25.080 +your library and + +00:24:25.080 --> 00:24:25.700 +consuming that + +00:24:25.700 --> 00:24:26.080 +type of + +00:24:26.080 --> 00:24:26.600 +information. + +00:24:27.120 --> 00:24:27.780 +I will say + +00:24:27.780 --> 00:24:29.040 +personally, what + +00:24:29.040 --> 00:24:29.900 +I tend to do + +00:24:29.900 --> 00:24:31.760 +is I annotate + +00:24:31.760 --> 00:24:32.420 +things that I + +00:24:32.420 --> 00:24:32.900 +think are + +00:24:32.900 --> 00:24:33.680 +non-trivial + +00:24:33.680 --> 00:24:34.480 +because I want + +00:24:34.480 --> 00:24:35.160 +to see that + +00:24:35.160 --> 00:24:36.440 +as documentation. + +00:24:37.400 --> 00:24:38.020 +And if + +00:24:38.020 --> 00:24:39.080 +something, you + +00:24:39.080 --> 00:24:39.540 +know, a + +00:24:39.540 --> 00:24:39.900 +function that + +00:24:39.900 --> 00:24:40.560 +does return + +00:24:40.560 --> 00:24:41.540 +none, to be + +00:24:41.540 --> 00:24:42.040 +honest, I will + +00:24:42.040 --> 00:24:42.600 +probably forget + +00:24:42.600 --> 00:24:43.460 +to annotate it + +00:24:43.460 --> 00:24:44.340 +half the time + +00:24:44.340 --> 00:24:45.260 +because I'll + +00:24:45.260 --> 00:24:46.060 +be like, I + +00:24:46.060 --> 00:24:46.700 +honestly don't + +00:24:46.700 --> 00:24:47.220 +need to see + +00:24:47.220 --> 00:24:47.460 +it. + +00:24:47.660 --> 00:24:48.500 +One of the + +00:24:48.500 --> 00:24:49.940 +interesting features + +00:24:49.940 --> 00:24:50.760 +of the + +00:24:50.760 --> 00:24:52.440 +Pyrefly VS + +00:24:52.440 --> 00:24:53.240 +Code extension, + +00:24:53.380 --> 00:24:53.840 +that's the only + +00:24:53.840 --> 00:24:54.540 +one I can speak + +00:24:54.540 --> 00:24:54.920 +of at the + +00:24:54.920 --> 00:24:55.300 +moment, and + +00:24:55.300 --> 00:24:55.740 +Carl, you've + +00:24:55.740 --> 00:24:56.080 +got to tell + +00:24:56.080 --> 00:24:56.860 +me if the + +00:24:56.860 --> 00:24:58.000 +TY one does + +00:24:58.000 --> 00:24:58.600 +this as well, + +00:24:59.040 --> 00:24:59.580 +is it will + +00:24:59.580 --> 00:25:00.440 +sort of overlay + +00:25:00.440 --> 00:25:02.440 +its belief of + +00:25:02.440 --> 00:25:03.240 +what types + +00:25:03.240 --> 00:25:03.580 +are. + +00:25:03.960 --> 00:25:04.280 +Like, if + +00:25:04.280 --> 00:25:05.060 +there's, you + +00:25:05.060 --> 00:25:06.080 +say x equals a + +00:25:06.080 --> 00:25:06.560 +function return + +00:25:06.560 --> 00:25:07.180 +value and it + +00:25:07.180 --> 00:25:07.700 +knows what the + +00:25:07.700 --> 00:25:08.340 +function returns, + +00:25:08.440 --> 00:25:08.900 +it'll have a + +00:25:08.900 --> 00:25:09.520 +gray, like, + +00:25:09.680 --> 00:25:10.580 +colon int, if + +00:25:10.580 --> 00:25:11.200 +it returned an + +00:25:11.200 --> 00:25:11.820 +int or something. + +00:25:11.820 --> 00:25:12.620 +So you can + +00:25:12.620 --> 00:25:14.120 +kind of read + +00:25:14.120 --> 00:25:15.140 +the code and + +00:25:15.140 --> 00:25:15.600 +see what the + +00:25:15.600 --> 00:25:16.200 +types are without + +00:25:16.200 --> 00:25:17.060 +actually putting + +00:25:17.060 --> 00:25:18.040 +it into the + +00:25:18.040 --> 00:25:18.720 +text of the + +00:25:18.720 --> 00:25:18.980 +code. + +00:25:19.080 --> 00:25:19.580 +It's only within + +00:25:19.580 --> 00:25:20.120 +the editor. + +00:25:20.520 --> 00:25:21.080 +Does ty do + +00:25:21.080 --> 00:25:21.460 +something like + +00:25:21.460 --> 00:25:21.840 +that, Carl? + +00:25:22.040 --> 00:25:22.860 +Yes, we also + +00:25:22.860 --> 00:25:23.560 +have inlay type + +00:25:23.560 --> 00:25:23.800 +ins. + +00:25:23.900 --> 00:25:24.400 +Yeah, inlay type + +00:25:24.400 --> 00:25:24.800 +ins, that's + +00:25:24.800 --> 00:25:25.200 +what it's called. + +00:25:25.500 --> 00:25:26.060 +So, yeah, I + +00:25:26.060 --> 00:25:26.400 +don't know, that + +00:25:26.400 --> 00:25:27.420 +also brings an + +00:25:27.420 --> 00:25:28.300 +interesting challenge, + +00:25:28.400 --> 00:25:28.880 +not a challenge, + +00:25:28.960 --> 00:25:30.160 +like a wrinkle to + +00:25:30.160 --> 00:25:31.460 +the recommendation + +00:25:31.460 --> 00:25:33.640 +of should I put + +00:25:33.640 --> 00:25:34.360 +types on, like, + +00:25:34.400 --> 00:25:35.180 +the return value + +00:25:35.180 --> 00:25:35.780 +because I want to + +00:25:35.780 --> 00:25:36.280 +know that's a + +00:25:36.280 --> 00:25:36.920 +list of user, + +00:25:37.040 --> 00:25:37.880 +not a list of + +00:25:37.880 --> 00:25:38.880 +user IDs or + +00:25:38.880 --> 00:25:39.320 +whatever, for + +00:25:39.320 --> 00:25:39.940 +example, like a + +00:25:39.940 --> 00:25:40.780 +list of UUID. + +00:25:41.220 --> 00:25:41.580 +But if, + +00:25:41.820 --> 00:25:42.340 +it's going to + +00:25:42.340 --> 00:25:43.000 +show up anyway + +00:25:43.000 --> 00:25:43.660 +in the editor, + +00:25:44.160 --> 00:25:44.740 +maybe I don't + +00:25:44.740 --> 00:25:45.280 +have to write + +00:25:45.280 --> 00:25:45.780 +that, right? + +00:25:45.840 --> 00:25:46.220 +And so that + +00:25:46.220 --> 00:25:47.260 +becomes sort of + +00:25:47.260 --> 00:25:48.260 +somewhere where + +00:25:48.260 --> 00:25:49.000 +you could debate + +00:25:49.000 --> 00:25:49.680 +again, I think. + +00:25:50.120 --> 00:25:51.160 +However, I do + +00:25:51.160 --> 00:25:52.280 +100% agree with + +00:25:52.280 --> 00:25:53.040 +you, Rebecca, that + +00:25:53.040 --> 00:25:53.760 +put it on your + +00:25:53.760 --> 00:25:54.560 +API boundaries. + +00:25:54.740 --> 00:25:55.560 +If, like, this + +00:25:55.560 --> 00:25:56.600 +is the place that + +00:25:56.600 --> 00:25:57.500 +people get into + +00:25:57.500 --> 00:25:58.340 +some part of your + +00:25:58.340 --> 00:25:58.820 +code and they + +00:25:58.820 --> 00:25:59.700 +don't know or + +00:25:59.700 --> 00:26:00.380 +want to know about + +00:26:00.380 --> 00:26:01.120 +the inside of it, + +00:26:01.500 --> 00:26:02.460 +having types there + +00:26:02.460 --> 00:26:03.360 +is really helpful + +00:26:03.360 --> 00:26:04.440 +both for editors, + +00:26:04.960 --> 00:26:05.800 +for type checkers, + +00:26:05.840 --> 00:26:06.480 +and just for + +00:26:06.480 --> 00:26:07.300 +reading code, and + +00:26:07.300 --> 00:26:08.360 +even for AI, which + +00:26:08.360 --> 00:26:09.240 +is a crazy world. + +00:26:09.580 --> 00:26:09.680 +Yeah. + +00:26:09.900 --> 00:26:10.220 +Carl? + +00:26:10.220 --> 00:26:11.000 +What are your + +00:26:11.000 --> 00:26:11.400 +thoughts here? + +00:26:11.460 --> 00:26:12.180 +How much typing is + +00:26:12.180 --> 00:26:12.760 +too much typing? + +00:26:13.200 --> 00:26:13.780 +What's the + +00:26:13.780 --> 00:26:14.580 +guidelines here? + +00:26:14.680 --> 00:26:15.240 +I think I agree + +00:26:15.240 --> 00:26:15.800 +with Rebecca's + +00:26:15.800 --> 00:26:16.100 +answer. + +00:26:16.280 --> 00:26:17.120 +I mean, that one + +00:26:17.120 --> 00:26:17.740 +place you definitely + +00:26:17.740 --> 00:26:18.740 +want to have + +00:26:18.740 --> 00:26:19.840 +explicit type + +00:26:19.840 --> 00:26:21.100 +annotations is that + +00:26:21.100 --> 00:26:22.140 +API boundaries, + +00:26:22.420 --> 00:26:23.300 +the public API of + +00:26:23.300 --> 00:26:24.120 +a library, etc. + +00:26:24.120 --> 00:26:25.680 +In terms of what's + +00:26:25.680 --> 00:26:26.660 +too much typing, I + +00:26:26.660 --> 00:26:27.220 +mean, there are + +00:26:27.220 --> 00:26:28.320 +certainly patterns + +00:26:28.320 --> 00:26:28.900 +that have + +00:26:28.900 --> 00:26:29.740 +historically been + +00:26:29.740 --> 00:26:31.280 +used in Python + +00:26:31.280 --> 00:26:32.380 +that we still + +00:26:32.380 --> 00:26:34.060 +can't express well + +00:26:34.060 --> 00:26:34.540 +in the type + +00:26:34.540 --> 00:26:35.580 +system, or that + +00:26:35.580 --> 00:26:36.480 +require extremely + +00:26:36.480 --> 00:26:37.680 +complex type + +00:26:37.680 --> 00:26:39.240 +annotations to + +00:26:39.240 --> 00:26:40.020 +express well, and + +00:26:40.020 --> 00:26:40.640 +I think there it + +00:26:40.640 --> 00:26:41.720 +becomes a judgment + +00:26:41.720 --> 00:26:42.140 +call. + +00:26:43.020 --> 00:26:44.420 +If it's like a + +00:26:44.420 --> 00:26:45.220 +core, widely + +00:26:45.220 --> 00:26:46.540 +used API, you + +00:26:46.540 --> 00:26:47.620 +may get a lot of + +00:26:47.620 --> 00:26:48.940 +benefit from some + +00:26:48.940 --> 00:26:50.020 +very complex and + +00:26:50.020 --> 00:26:51.020 +verbose annotations, + +00:26:51.440 --> 00:26:52.100 +and so then it's + +00:26:52.100 --> 00:26:53.140 +worth sort of going + +00:26:53.140 --> 00:26:53.940 +through that pain + +00:26:53.940 --> 00:26:55.020 +and the pain of + +00:26:55.020 --> 00:26:55.660 +adding them and + +00:26:55.660 --> 00:26:56.960 +of reading them in + +00:26:56.960 --> 00:26:57.700 +order to get that + +00:26:57.700 --> 00:26:59.000 +additional typing + +00:26:59.000 --> 00:27:00.020 +coverage everywhere you + +00:27:00.020 --> 00:27:00.820 +use that API. + +00:27:01.120 --> 00:27:02.340 +If it's much less + +00:27:02.340 --> 00:27:03.400 +frequently used code + +00:27:03.400 --> 00:27:04.280 +that's highly dynamic, + +00:27:04.720 --> 00:27:05.520 +maybe it's not worth + +00:27:05.520 --> 00:27:06.280 +it in that case. + +00:27:06.620 --> 00:27:07.140 +I think there's a lot + +00:27:07.140 --> 00:27:07.780 +of judgment calls + +00:27:07.780 --> 00:27:07.960 +here. + +00:27:08.140 --> 00:27:08.960 +What about like + +00:27:08.960 --> 00:27:10.200 +one-off scripts? + +00:27:10.680 --> 00:27:11.140 +You know, I'm going + +00:27:11.140 --> 00:27:11.940 +to write this thing to + +00:27:11.940 --> 00:27:12.940 +just move this data + +00:27:12.940 --> 00:27:13.580 +from here to there, + +00:27:13.620 --> 00:27:14.220 +and once it's moved, + +00:27:14.280 --> 00:27:15.020 +I don't need it again. + +00:27:15.120 --> 00:27:16.160 +It's done with that + +00:27:16.160 --> 00:27:16.700 +old system, we're + +00:27:16.700 --> 00:27:17.260 +going to the new + +00:27:17.260 --> 00:27:17.460 +one. + +00:27:17.800 --> 00:27:18.660 +Maybe less typing. + +00:27:18.820 --> 00:27:19.080 +Yeah, I think + +00:27:19.080 --> 00:27:20.040 +that's what's useful + +00:27:20.040 --> 00:27:20.600 +for you. + +00:27:20.980 --> 00:27:21.780 +Often I feel like + +00:27:21.780 --> 00:27:22.420 +one-off scripts are + +00:27:22.420 --> 00:27:23.120 +not really one-off + +00:27:23.140 --> 00:27:23.880 +like maybe you + +00:27:23.880 --> 00:27:24.500 +want to move some + +00:27:24.500 --> 00:27:25.460 +similar data later, + +00:27:25.620 --> 00:27:26.260 +and then it's useful + +00:27:26.260 --> 00:27:27.160 +if you can understand + +00:27:27.160 --> 00:27:27.800 +your code again, + +00:27:27.860 --> 00:27:29.040 +if you want to + +00:27:29.040 --> 00:27:29.600 +read what you did. + +00:27:29.720 --> 00:27:30.140 +You thought you + +00:27:30.140 --> 00:27:30.880 +didn't need it again, + +00:27:30.940 --> 00:27:31.760 +and all of a sudden + +00:27:31.760 --> 00:27:32.500 +at six months old, + +00:27:32.540 --> 00:27:33.220 +you don't understand + +00:27:33.220 --> 00:27:34.540 +it, and the types + +00:27:34.540 --> 00:27:35.060 +that help a lot, + +00:27:35.100 --> 00:27:35.260 +right? + +00:27:35.560 --> 00:27:35.720 +Yeah. + +00:27:36.160 --> 00:27:36.780 +Yellow, what's + +00:27:36.780 --> 00:27:38.280 +your advice? + +00:27:38.540 --> 00:27:39.060 +What Cara and + +00:27:39.060 --> 00:27:39.480 +Rebecca said + +00:27:39.480 --> 00:27:40.240 +makes sense to me + +00:27:40.240 --> 00:27:40.440 +too. + +00:27:40.840 --> 00:27:41.480 +I think types + +00:27:41.480 --> 00:27:42.460 +have advantages + +00:27:42.460 --> 00:27:43.340 +in terms of + +00:27:43.340 --> 00:27:44.240 +documenting humor + +00:27:44.240 --> 00:27:45.120 +readers, what + +00:27:45.120 --> 00:27:46.180 +is going on, + +00:27:46.400 --> 00:27:47.040 +and in terms of + +00:27:47.040 --> 00:27:47.940 +catching mistakes + +00:27:47.940 --> 00:27:48.500 +that otherwise + +00:27:48.500 --> 00:27:49.160 +would not be + +00:27:49.160 --> 00:27:49.720 +caught until + +00:27:49.720 --> 00:27:50.740 +runtime perhaps. + +00:27:51.160 --> 00:27:51.680 +They have costs + +00:27:51.680 --> 00:27:52.340 +in maybe making + +00:27:52.340 --> 00:27:52.880 +your code harder + +00:27:52.880 --> 00:27:53.500 +to read if + +00:27:53.500 --> 00:27:54.020 +there's too much + +00:27:54.020 --> 00:27:54.580 +going on. + +00:27:54.940 --> 00:27:55.740 +So add types + +00:27:55.740 --> 00:27:56.140 +as long as + +00:27:56.140 --> 00:27:56.720 +those benefits + +00:27:56.720 --> 00:27:57.240 +outweigh the + +00:27:57.240 --> 00:27:57.560 +costs. + +00:27:57.760 --> 00:27:57.940 +Yeah. + +00:27:58.420 --> 00:27:59.100 +I mean, do you + +00:27:59.100 --> 00:27:59.780 +recommend to + +00:27:59.780 --> 00:28:01.060 +anyone that they + +00:28:01.060 --> 00:28:03.620 +just 100% go full + +00:28:03.620 --> 00:28:04.600 +like C++, + +00:28:04.840 --> 00:28:05.600 +C# on it, + +00:28:05.660 --> 00:28:06.620 +and just type it + +00:28:06.620 --> 00:28:07.740 +every single thing? + +00:28:08.080 --> 00:28:08.740 +Is there an + +00:28:08.740 --> 00:28:10.060 +advantage like for + +00:28:10.060 --> 00:28:11.100 +static type checkers, + +00:28:11.240 --> 00:28:11.540 +you know, like + +00:28:11.540 --> 00:28:12.400 +mypy type stuff + +00:28:12.400 --> 00:28:13.320 +you can run across + +00:28:13.320 --> 00:28:14.680 +and get that? + +00:28:14.780 --> 00:28:15.040 +I mean, you could + +00:28:15.040 --> 00:28:15.420 +do that with + +00:28:15.420 --> 00:28:16.160 +Powerfly or TY + +00:28:16.160 --> 00:28:17.260 +and the CLI as + +00:28:17.260 --> 00:28:18.140 +well, but you know, + +00:28:18.400 --> 00:28:19.340 +thinking more mypy + +00:28:19.340 --> 00:28:19.980 +is like kind of + +00:28:19.980 --> 00:28:20.640 +being real strict + +00:28:20.640 --> 00:28:21.200 +on some of that + +00:28:21.200 --> 00:28:21.500 +stuff. + +00:28:21.580 --> 00:28:22.140 +Personally, I do + +00:28:22.140 --> 00:28:23.140 +tend to annotate + +00:28:23.140 --> 00:28:23.880 +almost all like + +00:28:23.880 --> 00:28:24.700 +function parameters + +00:28:24.700 --> 00:28:26.240 +and if class + +00:28:26.240 --> 00:28:27.060 +attributes, if I + +00:28:27.060 --> 00:28:27.760 +make a class, + +00:28:28.100 --> 00:28:28.680 +sometimes it's not + +00:28:28.680 --> 00:28:29.560 +as necessary, + +00:28:29.740 --> 00:28:30.640 +like you don't + +00:28:30.640 --> 00:28:30.980 +really need to + +00:28:30.980 --> 00:28:31.400 +annotate your + +00:28:31.400 --> 00:28:32.060 +tests perhaps, + +00:28:32.220 --> 00:28:32.560 +or you don't + +00:28:32.560 --> 00:28:33.080 +need to annotate + +00:28:33.080 --> 00:28:33.800 +internal functions + +00:28:33.800 --> 00:28:34.840 +as much, but + +00:28:34.840 --> 00:28:35.680 +for my own + +00:28:35.680 --> 00:28:36.460 +coding, I usually + +00:28:36.460 --> 00:28:36.960 +find it helpful + +00:28:36.960 --> 00:28:37.540 +to do that. + +00:28:37.900 --> 00:28:39.200 +But sometimes I + +00:28:39.200 --> 00:28:39.540 +see people + +00:28:39.540 --> 00:28:40.220 +annotating even + +00:28:40.220 --> 00:28:41.000 +local variables + +00:28:41.000 --> 00:28:41.760 +where it's very + +00:28:41.760 --> 00:28:42.460 +obvious to type + +00:28:42.460 --> 00:28:43.100 +check if the type + +00:28:43.100 --> 00:28:44.040 +is and they can + +00:28:44.040 --> 00:28:44.600 +just infer it + +00:28:44.600 --> 00:28:46.000 +reliably, and then + +00:28:46.000 --> 00:28:46.860 +it really just + +00:28:46.860 --> 00:28:47.580 +adds noise and + +00:28:47.580 --> 00:28:48.140 +you shouldn't do + +00:28:48.140 --> 00:28:48.340 +it. + +00:28:48.500 --> 00:28:48.960 +Yeah, exactly. + +00:28:48.960 --> 00:28:49.620 +If you've got a + +00:28:49.620 --> 00:28:50.160 +function that's + +00:28:50.160 --> 00:28:50.860 +annotated with a + +00:28:50.860 --> 00:28:51.660 +return value and + +00:28:51.660 --> 00:28:53.020 +you say x equals + +00:28:53.020 --> 00:28:53.900 +the function call, + +00:28:54.320 --> 00:28:55.020 +then the type + +00:28:55.020 --> 00:28:55.740 +checkers can infer + +00:28:55.740 --> 00:28:56.560 +that and you're + +00:28:56.560 --> 00:28:57.560 +just causing + +00:28:57.560 --> 00:28:59.560 +extra noise, I + +00:28:59.560 --> 00:28:59.820 +guess. + +00:29:00.240 --> 00:29:01.400 +So suppose you + +00:29:01.400 --> 00:29:02.200 +all want to + +00:29:02.200 --> 00:29:03.640 +change something. + +00:29:03.960 --> 00:29:04.880 +What's the process + +00:29:04.880 --> 00:29:05.640 +of actually going + +00:29:05.640 --> 00:29:06.700 +through and making + +00:29:06.700 --> 00:29:07.420 +some changes? + +00:29:08.280 --> 00:29:09.000 +Mostly sort of + +00:29:09.000 --> 00:29:09.800 +two levels of + +00:29:09.800 --> 00:29:10.100 +this. + +00:29:10.400 --> 00:29:10.860 +Well, maybe + +00:29:10.860 --> 00:29:11.120 +there's even + +00:29:11.120 --> 00:29:11.660 +three levels. + +00:29:11.780 --> 00:29:12.320 +The first one is + +00:29:12.320 --> 00:29:12.780 +if it's something + +00:29:12.780 --> 00:29:13.580 +that's so small + +00:29:13.580 --> 00:29:14.500 +that's just like + +00:29:14.500 --> 00:29:15.100 +a wording + +00:29:15.100 --> 00:29:15.860 +clarification or + +00:29:15.860 --> 00:29:16.540 +something, we + +00:29:16.540 --> 00:29:17.120 +just make a PR + +00:29:17.120 --> 00:29:18.420 +to the repo and + +00:29:18.420 --> 00:29:19.220 +a few of us + +00:29:19.220 --> 00:29:20.040 +look at it and + +00:29:20.040 --> 00:29:20.740 +we change it. + +00:29:21.000 --> 00:29:21.640 +The second level + +00:29:21.640 --> 00:29:22.420 +is when it's sort + +00:29:22.420 --> 00:29:22.940 +of a smaller + +00:29:22.940 --> 00:29:24.180 +change that + +00:29:24.180 --> 00:29:25.200 +doesn't really + +00:29:25.200 --> 00:29:25.860 +introduce a new + +00:29:25.860 --> 00:29:26.680 +feature and then + +00:29:26.680 --> 00:29:28.220 +we make a PR + +00:29:28.220 --> 00:29:28.800 +to the typing + +00:29:28.800 --> 00:29:30.200 +spec repo and + +00:29:30.200 --> 00:29:31.140 +we formally have + +00:29:31.140 --> 00:29:31.900 +all of us sign + +00:29:31.900 --> 00:29:32.380 +off on it. + +00:29:32.740 --> 00:29:33.360 +That's what + +00:29:33.360 --> 00:29:34.040 +happens like + +00:29:34.040 --> 00:29:34.560 +what Carl + +00:29:34.560 --> 00:29:35.120 +mentioned earlier + +00:29:35.120 --> 00:29:35.800 +of the final + +00:29:35.800 --> 00:29:37.140 +change in data + +00:29:37.140 --> 00:29:37.680 +classes. + +00:29:37.860 --> 00:29:38.440 +It's had to + +00:29:38.440 --> 00:29:39.480 +merge this one, + +00:29:39.620 --> 00:29:40.300 +yeah, add + +00:29:40.300 --> 00:29:40.580 +Carl. + +00:29:41.900 --> 00:29:42.740 +I love it. + +00:29:42.840 --> 00:29:43.580 +This repo itself + +00:29:43.580 --> 00:29:44.040 +doesn't have + +00:29:44.040 --> 00:29:44.520 +anything. + +00:29:44.980 --> 00:29:45.460 +It's the + +00:29:45.460 --> 00:29:46.160 +Python typing + +00:29:46.160 --> 00:29:47.400 +repo where the + +00:29:47.400 --> 00:29:48.160 +decisions are + +00:29:48.160 --> 00:29:48.420 +made. + +00:29:48.840 --> 00:29:49.160 +The typing + +00:29:49.160 --> 00:29:49.800 +council just + +00:29:49.800 --> 00:29:50.800 +has like some + +00:29:50.800 --> 00:29:51.560 +documentation. + +00:29:52.420 --> 00:29:52.560 +Yeah. + +00:29:52.840 --> 00:29:53.240 +And then the + +00:29:53.240 --> 00:29:53.780 +third level is + +00:29:53.780 --> 00:29:54.580 +peps, like really + +00:29:54.580 --> 00:29:55.480 +big new changes. + +00:29:55.620 --> 00:29:56.000 +You can still + +00:29:56.000 --> 00:29:56.920 +write a PEP and + +00:29:56.920 --> 00:29:57.880 +then we make a + +00:29:57.880 --> 00:29:58.680 +recommendation and + +00:29:58.680 --> 00:29:59.000 +the steering + +00:29:59.000 --> 00:29:59.720 +council makes a + +00:29:59.720 --> 00:30:00.480 +decision eventually. + +00:30:00.740 --> 00:30:01.380 +So if I wanted + +00:30:01.380 --> 00:30:01.920 +to suggest + +00:30:01.920 --> 00:30:02.940 +something, I + +00:30:02.940 --> 00:30:03.500 +could come up + +00:30:03.500 --> 00:30:04.100 +here and I + +00:30:04.100 --> 00:30:05.580 +could open up + +00:30:05.580 --> 00:30:06.500 +an issue, maybe + +00:30:06.500 --> 00:30:07.380 +start a conversation + +00:30:07.380 --> 00:30:08.540 +on typing, + +00:30:09.000 --> 00:30:09.840 +Python slash + +00:30:09.840 --> 00:30:10.120 +typing. + +00:30:10.280 --> 00:30:10.760 +And you can make + +00:30:10.760 --> 00:30:11.360 +a pull request + +00:30:11.360 --> 00:30:11.960 +to change the + +00:30:11.960 --> 00:30:12.240 +spec. + +00:30:12.240 --> 00:30:12.680 +Okay. + +00:30:12.900 --> 00:30:13.420 +And so the + +00:30:13.420 --> 00:30:13.940 +pull request + +00:30:13.940 --> 00:30:14.480 +would not be + +00:30:14.480 --> 00:30:15.640 +to change the + +00:30:15.640 --> 00:30:16.720 +code, like + +00:30:16.720 --> 00:30:17.340 +how Python + +00:30:17.340 --> 00:30:18.920 +maybe interprets + +00:30:18.920 --> 00:30:19.920 +code that has + +00:30:19.920 --> 00:30:20.600 +this new thing, + +00:30:20.700 --> 00:30:21.620 +but to suggest + +00:30:21.620 --> 00:30:22.240 +that the spec + +00:30:22.240 --> 00:30:22.760 +has it, which + +00:30:22.760 --> 00:30:23.680 +then would start + +00:30:23.680 --> 00:30:24.660 +a process that + +00:30:24.660 --> 00:30:25.420 +ultimately might + +00:30:25.420 --> 00:30:26.240 +make CPython + +00:30:26.240 --> 00:30:26.880 +understand it, + +00:30:26.920 --> 00:30:27.060 +right? + +00:30:27.120 --> 00:30:27.600 +Well, CPython + +00:30:27.600 --> 00:30:28.380 +itself probably + +00:30:28.380 --> 00:30:29.080 +doesn't do + +00:30:29.080 --> 00:30:29.720 +anything with it. + +00:30:30.220 --> 00:30:30.800 +I guess most of + +00:30:30.800 --> 00:30:31.120 +the things that + +00:30:31.120 --> 00:30:31.880 +go directly here + +00:30:31.880 --> 00:30:32.800 +are changes to + +00:30:32.800 --> 00:30:33.420 +how to interpret + +00:30:33.420 --> 00:30:33.960 +things that are + +00:30:33.960 --> 00:30:35.380 +already in CPython. + +00:30:37.140 --> 00:30:38.060 +This portion of + +00:30:38.060 --> 00:30:38.700 +Talk Python To Me + +00:30:38.700 --> 00:30:39.200 +is brought to you + +00:30:39.200 --> 00:30:39.860 +by us. + +00:30:40.380 --> 00:30:41.000 +I want to tell + +00:30:41.000 --> 00:30:42.020 +you about a + +00:30:42.020 --> 00:30:42.600 +course I put + +00:30:42.600 --> 00:30:43.160 +together that + +00:30:43.160 --> 00:30:44.160 +I'm really proud + +00:30:44.160 --> 00:30:45.360 +of, Agentic + +00:30:45.360 --> 00:30:46.340 +AI Programming + +00:30:46.340 --> 00:30:46.980 +for Python + +00:30:46.980 --> 00:30:47.620 +Developers. + +00:30:48.280 --> 00:30:49.160 +I know a lot + +00:30:49.160 --> 00:30:49.540 +of you have + +00:30:49.540 --> 00:30:50.560 +tried AI coding + +00:30:50.560 --> 00:30:51.280 +tools and come + +00:30:51.280 --> 00:30:51.900 +away thinking, + +00:30:52.260 --> 00:30:52.980 +well, this is + +00:30:52.980 --> 00:30:53.560 +more hassle + +00:30:53.560 --> 00:30:54.160 +than it's worth. + +00:30:54.520 --> 00:30:55.120 +And honestly, + +00:30:55.360 --> 00:30:56.200 +all the vibe + +00:30:56.200 --> 00:30:57.020 +coding hype + +00:30:57.020 --> 00:30:57.820 +isn't helping. + +00:30:58.080 --> 00:30:59.080 +It's a smokescreen + +00:30:59.080 --> 00:30:59.800 +that hides what + +00:30:59.800 --> 00:31:00.480 +these tools can + +00:31:00.480 --> 00:31:01.280 +actually do. + +00:31:01.640 --> 00:31:02.560 +This course is + +00:31:02.560 --> 00:31:03.320 +about agentic + +00:31:03.320 --> 00:31:04.080 +engineering, + +00:31:04.660 --> 00:31:05.540 +applying real + +00:31:05.540 --> 00:31:06.320 +software engineering + +00:31:06.320 --> 00:31:07.520 +practices with AI + +00:31:07.520 --> 00:31:08.260 +that understands + +00:31:08.260 --> 00:31:09.160 +your entire code + +00:31:09.160 --> 00:31:10.160 +database, runs + +00:31:10.160 --> 00:31:11.200 +your tests, and + +00:31:11.200 --> 00:31:12.080 +builds complete + +00:31:12.080 --> 00:31:13.100 +features under + +00:31:13.100 --> 00:31:13.720 +your direction. + +00:31:14.380 --> 00:31:14.960 +I've used these + +00:31:14.960 --> 00:31:15.760 +techniques to ship + +00:31:15.760 --> 00:31:17.120 +real production code + +00:31:17.120 --> 00:31:18.140 +across Talk Python, + +00:31:18.600 --> 00:31:19.520 +Python Bytes, and + +00:31:19.520 --> 00:31:20.180 +completely new + +00:31:20.180 --> 00:31:20.580 +projects. + +00:31:21.000 --> 00:31:21.840 +I migrated an + +00:31:21.840 --> 00:31:22.960 +entire CSS framework + +00:31:22.960 --> 00:31:23.660 +on a production + +00:31:23.660 --> 00:31:24.720 +site with thousands + +00:31:24.720 --> 00:31:25.900 +of lines of HTML + +00:31:25.900 --> 00:31:27.160 +in a few hours, + +00:31:27.580 --> 00:31:27.900 +twice. + +00:31:28.240 --> 00:31:29.360 +I shipped a new + +00:31:29.360 --> 00:31:30.280 +search feature with + +00:31:30.280 --> 00:31:31.580 +caching and async + +00:31:31.580 --> 00:31:32.500 +in under an hour. + +00:31:32.500 --> 00:31:33.820 +I built a complete + +00:31:33.820 --> 00:31:35.200 +CLI tool for + +00:31:35.200 --> 00:31:36.640 +Talk Python from + +00:31:36.640 --> 00:31:37.700 +scratch, tested, + +00:31:38.020 --> 00:31:38.780 +documented, and + +00:31:38.780 --> 00:31:40.400 +published to PyPI in + +00:31:40.400 --> 00:31:41.060 +an afternoon. + +00:31:41.580 --> 00:31:42.780 +Real projects, real + +00:31:42.780 --> 00:31:43.960 +production code, both + +00:31:43.960 --> 00:31:45.060 +Greenfield and + +00:31:45.060 --> 00:31:45.540 +legacy. + +00:31:46.020 --> 00:31:47.360 +No toy demos, no + +00:31:47.360 --> 00:31:47.660 +fluff. + +00:31:48.220 --> 00:31:48.980 +I'll show you the + +00:31:48.980 --> 00:31:49.940 +guardrails, the + +00:31:49.940 --> 00:31:50.880 +planning techniques, and + +00:31:50.880 --> 00:31:51.880 +the workflows that + +00:31:51.880 --> 00:31:53.000 +turn AI into a + +00:31:53.000 --> 00:31:53.980 +genuine engineering + +00:31:53.980 --> 00:31:54.360 +partner. + +00:31:54.960 --> 00:31:55.680 +Check it out at + +00:31:55.680 --> 00:31:56.620 +talkpython.fm + +00:31:56.620 --> 00:31:57.520 +slash agentic + +00:31:57.520 --> 00:31:58.420 +dash engineering. + +00:31:58.640 --> 00:32:00.280 +That's talkpython.fm + +00:32:00.280 --> 00:32:01.340 +slash agentic dash + +00:32:01.340 --> 00:32:01.780 +engineering. + +00:32:01.780 --> 00:32:02.880 +The link is in your + +00:32:02.880 --> 00:32:03.680 +podcast player's + +00:32:03.680 --> 00:32:04.120 +show notes. + +00:32:05.320 --> 00:32:06.180 +If it's adding + +00:32:06.180 --> 00:32:07.240 +something new, it + +00:32:07.240 --> 00:32:08.200 +will usually need to + +00:32:08.200 --> 00:32:08.940 +go through a path, + +00:32:09.040 --> 00:32:09.600 +except if it's + +00:32:09.600 --> 00:32:10.600 +something very small. + +00:32:10.680 --> 00:32:11.240 +Let's talk about + +00:32:11.240 --> 00:32:11.900 +that for a minute. + +00:32:11.980 --> 00:32:13.280 +We got two + +00:32:13.280 --> 00:32:14.720 +representatives here + +00:32:14.720 --> 00:32:15.740 +of the newer + +00:32:15.740 --> 00:32:17.000 +breed of tools. + +00:32:17.700 --> 00:32:18.780 +What's the story + +00:32:18.780 --> 00:32:21.100 +for inconsistencies + +00:32:21.100 --> 00:32:22.760 +across interpretations + +00:32:22.760 --> 00:32:23.540 +of the spec? + +00:32:23.620 --> 00:32:24.340 +I know that there's + +00:32:24.340 --> 00:32:25.080 +slight variations. + +00:32:25.680 --> 00:32:26.940 +I've also, you + +00:32:26.940 --> 00:32:27.940 +know, not putting + +00:32:27.940 --> 00:32:28.640 +either of you on + +00:32:28.640 --> 00:32:29.440 +the spotlight, but + +00:32:29.440 --> 00:32:31.040 +like using, say, + +00:32:31.040 --> 00:32:32.320 +PyCharm and + +00:32:32.320 --> 00:32:33.380 +like writing code. + +00:32:33.500 --> 00:32:34.140 +So it's type + +00:32:34.140 --> 00:32:35.760 +checkers happy and + +00:32:35.760 --> 00:32:36.520 +then using something + +00:32:36.520 --> 00:32:37.780 +like Pyright. + +00:32:37.980 --> 00:32:39.320 +And so it has a + +00:32:39.320 --> 00:32:39.940 +real different + +00:32:39.940 --> 00:32:41.100 +interpretation of + +00:32:41.100 --> 00:32:42.160 +what you should let + +00:32:42.160 --> 00:32:42.860 +slide and what you + +00:32:42.860 --> 00:32:43.140 +shouldn't. + +00:32:43.200 --> 00:32:43.820 +I feel like Pyright + +00:32:43.820 --> 00:32:45.400 +is more, much more + +00:32:45.400 --> 00:32:47.300 +focused on like + +00:32:47.300 --> 00:32:48.200 +enforcing the + +00:32:48.200 --> 00:32:49.880 +nullability and or + +00:32:49.880 --> 00:32:50.660 +the lack thereof. + +00:32:50.700 --> 00:32:51.860 +And it warns of + +00:32:51.860 --> 00:32:52.960 +inconsistencies there + +00:32:52.960 --> 00:32:53.800 +where PyCharm doesn't + +00:32:53.800 --> 00:32:54.680 +seem to care as much. + +00:32:54.980 --> 00:32:55.500 +I don't know which + +00:32:55.500 --> 00:32:56.180 +one I like better, + +00:32:56.260 --> 00:32:56.780 +but I know they're + +00:32:56.780 --> 00:32:57.120 +different. + +00:32:57.120 --> 00:32:57.980 +And if I write code + +00:32:57.980 --> 00:32:58.480 +in one that I + +00:32:58.480 --> 00:32:59.220 +open the other, I'm + +00:32:59.220 --> 00:33:00.300 +like, huh, why is it + +00:33:00.300 --> 00:33:00.580 +upset? + +00:33:00.720 --> 00:33:01.480 +It seemed like it + +00:33:01.480 --> 00:33:01.900 +was fine. + +00:33:02.180 --> 00:33:02.540 +How do you all + +00:33:02.540 --> 00:33:03.100 +navigate this? + +00:33:03.260 --> 00:33:03.380 +Yeah. + +00:33:03.560 --> 00:33:04.580 +One thing useful to + +00:33:04.580 --> 00:33:05.480 +say about the spec + +00:33:05.480 --> 00:33:06.360 +there is that the + +00:33:06.360 --> 00:33:07.780 +spec covers a lot of + +00:33:07.780 --> 00:33:08.060 +things. + +00:33:08.160 --> 00:33:08.900 +In particular, it + +00:33:08.900 --> 00:33:09.740 +tends to cover sort + +00:33:09.740 --> 00:33:10.860 +of the details of + +00:33:10.860 --> 00:33:11.740 +more advanced type + +00:33:11.740 --> 00:33:12.560 +system features. + +00:33:12.940 --> 00:33:13.580 +But there's a lot of + +00:33:13.580 --> 00:33:14.980 +very fundamental + +00:33:14.980 --> 00:33:16.560 +stuff about how a + +00:33:16.560 --> 00:33:17.460 +type checker works + +00:33:17.460 --> 00:33:18.900 +in terms of how it + +00:33:18.900 --> 00:33:19.960 +does inference and + +00:33:19.960 --> 00:33:20.700 +how it does type + +00:33:20.700 --> 00:33:21.360 +narrowing. + +00:33:21.440 --> 00:33:22.460 +And even in some + +00:33:22.460 --> 00:33:23.100 +cases, like you + +00:33:23.100 --> 00:33:23.660 +mentioned, you know, + +00:33:23.660 --> 00:33:24.560 +what it chooses to + +00:33:24.560 --> 00:33:25.740 +emit errors on that + +00:33:25.740 --> 00:33:27.080 +isn't really covered + +00:33:27.080 --> 00:33:28.560 +by the spec, partly + +00:33:28.560 --> 00:33:29.760 +maybe because we + +00:33:29.760 --> 00:33:30.400 +haven't gotten to + +00:33:30.400 --> 00:33:31.260 +it and also partly + +00:33:31.260 --> 00:33:33.500 +intentionally in that + +00:33:33.500 --> 00:33:35.160 +there may be room in + +00:33:35.160 --> 00:33:35.880 +some of those cases + +00:33:35.880 --> 00:33:36.600 +for different type + +00:33:36.600 --> 00:33:37.400 +checkers to work + +00:33:37.400 --> 00:33:38.100 +differently if they're + +00:33:38.100 --> 00:33:38.680 +serving different + +00:33:38.680 --> 00:33:39.040 +needs. + +00:33:39.040 --> 00:33:40.440 +Like if PyCharm is + +00:33:40.440 --> 00:33:41.420 +primarily concerned + +00:33:41.420 --> 00:33:42.200 +about being a + +00:33:42.200 --> 00:33:44.020 +useful kind of IDE + +00:33:44.020 --> 00:33:45.240 +and providing go-to + +00:33:45.240 --> 00:33:46.100 +definition and that + +00:33:46.100 --> 00:33:47.340 +sort of thing, maybe + +00:33:47.340 --> 00:33:48.900 +emitting lots of + +00:33:48.900 --> 00:33:50.000 +warnings or errors + +00:33:50.000 --> 00:33:51.160 +and all kinds of + +00:33:51.160 --> 00:33:52.060 +things where your + +00:33:52.060 --> 00:33:52.760 +code might be doing + +00:33:52.760 --> 00:33:54.040 +something wrong isn't + +00:33:54.040 --> 00:33:54.920 +as high a priority. + +00:33:54.920 --> 00:33:56.200 +And another type + +00:33:56.200 --> 00:33:56.780 +checker might have + +00:33:56.780 --> 00:33:57.640 +a different priority. + +00:33:57.860 --> 00:33:58.760 +One thing I do + +00:33:58.760 --> 00:33:59.820 +want to mention is + +00:33:59.820 --> 00:34:01.320 +that it may not + +00:34:01.320 --> 00:34:02.540 +seem like it, but + +00:34:02.540 --> 00:34:04.080 +things are already + +00:34:04.080 --> 00:34:05.900 +much better than + +00:34:05.900 --> 00:34:06.700 +they used to be. + +00:34:07.120 --> 00:34:08.160 +Like previously, I + +00:34:08.160 --> 00:34:09.200 +worked on a + +00:34:09.200 --> 00:34:09.700 +different type + +00:34:09.700 --> 00:34:10.520 +checker called + +00:34:10.520 --> 00:34:11.420 +PyType. + +00:34:11.580 --> 00:34:12.380 +And at that time, + +00:34:12.420 --> 00:34:13.140 +it was, you know, + +00:34:13.480 --> 00:34:14.280 +sort of the Wild + +00:34:14.280 --> 00:34:14.640 +West. + +00:34:14.760 --> 00:34:15.440 +Like we want to + +00:34:15.440 --> 00:34:16.680 +know how other + +00:34:16.680 --> 00:34:17.700 +type checkers like + +00:34:17.700 --> 00:34:18.400 +do something. + +00:34:18.520 --> 00:34:18.860 +Well, you know, + +00:34:18.880 --> 00:34:19.500 +like open up the + +00:34:19.500 --> 00:34:20.260 +mypy playground, + +00:34:20.520 --> 00:34:21.040 +open up the + +00:34:21.040 --> 00:34:22.380 +PyRite playground, + +00:34:22.800 --> 00:34:24.660 +see what tells + +00:34:24.660 --> 00:34:24.880 +you. + +00:34:25.020 --> 00:34:25.920 +Now we at least + +00:34:25.920 --> 00:34:26.740 +have spec and + +00:34:26.740 --> 00:34:27.760 +conformance tests. + +00:34:28.000 --> 00:34:28.460 +Yeah, that's + +00:34:28.460 --> 00:34:28.880 +really cool. + +00:34:29.080 --> 00:34:29.640 +How much would + +00:34:29.640 --> 00:34:31.260 +you say that + +00:34:31.260 --> 00:34:32.940 +your two type + +00:34:32.940 --> 00:34:34.140 +checkers maybe + +00:34:34.140 --> 00:34:35.420 +bring in mypy + +00:34:35.420 --> 00:34:35.900 +as well? + +00:34:36.020 --> 00:34:36.460 +Like how much + +00:34:36.460 --> 00:34:37.200 +do they agree + +00:34:37.200 --> 00:34:38.060 +versus disagree? + +00:34:38.320 --> 00:34:38.880 +You know, like + +00:34:38.880 --> 00:34:40.060 +you only see the + +00:34:40.060 --> 00:34:40.380 +differences. + +00:34:40.540 --> 00:34:41.140 +You don't see in + +00:34:41.140 --> 00:34:41.900 +which ways that + +00:34:41.900 --> 00:34:42.640 +they are the same + +00:34:42.640 --> 00:34:43.840 +as a consumer of + +00:34:43.840 --> 00:34:44.380 +them so much, + +00:34:44.440 --> 00:34:44.580 +right? + +00:34:44.580 --> 00:34:45.040 +You're like, why + +00:34:45.040 --> 00:34:45.420 +is this one + +00:34:45.420 --> 00:34:46.060 +squiggly when it + +00:34:46.060 --> 00:34:46.680 +wasn't squiggly + +00:34:46.680 --> 00:34:47.120 +before? + +00:34:47.600 --> 00:34:48.540 +But how similar + +00:34:48.540 --> 00:34:49.020 +or different are + +00:34:49.020 --> 00:34:49.140 +they? + +00:34:49.240 --> 00:34:49.640 +I don't know how + +00:34:49.640 --> 00:34:50.260 +we would quantify + +00:34:50.260 --> 00:34:50.680 +that. + +00:34:51.000 --> 00:34:51.900 +I think there's a + +00:34:51.900 --> 00:34:52.640 +lot that is the + +00:34:52.640 --> 00:34:53.580 +same just because + +00:34:53.580 --> 00:34:54.760 +it's based on how + +00:34:54.760 --> 00:34:55.480 +Python actually + +00:34:55.480 --> 00:34:55.860 +works. + +00:34:56.200 --> 00:34:56.960 +We're both trying + +00:34:56.960 --> 00:34:57.420 +to model the + +00:34:57.420 --> 00:34:58.280 +same language and + +00:34:58.280 --> 00:34:58.540 +then there's + +00:34:58.540 --> 00:34:59.100 +certainly also + +00:34:59.100 --> 00:34:59.680 +plenty of + +00:34:59.680 --> 00:35:00.660 +differences or + +00:35:00.660 --> 00:35:01.120 +things that we + +00:35:01.120 --> 00:35:01.700 +handle differently. + +00:35:02.060 --> 00:35:02.960 +So Rebecca, do you + +00:35:02.960 --> 00:35:03.360 +have a better way + +00:35:03.360 --> 00:35:04.020 +to quantify that? + +00:35:04.400 --> 00:35:05.360 +Yeah, I agree. + +00:35:05.360 --> 00:35:06.840 +it's hard to + +00:35:06.840 --> 00:35:08.180 +quantify, I suppose, + +00:35:08.300 --> 00:35:08.940 +talk a bit + +00:35:08.940 --> 00:35:10.220 +abstractly about + +00:35:10.220 --> 00:35:11.040 +various type + +00:35:11.040 --> 00:35:12.640 +checkers philosophies. + +00:35:13.180 --> 00:35:14.520 +With Pyrefly, we + +00:35:14.520 --> 00:35:15.380 +really try to do + +00:35:15.380 --> 00:35:16.360 +a lot of type + +00:35:16.360 --> 00:35:16.960 +inference. + +00:35:17.220 --> 00:35:17.940 +So that's a way + +00:35:17.940 --> 00:35:18.840 +in which we + +00:35:18.840 --> 00:35:19.760 +intentionally diverge + +00:35:19.760 --> 00:35:21.140 +a bit from mypy. + +00:35:21.320 --> 00:35:22.560 +But other than + +00:35:22.560 --> 00:35:23.300 +that deliberate + +00:35:23.300 --> 00:35:24.260 +decision, if we + +00:35:24.260 --> 00:35:24.880 +see ways in which + +00:35:24.880 --> 00:35:25.940 +we are accidentally + +00:35:25.940 --> 00:35:27.120 +different, we do + +00:35:27.120 --> 00:35:28.800 +try to fix that + +00:35:28.800 --> 00:35:29.460 +because otherwise + +00:35:29.460 --> 00:35:30.540 +people would have + +00:35:30.540 --> 00:35:31.580 +a hard time running + +00:35:31.580 --> 00:35:32.160 +multiple type + +00:35:32.160 --> 00:35:32.760 +checkers or + +00:35:32.760 --> 00:35:33.220 +migrating. + +00:35:33.460 --> 00:35:33.960 +Yeah, differences + +00:35:33.960 --> 00:35:34.980 +obviously cause + +00:35:35.620 --> 00:35:36.560 +pain for users + +00:35:36.560 --> 00:35:37.220 +who are using + +00:35:37.220 --> 00:35:37.760 +multiple type + +00:35:37.760 --> 00:35:38.340 +checkers or + +00:35:38.340 --> 00:35:39.120 +writing libraries + +00:35:39.120 --> 00:35:39.580 +that need to + +00:35:39.580 --> 00:35:40.160 +support multiple + +00:35:40.160 --> 00:35:40.820 +type checkers. + +00:35:40.960 --> 00:35:41.960 +So like Rebecca + +00:35:41.960 --> 00:35:43.000 +said, it's like if + +00:35:43.000 --> 00:35:43.720 +we are different + +00:35:43.720 --> 00:35:44.440 +from other type + +00:35:44.440 --> 00:35:45.040 +checkers, we want + +00:35:45.040 --> 00:35:45.540 +to be sure that + +00:35:45.540 --> 00:35:45.940 +there's a good + +00:35:45.940 --> 00:35:46.920 +reason for that + +00:35:46.920 --> 00:35:47.340 +difference. + +00:35:47.600 --> 00:35:47.960 +The difference + +00:35:47.960 --> 00:35:49.040 +should be because + +00:35:49.040 --> 00:35:49.780 +of philosophical + +00:35:49.780 --> 00:35:51.580 +choice, not just + +00:35:51.580 --> 00:35:52.760 +you happen to have + +00:35:52.760 --> 00:35:53.480 +chosen slightly + +00:35:53.480 --> 00:35:54.140 +differently, right? + +00:35:54.420 --> 00:35:54.840 +Yeah, and it's + +00:35:54.840 --> 00:35:56.420 +not just people who + +00:35:56.420 --> 00:35:57.740 +run different type + +00:35:57.740 --> 00:35:58.240 +checkers. + +00:35:58.800 --> 00:35:59.600 +Like you pointed + +00:35:59.600 --> 00:36:00.260 +out, Carl, a lot + +00:36:00.260 --> 00:36:01.140 +of times it is if I + +00:36:01.140 --> 00:36:02.500 +have a library and + +00:36:02.500 --> 00:36:03.740 +then different people + +00:36:03.740 --> 00:36:04.420 +want to consume + +00:36:04.420 --> 00:36:05.360 +that library, then + +00:36:05.360 --> 00:36:06.600 +their type checker + +00:36:06.600 --> 00:36:07.300 +may or may not + +00:36:07.300 --> 00:36:08.200 +warn them about how + +00:36:08.200 --> 00:36:09.220 +my library declares + +00:36:09.220 --> 00:36:10.660 +its types and so + +00:36:10.660 --> 00:36:10.960 +on. + +00:36:11.700 --> 00:36:12.660 +I'll give you a + +00:36:12.660 --> 00:36:13.760 +real quick example. + +00:36:14.380 --> 00:36:16.540 +I have a, I can't + +00:36:16.540 --> 00:36:17.100 +remember which one + +00:36:17.100 --> 00:36:17.420 +it was, I have + +00:36:17.420 --> 00:36:17.880 +three or four + +00:36:17.880 --> 00:36:19.000 +different open + +00:36:19.000 --> 00:36:19.620 +source libraries + +00:36:19.620 --> 00:36:20.140 +that I've created + +00:36:20.140 --> 00:36:21.700 +that somehow work + +00:36:21.700 --> 00:36:22.960 +with creating, + +00:36:23.540 --> 00:36:24.440 +basically passing + +00:36:24.440 --> 00:36:25.600 +data to templates + +00:36:25.600 --> 00:36:26.980 +in web apps, + +00:36:27.340 --> 00:36:27.460 +right? + +00:36:27.500 --> 00:36:28.700 +So one is like I + +00:36:28.700 --> 00:36:29.260 +want to use the + +00:36:29.260 --> 00:36:30.020 +Chameleon web + +00:36:30.020 --> 00:36:30.740 +template framework, + +00:36:30.740 --> 00:36:31.480 +but with fast + +00:36:31.480 --> 00:36:32.680 +API or with + +00:36:32.680 --> 00:36:33.480 +Flask or there's + +00:36:33.480 --> 00:36:33.920 +some other + +00:36:33.920 --> 00:36:34.580 +variations like + +00:36:34.580 --> 00:36:35.360 +partials and so + +00:36:35.360 --> 00:36:35.500 +on. + +00:36:35.700 --> 00:36:36.360 +I can't remember + +00:36:36.360 --> 00:36:37.180 +which one, but it + +00:36:37.180 --> 00:36:37.500 +doesn't really + +00:36:37.500 --> 00:36:37.720 +matter. + +00:36:37.840 --> 00:36:38.280 +One of them + +00:36:38.280 --> 00:36:39.760 +decorated a + +00:36:39.760 --> 00:36:40.680 +Flask. + +00:36:40.820 --> 00:36:41.600 +I think it was a + +00:36:41.600 --> 00:36:42.420 +Flask, especially + +00:36:42.420 --> 00:36:43.480 +makes it irrelevant. + +00:36:43.900 --> 00:36:44.980 +A Flask endpoint + +00:36:44.980 --> 00:36:46.760 +and PyRite was + +00:36:46.760 --> 00:36:47.660 +really upset. + +00:36:47.860 --> 00:36:48.480 +Like the error + +00:36:48.480 --> 00:36:49.460 +message filled the + +00:36:49.460 --> 00:36:50.620 +entire page of + +00:36:50.620 --> 00:36:51.240 +how it was + +00:36:51.240 --> 00:36:53.000 +inconsistent with + +00:36:53.000 --> 00:36:54.020 +what it expected + +00:36:54.020 --> 00:36:54.920 +for the definition + +00:36:54.920 --> 00:36:55.980 +of the Flask + +00:36:55.980 --> 00:36:56.500 +view method. + +00:36:56.600 --> 00:36:57.720 +I'm like, no one + +00:36:57.720 --> 00:36:58.320 +is going to call + +00:36:58.320 --> 00:36:58.540 +this. + +00:36:58.540 --> 00:36:58.980 +Like what does + +00:36:58.980 --> 00:36:59.640 +it even matter + +00:36:59.640 --> 00:37:00.840 +what this type + +00:37:00.840 --> 00:37:01.080 +is? + +00:37:01.180 --> 00:37:01.960 +It still runs + +00:37:01.960 --> 00:37:02.260 +fine. + +00:37:02.320 --> 00:37:02.920 +The runtime is + +00:37:02.920 --> 00:37:03.260 +fine. + +00:37:03.540 --> 00:37:04.780 +You know, it's no + +00:37:04.780 --> 00:37:05.480 +problem with this + +00:37:05.480 --> 00:37:05.860 +decorator. + +00:37:05.980 --> 00:37:07.260 +It worked fine, but + +00:37:07.260 --> 00:37:08.800 +something about the + +00:37:08.800 --> 00:37:10.040 +way that the Flask + +00:37:10.040 --> 00:37:11.980 +at get returned + +00:37:11.980 --> 00:37:12.820 +the type versus + +00:37:12.820 --> 00:37:13.560 +what my thing + +00:37:13.560 --> 00:37:14.500 +returned varied in + +00:37:14.500 --> 00:37:15.060 +like a really + +00:37:15.060 --> 00:37:16.320 +slight way. + +00:37:16.460 --> 00:37:17.600 +I didn't care, but + +00:37:17.600 --> 00:37:18.600 +somebody was using + +00:37:18.600 --> 00:37:19.780 +some editor that + +00:37:19.780 --> 00:37:20.700 +used PyRite and + +00:37:20.700 --> 00:37:21.140 +they're like, you + +00:37:21.140 --> 00:37:22.380 +have to help fix + +00:37:22.380 --> 00:37:22.560 +this. + +00:37:22.640 --> 00:37:23.300 +I can't take all + +00:37:23.300 --> 00:37:23.760 +these warnings. + +00:37:23.880 --> 00:37:24.500 +They're huge and + +00:37:24.500 --> 00:37:25.080 +they're everywhere. + +00:37:25.420 --> 00:37:26.780 +Like, okay, I'll + +00:37:26.780 --> 00:37:27.360 +go fix it. + +00:37:27.360 --> 00:37:27.720 +Right. + +00:37:27.720 --> 00:37:29.060 +And I went and I + +00:37:29.060 --> 00:37:30.740 +put way more effort + +00:37:30.740 --> 00:37:31.440 +than was justified + +00:37:31.440 --> 00:37:32.840 +into a function type + +00:37:32.840 --> 00:37:33.340 +that no one ever + +00:37:33.340 --> 00:37:34.580 +calls just to make + +00:37:34.580 --> 00:37:36.080 +the errors on some + +00:37:36.080 --> 00:37:37.140 +type checker I + +00:37:37.140 --> 00:37:38.420 +didn't use go + +00:37:38.420 --> 00:37:38.800 +away. + +00:37:38.920 --> 00:37:39.220 +Right. + +00:37:39.260 --> 00:37:39.800 +And that's the + +00:37:39.800 --> 00:37:40.540 +kind of thing where + +00:37:40.540 --> 00:37:41.300 +it becomes just a + +00:37:41.300 --> 00:37:41.580 +headache. + +00:37:41.760 --> 00:37:42.260 +I don't know. + +00:37:42.280 --> 00:37:42.820 +I wish I remember. + +00:37:42.880 --> 00:37:43.540 +I probably got that + +00:37:43.540 --> 00:37:44.800 +written down in an + +00:37:44.800 --> 00:37:45.900 +issue somebody filed, + +00:37:46.000 --> 00:37:46.840 +but it was, it was + +00:37:46.840 --> 00:37:47.520 +a gnarly error. + +00:37:47.660 --> 00:37:48.520 +And, or if you're + +00:37:48.520 --> 00:37:49.120 +working on an open + +00:37:49.120 --> 00:37:50.240 +source project, you + +00:37:50.240 --> 00:37:50.900 +know, you can't make + +00:37:50.900 --> 00:37:51.700 +everybody use the + +00:37:51.700 --> 00:37:53.100 +same editor that + +00:37:53.100 --> 00:37:53.940 +wants to contribute + +00:37:53.940 --> 00:37:54.800 +on a big project. + +00:37:55.160 --> 00:37:55.980 +And so you might run + +00:37:55.980 --> 00:37:56.880 +into this variation as + +00:37:56.880 --> 00:37:57.020 +well. + +00:37:57.020 --> 00:37:57.580 +So there's a lot + +00:37:57.580 --> 00:37:58.000 +of cases. + +00:37:58.240 --> 00:37:58.320 +Yeah. + +00:37:58.340 --> 00:37:58.940 +It can be really + +00:37:58.940 --> 00:38:00.220 +difficult to make + +00:38:00.220 --> 00:38:01.240 +these decisions about + +00:38:01.240 --> 00:38:03.020 +what kind of, what + +00:38:03.020 --> 00:38:04.460 +sorts of errors people + +00:38:04.460 --> 00:38:06.220 +want their type checker + +00:38:06.220 --> 00:38:07.560 +to catch or what's + +00:38:07.560 --> 00:38:08.120 +too pedantic. + +00:38:08.460 --> 00:38:09.440 +You want your type + +00:38:09.440 --> 00:38:10.200 +checker to catch + +00:38:10.200 --> 00:38:11.700 +non-obvious errors, + +00:38:11.700 --> 00:38:12.740 +not just the obvious + +00:38:12.740 --> 00:38:13.440 +ones that you probably + +00:38:13.440 --> 00:38:14.020 +would have seen by + +00:38:14.020 --> 00:38:14.660 +looking at the code + +00:38:14.660 --> 00:38:15.040 +yourself. + +00:38:15.300 --> 00:38:16.020 +But then there'll be + +00:38:16.020 --> 00:38:17.440 +cases where somebody + +00:38:17.440 --> 00:38:18.040 +says, well, I don't + +00:38:18.040 --> 00:38:18.240 +care. + +00:38:18.300 --> 00:38:18.940 +That's too pedantic. + +00:38:19.160 --> 00:38:20.220 +And it is difficult to + +00:38:20.220 --> 00:38:20.960 +make everyone happy. + +00:38:21.640 --> 00:38:22.560 +Who decides what the + +00:38:22.560 --> 00:38:24.360 +right signature of a + +00:38:24.360 --> 00:38:25.860 +flask view and point + +00:38:25.860 --> 00:38:26.940 +should be like if you + +00:38:27.020 --> 00:38:27.900 +the framework can call + +00:38:27.900 --> 00:38:28.080 +it. + +00:38:28.080 --> 00:38:29.080 +It should be okay. + +00:38:29.080 --> 00:38:30.080 +There's not. + +00:38:30.200 --> 00:38:30.800 +Just because it had a + +00:38:30.800 --> 00:38:31.600 +decorator before, that + +00:38:31.600 --> 00:38:32.300 +doesn't mean that's the + +00:38:32.300 --> 00:38:33.360 +official structure. + +00:38:33.480 --> 00:38:34.700 +But anyway, I do think + +00:38:34.700 --> 00:38:35.440 +one of the bigger + +00:38:35.440 --> 00:38:37.000 +philosophical differences + +00:38:37.000 --> 00:38:39.120 +has to do around this + +00:38:39.120 --> 00:38:41.100 +concept of nullability. + +00:38:41.340 --> 00:38:42.080 +Do you guys call it + +00:38:42.080 --> 00:38:43.360 +nullability or none + +00:38:43.360 --> 00:38:43.920 +ability? + +00:38:44.280 --> 00:38:45.600 +Like nullability comes + +00:38:45.600 --> 00:38:46.040 +from the other + +00:38:46.040 --> 00:38:46.420 +languages. + +00:38:46.620 --> 00:38:48.080 +And by that, I mean, I + +00:38:48.080 --> 00:38:49.220 +can specify that I have + +00:38:49.220 --> 00:38:49.760 +an integer. + +00:38:49.760 --> 00:38:51.100 +And in the Python type + +00:38:51.100 --> 00:38:53.180 +system, it cannot be set + +00:38:53.180 --> 00:38:54.220 +to none, even though in + +00:38:54.220 --> 00:38:54.940 +the runtime it can. + +00:38:55.440 --> 00:38:56.740 +It has to be a concrete + +00:38:56.740 --> 00:38:58.820 +int type unless you make + +00:38:58.820 --> 00:39:00.940 +it a optional int or an + +00:39:00.940 --> 00:39:02.460 +int pipe none or one of + +00:39:02.460 --> 00:39:03.480 +those type things, right? + +00:39:03.580 --> 00:39:05.340 +And how strong that gets + +00:39:05.340 --> 00:39:06.480 +enforced seems to be one + +00:39:06.480 --> 00:39:07.980 +of the biggest difference + +00:39:07.980 --> 00:39:09.460 +of opinions that I've + +00:39:09.460 --> 00:39:10.040 +seen around. + +00:39:10.240 --> 00:39:10.860 +Like, how do you all + +00:39:10.860 --> 00:39:11.460 +think about that? + +00:39:11.600 --> 00:39:12.580 +That's interesting to me + +00:39:12.580 --> 00:39:13.540 +that that's your + +00:39:13.540 --> 00:39:15.100 +experience because my + +00:39:15.100 --> 00:39:16.000 +experience has been that + +00:39:16.000 --> 00:39:16.860 +that's actually an area + +00:39:16.860 --> 00:39:18.220 +where everyone seems to + +00:39:18.220 --> 00:39:19.120 +agree as far as I can + +00:39:19.120 --> 00:39:21.300 +tell that these are is + +00:39:21.300 --> 00:39:22.100 +an important source of + +00:39:22.100 --> 00:39:22.980 +bugs and it's better to + +00:39:22.980 --> 00:39:23.400 +catch them. + +00:39:23.480 --> 00:39:24.120 +So I think all of the + +00:39:24.120 --> 00:39:25.700 +type checkers, maybe you + +00:39:25.700 --> 00:39:26.720 +said PyCharm doesn't. + +00:39:26.900 --> 00:39:27.720 +I don't think PyCharm + +00:39:27.720 --> 00:39:28.220 +does that. + +00:39:28.340 --> 00:39:29.680 +I'm pretty sure it + +00:39:29.680 --> 00:39:31.440 +doesn't because I agree + +00:39:31.440 --> 00:39:32.200 +that it's an important + +00:39:32.200 --> 00:39:34.020 +thing to check, but it's + +00:39:34.020 --> 00:39:35.920 +also a point of a lot of + +00:39:35.920 --> 00:39:36.480 +friction. + +00:39:37.100 --> 00:39:37.920 +And by that, I mean, + +00:39:38.180 --> 00:39:39.480 +let's suppose I'm going + +00:39:39.480 --> 00:39:41.700 +to have a class that I + +00:39:41.700 --> 00:39:42.340 +need to create an + +00:39:42.340 --> 00:39:43.920 +instance of and then put + +00:39:43.920 --> 00:39:44.900 +values into. + +00:39:45.320 --> 00:39:46.580 +And I know once I put + +00:39:46.580 --> 00:39:47.680 +the values into it, let's + +00:39:47.680 --> 00:39:49.480 +say it has a user ID, I + +00:39:49.480 --> 00:39:50.960 +know for certain that + +00:39:50.960 --> 00:39:51.960 +that's going to be an + +00:39:51.960 --> 00:39:53.020 +integer, right? + +00:39:53.060 --> 00:39:54.260 +So I'd like to say user + +00:39:54.260 --> 00:39:55.520 +ID colon int because + +00:39:55.520 --> 00:39:57.000 +everywhere I use that + +00:39:57.000 --> 00:39:58.840 +object later, if it's a + +00:39:58.840 --> 00:39:59.820 +function that takes an + +00:39:59.820 --> 00:40:01.540 +int and I specify it as + +00:40:01.540 --> 00:40:03.000 +optional int, I will get + +00:40:03.000 --> 00:40:03.840 +a type check warning + +00:40:03.840 --> 00:40:05.400 +every single call site + +00:40:05.400 --> 00:40:06.960 +when I try to pass that. + +00:40:07.240 --> 00:40:08.100 +But I know from the + +00:40:08.100 --> 00:40:09.000 +semantics of the + +00:40:09.000 --> 00:40:10.780 +behavior that it's going + +00:40:10.780 --> 00:40:12.480 +to always be an int + +00:40:12.480 --> 00:40:13.440 +unless it's not + +00:40:13.440 --> 00:40:14.820 +initialized, right? + +00:40:14.820 --> 00:40:16.080 +And like in this short + +00:40:16.080 --> 00:40:17.060 +period where I want to + +00:40:17.060 --> 00:40:17.480 +create it. + +00:40:17.560 --> 00:40:18.700 +So I can't set the type + +00:40:18.700 --> 00:40:19.060 +to int. + +00:40:19.120 --> 00:40:19.660 +I have to set the + +00:40:19.660 --> 00:40:20.760 +optional int until I've + +00:40:20.760 --> 00:40:21.300 +loaded it. + +00:40:21.920 --> 00:40:22.680 +And, but there's like + +00:40:22.680 --> 00:40:23.660 +this, I don't know, + +00:40:23.720 --> 00:40:24.380 +that's, that's the part + +00:40:24.380 --> 00:40:25.480 +where I see a lot of it + +00:40:25.480 --> 00:40:27.720 +show up is inconsistencies + +00:40:27.720 --> 00:40:29.400 +and then warnings all + +00:40:29.400 --> 00:40:29.980 +over the place. + +00:40:29.980 --> 00:40:31.140 +So I'm like, well, but + +00:40:31.140 --> 00:40:32.300 +that function is actually + +00:40:32.300 --> 00:40:33.240 +checking if it's none + +00:40:33.240 --> 00:40:34.880 +and it'll return null, + +00:40:35.080 --> 00:40:35.900 +you know, none or + +00:40:35.900 --> 00:40:36.500 +something like that. + +00:40:36.500 --> 00:40:38.200 +So I totally agree with + +00:40:38.200 --> 00:40:38.380 +you. + +00:40:38.520 --> 00:40:39.420 +It's just somewhere I've + +00:40:39.420 --> 00:40:40.320 +seen the most + +00:40:40.320 --> 00:40:41.860 +inconsistencies across + +00:40:41.860 --> 00:40:43.660 +maybe PyCharm versus + +00:40:43.660 --> 00:40:44.280 +others. + +00:40:44.480 --> 00:40:45.580 +mypy also has a legacy + +00:40:45.580 --> 00:40:46.900 +mode for not checking + +00:40:46.900 --> 00:40:48.340 +none things called + +00:40:48.340 --> 00:40:49.760 +non-strict optional. + +00:40:50.200 --> 00:40:51.240 +We're trying to get + +00:40:51.240 --> 00:40:52.300 +rid of that from mypy + +00:40:52.300 --> 00:40:53.480 +because yeah, strict + +00:40:53.480 --> 00:40:54.360 +optional, like being + +00:40:54.360 --> 00:40:55.200 +strict about it is the + +00:40:55.200 --> 00:40:55.940 +more sensible thing to + +00:40:55.940 --> 00:40:56.160 +do. + +00:40:56.480 --> 00:40:57.580 +But it's possible that + +00:40:57.580 --> 00:40:58.540 +you've seen that too. + +00:40:59.020 --> 00:40:59.460 +Yeah, I agree. + +00:40:59.720 --> 00:41:00.320 +So what you mentioned + +00:41:00.320 --> 00:41:01.720 +is maybe sort of a + +00:41:01.720 --> 00:41:02.540 +special case of the + +00:41:02.540 --> 00:41:03.540 +case where you pass + +00:41:03.540 --> 00:41:04.360 +something to a class + +00:41:04.360 --> 00:41:05.380 +and there's initialization + +00:41:05.380 --> 00:41:06.440 +that changes the types. + +00:41:06.820 --> 00:41:07.660 +Doesn't necessarily have + +00:41:07.660 --> 00:41:08.220 +to deal with none. + +00:41:08.300 --> 00:41:08.960 +It could also just be + +00:41:08.960 --> 00:41:09.780 +like the attribute doesn't + +00:41:09.780 --> 00:41:10.740 +exist at all beforehand + +00:41:10.740 --> 00:41:11.240 +or something. + +00:41:11.240 --> 00:41:12.940 +Yeah, we don't have a + +00:41:12.940 --> 00:41:13.780 +good solution for that. + +00:41:13.900 --> 00:41:14.840 +Maybe there's room for + +00:41:14.840 --> 00:41:16.120 +something to support that + +00:41:16.120 --> 00:41:16.720 +use case better. + +00:41:17.020 --> 00:41:17.660 +I don't know what it + +00:41:17.660 --> 00:41:18.120 +would look like. + +00:41:18.260 --> 00:41:19.180 +In some cases, there's + +00:41:19.180 --> 00:41:20.720 +ways you can, these + +00:41:20.720 --> 00:41:21.900 +things can sometimes + +00:41:21.900 --> 00:41:22.760 +nudge you towards a + +00:41:22.760 --> 00:41:23.760 +different design that is + +00:41:23.760 --> 00:41:24.920 +actually safer and will + +00:41:24.920 --> 00:41:26.020 +avoid errors. + +00:41:26.020 --> 00:41:27.040 +Like in the kind of + +00:41:27.040 --> 00:41:27.640 +case you're talking + +00:41:27.640 --> 00:41:29.000 +about, you know, is + +00:41:29.000 --> 00:41:30.580 +it actually necessary + +00:41:30.580 --> 00:41:31.820 +that an uninitialized + +00:41:31.820 --> 00:41:32.600 +object and an + +00:41:32.600 --> 00:41:33.360 +initialized one are + +00:41:33.360 --> 00:41:34.120 +represented by the + +00:41:34.120 --> 00:41:34.580 +same type? + +00:41:34.780 --> 00:41:35.840 +Or is there a way to + +00:41:35.840 --> 00:41:37.260 +adjust the API so that + +00:41:37.260 --> 00:41:37.840 +those are actually + +00:41:37.840 --> 00:41:38.960 +different types than you + +00:41:38.960 --> 00:41:40.080 +solve the problem and + +00:41:40.080 --> 00:41:41.220 +your code is safer + +00:41:41.220 --> 00:41:41.580 +or so? + +00:41:41.580 --> 00:41:42.500 +I'm thinking like you + +00:41:42.500 --> 00:41:44.500 +submit a web form and + +00:41:44.500 --> 00:41:45.680 +before you parse it, you've + +00:41:45.680 --> 00:41:46.440 +got to create the instance + +00:41:46.440 --> 00:41:47.360 +to set the values. + +00:41:47.940 --> 00:41:48.300 +And I don't know. + +00:41:48.420 --> 00:41:49.320 +It's not worth diving into, + +00:41:49.380 --> 00:41:50.340 +but I do find this + +00:41:50.340 --> 00:41:52.260 +differentiation between like + +00:41:52.260 --> 00:41:53.880 +the strict enforcement of + +00:41:53.880 --> 00:41:55.200 +none versus not none. + +00:41:55.340 --> 00:41:56.500 +I think it's powerful and I + +00:41:56.500 --> 00:41:57.440 +do think you all are right + +00:41:57.440 --> 00:41:58.600 +that it does catch a lot of + +00:41:58.600 --> 00:41:58.840 +errors. + +00:41:58.940 --> 00:42:00.000 +It's just, it's just a + +00:42:00.000 --> 00:42:00.880 +difference and it's just an + +00:42:00.880 --> 00:42:02.280 +interesting, interesting + +00:42:02.280 --> 00:42:02.560 +choice. + +00:42:02.680 --> 00:42:03.620 +But I didn't get a + +00:42:03.620 --> 00:42:05.280 +concrete answer from the + +00:42:05.280 --> 00:42:06.140 +official counsel. + +00:42:06.720 --> 00:42:07.640 +Nullable or + +00:42:07.640 --> 00:42:08.200 +noneable? + +00:42:08.640 --> 00:42:09.140 +What is it? + +00:42:09.140 --> 00:42:09.700 +I feel like you just + +00:42:09.700 --> 00:42:10.600 +don't really even talk + +00:42:10.600 --> 00:42:11.700 +about it as a term mostly. + +00:42:12.340 --> 00:42:13.900 +It's, yeah, none is + +00:42:13.900 --> 00:42:15.420 +special in the type system + +00:42:15.420 --> 00:42:16.720 +in like how you represent + +00:42:16.720 --> 00:42:17.820 +it, but it's not really + +00:42:17.820 --> 00:42:19.660 +special in other ways. + +00:42:19.920 --> 00:42:20.940 +You don't have a term for + +00:42:20.940 --> 00:42:21.860 +int pipe none? + +00:42:22.140 --> 00:42:22.620 +Int or none. + +00:42:22.860 --> 00:42:23.780 +Historically, the term was + +00:42:23.780 --> 00:42:24.820 +optional, although I think + +00:42:24.820 --> 00:42:26.560 +that term has problems and + +00:42:26.560 --> 00:42:28.220 +we're sort of moving away + +00:42:28.220 --> 00:42:29.080 +from it because + +00:42:29.080 --> 00:42:31.680 +specifically one problem is + +00:42:31.680 --> 00:42:33.360 +that optional can mean + +00:42:33.360 --> 00:42:34.920 +you don't have to pass it + +00:42:34.920 --> 00:42:36.200 +in, like I say, as a + +00:42:36.200 --> 00:42:36.900 +function parameter. + +00:42:37.080 --> 00:42:38.200 +Let's talk a little bit + +00:42:38.200 --> 00:42:39.600 +about TypeShed. + +00:42:40.060 --> 00:42:41.440 +I think TypeShed is pretty + +00:42:41.440 --> 00:42:41.800 +interesting. + +00:42:41.940 --> 00:42:42.860 +Maybe people don't know + +00:42:42.860 --> 00:42:44.160 +too much about it. + +00:42:44.640 --> 00:42:45.840 +So I'm sure you all are + +00:42:45.840 --> 00:42:47.000 +familiar with this project + +00:42:47.000 --> 00:42:48.500 +that you can basically add + +00:42:48.500 --> 00:42:50.220 +type information that the + +00:42:50.220 --> 00:42:51.880 +libraries didn't bother to + +00:42:51.880 --> 00:42:52.740 +include for you, right? + +00:42:53.160 --> 00:42:54.460 +What are thoughts on TypeShed? + +00:42:54.620 --> 00:42:55.640 +How much do you all lean on + +00:42:55.640 --> 00:42:57.060 +this to sort of round out + +00:42:57.060 --> 00:42:58.020 +missing types? + +00:42:58.240 --> 00:42:59.160 +There are two parts to + +00:42:59.160 --> 00:43:00.040 +TypeShed, right? + +00:43:00.040 --> 00:43:02.440 +There's the standard library + +00:43:02.440 --> 00:43:04.480 +type stubs, which I think + +00:43:04.480 --> 00:43:05.600 +are invaluable. + +00:43:06.000 --> 00:43:07.260 +Like all the type checkers + +00:43:07.260 --> 00:43:07.980 +use those. + +00:43:08.480 --> 00:43:09.620 +And I mean, will the standard + +00:43:09.620 --> 00:43:10.660 +library itself ever have + +00:43:10.660 --> 00:43:11.880 +inline types? + +00:43:12.060 --> 00:43:12.640 +Who knows? + +00:43:12.700 --> 00:43:13.900 +This might be around forever. + +00:43:14.520 --> 00:43:15.900 +And then there are also + +00:43:15.900 --> 00:43:18.640 +the third party stubs. + +00:43:18.960 --> 00:43:20.320 +And I think that's what you're + +00:43:20.320 --> 00:43:20.780 +describing. + +00:43:20.940 --> 00:43:21.980 +They're libraries that for + +00:43:21.980 --> 00:43:23.160 +whatever reason don't ship + +00:43:23.160 --> 00:43:24.540 +with stubs themselves. + +00:43:24.920 --> 00:43:26.640 +Those are in TypeShed. + +00:43:26.640 --> 00:43:29.080 +And I think it's been like + +00:43:29.080 --> 00:43:31.040 +for a while, there's sort of + +00:43:31.040 --> 00:43:31.940 +been a question of like what + +00:43:31.940 --> 00:43:33.560 +we want to do with like + +00:43:33.560 --> 00:43:35.440 +TypeShed's third party stubs, + +00:43:35.540 --> 00:43:35.740 +right? + +00:43:35.860 --> 00:43:37.420 +Because like ideally like + +00:43:37.420 --> 00:43:38.480 +libraries would ship with + +00:43:38.480 --> 00:43:39.780 +their own types, but there + +00:43:39.780 --> 00:43:41.220 +are various obstacles to that. + +00:43:41.360 --> 00:43:43.120 +The obstacles that I know of + +00:43:43.120 --> 00:43:45.000 +used to be like, we want this + +00:43:45.000 --> 00:43:47.080 +to run on Python 2 and Python 3. + +00:43:47.520 --> 00:43:48.920 +Or we want it to run on + +00:43:48.920 --> 00:43:50.480 +Python 3.3 still. + +00:43:50.740 --> 00:43:52.480 +But it's been a long time + +00:43:52.480 --> 00:43:55.160 +since any non-type supporting + +00:43:55.160 --> 00:43:57.700 +version of Python was a real, + +00:43:57.840 --> 00:43:58.880 +you know, a supported type + +00:43:58.880 --> 00:43:59.500 +of thing, right? + +00:43:59.740 --> 00:44:01.300 +I mean, even 3.9 became + +00:44:01.300 --> 00:44:02.080 +deprecated. + +00:44:02.560 --> 00:44:04.440 +So on one hand, I feel like + +00:44:04.440 --> 00:44:05.880 +they could be merged in, + +00:44:05.920 --> 00:44:07.100 +but there's also a lot of + +00:44:07.100 --> 00:44:09.700 +other areas that are maybe + +00:44:09.700 --> 00:44:11.720 +we don't, they're not common, + +00:44:12.120 --> 00:44:12.300 +right? + +00:44:12.360 --> 00:44:13.900 +Like other libraries, like + +00:44:13.900 --> 00:44:15.960 +pick some, let's say Pyramid. + +00:44:16.120 --> 00:44:17.400 +I don't think the Pyramid web + +00:44:17.400 --> 00:44:19.220 +framework really ever got + +00:44:19.220 --> 00:44:20.280 +types added to it. + +00:44:20.280 --> 00:44:21.620 +Somebody could go and create + +00:44:21.620 --> 00:44:24.800 +a typeshed stub or a types + +00:44:24.800 --> 00:44:26.160 +underscore pyramid you could + +00:44:26.160 --> 00:44:27.180 +pip install and then we'll + +00:44:27.180 --> 00:44:28.040 +add the types, right? + +00:44:28.420 --> 00:44:29.680 +I certainly see it being + +00:44:29.680 --> 00:44:30.780 +really valuable for third + +00:44:30.780 --> 00:44:31.680 +party things that are just + +00:44:31.680 --> 00:44:33.000 +not going to get the type + +00:44:33.000 --> 00:44:33.840 +attention they need. + +00:44:33.960 --> 00:44:34.780 +Yeah, I think typeshed is + +00:44:34.780 --> 00:44:35.040 +great. + +00:44:35.180 --> 00:44:36.880 +I've spent a lot of time on + +00:44:36.880 --> 00:44:37.600 +improving it. + +00:44:37.800 --> 00:44:39.020 +As Rebecca said, especially + +00:44:39.020 --> 00:44:39.880 +with a standard library, + +00:44:39.980 --> 00:44:41.080 +it's irreplaceable. + +00:44:41.440 --> 00:44:42.520 +For third party libraries, + +00:44:42.780 --> 00:44:44.360 +I think it's become less + +00:44:44.360 --> 00:44:45.200 +needed over time. + +00:44:45.980 --> 00:44:47.480 +It used to be that very few + +00:44:47.480 --> 00:44:49.060 +third party libraries had + +00:44:49.060 --> 00:44:49.760 +any types. + +00:44:49.760 --> 00:44:51.480 +Now that's obviously changed. + +00:44:51.600 --> 00:44:53.080 +A lot of libraries ship + +00:44:53.080 --> 00:44:54.700 +their own types, but still + +00:44:54.700 --> 00:44:56.180 +there are quite a few types + +00:44:56.180 --> 00:44:57.680 +of libraries left where + +00:44:57.680 --> 00:44:59.460 +there aren't inline types + +00:44:59.460 --> 00:45:00.580 +and typeshed can provide + +00:45:00.580 --> 00:45:01.300 +useful types. + +00:45:01.800 --> 00:45:02.660 +I think typeshed also + +00:45:02.660 --> 00:45:03.620 +provides a service because + +00:45:03.620 --> 00:45:04.880 +it has a really great + +00:45:04.880 --> 00:45:05.980 +framework for testing + +00:45:05.980 --> 00:45:06.660 +these types. + +00:45:06.940 --> 00:45:07.820 +We have tools like step + +00:45:07.820 --> 00:45:09.260 +tests and various type + +00:45:09.260 --> 00:45:11.260 +checkers that help to + +00:45:11.260 --> 00:45:12.480 +make sure these types are + +00:45:12.480 --> 00:45:14.280 +good and meet a high + +00:45:14.280 --> 00:45:14.620 +standard. + +00:45:15.200 --> 00:45:16.100 +So yeah, I think they're + +00:45:16.100 --> 00:45:17.000 +still useful for many + +00:45:17.000 --> 00:45:17.460 +libraries. + +00:45:17.800 --> 00:45:18.640 +Yeah, I was just looking at + +00:45:18.640 --> 00:45:21.640 +the types dash flask and + +00:45:21.640 --> 00:45:23.320 +I guess it must be, must + +00:45:23.320 --> 00:45:24.440 +be gone because now type + +00:45:24.440 --> 00:45:25.520 +flask must have it + +00:45:25.520 --> 00:45:25.940 +internally. + +00:45:26.260 --> 00:45:27.720 +So it's kind of an interim + +00:45:27.720 --> 00:45:28.520 +sort of thing. + +00:45:28.580 --> 00:45:29.080 +That's pretty cool. + +00:45:29.240 --> 00:45:29.940 +In general, typeshed has + +00:45:29.940 --> 00:45:31.320 +the policy that we remove + +00:45:31.320 --> 00:45:32.460 +the snaps from typeshed if + +00:45:32.460 --> 00:45:34.120 +they are in the library + +00:45:34.120 --> 00:45:34.520 +itself. + +00:45:34.600 --> 00:45:35.340 +I find these super + +00:45:35.340 --> 00:45:37.160 +valuable because if there's + +00:45:37.160 --> 00:45:38.120 +a library I want to work + +00:45:38.120 --> 00:45:39.440 +with and it just doesn't + +00:45:39.440 --> 00:45:40.240 +have types for whatever + +00:45:40.240 --> 00:45:41.760 +reason, you can install + +00:45:41.760 --> 00:45:43.000 +stuff from here and all + +00:45:43.000 --> 00:45:43.920 +of a sudden your editor's + +00:45:43.920 --> 00:45:44.740 +way happier. + +00:45:44.740 --> 00:45:47.120 +I mean, I know we, you + +00:45:47.120 --> 00:45:48.340 +all agreed on like the + +00:45:48.340 --> 00:45:49.960 +API boundaries and I did + +00:45:49.960 --> 00:45:50.400 +as well. + +00:45:50.480 --> 00:45:51.040 +It's like that's one of + +00:45:51.040 --> 00:45:51.880 +the really cool things. + +00:45:51.980 --> 00:45:52.780 +The other thing that + +00:45:52.780 --> 00:45:53.620 +really makes me excited + +00:45:53.620 --> 00:45:54.960 +about types is if I hit + +00:45:54.960 --> 00:45:57.080 +dot in my editor, I get + +00:45:57.080 --> 00:45:58.960 +a meaningful list of real + +00:45:58.960 --> 00:46:00.420 +information about what I'm + +00:46:00.420 --> 00:46:00.900 +working on. + +00:46:00.960 --> 00:46:02.040 +And so adding, adding + +00:46:02.040 --> 00:46:03.320 +these types of things are + +00:46:03.320 --> 00:46:04.000 +pretty interesting. + +00:46:04.260 --> 00:46:05.440 +I want to ask you all + +00:46:05.440 --> 00:46:07.560 +about sort of these rogue, + +00:46:07.760 --> 00:46:10.300 +rogue tools that do stuff + +00:46:10.300 --> 00:46:11.460 +with Python typing that + +00:46:11.460 --> 00:46:12.900 +maybe y'all didn't intend. + +00:46:12.900 --> 00:46:14.060 +Like we all mentioned + +00:46:14.060 --> 00:46:15.840 +Pydantic, we've got + +00:46:15.840 --> 00:46:18.240 +Typer and FastAPI, but + +00:46:18.240 --> 00:46:19.640 +even a little farther out + +00:46:19.640 --> 00:46:20.960 +there is a bear type. + +00:46:21.020 --> 00:46:21.620 +Are you familiar with + +00:46:21.620 --> 00:46:22.160 +bear type? + +00:46:22.600 --> 00:46:22.740 +Yeah. + +00:46:22.800 --> 00:46:23.640 +Bear type's interesting. + +00:46:24.280 --> 00:46:26.840 +You can import, they have + +00:46:26.840 --> 00:46:27.180 +fun. + +00:46:27.840 --> 00:46:29.080 +They have fun with their, + +00:46:29.180 --> 00:46:30.920 +their import names and + +00:46:30.920 --> 00:46:31.240 +stuff. + +00:46:31.480 --> 00:46:33.120 +But basically you can put + +00:46:33.120 --> 00:46:35.080 +a, either a decorator onto + +00:46:35.080 --> 00:46:36.940 +some sort of call site or + +00:46:36.940 --> 00:46:37.860 +something, or you can just + +00:46:37.860 --> 00:46:40.480 +do it to an entire package + +00:46:40.480 --> 00:46:41.700 +or entire modules rather. + +00:46:41.700 --> 00:46:43.740 +So just run bear type + +00:46:43.740 --> 00:46:45.200 +dot claw import bear type + +00:46:45.200 --> 00:46:45.440 +this. + +00:46:45.540 --> 00:46:47.380 +And then it actually turns + +00:46:47.380 --> 00:46:49.020 +into runtime type checks. + +00:46:49.640 --> 00:46:51.240 +Good idea, bad idea. + +00:46:51.440 --> 00:46:51.800 +Interesting. + +00:46:52.380 --> 00:46:53.060 +What do you all think? + +00:46:53.340 --> 00:46:54.860 +So un-Pythonic, you won't + +00:46:54.860 --> 00:46:55.920 +even open the webpage. + +00:46:56.060 --> 00:46:57.320 +People should feel free to + +00:46:57.320 --> 00:46:58.800 +write whatever code helps + +00:46:58.800 --> 00:47:00.220 +them make like better + +00:47:00.220 --> 00:47:00.600 +software. + +00:47:01.080 --> 00:47:01.940 +I haven't really used + +00:47:01.940 --> 00:47:02.840 +bear type much myself, + +00:47:03.040 --> 00:47:04.240 +but it's clearly useful for + +00:47:04.240 --> 00:47:04.700 +some people. + +00:47:05.020 --> 00:47:06.200 +And I think generally in + +00:47:06.200 --> 00:47:07.300 +designing a type system, + +00:47:07.400 --> 00:47:08.180 +we should try to + +00:47:08.180 --> 00:47:09.580 +accommodate all users who + +00:47:09.580 --> 00:47:10.460 +do useful things to the + +00:47:10.460 --> 00:47:10.980 +type system. + +00:47:10.980 --> 00:47:12.000 +And that includes things + +00:47:12.000 --> 00:47:13.020 +like Pydentic or bear + +00:47:13.020 --> 00:47:13.280 +type. + +00:47:13.400 --> 00:47:14.240 +It's pretty fast. + +00:47:14.360 --> 00:47:15.900 +It's not as big of a hit + +00:47:15.900 --> 00:47:17.540 +as you would, you would + +00:47:17.540 --> 00:47:17.920 +imagine. + +00:47:18.580 --> 00:47:19.360 +They, let me see, what + +00:47:19.360 --> 00:47:20.560 +are they, somewhere they + +00:47:20.560 --> 00:47:21.960 +had a really fun, fun + +00:47:21.960 --> 00:47:23.240 +saying in here, but here + +00:47:23.240 --> 00:47:23.560 +we go. + +00:47:24.100 --> 00:47:25.640 +Bear type brings Rust C++ + +00:47:25.640 --> 00:47:26.780 +inspired zero cost + +00:47:26.780 --> 00:47:27.860 +abstractions into the + +00:47:27.860 --> 00:47:29.400 +lawless world of dynamically + +00:47:29.400 --> 00:47:31.020 +typed Python by enforcing + +00:47:31.020 --> 00:47:32.360 +type safety at the granular + +00:47:32.360 --> 00:47:33.660 +level of functions and + +00:47:33.660 --> 00:47:35.540 +methods against type hints + +00:47:35.540 --> 00:47:36.800 +standardized by the Python + +00:47:36.800 --> 00:47:37.260 +community. + +00:47:37.260 --> 00:47:39.260 +order one, non-amortized + +00:47:39.260 --> 00:47:40.200 +worst case time with + +00:47:40.200 --> 00:47:41.420 +negligible constant factors. + +00:47:41.520 --> 00:47:42.240 +Like, how about that? + +00:47:43.180 --> 00:47:43.940 +No, it's a pretty neat + +00:47:43.940 --> 00:47:44.820 +library and it's pretty + +00:47:44.820 --> 00:47:45.120 +fast. + +00:47:45.240 --> 00:47:46.660 +I honestly, I've never used + +00:47:46.660 --> 00:47:47.200 +it in production. + +00:47:47.720 --> 00:47:49.680 +Having type hints and + +00:47:49.680 --> 00:47:50.840 +squigglies in the editors + +00:47:50.840 --> 00:47:53.540 +or in the linters has always + +00:47:53.540 --> 00:47:54.880 +been enough for me, but I + +00:47:54.880 --> 00:47:56.120 +can see using this if it's + +00:47:56.120 --> 00:47:57.520 +really critical and you're + +00:47:57.520 --> 00:47:58.460 +having issues, maybe you + +00:47:58.460 --> 00:47:59.600 +want to catch some runtime + +00:47:59.600 --> 00:48:00.000 +errors. + +00:48:00.340 --> 00:48:00.720 +I don't know. + +00:48:00.960 --> 00:48:02.000 +It's not quite an endorsement, + +00:48:02.120 --> 00:48:03.300 +but it sure is like a, huh, + +00:48:03.400 --> 00:48:04.340 +that's different. + +00:48:04.340 --> 00:48:06.120 +I definitely think that the + +00:48:06.120 --> 00:48:08.240 +extent to which type + +00:48:08.240 --> 00:48:10.820 +checkers may have a + +00:48:10.820 --> 00:48:11.760 +different understanding of + +00:48:11.760 --> 00:48:12.780 +your code from what happens + +00:48:12.780 --> 00:48:14.040 +at runtime and there isn't + +00:48:14.040 --> 00:48:15.660 +anything built in to catch + +00:48:15.660 --> 00:48:17.420 +that is sometimes a pain + +00:48:17.420 --> 00:48:17.740 +point. + +00:48:18.160 --> 00:48:19.300 +And so the desire to have + +00:48:19.300 --> 00:48:20.860 +your type annotations, to + +00:48:20.860 --> 00:48:22.360 +find out at runtime if your + +00:48:22.360 --> 00:48:23.420 +type annotations are telling + +00:48:23.420 --> 00:48:24.880 +you a lie, it makes a lot of + +00:48:24.880 --> 00:48:26.560 +sense why people would like + +00:48:26.560 --> 00:48:26.860 +that. + +00:48:27.140 --> 00:48:27.500 +I mean, it's something + +00:48:27.500 --> 00:48:28.980 +used to from other languages + +00:48:28.980 --> 00:48:30.080 +where the type checker is built + +00:48:30.080 --> 00:48:30.760 +into the compiler. + +00:48:30.920 --> 00:48:31.100 +Right. + +00:48:31.160 --> 00:48:32.480 +You get like a runtime type + +00:48:32.480 --> 00:48:33.400 +cast, like cannot. + +00:48:33.400 --> 00:48:35.280 +We kind of get that if you + +00:48:35.280 --> 00:48:37.480 +try to parse a thing, you + +00:48:37.480 --> 00:48:38.860 +know, like put the int + +00:48:38.860 --> 00:48:40.280 +param around a string and + +00:48:40.280 --> 00:48:41.840 +it's not really a parsable + +00:48:41.840 --> 00:48:42.280 +as an int. + +00:48:42.460 --> 00:48:43.740 +But for like real type + +00:48:43.740 --> 00:48:44.720 +information, I think + +00:48:44.720 --> 00:48:45.860 +personally I would use this + +00:48:45.860 --> 00:48:48.280 +as like I might apply types, + +00:48:48.700 --> 00:48:50.480 +type checking to a module + +00:48:50.480 --> 00:48:52.780 +for debugging and development + +00:48:52.780 --> 00:48:53.700 +for a minute and just see + +00:48:53.700 --> 00:48:54.500 +what happens and then turn + +00:48:54.500 --> 00:48:55.000 +it back off. + +00:48:55.280 --> 00:48:55.880 +You know, I don't know that + +00:48:55.880 --> 00:48:57.420 +I'd just ship production code + +00:48:57.420 --> 00:48:57.820 +that way. + +00:48:58.140 --> 00:48:59.760 +But anyway, I got a couple + +00:48:59.760 --> 00:49:00.760 +more questions. + +00:49:00.880 --> 00:49:01.680 +We're getting shorter on + +00:49:01.680 --> 00:49:02.280 +time here. + +00:49:02.620 --> 00:49:03.520 +What was one of the + +00:49:03.520 --> 00:49:05.400 +harder questions that you + +00:49:05.400 --> 00:49:07.240 +all, harder decisions you + +00:49:07.240 --> 00:49:09.420 +all had to address on the + +00:49:09.420 --> 00:49:09.700 +council? + +00:49:09.700 --> 00:49:10.440 +I think the most + +00:49:10.440 --> 00:49:11.620 +contentious one was + +00:49:11.620 --> 00:49:14.680 +PEP 724, if I remember + +00:49:14.680 --> 00:49:15.520 +the number correctly. + +00:49:15.940 --> 00:49:17.460 +It was around a feature + +00:49:17.460 --> 00:49:19.020 +called type guards, which + +00:49:19.020 --> 00:49:20.660 +is around user-defined type + +00:49:20.660 --> 00:49:21.540 +narrowing functions. + +00:49:22.040 --> 00:49:23.100 +Initially, they find that in + +00:49:23.100 --> 00:49:24.740 +a way that later was found + +00:49:24.740 --> 00:49:25.820 +to be somewhat problematic + +00:49:25.820 --> 00:49:27.600 +and we basically came up + +00:49:27.600 --> 00:49:28.420 +with a better set of + +00:49:28.420 --> 00:49:30.000 +proposed semantics that + +00:49:30.000 --> 00:49:30.900 +maybe we should have done + +00:49:30.900 --> 00:49:32.980 +the first time around. + +00:49:33.520 --> 00:49:35.060 +And what this PEP proposed, + +00:49:35.300 --> 00:49:36.080 +and as you can see, I + +00:49:36.080 --> 00:49:37.840 +sponsored it, is that we + +00:49:37.840 --> 00:49:38.660 +basically changed the + +00:49:38.660 --> 00:49:39.420 +meaning of the existing + +00:49:39.420 --> 00:49:40.940 +type guards under certain + +00:49:40.940 --> 00:49:41.500 +conditions. + +00:49:41.800 --> 00:49:42.620 +What is a type guard? + +00:49:42.820 --> 00:49:43.780 +A type guard is a function, + +00:49:43.940 --> 00:49:44.840 +like there's a good example + +00:49:44.840 --> 00:49:45.780 +there, the isiterable. + +00:49:46.220 --> 00:49:47.980 +It's a function that tells + +00:49:47.980 --> 00:49:49.680 +you how to narrow something. + +00:49:50.240 --> 00:49:52.040 +So in this example, there's + +00:49:52.040 --> 00:49:53.320 +an isiterable type guard, + +00:49:53.480 --> 00:49:55.240 +which narrows an object to + +00:49:55.240 --> 00:49:56.280 +an iterable of anything. + +00:49:56.680 --> 00:49:58.400 +And then inside the func + +00:49:58.400 --> 00:49:59.840 +there, you can see if + +00:49:59.840 --> 00:50:01.860 +isiterable file, it knows + +00:50:01.860 --> 00:50:03.740 +that it's an iterable. + +00:50:04.320 --> 00:50:06.200 +And in this case, yeah, I + +00:50:06.200 --> 00:50:07.000 +guess it just narrows + +00:50:07.000 --> 00:50:08.200 +exactly to iterable any. + +00:50:08.600 --> 00:50:09.520 +That's one of the ways that + +00:50:09.520 --> 00:50:10.500 +type guards works. + +00:50:10.760 --> 00:50:10.940 +I see. + +00:50:11.040 --> 00:50:12.320 +And the type that returns + +00:50:12.320 --> 00:50:14.020 +kind of communicates to the + +00:50:14.020 --> 00:50:15.680 +type system, like that this + +00:50:15.680 --> 00:50:18.280 +function ensures that this, + +00:50:18.560 --> 00:50:19.760 +the thing that came in as an + +00:50:19.760 --> 00:50:21.160 +arbitrary object, in fact, + +00:50:21.220 --> 00:50:22.040 +is one of these. + +00:50:22.380 --> 00:50:22.660 +Okay. + +00:50:22.800 --> 00:50:23.180 +Interesting. + +00:50:23.600 --> 00:50:23.740 +Yeah. + +00:50:23.840 --> 00:50:25.200 +So that was a tricky one, + +00:50:25.260 --> 00:50:25.400 +huh? + +00:50:25.740 --> 00:50:27.400 +Any other standout, Rebecca or + +00:50:27.400 --> 00:50:27.620 +Carl? + +00:50:27.620 --> 00:50:28.740 +Well, the current discussion + +00:50:28.740 --> 00:50:29.800 +around what is the meaning + +00:50:29.800 --> 00:50:32.020 +of a float annotation, still + +00:50:32.020 --> 00:50:33.960 +unresolved, contentious topic. + +00:50:34.360 --> 00:50:34.760 +Okay. + +00:50:34.920 --> 00:50:35.180 +Gotcha. + +00:50:35.420 --> 00:50:38.720 +I mean, this on PEP724 is + +00:50:38.720 --> 00:50:41.180 +also what came to my mind + +00:50:41.180 --> 00:50:42.900 +immediately as well, because + +00:50:42.900 --> 00:50:44.280 +this was challenging + +00:50:44.280 --> 00:50:45.560 +discussion because, you + +00:50:45.560 --> 00:50:47.140 +know, like there were very + +00:50:47.140 --> 00:50:48.840 +conflicting considerations at + +00:50:48.840 --> 00:50:49.140 +play. + +00:50:49.300 --> 00:50:50.660 +It's like, what semantics + +00:50:50.660 --> 00:50:52.000 +did we want in the long + +00:50:52.000 --> 00:50:52.420 +term? + +00:50:52.620 --> 00:50:53.580 +And what did we want the + +00:50:53.580 --> 00:50:55.220 +type system to look like, + +00:50:55.280 --> 00:50:56.280 +you know, say like 10 years + +00:50:56.280 --> 00:50:57.620 +from now versus backwards + +00:50:57.620 --> 00:50:59.680 +compatibility and what the + +00:50:59.680 --> 00:51:00.780 +migration story would look + +00:51:00.780 --> 00:51:01.060 +like? + +00:51:01.060 --> 00:51:02.120 +It was quite tricky. + +00:51:02.300 --> 00:51:03.340 +I guess that's something you + +00:51:03.340 --> 00:51:04.880 +will always have to be + +00:51:04.880 --> 00:51:07.160 +cognizant of is like every + +00:51:07.160 --> 00:51:08.400 +change, even if it's an + +00:51:08.400 --> 00:51:10.400 +improvement, has to justify + +00:51:10.400 --> 00:51:12.600 +the fact that now you have + +00:51:12.600 --> 00:51:14.480 +challenges with the version + +00:51:14.480 --> 00:51:16.440 +history over time. + +00:51:16.440 --> 00:51:19.880 +I'm thinking like dict of string + +00:51:19.880 --> 00:51:21.800 +comma int with a capital or + +00:51:21.800 --> 00:51:22.540 +lowercase d. + +00:51:22.920 --> 00:51:24.640 +I've got people, I did a + +00:51:24.640 --> 00:51:25.500 +YouTube video showing + +00:51:25.500 --> 00:51:26.640 +something with the lowercase + +00:51:26.640 --> 00:51:27.840 +version because I was using + +00:51:27.840 --> 00:51:29.280 +something super modern like + +00:51:29.280 --> 00:51:30.260 +Python 3.11. + +00:51:30.680 --> 00:51:32.260 +And I got a message like, + +00:51:32.340 --> 00:51:33.600 +hey, Michael, you don't know + +00:51:33.600 --> 00:51:34.400 +how to write Python. + +00:51:34.520 --> 00:51:35.400 +Your code is broken. + +00:51:35.860 --> 00:51:37.220 +This code that you wrote just + +00:51:37.220 --> 00:51:38.160 +doesn't even run. + +00:51:38.240 --> 00:51:39.140 +I don't know how this is. + +00:51:39.280 --> 00:51:40.180 +I'm like, what version of + +00:51:40.180 --> 00:51:40.740 +Python is in? + +00:51:40.860 --> 00:51:41.340 +3.8. + +00:51:41.580 --> 00:51:41.820 +Nope. + +00:51:41.880 --> 00:51:42.940 +You can't use 3.8 for that. + +00:51:43.000 --> 00:51:43.620 +You're going to need to get a + +00:51:43.620 --> 00:51:44.040 +newer one. + +00:51:44.140 --> 00:51:44.580 +You know what I mean? + +00:51:44.880 --> 00:51:47.240 +But like those are complexities + +00:51:47.240 --> 00:51:48.540 +that get added to Python + +00:51:48.540 --> 00:51:49.820 +because of that. + +00:51:49.900 --> 00:51:51.000 +Now you've got two ways to + +00:51:51.000 --> 00:51:52.680 +specify what a dict is. + +00:51:52.840 --> 00:51:54.000 +There's a preferred new way, + +00:51:54.080 --> 00:51:55.080 +but there's still the old way + +00:51:55.080 --> 00:51:57.440 +and it just, it sort of piles + +00:51:57.440 --> 00:51:57.780 +up. + +00:51:58.000 --> 00:51:58.880 +And it's very hard to ever + +00:51:58.880 --> 00:51:59.840 +actually get rid of the old + +00:51:59.840 --> 00:52:00.940 +way, even if there's no good + +00:52:00.940 --> 00:52:01.820 +reason to use it anymore. + +00:52:01.960 --> 00:52:02.280 +Exactly. + +00:52:02.440 --> 00:52:04.020 +Once it's there, it's written + +00:52:04.020 --> 00:52:05.540 +in ink pretty much, right? + +00:52:05.580 --> 00:52:06.880 +Like we have five or six + +00:52:06.880 --> 00:52:07.680 +different ways to format + +00:52:07.680 --> 00:52:08.120 +strings. + +00:52:08.220 --> 00:52:09.920 +Maybe with t-strings at six + +00:52:09.920 --> 00:52:10.200 +now. + +00:52:10.560 --> 00:52:11.380 +They're all going to still be + +00:52:11.380 --> 00:52:12.160 +there, right? + +00:52:12.160 --> 00:52:13.520 +So every change, every + +00:52:13.520 --> 00:52:14.560 +decision you make is not + +00:52:14.560 --> 00:52:16.420 +just a matter of, is it the + +00:52:16.420 --> 00:52:17.420 +right decision, right? + +00:52:17.660 --> 00:52:19.580 +It's the, is it worth it? + +00:52:19.720 --> 00:52:20.140 +I'm sure. + +00:52:20.620 --> 00:52:20.840 +Yeah. + +00:52:21.200 --> 00:52:21.460 +I don't know. + +00:52:21.460 --> 00:52:22.300 +How do you all balance that? + +00:52:22.400 --> 00:52:23.660 +Like that's tricky. + +00:52:23.820 --> 00:52:24.540 +With things like the dict + +00:52:24.540 --> 00:52:25.860 +chains, at least we sort of + +00:52:25.860 --> 00:52:27.060 +know we're moving towards + +00:52:27.060 --> 00:52:29.720 +better states and there's two + +00:52:29.720 --> 00:52:30.860 +things, but they mean exactly + +00:52:30.860 --> 00:52:31.580 +the same thing. + +00:52:31.720 --> 00:52:33.700 +So the confusion is not as + +00:52:33.700 --> 00:52:34.040 +bad. + +00:52:34.460 --> 00:52:35.600 +The problem with type cards + +00:52:35.600 --> 00:52:36.700 +is that we're going to change + +00:52:36.700 --> 00:52:38.200 +how some existing thing + +00:52:38.200 --> 00:52:39.720 +works, like what it meant. + +00:52:39.720 --> 00:52:41.360 +And I think there are good + +00:52:41.360 --> 00:52:42.260 +reasons that maybe that's the + +00:52:42.260 --> 00:52:43.640 +right thing to do, but the, + +00:52:44.120 --> 00:52:45.340 +it would also have been pretty + +00:52:45.340 --> 00:52:46.400 +confusing for people if their + +00:52:46.400 --> 00:52:47.820 +existing types suddenly started + +00:52:47.820 --> 00:52:48.640 +meaning something completely + +00:52:48.640 --> 00:52:48.980 +different. + +00:52:49.320 --> 00:52:49.680 +Absolutely. + +00:52:50.180 --> 00:52:50.760 +Hence float. + +00:52:51.120 --> 00:52:51.460 +Okay. + +00:52:51.940 --> 00:52:53.000 +What's coming next? + +00:52:53.120 --> 00:52:55.920 +Like 3.15, 3.16, do you all + +00:52:55.920 --> 00:52:57.800 +have things that are in the works + +00:52:57.800 --> 00:52:58.880 +that you think are going to come + +00:52:58.880 --> 00:53:00.980 +or debates that are brewing? + +00:53:01.180 --> 00:53:03.320 +For 3.15, the, there's a type + +00:53:03.320 --> 00:53:04.940 +dict feature coming, extra + +00:53:04.940 --> 00:53:05.400 +items. + +00:53:05.400 --> 00:53:07.040 +you can already use it in + +00:53:07.040 --> 00:53:08.740 +tapping extensions if you want + +00:53:08.740 --> 00:53:10.560 +to use it, but it will be in + +00:53:10.560 --> 00:53:11.940 +CPath as of 2.15. + +00:53:12.100 --> 00:53:13.720 +It's likely we'll have a small + +00:53:13.720 --> 00:53:14.980 +thing I added called disjoint + +00:53:14.980 --> 00:53:16.900 +basis, which is very technical, + +00:53:17.140 --> 00:53:18.620 +but helps type narrowing in some + +00:53:18.620 --> 00:53:19.040 +cases. + +00:53:19.440 --> 00:53:19.540 +Yeah. + +00:53:19.540 --> 00:53:20.240 +I think those are the things + +00:53:20.240 --> 00:53:21.960 +that are likely to make it. + +00:53:22.440 --> 00:53:23.800 +There's, we can only speculate + +00:53:23.800 --> 00:53:24.900 +about what else people can + +00:53:24.900 --> 00:53:25.280 +propose. + +00:53:25.380 --> 00:53:26.380 +We're sort of bound by what + +00:53:26.380 --> 00:53:27.440 +people actually write up as + +00:53:27.440 --> 00:53:27.760 +peps. + +00:53:28.060 --> 00:53:28.860 +We have to wait for Google to + +00:53:28.860 --> 00:53:29.760 +write the peps before we can + +00:53:29.760 --> 00:53:30.200 +approve them. + +00:53:30.200 --> 00:53:32.240 +I think there's PEP 747 for + +00:53:32.240 --> 00:53:34.100 +type form, which I think is + +00:53:34.100 --> 00:53:35.820 +not, I think we recommended + +00:53:35.820 --> 00:53:37.260 +its acceptance, but I don't + +00:53:37.260 --> 00:53:38.140 +think the steering council + +00:53:38.140 --> 00:53:39.340 +accepted it yet or it hasn't + +00:53:39.340 --> 00:53:40.240 +been accepted formally. + +00:53:40.400 --> 00:53:41.040 +I think that's on their + +00:53:41.040 --> 00:53:41.500 +plate. + +00:53:41.660 --> 00:53:41.820 +Yeah. + +00:53:42.180 --> 00:53:42.380 +Yeah. + +00:53:42.400 --> 00:53:43.740 +So that's also pretty likely + +00:53:43.740 --> 00:53:45.180 +to make it into 3.15. + +00:53:45.320 --> 00:53:46.720 +This is one example of a + +00:53:46.720 --> 00:53:47.780 +case that will be pretty + +00:53:47.780 --> 00:53:49.180 +useful to people working + +00:53:49.180 --> 00:53:50.880 +with type annotations at + +00:53:50.880 --> 00:53:52.300 +runtime because it'll allow + +00:53:52.300 --> 00:53:53.940 +you to, it's sort of a meta + +00:53:53.940 --> 00:53:55.960 +thing where you can annotate, + +00:53:56.760 --> 00:53:57.900 +have a type annotation that + +00:53:57.900 --> 00:53:58.960 +describes another type + +00:53:58.960 --> 00:53:59.400 +annotation. + +00:53:59.940 --> 00:54:00.940 +So that's useful if you're, + +00:54:01.000 --> 00:54:02.040 +if you're writing code that + +00:54:02.040 --> 00:54:03.240 +works with type annotations. + +00:54:03.240 --> 00:54:04.620 +Make the peidantics of the + +00:54:04.620 --> 00:54:05.400 +world very happy. + +00:54:05.540 --> 00:54:07.620 +I am actually pretty excited + +00:54:07.620 --> 00:54:09.780 +about type form because, + +00:54:10.060 --> 00:54:10.760 +you know, I feel like there's + +00:54:10.760 --> 00:54:12.100 +a gap and we can express in + +00:54:12.100 --> 00:54:13.460 +the type system and we're + +00:54:13.460 --> 00:54:14.380 +good. + +00:54:14.680 --> 00:54:15.760 +And there are cases in the + +00:54:15.760 --> 00:54:17.780 +existing type system, like + +00:54:17.780 --> 00:54:18.740 +for instance, the cast + +00:54:18.740 --> 00:54:20.360 +function and some other + +00:54:20.360 --> 00:54:22.320 +cases where something takes + +00:54:22.320 --> 00:54:24.120 +any type expression as an + +00:54:24.120 --> 00:54:24.400 +argument. + +00:54:24.520 --> 00:54:25.220 +We actually don't have a good + +00:54:25.220 --> 00:54:27.000 +way to annotate that today + +00:54:27.000 --> 00:54:28.040 +and this will provide a nice + +00:54:28.040 --> 00:54:28.880 +way to express that. + +00:54:28.960 --> 00:54:30.000 +Let me pull up one thing + +00:54:30.000 --> 00:54:30.680 +really quick. + +00:54:31.020 --> 00:54:31.840 +Quick shout out to Will + +00:54:31.840 --> 00:54:32.500 +McGuggan here. + +00:54:32.500 --> 00:54:33.800 +He just released his + +00:54:33.800 --> 00:54:35.760 +Toad project, which is the + +00:54:35.760 --> 00:54:38.400 +new, takes textual and rich + +00:54:38.400 --> 00:54:39.580 +and all that kind of stuff + +00:54:39.580 --> 00:54:40.980 +and applies it to like, what + +00:54:40.980 --> 00:54:42.300 +if we had a better cloud code + +00:54:42.300 --> 00:54:43.460 +type of experience, which is + +00:54:43.460 --> 00:54:43.980 +pretty interesting. + +00:54:44.560 --> 00:54:45.440 +So the reason I'm bringing + +00:54:45.440 --> 00:54:47.200 +this up is, you know, final + +00:54:47.200 --> 00:54:47.620 +question. + +00:54:47.840 --> 00:54:49.560 +What about, do you all even + +00:54:49.560 --> 00:54:52.680 +worry about the role of like + +00:54:52.680 --> 00:54:54.480 +how types interact with AI + +00:54:54.480 --> 00:54:56.120 +and agentic coding tools? + +00:54:56.120 --> 00:54:58.980 +I know that if you have some + +00:54:58.980 --> 00:55:00.660 +code that has types on it + +00:55:00.660 --> 00:55:02.300 +and you give it to an AI, it's + +00:55:02.300 --> 00:55:03.340 +got a better chance of + +00:55:03.340 --> 00:55:04.280 +understanding what's happening + +00:55:04.280 --> 00:55:05.880 +than if you give it purely + +00:55:05.880 --> 00:55:07.560 +untyped code and say, tell me + +00:55:07.560 --> 00:55:08.300 +about this, right? + +00:55:08.340 --> 00:55:09.540 +It doesn't even know necessarily + +00:55:09.540 --> 00:55:10.580 +what's being passed to it. + +00:55:10.720 --> 00:55:12.160 +But is that anything you'll + +00:55:12.160 --> 00:55:13.320 +think about or what are your + +00:55:13.320 --> 00:55:14.500 +thoughts on this? + +00:55:14.760 --> 00:55:15.660 +Certainly think about it some. + +00:55:15.860 --> 00:55:17.460 +I mean, I think overall my + +00:55:17.460 --> 00:55:19.500 +feeling is that these coding + +00:55:19.500 --> 00:55:21.200 +agents seem to do better than + +00:55:21.200 --> 00:55:23.700 +more kind of the tighter + +00:55:23.700 --> 00:55:25.120 +feedback loops you can give them + +00:55:25.120 --> 00:55:25.760 +to work with. + +00:55:26.020 --> 00:55:27.080 +And so typing is another + +00:55:27.080 --> 00:55:28.400 +useful source of feedback + +00:55:28.400 --> 00:55:29.840 +where you can say, add type + +00:55:29.840 --> 00:55:30.880 +annotations and make sure the + +00:55:30.880 --> 00:55:32.920 +type checker passes and seems + +00:55:32.920 --> 00:55:34.600 +so it still seems pretty useful + +00:55:34.600 --> 00:55:35.140 +in that world. + +00:55:35.240 --> 00:55:36.340 +Yeah, you can easily write + +00:55:36.340 --> 00:55:38.120 +rules that say when you are + +00:55:38.120 --> 00:55:39.540 +done on anything I've asked + +00:55:39.540 --> 00:55:42.340 +you to do, always run ty or + +00:55:42.340 --> 00:55:43.760 +always run Pyrefly and make + +00:55:43.760 --> 00:55:45.180 +sure that there's no more, no + +00:55:45.180 --> 00:55:46.720 +new errors or at least or + +00:55:46.720 --> 00:55:48.240 +ideally zero errors, right? + +00:55:48.320 --> 00:55:49.340 +But nothing has been + +00:55:49.340 --> 00:55:49.840 +introduced. + +00:55:50.220 --> 00:55:50.800 +Yeah, pretty interesting. + +00:55:51.360 --> 00:55:52.820 +You other folks, Rebecca, + +00:55:53.260 --> 00:55:53.440 +Yela? + +00:55:53.700 --> 00:55:54.560 +Yeah, I guess in general, I + +00:55:54.560 --> 00:55:55.460 +think typing will remain + +00:55:55.460 --> 00:55:57.040 +useful for AI. + +00:55:57.400 --> 00:55:58.180 +We are probably rapidly + +00:55:58.180 --> 00:55:59.780 +moving to a world where a + +00:55:59.780 --> 00:56:00.960 +large proportion of all code + +00:56:00.960 --> 00:56:01.940 +is written by AI. + +00:56:02.120 --> 00:56:03.020 +Not everybody likes that + +00:56:03.020 --> 00:56:03.620 +opinion, Yael. + +00:56:03.640 --> 00:56:04.440 +Not everybody likes that. + +00:56:04.660 --> 00:56:06.760 +I guess I, maybe my current + +00:56:06.760 --> 00:56:07.780 +line of work makes me think + +00:56:07.780 --> 00:56:08.820 +that's more likely to happen. + +00:56:08.940 --> 00:56:09.900 +You don't have to like the + +00:56:09.900 --> 00:56:10.800 +fact it's going to be night + +00:56:10.800 --> 00:56:11.840 +soon, but it's going to be + +00:56:11.840 --> 00:56:11.980 +night. + +00:56:12.040 --> 00:56:12.380 +You know what I mean? + +00:56:12.420 --> 00:56:14.080 +Like there's the, I just + +00:56:14.080 --> 00:56:14.900 +think there's so much + +00:56:14.900 --> 00:56:16.220 +momentum on this, at least in + +00:56:16.220 --> 00:56:17.460 +the next five years or + +00:56:17.460 --> 00:56:18.260 +something, that it's going to + +00:56:18.260 --> 00:56:19.400 +be really, it's, it's a + +00:56:19.400 --> 00:56:21.320 +truth of how many people are + +00:56:21.320 --> 00:56:22.360 +writing code regardless of + +00:56:22.360 --> 00:56:23.920 +whether individuals want to + +00:56:23.920 --> 00:56:24.640 +write code that way. + +00:56:24.840 --> 00:56:25.180 +You know what I mean? + +00:56:25.200 --> 00:56:25.780 +So I think it's a + +00:56:25.780 --> 00:56:26.200 +consideration. + +00:56:26.380 --> 00:56:26.460 +Yeah. + +00:56:26.460 --> 00:56:26.640 +Yeah. + +00:56:26.700 --> 00:56:27.740 +I forgot that you worked + +00:56:27.740 --> 00:56:28.160 +at OpenAI. + +00:56:28.300 --> 00:56:30.920 +So of course, I should pull + +00:56:30.920 --> 00:56:32.060 +up a codex example or + +00:56:32.060 --> 00:56:32.840 +something, shouldn't I? + +00:56:32.960 --> 00:56:33.100 +Yeah. + +00:56:33.180 --> 00:56:33.820 +Codex is great. + +00:56:33.900 --> 00:56:34.220 +Use it. + +00:56:34.360 --> 00:56:35.160 +No, but I mean, do you + +00:56:35.160 --> 00:56:36.780 +have any further insight into + +00:56:36.780 --> 00:56:38.840 +like the role of types + +00:56:38.840 --> 00:56:39.660 +and coding agents? + +00:56:40.040 --> 00:56:40.900 +I know that's not exactly + +00:56:40.900 --> 00:56:41.700 +what you work on, right? + +00:56:41.740 --> 00:56:42.600 +You're more at the lower. + +00:56:42.740 --> 00:56:43.900 +As Carl said, types can + +00:56:43.900 --> 00:56:45.720 +also be helpful for AI to + +00:56:45.720 --> 00:56:47.060 +understand code better and to + +00:56:47.060 --> 00:56:48.020 +get a better feedback loop. + +00:56:48.320 --> 00:56:49.700 +I feel like the very big AI, + +00:56:49.700 --> 00:56:50.740 +the board is like humans. + +00:56:51.060 --> 00:56:53.140 +And if AI makes, sorry, if + +00:56:53.140 --> 00:56:54.800 +typing makes humans better + +00:56:54.800 --> 00:56:55.940 +at writing understanding + +00:56:55.940 --> 00:56:56.780 +this code, they're probably + +00:56:56.780 --> 00:56:58.560 +also big AI better at it. + +00:56:58.620 --> 00:56:59.540 +It's the locality of + +00:56:59.540 --> 00:56:59.940 +information. + +00:57:00.080 --> 00:57:01.120 +You can read the function + +00:57:01.120 --> 00:57:03.100 +and know everything you need + +00:57:03.100 --> 00:57:03.900 +to know about what's going + +00:57:03.900 --> 00:57:05.660 +into it without bouncing + +00:57:05.660 --> 00:57:06.720 +around and trying to + +00:57:06.720 --> 00:57:07.780 +understand blocks of code + +00:57:07.780 --> 00:57:08.660 +and like what might've been + +00:57:08.660 --> 00:57:09.320 +created that's getting + +00:57:09.320 --> 00:57:09.580 +impacted. + +00:57:09.740 --> 00:57:10.980 +It's good for humans and + +00:57:10.980 --> 00:57:11.980 +also good for AI. + +00:57:12.160 --> 00:57:12.300 +Right. + +00:57:12.600 --> 00:57:12.880 +Rebecca. + +00:57:12.980 --> 00:57:13.700 +Because I don't have much + +00:57:14.240 --> 00:57:15.920 +need to, and I'll say I am + +00:57:15.920 --> 00:57:18.320 +maybe a little more skeptical + +00:57:18.320 --> 00:57:20.680 +than most of my coworkers + +00:57:20.680 --> 00:57:22.040 +about the quality of AI + +00:57:22.040 --> 00:57:23.360 +generated code. + +00:57:23.500 --> 00:57:25.500 +But that means I think I am + +00:57:25.500 --> 00:57:27.700 +particularly gung-ho about, + +00:57:27.800 --> 00:57:29.860 +you know, like get AI to use + +00:57:29.860 --> 00:57:32.200 +types, type checkers, keep + +00:57:32.200 --> 00:57:33.400 +the guardrails there. + +00:57:33.600 --> 00:57:34.300 +I think that'll be very + +00:57:34.300 --> 00:57:34.540 +important. + +00:57:34.540 --> 00:57:35.160 +Yeah, if it's going to make + +00:57:35.160 --> 00:57:36.440 +a mistake, don't let it at + +00:57:36.440 --> 00:57:38.180 +least like make the type + +00:57:38.180 --> 00:57:39.720 +system become disconnected + +00:57:39.720 --> 00:57:40.820 +and not working. + +00:57:40.980 --> 00:57:42.600 +Like it has to keep the types + +00:57:42.600 --> 00:57:43.560 +hanging together as a + +00:57:43.560 --> 00:57:44.360 +minimum bar, right? + +00:57:44.360 --> 00:57:45.760 +And you can easily set that + +00:57:45.760 --> 00:57:46.500 +up as an automation. + +00:57:46.980 --> 00:57:47.140 +Yeah. + +00:57:47.220 --> 00:57:48.220 +Interesting to think of it as + +00:57:48.220 --> 00:57:49.740 +guardrails rather than an + +00:57:49.740 --> 00:57:50.280 +accelerant. + +00:57:50.520 --> 00:57:51.860 +But yeah, 100% it is. + +00:57:52.140 --> 00:57:52.760 +All right, folks. + +00:57:52.900 --> 00:57:54.120 +I think that's it for all + +00:57:54.120 --> 00:57:55.240 +the time that we have. + +00:57:55.640 --> 00:57:55.960 +Thank you. + +00:57:56.000 --> 00:57:56.800 +Thank you for being here. + +00:57:57.240 --> 00:57:59.040 +Final thoughts before we go. + +00:57:59.340 --> 00:58:00.280 +Carl, I'll let you go first. + +00:58:00.600 --> 00:58:01.260 +Final thoughts for people + +00:58:01.260 --> 00:58:02.020 +out there interested in + +00:58:02.020 --> 00:58:02.520 +Python typing. + +00:58:02.740 --> 00:58:02.900 +Yeah. + +00:58:02.960 --> 00:58:03.740 +Well, first of all, thanks + +00:58:03.740 --> 00:58:05.540 +for having us on the podcast. + +00:58:05.640 --> 00:58:06.520 +Really appreciate it. + +00:58:06.860 --> 00:58:07.840 +And thoughts for people + +00:58:07.840 --> 00:58:08.340 +out there. + +00:58:08.680 --> 00:58:10.660 +I guess if you have ideas + +00:58:10.660 --> 00:58:12.400 +of how Python typing could + +00:58:12.400 --> 00:58:14.340 +be improved, discuss.python. + +00:58:14.360 --> 00:58:15.300 +Python.org is a good + +00:58:15.300 --> 00:58:16.760 +place to bring up ideas + +00:58:16.760 --> 00:58:18.180 +and discuss them with the + +00:58:18.180 --> 00:58:19.900 +typing community and see + +00:58:19.900 --> 00:58:21.160 +what positive changes we + +00:58:21.160 --> 00:58:21.420 +can make. + +00:58:22.080 --> 00:58:22.400 +Rebecca. + +00:58:22.580 --> 00:58:24.040 +First, thank you, Michael. + +00:58:24.120 --> 00:58:25.760 +This is a lot of fun. + +00:58:26.760 --> 00:58:27.680 +Last thoughts? + +00:58:28.240 --> 00:58:30.180 +Hey, so, you know, like + +00:58:30.180 --> 00:58:31.260 +we'll look at the typing + +00:58:31.260 --> 00:58:32.500 +council and sometimes think, + +00:58:32.780 --> 00:58:33.480 +oh, you know, like the + +00:58:33.480 --> 00:58:34.280 +PEP has like governance + +00:58:34.280 --> 00:58:35.640 +in its name, but I + +00:58:35.640 --> 00:58:37.200 +wouldn't say we're really + +00:58:37.200 --> 00:58:39.300 +a governing body or + +00:58:39.300 --> 00:58:39.720 +anything. + +00:58:39.720 --> 00:58:42.120 +It's like people who are + +00:58:42.120 --> 00:58:43.300 +using the type system, + +00:58:43.480 --> 00:58:44.780 +like users, they're the + +00:58:44.780 --> 00:58:45.520 +ones who come up with, + +00:58:45.580 --> 00:58:46.140 +you know, like all the + +00:58:46.140 --> 00:58:47.840 +best ideas, propose them, + +00:58:47.940 --> 00:58:48.640 +discuss them. + +00:58:48.640 --> 00:58:50.500 +And we're just here to + +00:58:50.500 --> 00:58:52.940 +sort of be like, hey, + +00:58:53.040 --> 00:58:53.840 +you know, like we have + +00:58:53.840 --> 00:58:55.140 +some background and like + +00:58:55.140 --> 00:58:56.560 +how type checkers work and + +00:58:56.560 --> 00:58:57.420 +maybe some of the history + +00:58:57.420 --> 00:58:58.940 +and we can provide input. + +00:58:58.940 --> 00:59:00.040 +But I just encourage + +00:59:00.040 --> 00:59:00.740 +people, if there's a + +00:59:00.740 --> 00:59:02.120 +change you want to see in + +00:59:02.120 --> 00:59:03.740 +the type system, you know, + +00:59:03.740 --> 00:59:05.400 +like propose it yourself. + +00:59:05.400 --> 00:59:06.980 +It's very friendly and + +00:59:06.980 --> 00:59:07.680 +open community. + +00:59:07.880 --> 00:59:07.940 +Yeah. + +00:59:08.020 --> 00:59:09.080 +Now people who have + +00:59:09.080 --> 00:59:10.560 +listened know a little bit + +00:59:10.560 --> 00:59:11.460 +more about how to do so. + +00:59:11.680 --> 00:59:11.920 +Awesome. + +00:59:12.100 --> 00:59:12.320 +Thanks. + +00:59:12.640 --> 00:59:13.440 +Jale, final word. + +00:59:13.620 --> 00:59:13.760 +Yeah. + +00:59:13.840 --> 00:59:14.920 +Also, again, thank you for + +00:59:14.920 --> 00:59:15.600 +having me here. + +00:59:16.000 --> 00:59:17.040 +It's been great talking to + +00:59:17.040 --> 00:59:17.480 +all of you. + +00:59:17.700 --> 00:59:18.360 +I guess what I want to say + +00:59:18.360 --> 00:59:18.940 +is similar to what + +00:59:18.940 --> 00:59:19.920 +Karin Rebecca just said. + +00:59:20.300 --> 00:59:21.220 +If you want to have + +00:59:21.220 --> 00:59:22.220 +something changed to the + +00:59:22.220 --> 00:59:23.140 +type system, I'd really + +00:59:23.140 --> 00:59:24.540 +encourage you to sign up + +00:59:24.540 --> 00:59:25.680 +for discuss.python.org, + +00:59:25.800 --> 00:59:26.920 +make a proposal, go + +00:59:26.920 --> 00:59:27.680 +through the process. + +00:59:27.920 --> 00:59:28.620 +It can be somewhat + +00:59:28.620 --> 00:59:29.500 +daunting, perhaps, + +00:59:29.620 --> 00:59:30.700 +especially if you have to + +00:59:30.700 --> 00:59:32.160 +create a PEP, but it is + +00:59:32.160 --> 00:59:32.500 +doable. + +00:59:32.960 --> 00:59:34.060 +There are several recent + +00:59:34.060 --> 00:59:34.980 +typing PEPs have just + +00:59:34.980 --> 00:59:36.280 +been community members + +00:59:36.280 --> 00:59:37.880 +who saw something they + +00:59:37.880 --> 00:59:38.580 +wanted to improve, + +00:59:38.720 --> 00:59:40.200 +proposed a PEP, and saw + +00:59:40.200 --> 00:59:40.800 +it to completion. + +00:59:41.180 --> 00:59:41.900 +If there's something you + +00:59:41.900 --> 00:59:42.480 +want to see in the type + +00:59:42.480 --> 00:59:44.240 +system, then you can do + +00:59:44.240 --> 00:59:44.480 +it too. + +00:59:44.580 --> 00:59:46.180 +Thank you all for keeping + +00:59:46.180 --> 00:59:47.120 +Python typing going + +00:59:47.120 --> 00:59:47.600 +strong. + +00:59:48.040 --> 00:59:48.880 +Really appreciate your + +00:59:48.880 --> 00:59:49.560 +time on the show. + +00:59:49.980 --> 00:59:50.540 +See you all later. + +00:59:50.640 --> 00:59:50.940 +Bye. + +00:59:51.080 --> 00:59:51.260 +Bye. + +00:59:51.260 --> 00:59:54.180 +This has been another episode + +00:59:54.180 --> 00:59:55.140 +of Talk Python To Me. + +00:59:55.280 --> 00:59:56.260 +Thank you to our sponsors. + +00:59:56.440 --> 00:59:57.280 +Be sure to check out what + +00:59:57.280 --> 00:59:57.720 +they're offering. + +00:59:57.920 --> 00:59:59.100 +It really helps support the + +00:59:59.100 --> 00:59:59.280 +show. + +00:59:59.980 --> 01:00:01.000 +Take some stress out of + +01:00:01.000 --> 01:00:01.420 +your life. + +01:00:01.760 --> 01:00:02.920 +Get notified immediately + +01:00:02.920 --> 01:00:04.600 +about errors and performance + +01:00:04.600 --> 01:00:06.140 +issues in your web or mobile + +01:00:06.140 --> 01:00:07.200 +applications with Sentry. + +01:00:07.680 --> 01:00:09.680 +Just visit talkpython.fm + +01:00:09.680 --> 01:00:11.420 +slash Sentry and get + +01:00:11.420 --> 01:00:12.180 +started for free. + +01:00:12.660 --> 01:00:13.900 +Be sure to use our code + +01:00:13.900 --> 01:00:15.140 +talkpython26. + +01:00:15.740 --> 01:00:17.360 +That's talkpython, the + +01:00:17.360 --> 01:00:19.220 +numbers two, six, all one + +01:00:19.220 --> 01:00:19.440 +word. + +01:00:19.820 --> 01:00:20.960 +And it's brought to you by + +01:00:20.960 --> 01:00:22.900 +our Agentic AI programming + +01:00:22.900 --> 01:00:24.000 +for Python course. + +01:00:24.420 --> 01:00:25.600 +Learn to work with AI that + +01:00:25.600 --> 01:00:26.920 +actually understands your + +01:00:26.920 --> 01:00:28.500 +code base and build real + +01:00:28.500 --> 01:00:28.980 +features. + +01:00:29.480 --> 01:00:30.940 +Visit talkpython.fm + +01:00:30.940 --> 01:00:33.000 +slash agentic dash AI. + +01:00:33.560 --> 01:00:35.220 +If you or your team needs + +01:00:35.220 --> 01:00:36.240 +to learn Python, we have + +01:00:36.240 --> 01:00:38.220 +over 270 hours of beginner + +01:00:38.220 --> 01:00:39.740 +and advanced courses on + +01:00:39.740 --> 01:00:40.820 +topics ranging from + +01:00:40.820 --> 01:00:42.280 +complete beginners to + +01:00:42.280 --> 01:00:44.020 +async code, Flask, Django, + +01:00:44.220 --> 01:00:46.020 +HTMX, and even LLMs. + +01:00:46.260 --> 01:00:47.340 +Best of all, there's no + +01:00:47.340 --> 01:00:48.680 +subscription in sight. + +01:00:48.680 --> 01:00:49.920 +Browse the catalog at + +01:00:49.920 --> 01:00:50.880 +talkpython.fm. + +01:00:51.520 --> 01:00:52.360 +And if you're not already + +01:00:52.360 --> 01:00:53.820 +subscribed to the show on + +01:00:53.820 --> 01:00:54.660 +your favorite podcast + +01:00:54.660 --> 01:00:55.800 +player, what are you + +01:00:55.800 --> 01:00:56.200 +waiting for? + +01:00:56.660 --> 01:00:57.960 +Just search for Python in + +01:00:57.960 --> 01:00:58.680 +your podcast player. + +01:00:58.780 --> 01:00:59.420 +We should be right at the + +01:00:59.420 --> 01:00:59.660 +top. + +01:00:59.980 --> 01:01:01.080 +If you enjoy that geeky + +01:01:01.080 --> 01:01:01.960 +rap song, you can + +01:01:01.960 --> 01:01:02.960 +download the full track. + +01:01:03.060 --> 01:01:03.860 +The link is actually in + +01:01:03.860 --> 01:01:04.760 +your podcast blur show + +01:01:04.760 --> 01:01:04.980 +notes. + +01:01:05.540 --> 01:01:06.800 +This is your host, Michael + +01:01:06.800 --> 01:01:07.100 +Kennedy. + +01:01:07.300 --> 01:01:08.320 +Thank you so much for + +01:01:08.320 --> 01:01:08.600 +listening. + +01:01:08.780 --> 01:01:09.560 +I really appreciate it. + +01:01:09.960 --> 01:01:10.720 +I'll see you next time. + +01:01:18.680 --> 01:01:19.680 +Voyager. + +01:01:19.920 --> 01:01:21.980 +Voyager. + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:21.980 + + +01:01:21.980 --> 01:01:24.760 +And we ready to roll + +01:01:24.760 --> 01:01:27.540 +Upgrading the code + +01:01:27.540 --> 01:01:29.960 +No fear of getting whole + +01:01:29.960 --> 01:01:33.560 +We tapped into that modern vibe + +01:01:33.560 --> 01:01:34.940 +Overcame each storm + +01:01:34.940 --> 01:01:36.940 +Talk Python To Me + +01:01:36.940 --> 01:01:38.400 +IceSync is the norm From e6c6a6924603cfb0dbdc2697e043307774015ccf Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 18 Mar 2026 09:21:32 -0700 Subject: [PATCH 06/16] Transcripts --- ...n-monorepo-with-uv-and-prek-transcript.txt | 1734 ++++++++++ ...n-monorepo-with-uv-and-prek-transcript.vtt | 2722 +++++++++++++++ ...tic-site-generator-transcript-original.vtt | 2959 +++++++++++++++++ 3 files changed, 7415 insertions(+) create mode 100644 transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt create mode 100644 transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt create mode 100644 youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt diff --git a/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt new file mode 100644 index 0000000..bcfce13 --- /dev/null +++ b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt @@ -0,0 +1,1734 @@ +00:00:00 Monorepos. You've heard the talks, you've read the blog posts, maybe you've seen a few glimpses into how Google or Meta organized their massive code bases, but it's often in the abstract and behind closed doors. + +00:00:11 What if you could crack open a real production monorepo, one with over a million lines of Python code and over a hundred sub-packages, and actually see what's being built step-by-step using modern tools and standards? + +00:00:24 Well, that's exactly what Apache Airflow gives us. On this episode, I sit down with Yarek Patuk and Amag Desai, two of Airflow's top contributors, to go inside one of the largest open-source Python monorepos in the world and learn how they manage it with uv, pyproject.toml, and the latest packaging standards, so you can apply the same patterns to your own projects. + +00:00:47 This is Talk Python To Me, episode 540, recorded February 10th, 2026. + +00:00:52 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:14 This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. Let's connect on social media. + +00:01:21 You'll find me and Talk Python on Mastodon, Bluesky, and X. The social links are all in your show notes. + +00:01:28 You can find over 10 years of past episodes at talkpython.fm, and if you want to be part of the show, you can join our recording live streams. + +00:01:35 That's right, we live stream the raw, uncut version of each episode on YouTube. + +00:01:40 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:44 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:48 This episode is brought to you by our Agentic AI Programming for Python course. + +00:01:53 Learn to work with AI that actually understands your code base and build real features. + +00:01:58 Visit talkpython.fm/Agentic-AI. + +00:02:02 Hello, hello, Yarek, Amag. Welcome to Talk Python To Me. + +00:02:06 Awesome to have Amag, you here, and Yarek, you back. + +00:02:09 Very nice to be again at Talk Python To Me. It's one of my favorite podcasts I listen to all the time. + +00:02:15 Thank you, thank you. + +00:02:15 It's my first, but yeah, thanks for having me, Mike. + +00:02:18 Happy to have you here. + +00:02:19 You and a team of people, given the scale of this project, have built an amazing, amazing product with Apache Airflow. + +00:02:26 It's going to be really fun to dive into it, and specifically, we're going to focus on not building workflows exactly, + +00:02:33 although I'm sure we'll talk about that somewhat. + +00:02:35 The real goal, the thing that we're going to focus on, is how do you manage such a big project + +00:02:40 with so many different internal packages that all depend upon each other and so on, + +00:02:45 and monorepos, and that. + +00:02:47 I've touched on monorepos before, but two things. + +00:02:50 I think this makes a really interesting discussion for listeners out there. + +00:02:53 One, this is going to be very concrete with exact steps, and it's even open source. + +00:02:58 You can go check it out and play with it. + +00:03:00 And two, the tooling and the standards have changed significantly since I talked about this three or four years ago, + +00:03:06 making much of what we're going to talk about possible, right? + +00:03:09 Absolutely. + +00:03:09 Yeah. + +00:03:10 Now, before we dive into that, of course, let's do quick introductions. + +00:03:15 Jarek, it's been a while since you've been on the show. + +00:03:16 Who are you? + +00:03:17 Tell people who you are. + +00:03:18 I'm an Apache Airflow maintainer, one of the DMC members as well, and also one of the Apache Software Foundation members. + +00:03:26 I've got this nice, thin new logo of Apache Software Foundation that we got at FOSDEM. + +00:03:30 I'm also an Apache Airflow security committee member, which is an important aspect for what we are discussing today + +00:03:38 because of supply chain and dependencies and lots of security, potential security issues these dependencies bring. + +00:03:48 One of the few lucky people to contribute to open source full-time and get paid for it, + +00:03:53 which is amazing. + +00:03:54 Maybe another podcast one day about that, because I think that's also an interesting one. + +00:03:59 Yeah, I have something like that, a topic somewhat like that brewing. + +00:04:02 So, yeah, potentially to have you back for that. + +00:04:04 Hey, I'm Amok Desai. + +00:04:05 Again, similar to Jarek, I'm a PMC member and a computer at Apache Airflow. + +00:04:10 And I'm also part of, I'm one of the top 10 contributors to the project, + +00:04:15 top 10 all-time contributors to the project, Jarek being number one. + +00:04:18 So I work at Astronomer as a senior software engineer, where I get to live in both worlds. + +00:04:24 One is contributing to Airflow's code development and also supporting the companies that are trying to run Airflow at scale. + +00:04:31 Awesome. What is Astronomer? Tell people about that. + +00:04:33 It's a company where most of our, we're a company which is almost one of the leading contributors + +00:04:39 to Apache Airflow and also the leading consumer of it. + +00:04:42 We supply and we provide a managed distribution of, corporate managed distribution of Apache Airflow inside Astro. + +00:04:49 And yeah, I think we have a data platform as well to try and make your lives easier to use Airflow at scale. + +00:04:56 And let me ask to it, two comments. + +00:04:59 So Airflow has a number of stakeholders and commercial stakeholders who are hosting Airflow as a service as well. + +00:05:06 And, you know, like using Airflow, we have contributions from all over the place. + +00:05:10 Astronomer by far, like the biggest number of contributions and fantastic open source stakeholder. + +00:05:15 We like very much focused on making Apache Airflow, like truly vendor neutral Apache project. + +00:05:21 Like I'm always amazed how well this works. + +00:05:25 And the second thing, the number one, I'm cheating a bit. + +00:05:27 Like, you know, I do a lot of small PRs. + +00:05:29 This is how you get the number one. + +00:05:30 I guess it depends how you measure it, huh? + +00:05:32 You know, you could always just do one ginormous AI PR that's like a hundred thousand lines of code in your PR + +00:05:39 and people would love you for it. + +00:05:40 And you'd be a mega contributor. + +00:05:42 Oh, yeah. + +00:05:43 Well, not. + +00:05:44 He does both. + +00:05:45 The funny part is Yarek does both. + +00:05:47 His velocity amazes me or, I don't know, shocks me sometimes. + +00:05:50 He does massive PRs and also like a lot of tiny ones. + +00:05:54 And by the time I'm looking, there are like three more out of it. + +00:05:56 I don't know how he does it. + +00:05:57 We're going to get to a bit of how much traffic there is on Airflow in terms of like open source activity. + +00:06:04 It's some, it's a little bit. + +00:06:05 Before we move on though, Yarek, what is the Apache Software Foundation? + +00:06:09 What is this Apache thing that you're talking about? + +00:06:12 And why is Airflow part of it? + +00:06:13 Very quickly. + +00:06:14 It's a foundation. + +00:06:15 One of the oldest foundations, open source foundation in the world. + +00:06:18 25, 26, seven years now. + +00:06:20 I think the main thing about Apache Software Foundation is that it's individual driven. + +00:06:24 So every member is an individual, not a corporate, as opposed to like Linux Software Foundation, + +00:06:29 where members are corporates. + +00:06:30 And people make decisions in both foundation and projects or in PMCs, so-called project management committees. + +00:06:39 And Airflow is one of the PMCs. + +00:06:41 So one of the project management committees, which has PMC members. + +00:06:45 We are both PMC members and we have like 50 other individuals or 60. + +00:06:50 I can't remember like the number of changes. + +00:06:52 Like we are inviting new ones all the time. + +00:06:54 We make decisions as humans, as individuals, not the corporates who are employing us, for example, + +00:06:59 because it's a meritocracy-based system where people have merit and the merit doesn't expire + +00:07:07 and the merit doesn't belong to individuals, not to the corporates. + +00:07:11 That's one of the big, like pretty much all the open source software out there, like has some Apache Foundation + +00:07:18 or Apache Complement in it. + +00:07:20 It started with Apache Server 20, 30 years ago almost. + +00:07:24 But now we have more than 200 PMCs. + +00:07:27 We just passed 10,000 committers mark two months ago, I think. + +00:07:32 So like lots of individuals, lots of people contributing to the foundation. + +00:07:36 And the main thing about foundation is community over code. + +00:07:39 So we value building communities more than actually producing code. + +00:07:42 We believe producing code is just byproduct of great communities working together. + +00:07:48 And ASF is a charity, is a public good charity in the US registered in Delaware. + +00:07:53 So we actually cannot be sold. + +00:07:55 We cannot change our license. + +00:07:56 Nothing like that can happen because of the status of foundation. + +00:07:59 And a really positive force for open source, right? + +00:08:01 Oh, absolutely. + +00:08:02 Absolutely. + +00:08:02 When I first got into like learning how ASF works, I said like that it has no chance to work. + +00:08:08 Like there is no way it works. + +00:08:10 It's too idealistic. + +00:08:11 There's no way. + +00:08:12 Absolutely. + +00:08:12 And nobody in the foundation who makes decisions gets any money. + +00:08:16 So like everyone is a volunteer. + +00:08:18 All the PMC members, all the committers, all the board members, all president, all the VVPs, + +00:08:23 those are all volunteer driven roles. + +00:08:26 And those are the people who make decisions. + +00:08:28 We just pay a few people in infrastructure and security. + +00:08:31 That's basically it. + +00:08:31 Let's start by just talking about high level abstract. + +00:08:35 What is a monorepo? + +00:08:36 I think it's so easy to make that sound like the same thing as a monolith. + +00:08:42 You're like, oh yeah, monorepo, monolith, same thing, right? + +00:08:44 And yet you're shaking your head. + +00:08:46 The first time I met personally a monorepo, maybe I can continue with that, but that was + +00:08:50 like at Google. + +00:08:51 I worked at Google years ago and I was surprised coming to Google that all the code there is + +00:08:57 in a single monorepo. + +00:08:59 Even though like we have like hundreds of products and all the stuff you see. + +00:09:03 It's got to be a lot of code, right? + +00:09:05 Like a giant, giant repo. + +00:09:07 Like now they have like maybe four. + +00:09:09 I don't know. + +00:09:10 Like I've heard some stories. + +00:09:11 I don't work there for a long time now. + +00:09:13 But that for me, that was a sign that like you don't really have to split and dice and + +00:09:17 slice your repositories into many, many small ones. + +00:09:21 Even if you have like non-monolithical product, it all can be kept in a single source, single + +00:09:27 repository, separate source trees maybe, separate like we'll talk about how we do it in + +00:09:32 Airflow. + +00:09:33 But it's a way how you can bind it together and have it tested together and have it developed + +00:09:38 together. + +00:09:39 Even though each piece is pretty much separate and you can work on them separately. + +00:09:44 That's the monorepo. + +00:09:45 As opposed to multirepo, which is like when you have multiple repositories consisting of + +00:09:49 whatever comes up as a product. + +00:09:51 Yeah. + +00:09:52 Everything that Jarek said plus just a small addition, which is each of the component or + +00:09:57 the tiny bit of a monorepo can have its own build artifacts, its dependencies. + +00:10:03 It can also have its own release cycle or a release vehicle. + +00:10:06 That's the only addition, but everything is put together as a big puzzle just to keep the + +00:10:10 puzzle together. + +00:10:11 You know, not every monorepo is Python, but in Python terms, it could have its own pyproject.toml, + +00:10:16 potentially its own virtual environment. + +00:10:18 The nomenclature ironies of this is often the monorepo, I think, makes more sense when + +00:10:24 you are working with lots of small parts, right? + +00:10:27 Where the monolith, maybe it has a couple of things, but it doesn't depend real deeply. + +00:10:31 The more interconnections you have and the harder it is to manage those versions, the more something + +00:10:36 like this makes sense, right? + +00:10:38 People really make a connection between isolated work on part of the system into having to have + +00:10:45 separate repository for that, which is completely not the case. + +00:10:48 Like you can actually have an isolated sub part of the repository, even if it's Git. + +00:10:53 Git doesn't have like, you have some modules and sub repos and all that stuff. + +00:10:56 But even like in a single Git repository, you can easily have like start working and focusing + +00:11:01 on a small part of the whole monorepo and only care about that. + +00:11:06 That's what the monorepo is. + +00:11:09 I'm going to go ahead and put it out there. + +00:11:10 I'm not a big fan of microservice architectures. + +00:11:13 I kind of find it's trading code complexity for DevOps and deployment complexity. + +00:11:20 And I think we have better tools to manage code complexity than DevOps complexity. + +00:11:24 But something like this does help you manage those kinds of deployments as well better, right? + +00:11:30 I use the term mini-series, not microservices. + +00:11:33 Microservices is just too much. + +00:11:35 But then you can have a lot of mini-series, a number of mini-services, but not micro. + +00:11:39 Like micro was just too much of a mainstream. + +00:11:42 I can get on board with that. + +00:11:43 Amak, what do you think? + +00:11:44 I like that as well, mini-services. + +00:11:46 Maybe you should coin that too. + +00:11:47 It's the microservices that are too small. + +00:11:49 It feels to me like the equivalent of when you're trying to write unit tests and you're + +00:11:53 like, oh, what if I get a customer and I set their first name? + +00:11:57 And then I check that their first name is set. + +00:11:58 Like, what are you doing? + +00:11:59 You don't need to check that assignment works. + +00:12:00 This is too, you're just too much in the weeds. + +00:12:03 You know what I mean? + +00:12:03 This is what AI agents do now all the time. + +00:12:06 Like, no. + +00:12:08 Yeah, think of the code coverage. + +00:12:09 Just think of the code coverage. + +00:12:10 Come on. + +00:12:10 You've got some goals to hit. + +00:12:12 You said 80% code coverage. + +00:12:13 It's on top of it. + +00:12:14 Yeah. + +00:12:15 That sets the stage. + +00:12:16 Let's talk a little bit about specifically how Apache Airflow has come to need this, basically. + +00:12:23 Right? + +00:12:23 Like, you shared with me the pulse, the GitHub pulse for Apache Airflow. + +00:12:27 And it's kind of worth looking at just how much open source interest and traffic there + +00:12:33 is. + +00:12:33 Who wants to kind of summarize this weekly pulse here? + +00:12:37 This is not the best week in terms of the number of comments. + +00:12:40 We have had even more red, but in the week of... + +00:12:43 Wow. + +00:12:43 Just one of those weeks. + +00:12:45 Yeah. + +00:12:46 One of the usual weeks. + +00:12:47 Between Feb 3 and Feb 10, we have had about 310 active pull requests. + +00:12:53 So, you can imagine that's about 40 plus pull requests a day. + +00:12:57 A lot of them are being assisted by the AI revolution going on, but that's a lot of pull requests. + +00:13:04 And we have merged about 200 of them. + +00:13:06 About 100 are open. + +00:13:08 And similarly with issues, right? + +00:13:10 35 new issues. + +00:13:11 Five issues per day. + +00:13:12 That's a lot of traffic. + +00:13:13 So, you can imagine the amount of review pressure each of the maintainers has here. + +00:13:19 There's 300 pull requests spread across, I don't know, 120, 130, maybe 140 distributions. + +00:13:26 And each of the distributions having like a swim lane owner who is actively trying to take a look at these pull requests. + +00:13:33 So, it's just another week to be very honest. + +00:13:36 It's more than 25 PRs a day, including weekends. + +00:13:39 How many of these people are high value? + +00:13:40 How many of these PRs are high value? + +00:13:43 I guess I'm trying to get the sense of like, how much does this get accepted? + +00:13:46 Are these just people throwing stuff out there that doesn't make sense for the direction of airflow? + +00:13:50 Well, those merged all make sense because they are reviewed and merged by airflow maintainers. + +00:13:55 And we are very serious about that. + +00:13:56 So, like we don't merge anything that doesn't pass our bar, which is like very high and extremely high. + +00:14:01 Like we have 170 track hooks which are checking if the PR is doing what we, if the code is doing what it was supposed to be doing and if it's architected properly. + +00:14:11 And on top of that, we have individuals, people like, like among myself and maybe 50 other PMC members and committees who are reviewing it and making their comments and know the system enough to direct people. + +00:14:25 So, they may make sense. + +00:14:26 We do have recently, and that was a recurring them at the FOSDEM conference last week when I was there about like AI generated contributions. + +00:14:34 And many of the AI generated contributions are not the best quality. + +00:14:38 It's not like AI is bad quality. + +00:14:41 Many of those are easier to produce and they might have bad quality. + +00:14:45 So, we are now learning how to filter them out and how to make the, to handle them quickly. + +00:14:50 But those are the actual high value PRs that we merged. + +00:14:53 In terms of numbers, if you, if I may, the, it would be maybe a third of the open pull requests that are nice general trend. + +00:15:01 That's pretty good, honestly. + +00:15:02 Yep. + +00:15:03 We have some guidelines published very recently. + +00:15:05 And due to that, we have seen a dip in such, such quality of PRs. + +00:15:09 We published some guidelines in our contribution guides about what will be the action taken if, you know, bad quality PRs are raised or non or PRs are raised where the author does not know the context, but the AI does. + +00:15:22 I don't want to go down this rat hole. + +00:15:24 People hear this enough lately, but I just, it's been in the news lately. + +00:15:27 Open source projects have been kind of getting a barrage of AI submissions. + +00:15:32 And I think that comes in a couple of flavors. + +00:15:35 One, people who just want to get their name listed as a contributor, maybe it helps them with their job or whatever. + +00:15:40 So there's like a small incentive there, but it's been really bad for bug bounties. + +00:15:45 Like curl closed its bug bounty program because people were trying to make the 50 or $250 by finding some issue with AI. + +00:15:52 Is that a problem for you all just taking the pulse of a big project like that? + +00:15:56 It is. + +00:15:57 I actually had a talk about that at the Global Vulnerability Intelligence Platform Summit just before Fosden. + +00:16:03 So that was exactly like, I even quoted Daniel Stenberg and I met him there at Fosden, which like, that was really cool. + +00:16:09 There are some different motivations of people who are submitting those those AI issues and we should fight with the in different ways with different approaches or like, you know, the respond to those motivations. + +00:16:21 Somehow we have some ideas. + +00:16:22 We have an open discussion in GitHub maintainers list right now. + +00:16:27 And GitHub is trying to address it by like just discussing what they can do right now. + +00:16:32 And that's the highest priority for them. + +00:16:34 We have a discussion with OSSF for security kind of guidelines or policies for open source maintainers, how to deal with those issues. + +00:16:42 And I'm sure we will work out some ways and toolings and most of all processes and like being assertive is one thing, like just saying no when the report doesn't meet all the bars immediately. + +00:16:53 And, you know, directing people to the description is good enough of a, you know, barrier for, you know, getting kind of completely broken PRs because we have to just make it more expensive for the reporters than for the maintainers to diagnose the issues or decide if the issues are bad or good. + +00:17:11 And I'm not necessarily saying that there's something inherently bad because AI wrote some of the code than a person. + +00:17:17 AI can write really good code better than a lot of people I've seen. + +00:17:20 But it has this sort of shotgun effect often of just like, I'm going to change all these files and it's not as focused and clear. + +00:17:28 A lot of times it just it doesn't it doesn't get the Zen of it. + +00:17:31 You know, Amag, what do you think? + +00:17:33 It'll generate code, which it thinks is good, but we don't really know the ripple effect and we want to avoid such things. + +00:17:40 Such a long living app with lots of complexity. Right. + +00:17:44 And we all are using the AI for generating the code, to be honest, like so like most of my code. + +00:17:50 You should. Yeah, it's incredible. It's it's I pulled up this graphic here and I'll link to it in the show notes. + +00:17:56 I just given people a sense, I got this little utility that I released this week called Tallymon, which like analyzes code and gives you sort of a more of a breakdown than just like this many lines or whatever. + +00:18:07 So I want to just highlight maybe you all like can riff on this a little bit to give a sense. + +00:18:12 So 100 or 1.2 million lines of Python, 918,000 excluding comments, maybe a little over counting the way this thing works, but still 200,000 restructured texts. + +00:18:23 The one that really stood out to me, 81,000 lines of YAML and 16,000 lines of TAML. + +00:18:28 You guys, that's impressive. And you know what? + +00:18:32 Hat tip to just a just a sprinkle, just a hint of Java at 42 lines of Java. + +00:18:38 But, you know, almost a million, just over a million lines of code without comments. + +00:18:43 That's a big project. What do you think? + +00:18:45 What happened when you joined? + +00:18:47 I don't know. I think it was much less. + +00:18:50 You did contribute a lot. + +00:18:51 You can imagine so because of the number of packages we haven't read the monorepo discussion from earlier. + +00:18:56 We have a lot of packages and the YAML might surprise you at first. + +00:19:00 But if you actually go and see why the YAML, it's mostly for our providers. + +00:19:05 So integration with other systems is something we call as providers. + +00:19:08 And the spec of the providers is written in YAML. + +00:19:11 And TAML, sure, will come to it very, very, very soon. + +00:19:14 That's kind of why I pulled this up, actually, is the TAML aspect is quite interesting, which leave us with that number as we move on. + +00:19:23 16,000 lines of TAML. That's a lot of pyproject.TAML going on right there, folks. + +00:19:28 Oh, yes. And lots of it is generated, actually. + +00:19:31 So like, because we actually generate quite a lot of the YAML and TAML that we have and keep it in the repo. + +00:19:37 So we don't want to regenerate every time. + +00:19:39 So like, we don't write YAML by hand. + +00:19:43 Maybe we can start by introducing this by just giving a shout out to this series that you wrote over here on Medium. + +00:19:50 Yarek, modern Python repo for Apache Airflow, parts one through four. + +00:19:55 Yes, I initially started discussing this blog post idea with a few people. + +00:20:00 Like, you know, like people are busy and I couldn't get people like to write it. + +00:20:04 So I decided to write it myself. + +00:20:06 Well, with a lot of AI help, of course. + +00:20:09 It's not that everything is written by hand. + +00:20:11 And when I wrote it, I realized it's like too big and I had to split it into four. + +00:20:17 But the idea was like to document what we've done because because I think that a lot of people are struggling with like monorepo versus multirepo or like how they should do their repository in when they are the project grows. + +00:20:29 And there were lots of discussions in the past, including here, one of the, you know, one of the podcasts of yours were monorepo versus multirepo. + +00:20:37 And I can't remember who that was, but there was discussion about like going back and forth and like finding that people sometimes go back and then then go forth and like in different directions because there are different problems or approaches. + +00:20:50 So I just wanted to document the reasoning why we are doing it, like why it's possible now because of the packaging ecosystem maturing for Python and uv and other tools coming into the space. + +00:21:03 And then the last part was like really the kind of a little bit innovative approach that we do where the tooling is still not catching up with what we need and what we, what we, what we did. + +00:21:12 So those are the kind of history why we are doing it. + +00:21:16 The, you know, the packaging, the automated verification with Prec. + +00:21:22 So that was the third part. + +00:21:23 And the fourth part was about like the, this chart, libraries, innovation, innovative concept, but we added for, for, for. + +00:21:29 I'll link to the series as well as to a talk that you gave at Fostum that just got published, right? + +00:21:34 Yes. + +00:21:35 Yes. + +00:21:36 And, they are, they have amazing system of recording and publishing stuff. + +00:21:40 Like, like for the volunteer driven conference, thousand speakers. + +00:21:44 Oh, that was, that's amazing. + +00:21:45 That works. + +00:21:46 Like probably some automation going on there. + +00:21:48 Let's talk a little bit about, I guess the problems that you ran into because initially there were some challenges with the standards and tooling not be there. + +00:21:56 And you actually, one of the takeaways, if people read the series or watch the talk is you actually had to work with some of the tool providers to make this possible. + +00:22:05 So not only is it like, well, the tools have changed what we could do this. + +00:22:09 It's you all have changed the tools a little bit through, you know, working closely, like, Hey, we've got this 1 million line project with a hundred dollars. + +00:22:17 So not only is it a hundred sub modules or more help. + +00:22:20 Like it's just your tools to support this. + +00:22:22 Help me make this work. + +00:22:23 Right. + +00:22:24 What were some of the problems? + +00:22:25 Let me start with this cooperation and maybe, you know, Amok can also explain like what was before and after, because like he experienced that firsthand as a, as a user kind of this kind of repository structure. + +00:22:34 But for me, the idea was like, I was working on it for years. + +00:22:39 Like when we went to airflow to five years ago, we, or four years ago, I can't remember. + +00:22:44 That's a long time. + +00:22:45 And we didn't have all the tooling and we had to do pretty much everything that we do now with the, with monorepine uv by hand, by bash scripts by that time. + +00:22:55 By that time, by that time, crazy. + +00:22:57 So like, if you run it three years ago, the, the, your code, you would see more than 10,000 lines of bash code, which I wrote. + +00:23:04 But we, we since removed. + +00:23:05 We since removed. + +00:23:06 That is not joyful. + +00:23:07 That doesn't spark joy. + +00:23:08 That's why we removed it with some outreach internship actually. + +00:23:11 And shout out to edit and, and Borna who were our outreach mentors who helped us to convert it to, to Python, which was really helpful. + +00:23:19 That's how it started. + +00:23:20 No tooling need because we grew, we wanted to have more providers, more integrations, and it already was quite difficult to manage if they are well part of single distribution. + +00:23:30 So we have to split into many distributions, 60, I think at the beginning. + +00:23:34 Now we have more than hundreds. + +00:23:36 Now, when we did that, I, we had to do all manually and like working with that was like really cumbersome. + +00:23:43 Maybe, you know, like I can switch to, to Amok. + +00:23:45 So he can say like the past experience and new experience because like he experienced the change himself. + +00:23:50 Yeah. + +00:23:51 The, the past experience was scary to be, to speak the least. + +00:23:55 Whenever I, switch branches or have to rebase for whatever reason, I had a nightmare, a very bad time trying to, you know, package things together and try to run something. + +00:24:05 And I think Yarek found me often, you know, ranting on the Slack channels that, Hey, this doesn't work. + +00:24:09 Hey, that doesn't work. + +00:24:10 What do we do? + +00:24:11 Now it's, it's very easy. + +00:24:13 It's, it's effortless, almost effortless compared to what we had years, maybe like five years ago, four years ago. + +00:24:19 Yeah. Amazing. + +00:24:20 How does GitHub deal? + +00:24:21 I was the only one who actually managed the whole thing for years. + +00:24:24 And I was like overwhelmed as well when people have problems, of course. + +00:24:27 So then the change that we've done was not only with the tooling. + +00:24:31 And as you mentioned, we were actually cooperating with Charlie from Astral, Charlie Marsh and with Joe from FEC because we had this need. + +00:24:39 We had it implemented ourselves and then they could look at how we've done that and they could implement it properly in their tooling. + +00:24:46 And we've been like exchanging the, you know, like Charlie was even interviewing me at some point of time, how we, how, what, what are our needs? + +00:24:53 So I have for a long time, I have this, this motto that the best way to foresee future is to shape it. + +00:25:00 And like, so we did shape the future by, you know, talking to those tool providers so that they can, or builders so that they could build it for us and work with us. + +00:25:08 And we helped them to test them and everything like that. + +00:25:10 But also it was like listening to Amog and other contributors, like all the problems they had or like, and then when I solved it, I wouldn't, I wouldn't also own solve it with the new tooling, but we also engaged all the more people from the, from the team, like Amog and few other active contributors. + +00:25:27 And they were actually part of the whole process of conversion. + +00:25:29 And they are now part of the team. + +00:25:31 And now we can have this podcast while things are being broken in airflow right now. + +00:25:36 And somebody is probably fixing it right as we speak. + +00:25:38 So like, not me anymore. + +00:25:40 So that's, those old, old things are really great. + +00:25:42 This portion of Talk Python To Me is brought to you by us. + +00:25:46 I want to tell you about a course I put together that I'm really proud of. + +00:25:50 Agentic AI programming for Python developers. + +00:25:53 I know a lot of you have tried AI coding tools and come away thinking, well, this is more hassle than it's worth. + +00:26:00 And honestly, all the vibe coding hype isn't helping. + +00:26:03 It's a smoke screen that hides what these tools can actually do. + +00:26:07 This course is about agentic engineering. + +00:26:09 Applying real software engineering practices with AI that understands your entire code base, runs your tests, and builds complete features under your direction. + +00:26:19 I've used these techniques to ship real production code across Talk Python, Python bytes, and completely new projects. + +00:26:27 I migrated an entire CSS framework on a production site with thousands of lines of HTML in a few hours. + +00:26:33 I shipped a new search feature with caching and async in under an hour. + +00:26:38 I built a complete CLI tool for Talk Python from scratch, tested, documented, and published to PyPI in an afternoon. + +00:26:47 Real projects, real production code, both Greenfield and legacy. + +00:26:51 No toy demos, no fluff. + +00:26:53 I'll show you the guardrails, the planning techniques, and the workflows that turn AI into a genuine engineering partner. + +00:27:00 Check it out at talkpython.fm/agentic dash engineering. + +00:27:04 That's talkpython.fm/agentic dash engineering. + +00:27:07 The link is in your podcast player's show notes. + +00:27:10 How does GitHub deal with so many files and such a big project? Is it fine or is it a challenge? + +00:27:18 Except yesterday, where half of the time GitHub was not at the end. + +00:27:21 Except yesterday. + +00:27:22 Yeah, for people who don't know, yesterday morning, at least morning US time, GitHub was having a moment. + +00:27:27 Like, it was, I couldn't clone stuff. + +00:27:29 I pulled up the random page on GitHub and got the 503 Unicorn. + +00:27:35 It was not good, right? + +00:27:36 Besides that, not excluding that time. + +00:27:38 The Unicorn is actually a little bit like looking kind of angry at you. + +00:27:41 That's one of the observations I had from yesterday. + +00:27:44 I saw it so many times that it's like, it doesn't look nice. + +00:27:47 But maybe GitHub. + +00:27:48 I agree. + +00:27:49 That's not a great error page. + +00:27:50 Like, some error pages are amazing where it's like, you know, the coyote fell off of a cliff. + +00:27:55 Woo! + +00:27:56 You know, like, that one just looks like it's angry back at you. + +00:27:58 Besides that, it's perfect. + +00:28:00 Like, it works like seamlessly, no problems whatsoever with the size, with the numbers. + +00:28:04 Like, we are very, very happy in general. + +00:28:06 And of course, like, things like that happen. + +00:28:08 There is nothing wrong. + +00:28:09 Like, there is something wrong, but like, it's not like that, that it happens all the time. + +00:28:12 Not really like GitHub. + +00:28:13 It's super rare. + +00:28:14 GitHub is an incredible service. + +00:28:15 I mean, I know there's been some grief about the GitHub actions, but I put, that's a different, different conversation. + +00:28:22 Right? + +00:28:23 So let's talk about, next, about how the package standards have changed and how basically some of those things have made it possible. + +00:28:31 And so in your talk, you pulled up a bunch of different peps, nine of them or something like that, that were about packaging, recently packaging standards and different things like that, that have made basically the structure that you're working with and the tools that do it possible. + +00:28:45 Do you want to maybe highlight either of you, some of these things that stand out as, this one is really important. + +00:28:50 The one which is maybe not super related to Monorepo, but it actually helped us a lot, like the pep723, the last, all the, one but last inline script metadata, which is like one of the biggest successes and the biggest kind of usages I see from PEP implemented. + +00:29:07 It caught up very, very quickly. It allows us to, you know, embed inline script metadata into, into the Python scripts, which is like something that we've been dreaming of for years, especially for this kind of tooling, the FCI environment, et cetera, et cetera. + +00:29:21 This is really, really helpful. + +00:29:22 So that, that's the one that I would like to highlight. + +00:29:24 But I, you know, I read all of them like many times, all the peps and they are difficult things to read, to read and understand, but they were like, we actually did all that we could to, you know, be like fully compliant with the, not only with the specification of those peps, but also with the kind of spirit of the specification, because sometimes things are not very precisely described and there are some interpretations and stuff. + +00:29:46 So we just, we just made sure, and this is our, our goal as well. + +00:29:50 Like we just make sure that all the PEP standards that are being published are actually very meticulously followed. + +00:29:56 And we just try to adapt to any changes that are coming in the environment. + +00:30:00 So we know how difficult it is if people are sticking to the old ways and like that's, that makes difficult for Python maintainers. + +00:30:06 Mark, any other thoughts? + +00:30:07 This one is a particularly very important one for us also because it simplifies our pre-commit configurations where earlier we had to, + +00:30:15 you know, specify the dependencies as required. + +00:30:19 So like whatever the particular version was, but now it's all in the script. + +00:30:23 It's not, and the pre-commit remains as clean as it could just with the hook name and, + +00:30:28 you know, the regex for the file filter and minimal configurations for it to work well. + +00:30:34 And I think your dependency group is also the other pep. + +00:30:37 I don't recall the name, but I recall the number. + +00:30:40 I think it's six. + +00:30:41 Oh, I can't remember all the numbers, but one of those. + +00:30:43 That would be 735 folks, 735. + +00:30:46 That's also particularly nice for us. + +00:30:48 We can define the dependency groups in our by-projects and it's, it's nice to, it's really nice how it works with uv. + +00:30:55 We're very happy with this particular dependency group as well as the inline scripts. + +00:30:59 Right. + +00:31:00 The inline scripts are cool. + +00:31:01 I, you know, especially with uv these days, it really makes running some kind of Python code so much easier. + +00:31:08 It's, it's almost as if everything is standard library. + +00:31:12 I can give somebody a file. + +00:31:13 I can say the way you run it. + +00:31:14 No, no, no, no, no. + +00:31:15 Don't. + +00:31:16 I know it looks like you say Python, but don't say that. + +00:31:18 You say uv run this and then, and that's it. + +00:31:21 Like they didn't even have to have Python. + +00:31:22 They might need 10 dependencies and so on it, but it doesn't matter. + +00:31:26 Right. + +00:31:27 Yeah. + +00:31:28 And big standard. + +00:31:29 It makes it also, you know, like other tools are doing the same or hatch run. + +00:31:31 That's the same. + +00:31:32 That's like, yeah, there is even like support for inline script metadata just released in latest + +00:31:37 tip 26. + +00:31:38 So like, it's all good because of the standards and not because a single particular tool does it in an opinionated way. + +00:31:44 So this, this is really, really, really cool. + +00:31:46 And there is one big benefit of those kinds of apps and this part, particularly inline script metadata. + +00:31:52 It's like, we have less YAML. + +00:31:54 Yeah. + +00:31:55 You already have a lot of YAML, but less is better. + +00:31:58 We have a lot still. + +00:31:59 We can't come from that. + +00:32:00 It's better than it was. + +00:32:03 Yeah. + +00:32:04 And so the dependency groups are like, you know, for dev or for tests or something like that. + +00:32:09 Right. + +00:32:10 So you can say like uv sync or uv pip install, and you can say like thing bracket dev or something like that. + +00:32:19 Right. + +00:32:20 The nice thing is about you think is that it sends the dev dependencies automatically without you even specifying that, which is like the best thing for development because you actually always want to have developer developing development tools with you. + +00:32:31 That's a good point. + +00:32:32 Yeah. + +00:32:33 That's really cool. + +00:32:34 That was the changes to Python itself through the peps. + +00:32:37 But there's also tools and you've already mentioned some of them, both of them, but tools that make this possible, which I mean, I think uv has to be number one that goes on this list, right? + +00:32:47 Like uv has really done some powerful stuff here. + +00:32:50 Right. + +00:32:51 Again, Amok can say like, I introduced it, but Amok was the one to switch to use uv at some point of time. + +00:32:56 Yep. + +00:32:57 uv has been a game changer. + +00:32:58 I think we were using poetry before this or Hatch. + +00:33:01 I don't know. + +00:33:02 No, not even that. + +00:33:03 Just pitch. + +00:33:04 Just pitch. + +00:33:05 Just pitch. + +00:33:06 Just pitch. + +00:33:07 Just the image. + +00:33:08 It's so good. + +00:33:09 I don't even remember the last, you know, game changing aspect that uv brought in was this notion of workspaces. + +00:33:13 It's something very simple. + +00:33:14 You can compare it very similar to, you know, a co-working space or something similar where it's a unified environment where multiple interconnected pieces coexist and they're very easy to manage. + +00:33:26 And that's something that eventually led us to splitting the whole repository across our distributions. + +00:33:32 And that's the reason you see so many toml files. + +00:33:35 So everything has a by project toml. + +00:33:37 Everything defines the dependency groups it needs and development of a particular package is restricted only to its dependencies. + +00:33:46 So you develop it, you run uv sync, you can run your by test using uv and everything that is supposed to run with it is running with it. + +00:33:55 And any bad or, you know, cross imports are caught really easily. + +00:33:59 So I think the workspace feature at least was the most important one for me. + +00:34:05 And obviously the speed that it brings with it. + +00:34:07 And that's impressive. + +00:34:08 It is. + +00:34:09 And I think this workspace concept, it's new to me. + +00:34:13 I'll say it's new to me. + +00:34:14 I don't know how new it is to other other folks. + +00:34:16 So you've got this giant monorepo and how many different conceptually different packages or projects are in there right now? + +00:34:26 120 plus. + +00:34:27 It changes by day because Amok is doing a lot to increase the number very, very quickly because we are just now in the middle of finishing some isolation kind of restructuring. + +00:34:38 And Amok is the one that that's why he's here also to lead the introduction of new packages that we or new distributions that we that we have like a shared libraries that we will talk about later. + +00:34:48 So we have a lot of those. + +00:34:49 Yes. + +00:34:50 I think this is super important to dive into and how uv makes this possible. + +00:34:54 And I think you said also Hatch, you talked with Ofec, who runs Hatch as well about this, right? + +00:35:00 Yes. + +00:35:01 Yes. + +00:35:02 Hatch is also supporting workspaces, which are modeled mainly about what like after what uv has done. + +00:35:07 We haven't tried it yet, but I've heard it's very, very similar or even like you can use it as a one to one replacement in some cases or maybe even in all. + +00:35:16 But generally, I would love this eventually to become some kind of standard so that multiple tools are supporting this. + +00:35:21 But but yes, there are a few other tools that we were considering before, but uv is by far the kind of like, yeah, well, we work together. + +00:35:28 We shaped it together with the uv team. + +00:35:30 So it definitely works well for us. + +00:35:33 Yeah. + +00:35:34 Amazing. + +00:35:35 So let me describe this a little bit and then you all can can actually introduce it. + +00:35:39 So the idea is we've got this mono repo with a bunch of different folders for the sections, right? + +00:35:45 Like airflow dash CLI or CTL and airflow dash core and so on. + +00:35:49 And you'd like to be able to kind of just jump into one section and treat it as a top level project, right? + +00:35:56 It's got a pyproject.toml. + +00:35:57 It's got a source file, tests and so on. + +00:35:59 But the challenge is you can't just have a bunch of disconnected pieces like maybe airflow core depends on five other parts of it that are also themselves have their own pyproject.toml and different things. + +00:36:12 And you've got to set up, you know, set up. + +00:36:14 If you jump into the airflow core, you've got to set up the environment just right to be working on those other parts, right? + +00:36:19 It sounds pretty tricky. + +00:36:20 So how does how does that work? + +00:36:22 Who wants to make sense of this for us? + +00:36:24 It works perfectly. + +00:36:25 Like it's super, super simple, actually. + +00:36:27 You know, like the whole thing about the uv is like its simplicity of the of not of the concept. + +00:36:33 The implementation is actually quite tricky. + +00:36:35 But the way how you use it is very simple. + +00:36:37 Just go to the directory and run uv sync. + +00:36:39 That's basically it. + +00:36:40 This is the directory you want to work on. + +00:36:43 And it does exactly what you would expect it to do, which means that it syncs. + +00:36:47 It actually updates the or recreates basically the virtual environment that you're using with all the dependencies that this particular distribution needs and anything that it needs. + +00:36:56 As a transitive dependency as well. + +00:36:58 So if it refers to another project project inside the workspace, it will also use it from there, not from like installed by by PR. + +00:37:05 So you can immediately start working on this because everything after uv sync, everything is exactly as you expect for this particular subset of the repository that you were on. + +00:37:14 And that's basically it. + +00:37:16 This is all. + +00:37:17 Like there is nothing more, basically. + +00:37:18 That's it. + +00:37:19 It works. + +00:37:20 And you can, when you run uv sync pytest run, it will do exactly what you want. + +00:37:25 So in this folder, because it will also uv, uv run py test, it will do exactly what you want because even uv run will automatically sync the virtual and very, very quickly to the one that your project needs. + +00:37:38 And then it will just run pytest in this virtual environment and it will run all the tests in your project. + +00:37:43 And that's basically it. + +00:37:44 So it's like conceptually for the users is like, you don't have to do much, just uv sync. + +00:37:49 And that's it. + +00:37:50 I think one of the big challenges here is how do different parts of the project know about each other, right? + +00:37:57 Yeah. + +00:37:58 You said that it, it, it's similar links the different elements in. + +00:38:01 The basic kind of workspace and implementation is just a workspace definition. + +00:38:05 So you have to have the definition of workspace in the top level by project. + +00:38:09 So there you have all of them listed. + +00:38:11 You have links to it. + +00:38:12 They have described where they are and uv will read the by project from the top level and will, will know what they are. + +00:38:18 We'll, will know where to look for particular distributions. + +00:38:21 So that's the, that's the simple discovery and the way how we know that we are using it from the sources and not from the, from the IPI. + +00:38:29 But then like the shared libraries as, as it's like something that we added on top of it and the sim links are on the top of it. + +00:38:36 And this is kind of extra innovative thing that we are doing for something else that we need, but you know, we can, we can talk about that now or like I'm not can talk about. + +00:38:44 This is really cool. + +00:38:45 So one of the things that happens here is these different slices or subsections of the monorepo PI project.toml that PI project.toml depend, defines its true dependencies and its dev dependencies and so on. + +00:38:59 So when you go and jump into a section, it will, uv will basically realign the virtual environment with whatever dependencies are supposed to be there from those things. + +00:39:10 Right. So that means installing stuff, obviously, but actually what surprised me a little bit, not a lot, but like, Oh yeah, I guess it does do that. + +00:39:17 That's cool. Is it actually uninstalled stuff. That's not explicitly put there, which I can imagine before that you could be like, well, this one part way down here depends on this weird library. + +00:39:29 And somehow I used to be over there. Then I went back to the, this other piece and then I came back and I forgot where that even came from. + +00:39:36 Like, why is that in my virtual environment? And like, how do I specify that? Probably juggling that was a big problem, right? + +00:39:41 This, this like loading and unloading dependencies based on what part of the monorepo you're in. + +00:39:47 And I think that actually makes it really much easier to deal with like this, this type of code structure. + +00:39:52 Let me add to that one more thing, because it's also not only the dependencies that you might have from somewhere else, but also it's a cross dependencies between different distributions inside. + +00:40:02 So for example, if our flow CTL does not use our flow core, if you go there and you think you will not be able to report and use any of the source code, which is in airflow core, because it's not a dependency of our flow CTL. + +00:40:14 So uv sync will not only uninstall the dependencies that you have, but also uninstall the source code that you have from other parts of the repo, which is a fantastic thing for us. + +00:40:23 And that was exactly what was missing before kind of isolation between those. + +00:40:27 You only actually can from your source, you only can refer to the source code of those distribution that you depend on and nothing else from the monorepo. + +00:40:36 So this means that it's like you can slice and dice your repository as you want. + +00:40:41 So depending on in which the directory you are and when you run uv sync, you will have like subset, like the actual useful and the used subset from your repository. + +00:40:51 And it can be completely different if you go to another directory, some of that can be overlapping, some of that can be completely different. + +00:40:58 Depends like which dependencies are defined. + +00:41:00 And this is like, this all magically happens, like by just defining the dependency in PI project. + +00:41:06 And uv sync will handle it for you in the workspace. + +00:41:09 It's like exactly the reason why it's so useful for developers. + +00:41:13 It helped us in our vision to actually, you know, decompose the project into multiple parts and avoid the classic problem of coupling, which every monorepo faces at some point in their lifecycle, because everything is out there. + +00:41:27 Why don't we just, you know, have code leaks all over the place. + +00:41:31 So this helps us prevent that. + +00:41:33 And I cannot imagine a time how we did it earlier before uv. + +00:41:36 I don't know if we did it, but if we did it, it would have been a really tough thing. + +00:41:41 Yeah, there's a bunch of tools that you can, linters and code analysis things you can run on your code that breaks down for these different modules and these layers. + +00:41:50 Here's like a directed graph of how this thing, and you can set up rules to say this should never cross that boundary, but these are just very, very vague things. + +00:42:00 And this setup actually makes it so it's not accessible to your code. + +00:42:04 If you didn't say it should be. + +00:42:05 It's just built in exactly the definition of your distribution, which you anyhow have to do because like you have to define what the, what the dependencies are. + +00:42:13 And yes, we did something like that before. + +00:42:15 So we get a number of like rough rules or whatever. + +00:42:18 Don't import here, import here. + +00:42:21 We still have them for shared libraries, which we can talk about now, because I think this is an important modification of the concept. + +00:42:28 So we do have some automated check for quality and for imports with Prec, our Prec commit hook implementation. + +00:42:36 But before that, it was just completely, completely like handwritten and unmaintainable. + +00:42:42 People will not, we're not actually updating it with all the distributions you couldn't really, you know, follow when things change. + +00:42:48 With PyProject Tom being the, for each distribution being the single source of truth, you don't have to do anything because the dependency is declared there. + +00:42:56 And this is like the best part of, of uv understanding that and, and doing everything that is like reasonable in this case. + +00:43:03 The other major tool involved here was Prec, which it's a pre commit framework for running hooks, many languages, but especially Python relevant here written in Rust. + +00:43:15 So it pairs well with uv, I suppose. + +00:43:17 Oh yeah. + +00:43:18 It was inspired by uv as well. + +00:43:20 And, and Joe was mentioning, mentioned that, that he was actually contributing to uv before. + +00:43:25 Great. + +00:43:26 How's Prec show up here? + +00:43:27 I feel like this is leading towards what you were hinting at earlier. + +00:43:30 It's a new name, Prec. + +00:43:31 So, yep. + +00:43:32 This allows us to do a few things which pre commit did not do, or, you know, did not accept as suggestions. + +00:43:39 So, one certain thing that Prec offers is obviously it's written in Rust. + +00:43:45 So speed is the obvious one is that we get. + +00:43:48 But apart from that, we also get this notion of it pairing well with uv in terms of modularized hooks. + +00:43:53 Earlier, we had all the hooks in one place in that, in the top level pre commit YAML, right? + +00:44:00 And it was a big fight. + +00:44:01 It was really big. + +00:44:03 You can imagine. + +00:44:04 So, yeah. + +00:44:05 So this Prec allowed us to, Prec again, you know, it, it consumed the concept of workspaces here, I would say. + +00:44:11 So it allowed you to define pre commit hooks or Prec hooks within a module itself. + +00:44:18 And this paired well with uv in the sense that when you have to run hooks that are bound to a certain distribution, + +00:44:26 all you have to do is check in into the, you know, the sub module and just do a Prec run. + +00:44:31 It will run the relevant hooks for that particular module. + +00:44:34 And the other, other thing that I really love about Prec is auto completion, which is not something pre commit had. + +00:44:41 So you can imagine that something fails in the CI, you have to copy that and copy the ID and try to kind of backtrack it in your repo as to which one is failing. + +00:44:50 So it's, it used to be a nightmare, but now with the, you know, the tab completion, it's, it's amazing. + +00:44:56 Nice. Are you talking about like shell autocomplete integration? + +00:45:00 Yeah. Yeah. So, okay. I've seen. + +00:45:02 I have some story about that very, very short. + +00:45:04 So like we actually tried to get out the completion for hook names with, with Prec commit, which was the predecessor of Prec. + +00:45:11 Like Prec was largely based on Prec commit, but somehow the author of it didn't accept even idea of us contributing it or actually had some very, very excessive expectations for that. + +00:45:22 And we, you know, discussed and like, there were like, other people were also trying to convince the author to do that, but they refused. + +00:45:29 He refused basically and refused to accept contributions. + +00:45:32 Even when we spoke to Joe, that was like completely different stories. + +00:45:36 Like we need that. + +00:45:37 And next day it was there. + +00:45:38 Like it's like completely different approach. + +00:45:41 So, so this is, and then we said like, we need workspaces and like a few weeks later, because it took a little bit of time, it was there and we work together and we tested that. + +00:45:49 And like, I raised, I don't know how many issues in the initial kind of pre-release version when, when we wanted to use it. + +00:45:55 So I think the collaboration and being, you know, working together, listening to your users and be responding and actually working as an open source maintainers together. + +00:46:04 This actually worked perfectly well here, both, both in uv and Prec. + +00:46:08 And this is why we love Prec actually because, because we know we can rely, if something is not working, that it's going to be like, we can discuss and either submit a fix or, or, you know, Joe will do this or even like lots of other people can do it. + +00:46:22 Because there was a few features that we wanted and somebody else implemented it. + +00:46:26 And that wasn't Joe, they contributed Prec because of this openness and, you know, being able to accept the needs of the users. + +00:46:35 That was very, very important part, like why we moved to Prec. + +00:46:38 Yeah. I think Airflow was also one of the initial case studies for Prec. + +00:46:41 It's a project of that scale. And if you kind of satisfy that project's needs, you are, you're pretty good with most use cases. + +00:46:48 I think that's quite both Prec and UVS. + +00:46:51 Yeah. Right there at the top of the Prec repo, it says, although Prec is pretty new, it's already powering real projects, you know, little things like CPython, Apache Airflow and FastAPI. + +00:47:01 I know Hugo van Kameret from the release manager of Python. So we met at Fosdem as well. And like, he was actually listening to our Prec discussion and he converted, you know, CPython to use Prec because of the, of the needs they had. + +00:47:13 So like, it was all about, you know, people talking to each other, word of mouth and things like that. + +00:47:19 You know, there's a feature listed here that just makes me jealous. One of the features of Prec is a single binary with no dependencies that doesn't require Python or any other runtime to be installed. + +00:47:29 Like how incredible would it be with Python if we had a, a Python --build app or something, you know what I mean? + +00:47:36 You can put it at your thing and you get something you could distribute. I know uv solves a lot, but you still got to have uv installed. + +00:47:43 And then, you know, like this, that is a huge advantage of things like Rust and go and some other languages. + +00:47:49 It's both good and bad in some cases. So it's like, there are always trade-offs, different choice made by Python here. + +00:47:55 I don't think it's like the best choice for, for Python. I think Python being script language, it's okay to have, you know, like dependencies and especially like inline script, script metadata almost did it because you just, you know, can install stuff. + +00:48:09 And uv also, and the kind of tooling is also doing all the stuff like uv install or uv tool install, whatever. + +00:48:16 And it would not only install the project, its dependencies, but also install Python that is needed to run it. + +00:48:22 So like all this is really a matter of two weeks and it has improved dramatically over the last few years. + +00:48:27 Yeah. I was pining for an option, not a only binary thing. + +00:48:32 All right. So one thing I actually want to talk about going back to this workspaces thing real quick is what does it look like from a IDE or editor experience to work on this? + +00:48:43 All right. Like you've got Python projects, you've got maybe VS Code workspaces where you can pull in different pieces. How do you all manage that? + +00:48:52 I cannot talk for VS Code. I'm a Python user here, but we had to do a little bit of hacking, I would say, or more like a helper script for the IDs, right? + +00:49:02 Because so we have a IDE helper script right in the repo and we recommend the users to run it so that the IDE knows what is where in terms of maintaining things, right? + +00:49:12 Because in normal projects, there's usually just one source, one desk at the top level, but it has 120 plus. + +00:49:19 And the helper script is, it does a pretty simple thing. It just auto discovers all the packages in the monorepo and adds this. + +00:49:27 So IntelliJ and PyCharm both have a .IDR within each, a hidden folder within each of the projects that it opens. + +00:49:35 And it has a, and it supports XML like format for IML where you can define certain things. + +00:49:42 So this essentially does a very simple thing. + +00:49:44 It just, for each package, it adds the module slash source as the source root and the module slash tests of the test. + +00:49:52 It's as if you went through all 120 things and right clicked and said mark as sources root or something like that. + +00:49:56 Yeah, we had this PyCharm script and then we have the same approach for VS Code. + +00:50:00 So we have another script for VS Code as well, which was contributed by someone who uses VS Code because neither me or Amog are VS Code users. + +00:50:08 PyCharm uses both of us. + +00:50:09 But, you know, communities also and like somebody said, OK, I'll do it. + +00:50:13 And there it was. + +00:50:14 And they tested it. + +00:50:15 And, you know, like that's, that was super cool actually. + +00:50:17 So, yeah, it works well. + +00:50:19 Also, the, you know, a little bit of words, probably we don't talk, we won't talk too much about like the, we don't have too much time, but the shared libraries concept a little bit might maybe it's the right time to introduce the concept. + +00:50:32 Because, because we like one thing that Amog mentioned is like the, we have, we solve this coupling problem, but also we wanted to solve the dry problem. + +00:50:41 And those two are always kind of mixture, like you get dry and then you get more dry and less coupling and like, like more dry and more coupling and like all these things are complex when you have lots of code interacting with each other. + +00:50:54 Dry being the architectural philosophy of do not repeat yourself. + +00:50:57 But if you're not repeating yourself, everything where if it exists somewhere, everything's got to depend on that somewhere and it starts to become more linked together. + +00:51:04 Right. + +00:51:05 So it's a little bit of like a, eat cake and have it too. + +00:51:08 Like we want to have dry code and not to repeat it for like common utilities, like logging, configuration, whatever, all the things that are kind of common between all the different distributions. + +00:51:18 But also we didn't want to depend on a single version of those, because if we do, then it means that we have to make sure that the backwards compatibility is maintained. + +00:51:26 Because like when we install different version of different distributions coming from different time of repository, they might use different version of those shared libraries. + +00:51:35 And like how to make sure that they don't have breaking changes and stuff like so this is all the whole level of complexity between like how to manage the dependencies there and manage versions, especially manage the backwards compatibility. + +00:51:48 So we figured out that with some very simple approach, we tried a few different approaches, but like, like one of the approaches was using the vendor link library from pip and from Byton, no, from pip, from pip. + +00:52:01 And the second one, and that's the one we came up, we finally implemented, was like using Simlinks to share the code between different distributions. + +00:52:09 And that's a very innovative approach that I hope will make it into some kind of standard eventually. + +00:52:14 So like we came up with this approach where we actually have cake and eat it too, like, which is like pretty amazing if you fought with like for years with this kind of common dependency issues that and backwards compatibility. + +00:52:27 So in our case, like the Simlink approach we have, it needs some pre-processing of by project DOM. + +00:52:33 Some parts of the PyProject DOM are generated to make it actually work. + +00:52:37 But this is all automated with Preq, which is like, we don't have to think about that even. + +00:52:41 And once we do that, and once we create some Simlinks between different parts of code, like one library, one distribution is Simlinks in code from the shared distribution. + +00:52:50 The end result is that this code gets automatically vendored in during the building of the package, which means that we actually have the same library in different package, in different version, in different distributions. + +00:53:02 So distribution released a week ago will have a shared configuration from a week ago. + +00:53:08 But another distribution will have the same shared configuration code from today if it's released today. + +00:53:14 And we can install them together. + +00:53:16 And all of them have effectively, like if they had a different version of the same library installed. + +00:53:22 It's as if the Airflow-CTL said it had a dependency on core and it pinned that version to something, but a different part of the repo pinned it to a different. + +00:53:31 And they can both kind of coexist. + +00:53:34 But it's actually all within the same code file. + +00:53:36 That's insane. + +00:53:37 OK. + +00:53:38 And this is like largely, like it's nothing new. + +00:53:40 It's largely inspired by how the libraries work in C and like traditional kind of building code. + +00:53:46 Like you have dynamic libraries and static libraries. + +00:53:49 So this is like essentially equivalent of static libraries where you take the code of the version that you compile the stuff in and put it inside the final binary. + +00:53:59 And then it results like in Rust, the kind of single binary thing. + +00:54:02 So it's a little bit like, so we have a little bit of this single binary by doing that in the sense that we automatically vendor in all the, you know, shared dependencies that we have in the same distribution. + +00:54:15 So it's kind of hybrid, but it's always like, so Rust is a little bit too far because everything is single binary. + +00:54:21 In our case, we have a bit of both. + +00:54:23 Like we can use libraries dynamically, but we can also embed libraries as shared inside the single distribution. + +00:54:29 That's very cool. + +00:54:30 That's wild. + +00:54:31 Amag? + +00:54:32 Sounds like you were instrumental in this, Pari. + +00:54:34 That's the nice thing about the approach that was chosen, right? + +00:54:37 We all came together as a community on this one. + +00:54:40 And we had one email, DevList discussion one fine day that, hey, we want to achieve something like this, which more or less was something everyone agreed upon. + +00:54:50 So people started chiming in and we started trying different things out. + +00:54:53 The first one, obviously using the rendering tool from Pip. + +00:54:57 Somebody did a POC on that, but it felt like it's going to be difficult to achieve that long term. + +00:55:02 And also it could be brittle. + +00:55:04 So Yarek came up with this particular option with Simlinks, which again was discussed within the community. + +00:55:10 A few of us picked this PR up, passed it locally, played around and gave the feedback. + +00:55:16 So I don't think this would be possible with AI in the sense that this has never been done before. + +00:55:23 Or something like this, where a community comes together and solves a rather difficult problem, is something that makes me really happy. + +00:55:31 And also something that all of us are working towards a common goal while also bound by our corporate hats, right? + +00:55:38 Is something that is again, really nice to see. + +00:55:41 We have about how 11, I think at this point, we have about 11 to 12 shared libraries where the main notion here is to reimagine Airflow as a independent server and more like a control plane and execution plane. + +00:55:54 What we did with Airflow three and this shared libraries is helping us achieve that model. + +00:56:01 And we have about 11 to 12 of them. + +00:56:03 And I think a few more coming very soon. + +00:56:05 But yeah, that's yeah, it's been nice working on the shared libraries. + +00:56:10 It's yeah. + +00:56:11 Is this something that people can take and adopt into their monorepo if they want to live that life? + +00:56:16 Absolutely. + +00:56:17 Yeah. + +00:56:18 It's just that it's really like one or two kind of preq hooks which are maintaining the consistency. + +00:56:23 And like, so that you don't forget to add this symlink here and that kind of I project com definition here and or that hatch definition for the hatch link to actually embed your symlink code into the final distribution. + +00:56:38 So like there are like a few pieces that have to be put together from existing libraries. + +00:56:43 So that's basically it. + +00:56:44 And once you do it, it's just that those are the funny thing is like those shared libraries are just standalone distributions. + +00:56:51 You can actually build them separately as a library as well. + +00:56:54 We could potentially even, you know, like just use them as library as well. + +00:56:58 No problem whatsoever because they are just standard plane distributions or any other. + +00:57:02 We just happen to take the source code of it and then embed it in into the target distribution that wants to use it rather than, you know, link to it by dependency. + +00:57:11 So that's basically other than that. + +00:57:13 It's it's it's a kind of completely standard library and or standard distribution. + +00:57:18 And one one more thing that is really important to add here is like this also has a side effect, but I think a very nice one. + +00:57:24 And Amo can confirm that because he has been doing a lot of that is like we actually come up with like way better internal architecture. + +00:57:31 Or because of that, because a lot of those shared libraries, they depended on each other, sometimes in a circular fashion. + +00:57:38 Sometimes it really dependent, like which import you did first, like what happened, like what was initialized. + +00:57:43 And I was like complete spaghetti of dependencies between generally independent pieces of functionality. + +00:57:49 Right now, by having shared libraries, we are actually forcing ourselves to make it make them isolated. + +00:57:56 We are changing the way how we initialize them. + +00:57:59 For example, we are injecting all the configuration rather than using them from inside the library, because like configuration libraries and other libraries. + +00:58:06 So you don't want to depend on the other libraries. + +00:58:08 So it's and it's really nice. + +00:58:09 I think it comes. + +00:58:11 The result is that really the architecture of Airflow internally is so much better because of that. + +00:58:17 So less surprises and explicit initialization is like something that we'll have to do rather than implicit initialization, initialization during imports, which which has always been plaguing as a big issue. + +00:58:28 Certainly, it also allows you to imagine each component having an entry point, per se, where you have an initial starting point and it initializes everything it needs by injecting and calling certain factories, which makes a very clean for anyone visiting the project. + +00:58:45 Also, they look at something and they know the entry point very clearly that, hey, this is how it starts. + +00:58:49 This is what it initializes. + +00:58:50 You know, it reminds me of like Golang or Java projects where they have a nice, nice main where in Python, Python, it's not really the same way. + +00:58:59 All right. Well, I think that's about it for all the time we have. + +00:59:02 I guess let's close it out with one final thought. + +00:59:05 Here's just people who are maybe inspired by your design, by the way you put together Airflow and this monorepo concept, especially Python people. + +00:59:15 What do you what do you say to them? + +00:59:17 Final thoughts here. + +00:59:18 I mean, like there was always discussion. + +00:59:20 Like we had lots of discussions internally, even some of the teams members in Airflow. + +00:59:24 They let's split the repository into smaller one. + +00:59:27 Like let's make more of them because it's going to make things easier. + +00:59:30 I was always the monorepo fan and and I made a lot of work to make it possible. + +00:59:35 But that was a very, very difficult thing. + +00:59:37 It's changed. + +00:59:38 So like the reasons why you would like to have multiple repos are gone now if you're using the right tooling. + +00:59:44 And only the benefits or mostly the benefits from having it in one place where you can test everything together and work on it together, remain. + +00:59:51 All the rest is basically gone. + +00:59:53 So for me, the discussion monorepo versus multirepo is already solved. + +00:59:57 Yeah, just do it. + +00:59:58 We it's not even. + +01:00:00 So personally, I've been using the read me that we have present in the shared libraries as a context for my ID. + +01:00:07 So it's turning out to be very nice for the shared library split, for example. + +01:00:12 All I have to do is just provide it the context and tell it, hey, just just construct the structure for me and I can do everything else. + +01:00:20 So it's that easy. + +01:00:21 We have all the things in place. + +01:00:23 We are in the right area to do it. + +01:00:24 So just do it. + +01:00:25 Very inspiring. + +01:00:26 Thank you for being here. + +01:00:28 Awesome for this look inside. + +01:00:31 And it's Apache Airflow. + +01:00:32 It's on GitHub. + +01:00:33 People can go look and see. + +01:00:34 It's not just a talking vaguely about some internal project. + +01:00:38 Right. + +01:00:39 So people can go check it out. + +01:00:40 All right. + +01:00:41 See you later. + +01:00:42 Thanks. + +01:00:43 Thanks. + +01:00:45 This has been another episode of Talk Python To Me. + +01:00:48 Thank you to our sponsors. + +01:00:49 Be sure to check out what they're offering. + +01:00:50 It really helps support the show. + +01:00:52 This episode is brought to you by our Agentic AI Programming for Python course. + +01:00:57 Learn to work with AI that actually understands your code base and build real features. + +01:01:02 Visit talkpython.fm/agentic-ai. + +01:01:06 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced + +01:01:11 courses on topics ranging from complete beginners to async code, Flask, Django, HTMX, and even + +01:01:18 LLMs. + +01:01:18 Best of all, there's no subscription in sight. + +01:01:21 Browse the catalog at talkpython.fm. + +01:01:24 And if you're not already subscribed to the show on your favorite podcast player, what + +01:01:28 are you waiting for? + +01:01:29 Just search for Python in your podcast player. + +01:01:31 We should be right at the top. + +01:01:32 If you enjoy that geeky rap song, you can download the full track. + +01:01:35 The link is actually in your podcast blur show notes. + +01:01:38 This is your host, Michael Kennedy. + +01:01:39 Thank you so much for listening. + +01:01:41 I really appreciate it. + +01:01:42 I'll see you next time. + +01:02:11 Bye. + +01:02:11 Thank you. + diff --git a/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt new file mode 100644 index 0000000..6c9349c --- /dev/null +++ b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt @@ -0,0 +1,2722 @@ +WEBVTT + +00:00:00.000 --> 00:00:10.880 +Monorepos. You've heard the talks, you've read the blog posts, maybe you've seen a few glimpses into how Google or Meta organized their massive code bases, but it's often in the abstract and behind closed doors. + +00:00:11.500 --> 00:00:24.760 +What if you could crack open a real production monorepo, one with over a million lines of Python code and over a hundred sub-packages, and actually see what's being built step-by-step using modern tools and standards? + +00:00:24.760 --> 00:00:46.440 +Well, that's exactly what Apache Airflow gives us. On this episode, I sit down with Yarek Patuk and Amag Desai, two of Airflow's top contributors, to go inside one of the largest open-source Python monorepos in the world and learn how they manage it with uv, pyproject.toml, and the latest packaging standards, so you can apply the same patterns to your own projects. + +00:00:47.080 --> 00:00:52.200 +This is Talk Python To Me, episode 540, recorded February 10th, 2026. + +00:00:52.200 --> 00:01:14.460 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:14.620 --> 00:01:21.920 +This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. Let's connect on social media. + +00:01:21.920 --> 00:01:27.540 +You'll find me and Talk Python on Mastodon, Bluesky, and X. The social links are all in your show notes. + +00:01:28.240 --> 00:01:35.280 +You can find over 10 years of past episodes at talkpython.fm, and if you want to be part of the show, you can join our recording live streams. + +00:01:35.480 --> 00:01:39.520 +That's right, we live stream the raw, uncut version of each episode on YouTube. + +00:01:40.000 --> 00:01:44.520 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:44.680 --> 00:01:48.340 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:48.340 --> 00:01:52.840 +This episode is brought to you by our Agentic AI Programming for Python course. + +00:01:53.300 --> 00:01:57.920 +Learn to work with AI that actually understands your code base and build real features. + +00:01:58.440 --> 00:02:01.820 +Visit talkpython.fm/Agentic-AI. + +00:02:02.460 --> 00:02:05.840 +Hello, hello, Yarek, Amag. Welcome to Talk Python To Me. + +00:02:06.160 --> 00:02:09.560 +Awesome to have Amag, you here, and Yarek, you back. + +00:02:09.820 --> 00:02:14.920 +Very nice to be again at Talk Python To Me. It's one of my favorite podcasts I listen to all the time. + +00:02:15.000 --> 00:02:15.720 +Thank you, thank you. + +00:02:15.720 --> 00:02:18.360 +It's my first, but yeah, thanks for having me, Mike. + +00:02:18.440 --> 00:02:19.280 +Happy to have you here. + +00:02:19.780 --> 00:02:26.380 +You and a team of people, given the scale of this project, have built an amazing, amazing product with Apache Airflow. + +00:02:26.680 --> 00:02:33.180 +It's going to be really fun to dive into it, and specifically, we're going to focus on not building workflows exactly, + +00:02:33.420 --> 00:02:34.920 +although I'm sure we'll talk about that somewhat. + +00:02:35.180 --> 00:02:40.440 +The real goal, the thing that we're going to focus on, is how do you manage such a big project + +00:02:40.440 --> 00:02:45.400 +with so many different internal packages that all depend upon each other and so on, + +00:02:45.680 --> 00:02:47.740 +and monorepos, and that. + +00:02:47.820 --> 00:02:50.340 +I've touched on monorepos before, but two things. + +00:02:50.420 --> 00:02:53.640 +I think this makes a really interesting discussion for listeners out there. + +00:02:53.740 --> 00:02:58.660 +One, this is going to be very concrete with exact steps, and it's even open source. + +00:02:58.740 --> 00:03:00.460 +You can go check it out and play with it. + +00:03:00.720 --> 00:03:06.920 +And two, the tooling and the standards have changed significantly since I talked about this three or four years ago, + +00:03:06.920 --> 00:03:09.160 +making much of what we're going to talk about possible, right? + +00:03:09.280 --> 00:03:09.560 +Absolutely. + +00:03:09.820 --> 00:03:09.980 +Yeah. + +00:03:10.280 --> 00:03:14.480 +Now, before we dive into that, of course, let's do quick introductions. + +00:03:15.140 --> 00:03:16.620 +Jarek, it's been a while since you've been on the show. + +00:03:16.680 --> 00:03:17.120 +Who are you? + +00:03:17.280 --> 00:03:18.100 +Tell people who you are. + +00:03:18.200 --> 00:03:22.620 +I'm an Apache Airflow maintainer, one of the DMC members as well, + +00:03:22.720 --> 00:03:25.980 +and also one of the Apache Software Foundation members. + +00:03:26.100 --> 00:03:30.420 +I've got this nice, thin new logo of Apache Software Foundation that we got at FOSDEM. + +00:03:30.420 --> 00:03:34.740 +I'm also an Apache Airflow security committee member, + +00:03:35.100 --> 00:03:38.420 +which is an important aspect for what we are discussing today + +00:03:38.980 --> 00:03:44.180 +because of supply chain and dependencies and lots of security, + +00:03:44.500 --> 00:03:47.760 +potential security issues these dependencies bring. + +00:03:48.020 --> 00:03:53.160 +One of the few lucky people to contribute to open source full-time and get paid for it, + +00:03:53.360 --> 00:03:54.300 +which is amazing. + +00:03:54.860 --> 00:03:59.200 +Maybe another podcast one day about that, because I think that's also an interesting one. + +00:03:59.200 --> 00:04:01.980 +Yeah, I have something like that, a topic somewhat like that brewing. + +00:04:02.220 --> 00:04:03.920 +So, yeah, potentially to have you back for that. + +00:04:04.020 --> 00:04:05.080 +Hey, I'm Amok Desai. + +00:04:05.220 --> 00:04:10.040 +Again, similar to Jarek, I'm a PMC member and a computer at Apache Airflow. + +00:04:10.440 --> 00:04:15.660 +And I'm also part of, I'm one of the top 10 contributors to the project, + +00:04:15.920 --> 00:04:18.760 +top 10 all-time contributors to the project, Jarek being number one. + +00:04:18.980 --> 00:04:21.780 +So I work at Astronomer as a senior software engineer, + +00:04:21.920 --> 00:04:24.340 +where I get to live in both worlds. + +00:04:24.340 --> 00:04:27.160 +One is contributing to Airflow's code development + +00:04:27.160 --> 00:04:31.620 +and also supporting the companies that are trying to run Airflow at scale. + +00:04:31.700 --> 00:04:33.860 +Awesome. What is Astronomer? Tell people about that. + +00:04:33.940 --> 00:04:39.200 +It's a company where most of our, we're a company which is almost one of the leading contributors + +00:04:39.200 --> 00:04:41.980 +to Apache Airflow and also the leading consumer of it. + +00:04:42.300 --> 00:04:45.620 +We supply and we provide a managed distribution of, + +00:04:45.880 --> 00:04:49.200 +corporate managed distribution of Apache Airflow inside Astro. + +00:04:49.200 --> 00:04:56.800 +And yeah, I think we have a data platform as well to try and make your lives easier to use Airflow at scale. + +00:04:56.940 --> 00:04:58.900 +And let me ask to it, two comments. + +00:04:59.540 --> 00:05:03.080 +So Airflow has a number of stakeholders and commercial stakeholders + +00:05:03.080 --> 00:05:05.860 +who are hosting Airflow as a service as well. + +00:05:06.080 --> 00:05:10.020 +And, you know, like using Airflow, we have contributions from all over the place. + +00:05:10.140 --> 00:05:13.020 +Astronomer by far, like the biggest number of contributions + +00:05:13.020 --> 00:05:15.060 +and fantastic open source stakeholder. + +00:05:15.060 --> 00:05:18.560 +We like very much focused on making Apache Airflow, + +00:05:18.740 --> 00:05:21.300 +like truly vendor neutral Apache project. + +00:05:21.560 --> 00:05:24.260 +Like I'm always amazed how well this works. + +00:05:25.000 --> 00:05:27.660 +And the second thing, the number one, I'm cheating a bit. + +00:05:27.800 --> 00:05:29.640 +Like, you know, I do a lot of small PRs. + +00:05:29.680 --> 00:05:30.840 +This is how you get the number one. + +00:05:30.960 --> 00:05:32.560 +I guess it depends how you measure it, huh? + +00:05:32.880 --> 00:05:36.880 +You know, you could always just do one ginormous AI PR + +00:05:36.880 --> 00:05:39.580 +that's like a hundred thousand lines of code in your PR + +00:05:39.580 --> 00:05:40.700 +and people would love you for it. + +00:05:40.700 --> 00:05:42.060 +And you'd be a mega contributor. + +00:05:42.340 --> 00:05:43.480 +Oh, yeah. + +00:05:43.740 --> 00:05:44.280 +Well, not. + +00:05:44.280 --> 00:05:44.840 +He does both. + +00:05:45.460 --> 00:05:46.900 +The funny part is Yarek does both. + +00:05:47.260 --> 00:05:50.340 +His velocity amazes me or, I don't know, shocks me sometimes. + +00:05:50.740 --> 00:05:54.400 +He does massive PRs and also like a lot of tiny ones. + +00:05:54.500 --> 00:05:56.820 +And by the time I'm looking, there are like three more out of it. + +00:05:56.860 --> 00:05:57.640 +I don't know how he does it. + +00:05:57.640 --> 00:06:01.800 +We're going to get to a bit of how much traffic there is on Airflow + +00:06:01.800 --> 00:06:03.740 +in terms of like open source activity. + +00:06:04.060 --> 00:06:05.280 +It's some, it's a little bit. + +00:06:05.640 --> 00:06:09.360 +Before we move on though, Yarek, what is the Apache Software Foundation? + +00:06:09.460 --> 00:06:12.560 +What is this Apache thing that you're talking about? + +00:06:12.620 --> 00:06:13.900 +And why is Airflow part of it? + +00:06:13.900 --> 00:06:14.420 +Very quickly. + +00:06:14.520 --> 00:06:15.160 +It's a foundation. + +00:06:15.280 --> 00:06:18.020 +One of the oldest foundations, open source foundation in the world. + +00:06:18.360 --> 00:06:20.420 +25, 26, seven years now. + +00:06:20.420 --> 00:06:24.440 +I think the main thing about Apache Software Foundation is that it's individual driven. + +00:06:24.440 --> 00:06:29.280 +So every member is an individual, not a corporate, as opposed to like Linux Software Foundation, + +00:06:29.420 --> 00:06:30.760 +where members are corporates. + +00:06:30.760 --> 00:06:38.720 +And people make decisions in both foundation and projects or in PMCs, so-called project management committees. + +00:06:39.280 --> 00:06:41.160 +And Airflow is one of the PMCs. + +00:06:41.280 --> 00:06:45.580 +So one of the project management committees, which has PMC members. + +00:06:45.580 --> 00:06:50.600 +We are both PMC members and we have like 50 other individuals or 60. + +00:06:50.700 --> 00:06:52.140 +I can't remember like the number of changes. + +00:06:52.140 --> 00:06:54.040 +Like we are inviting new ones all the time. + +00:06:54.040 --> 00:06:59.480 +We make decisions as humans, as individuals, not the corporates who are employing us, for example, + +00:06:59.480 --> 00:07:07.280 +because it's a meritocracy-based system where people have merit and the merit doesn't expire + +00:07:07.280 --> 00:07:11.460 +and the merit doesn't belong to individuals, not to the corporates. + +00:07:11.460 --> 00:07:18.580 +That's one of the big, like pretty much all the open source software out there, like has some Apache Foundation + +00:07:18.580 --> 00:07:20.100 +or Apache Complement in it. + +00:07:20.160 --> 00:07:23.780 +It started with Apache Server 20, 30 years ago almost. + +00:07:24.160 --> 00:07:27.240 +But now we have more than 200 PMCs. + +00:07:27.740 --> 00:07:32.200 +We just passed 10,000 committers mark two months ago, I think. + +00:07:32.580 --> 00:07:36.400 +So like lots of individuals, lots of people contributing to the foundation. + +00:07:36.700 --> 00:07:39.540 +And the main thing about foundation is community over code. + +00:07:39.540 --> 00:07:42.800 +So we value building communities more than actually producing code. + +00:07:42.800 --> 00:07:47.680 +We believe producing code is just byproduct of great communities working together. + +00:07:48.160 --> 00:07:53.220 +And ASF is a charity, is a public good charity in the US registered in Delaware. + +00:07:53.500 --> 00:07:55.080 +So we actually cannot be sold. + +00:07:55.220 --> 00:07:56.360 +We cannot change our license. + +00:07:56.520 --> 00:07:58.980 +Nothing like that can happen because of the status of foundation. + +00:07:59.120 --> 00:08:01.300 +And a really positive force for open source, right? + +00:08:01.440 --> 00:08:02.000 +Oh, absolutely. + +00:08:02.240 --> 00:08:02.640 +Absolutely. + +00:08:02.960 --> 00:08:08.580 +When I first got into like learning how ASF works, I said like that it has no chance to work. + +00:08:08.580 --> 00:08:10.080 +Like there is no way it works. + +00:08:10.420 --> 00:08:11.220 +It's too idealistic. + +00:08:11.340 --> 00:08:11.880 +There's no way. + +00:08:12.040 --> 00:08:12.400 +Absolutely. + +00:08:12.660 --> 00:08:15.940 +And nobody in the foundation who makes decisions gets any money. + +00:08:16.080 --> 00:08:17.740 +So like everyone is a volunteer. + +00:08:18.060 --> 00:08:23.740 +All the PMC members, all the committers, all the board members, all president, all the VVPs, + +00:08:23.960 --> 00:08:26.100 +those are all volunteer driven roles. + +00:08:26.300 --> 00:08:27.980 +And those are the people who make decisions. + +00:08:28.300 --> 00:08:30.860 +We just pay a few people in infrastructure and security. + +00:08:31.100 --> 00:08:31.720 +That's basically it. + +00:08:31.720 --> 00:08:35.220 +Let's start by just talking about high level abstract. + +00:08:35.820 --> 00:08:36.900 +What is a monorepo? + +00:08:36.900 --> 00:08:41.780 +I think it's so easy to make that sound like the same thing as a monolith. + +00:08:42.020 --> 00:08:44.220 +You're like, oh yeah, monorepo, monolith, same thing, right? + +00:08:44.580 --> 00:08:46.140 +And yet you're shaking your head. + +00:08:46.220 --> 00:08:50.420 +The first time I met personally a monorepo, maybe I can continue with that, but that was + +00:08:50.420 --> 00:08:51.120 +like at Google. + +00:08:51.460 --> 00:08:57.760 +I worked at Google years ago and I was surprised coming to Google that all the code there is + +00:08:57.760 --> 00:08:58.840 +in a single monorepo. + +00:08:59.340 --> 00:09:03.280 +Even though like we have like hundreds of products and all the stuff you see. + +00:09:03.280 --> 00:09:04.980 +It's got to be a lot of code, right? + +00:09:05.020 --> 00:09:07.180 +Like a giant, giant repo. + +00:09:07.420 --> 00:09:09.500 +Like now they have like maybe four. + +00:09:09.840 --> 00:09:10.360 +I don't know. + +00:09:10.400 --> 00:09:11.680 +Like I've heard some stories. + +00:09:11.980 --> 00:09:13.440 +I don't work there for a long time now. + +00:09:13.660 --> 00:09:17.940 +But that for me, that was a sign that like you don't really have to split and dice and + +00:09:17.940 --> 00:09:21.380 +slice your repositories into many, many small ones. + +00:09:21.460 --> 00:09:27.840 +Even if you have like non-monolithical product, it all can be kept in a single source, single + +00:09:27.840 --> 00:09:32.720 +repository, separate source trees maybe, separate like we'll talk about how we do it in + +00:09:32.720 --> 00:09:32.960 +Airflow. + +00:09:33.080 --> 00:09:38.980 +But it's a way how you can bind it together and have it tested together and have it developed + +00:09:38.980 --> 00:09:39.480 +together. + +00:09:39.480 --> 00:09:43.800 +Even though each piece is pretty much separate and you can work on them separately. + +00:09:44.160 --> 00:09:44.900 +That's the monorepo. + +00:09:45.120 --> 00:09:49.440 +As opposed to multirepo, which is like when you have multiple repositories consisting of + +00:09:49.440 --> 00:09:51.340 +whatever comes up as a product. + +00:09:51.840 --> 00:09:52.140 +Yeah. + +00:09:52.240 --> 00:09:57.180 +Everything that Jarek said plus just a small addition, which is each of the component or + +00:09:57.180 --> 00:10:02.520 +the tiny bit of a monorepo can have its own build artifacts, its dependencies. + +00:10:03.100 --> 00:10:06.180 +It can also have its own release cycle or a release vehicle. + +00:10:06.560 --> 00:10:10.920 +That's the only addition, but everything is put together as a big puzzle just to keep the + +00:10:10.920 --> 00:10:11.400 +puzzle together. + +00:10:11.480 --> 00:10:16.220 +You know, not every monorepo is Python, but in Python terms, it could have its own pyproject.toml, + +00:10:16.560 --> 00:10:18.000 +potentially its own virtual environment. + +00:10:18.000 --> 00:10:24.960 +The nomenclature ironies of this is often the monorepo, I think, makes more sense when + +00:10:24.960 --> 00:10:27.660 +you are working with lots of small parts, right? + +00:10:27.880 --> 00:10:31.680 +Where the monolith, maybe it has a couple of things, but it doesn't depend real deeply. + +00:10:31.880 --> 00:10:36.900 +The more interconnections you have and the harder it is to manage those versions, the more something + +00:10:36.900 --> 00:10:38.180 +like this makes sense, right? + +00:10:38.180 --> 00:10:45.220 +People really make a connection between isolated work on part of the system into having to have + +00:10:45.220 --> 00:10:48.700 +separate repository for that, which is completely not the case. + +00:10:48.880 --> 00:10:53.140 +Like you can actually have an isolated sub part of the repository, even if it's Git. + +00:10:53.200 --> 00:10:56.600 +Git doesn't have like, you have some modules and sub repos and all that stuff. + +00:10:56.880 --> 00:11:01.720 +But even like in a single Git repository, you can easily have like start working and focusing + +00:11:01.720 --> 00:11:06.720 +on a small part of the whole monorepo and only care about that. + +00:11:06.720 --> 00:11:09.160 +That's what the monorepo is. + +00:11:09.280 --> 00:11:10.140 +I'm going to go ahead and put it out there. + +00:11:10.220 --> 00:11:12.980 +I'm not a big fan of microservice architectures. + +00:11:13.340 --> 00:11:19.920 +I kind of find it's trading code complexity for DevOps and deployment complexity. + +00:11:20.100 --> 00:11:23.560 +And I think we have better tools to manage code complexity than DevOps complexity. + +00:11:24.080 --> 00:11:30.340 +But something like this does help you manage those kinds of deployments as well better, right? + +00:11:30.460 --> 00:11:33.420 +I use the term mini-series, not microservices. + +00:11:33.620 --> 00:11:35.020 +Microservices is just too much. + +00:11:35.020 --> 00:11:39.300 +But then you can have a lot of mini-series, a number of mini-services, but not micro. + +00:11:39.700 --> 00:11:42.320 +Like micro was just too much of a mainstream. + +00:11:42.440 --> 00:11:43.260 +I can get on board with that. + +00:11:43.400 --> 00:11:44.040 +Amak, what do you think? + +00:11:44.100 --> 00:11:46.180 +I like that as well, mini-services. + +00:11:46.440 --> 00:11:47.460 +Maybe you should coin that too. + +00:11:47.540 --> 00:11:49.040 +It's the microservices that are too small. + +00:11:49.080 --> 00:11:53.340 +It feels to me like the equivalent of when you're trying to write unit tests and you're + +00:11:53.340 --> 00:11:56.820 +like, oh, what if I get a customer and I set their first name? + +00:11:57.000 --> 00:11:58.580 +And then I check that their first name is set. + +00:11:58.660 --> 00:11:59.340 +Like, what are you doing? + +00:11:59.380 --> 00:12:00.880 +You don't need to check that assignment works. + +00:12:00.960 --> 00:12:03.200 +This is too, you're just too much in the weeds. + +00:12:03.260 --> 00:12:03.800 +You know what I mean? + +00:12:03.900 --> 00:12:06.460 +This is what AI agents do now all the time. + +00:12:06.680 --> 00:12:07.660 +Like, no. + +00:12:08.000 --> 00:12:09.240 +Yeah, think of the code coverage. + +00:12:09.400 --> 00:12:10.460 +Just think of the code coverage. + +00:12:10.560 --> 00:12:10.820 +Come on. + +00:12:10.980 --> 00:12:12.140 +You've got some goals to hit. + +00:12:12.180 --> 00:12:13.740 +You said 80% code coverage. + +00:12:13.740 --> 00:12:14.820 +It's on top of it. + +00:12:14.920 --> 00:12:15.000 +Yeah. + +00:12:15.080 --> 00:12:16.120 +That sets the stage. + +00:12:16.300 --> 00:12:22.820 +Let's talk a little bit about specifically how Apache Airflow has come to need this, basically. + +00:12:23.100 --> 00:12:23.260 +Right? + +00:12:23.380 --> 00:12:27.160 +Like, you shared with me the pulse, the GitHub pulse for Apache Airflow. + +00:12:27.680 --> 00:12:33.380 +And it's kind of worth looking at just how much open source interest and traffic there + +00:12:33.380 --> 00:12:33.580 +is. + +00:12:33.720 --> 00:12:37.280 +Who wants to kind of summarize this weekly pulse here? + +00:12:37.440 --> 00:12:40.800 +This is not the best week in terms of the number of comments. + +00:12:40.800 --> 00:12:43.300 +We have had even more red, but in the week of... + +00:12:43.300 --> 00:12:43.380 +Wow. + +00:12:43.380 --> 00:12:44.700 +Just one of those weeks. + +00:12:45.440 --> 00:12:45.880 +Yeah. + +00:12:46.120 --> 00:12:47.300 +One of the usual weeks. + +00:12:47.640 --> 00:12:52.720 +Between Feb 3 and Feb 10, we have had about 310 active pull requests. + +00:12:53.280 --> 00:12:57.540 +So, you can imagine that's about 40 plus pull requests a day. + +00:12:57.800 --> 00:13:04.080 +A lot of them are being assisted by the AI revolution going on, but that's a lot of pull requests. + +00:13:04.380 --> 00:13:06.760 +And we have merged about 200 of them. + +00:13:06.760 --> 00:13:08.140 +About 100 are open. + +00:13:08.420 --> 00:13:09.800 +And similarly with issues, right? + +00:13:10.240 --> 00:13:11.340 +35 new issues. + +00:13:11.620 --> 00:13:12.660 +Five issues per day. + +00:13:12.660 --> 00:13:13.940 +That's a lot of traffic. + +00:13:13.940 --> 00:13:19.420 +So, you can imagine the amount of review pressure each of the maintainers has here. + +00:13:19.820 --> 00:13:26.100 +There's 300 pull requests spread across, I don't know, 120, 130, maybe 140 distributions. + +00:13:26.100 --> 00:13:33.260 +And each of the distributions having like a swim lane owner who is actively trying to take a look at these pull requests. + +00:13:33.260 --> 00:13:36.260 +So, it's just another week to be very honest. + +00:13:36.260 --> 00:13:39.420 +It's more than 25 PRs a day, including weekends. + +00:13:39.420 --> 00:13:40.420 +How many of these people are high value? + +00:13:40.420 --> 00:13:43.420 +How many of these PRs are high value? + +00:13:43.420 --> 00:13:46.180 +I guess I'm trying to get the sense of like, how much does this get accepted? + +00:13:46.180 --> 00:13:50.420 +Are these just people throwing stuff out there that doesn't make sense for the direction of airflow? + +00:13:50.420 --> 00:13:55.700 +Well, those merged all make sense because they are reviewed and merged by airflow maintainers. + +00:13:55.700 --> 00:13:56.700 +And we are very serious about that. + +00:13:56.700 --> 00:14:01.700 +So, like we don't merge anything that doesn't pass our bar, which is like very high and extremely high. + +00:14:01.700 --> 00:14:11.700 +Like we have 170 track hooks which are checking if the PR is doing what we, if the code is doing what it was supposed to be doing and if it's architected properly. + +00:14:11.700 --> 00:14:25.700 +And on top of that, we have individuals, people like, like among myself and maybe 50 other PMC members and committees who are reviewing it and making their comments and know the system enough to direct people. + +00:14:25.700 --> 00:14:26.700 +So, they may make sense. + +00:14:26.700 --> 00:14:34.700 +We do have recently, and that was a recurring them at the FOSDEM conference last week when I was there about like AI generated contributions. + +00:14:34.700 --> 00:14:38.700 +And many of the AI generated contributions are not the best quality. + +00:14:38.700 --> 00:14:41.700 +It's not like AI is bad quality. + +00:14:41.700 --> 00:14:45.700 +Many of those are easier to produce and they might have bad quality. + +00:14:45.700 --> 00:14:50.700 +So, we are now learning how to filter them out and how to make the, to handle them quickly. + +00:14:50.700 --> 00:14:53.700 +But those are the actual high value PRs that we merged. + +00:14:53.700 --> 00:15:01.700 +In terms of numbers, if you, if I may, the, it would be maybe a third of the open pull requests that are nice general trend. + +00:15:01.700 --> 00:15:02.700 +That's pretty good, honestly. + +00:15:02.700 --> 00:15:03.700 +Yep. + +00:15:03.700 --> 00:15:05.700 +We have some guidelines published very recently. + +00:15:05.700 --> 00:15:09.700 +And due to that, we have seen a dip in such, such quality of PRs. + +00:15:09.700 --> 00:15:22.700 +We published some guidelines in our contribution guides about what will be the action taken if, you know, bad quality PRs are raised or non or PRs are raised where the author does not know the context, but the AI does. + +00:15:22.700 --> 00:15:24.700 +I don't want to go down this rat hole. + +00:15:24.700 --> 00:15:27.700 +People hear this enough lately, but I just, it's been in the news lately. + +00:15:27.700 --> 00:15:32.700 +Open source projects have been kind of getting a barrage of AI submissions. + +00:15:32.700 --> 00:15:35.700 +And I think that comes in a couple of flavors. + +00:15:35.700 --> 00:15:40.700 +One, people who just want to get their name listed as a contributor, maybe it helps them with their job or whatever. + +00:15:40.700 --> 00:15:45.700 +So there's like a small incentive there, but it's been really bad for bug bounties. + +00:15:45.700 --> 00:15:52.700 +Like curl closed its bug bounty program because people were trying to make the 50 or $250 by finding some issue with AI. + +00:15:52.700 --> 00:15:56.700 +Is that a problem for you all just taking the pulse of a big project like that? + +00:15:56.700 --> 00:15:57.700 +It is. + +00:15:57.700 --> 00:16:03.700 +I actually had a talk about that at the Global Vulnerability Intelligence Platform Summit just before Fosden. + +00:16:03.700 --> 00:16:09.700 +So that was exactly like, I even quoted Daniel Stenberg and I met him there at Fosden, which like, that was really cool. + +00:16:09.700 --> 00:16:21.700 +There are some different motivations of people who are submitting those those AI issues and we should fight with the in different ways with different approaches or like, you know, the respond to those motivations. + +00:16:21.700 --> 00:16:22.700 +Somehow we have some ideas. + +00:16:22.700 --> 00:16:27.700 +We have an open discussion in GitHub maintainers list right now. + +00:16:27.700 --> 00:16:32.700 +And GitHub is trying to address it by like just discussing what they can do right now. + +00:16:32.700 --> 00:16:34.700 +And that's the highest priority for them. + +00:16:34.700 --> 00:16:42.700 +We have a discussion with OSSF for security kind of guidelines or policies for open source maintainers, how to deal with those issues. + +00:16:42.700 --> 00:16:53.700 +And I'm sure we will work out some ways and toolings and most of all processes and like being assertive is one thing, like just saying no when the report doesn't meet all the bars immediately. + +00:16:53.700 --> 00:17:11.700 +And, you know, directing people to the description is good enough of a, you know, barrier for, you know, getting kind of completely broken PRs because we have to just make it more expensive for the reporters than for the maintainers to diagnose the issues or decide if the issues are bad or good. + +00:17:11.700 --> 00:17:17.700 +And I'm not necessarily saying that there's something inherently bad because AI wrote some of the code than a person. + +00:17:17.700 --> 00:17:20.700 +AI can write really good code better than a lot of people I've seen. + +00:17:20.700 --> 00:17:28.700 +But it has this sort of shotgun effect often of just like, I'm going to change all these files and it's not as focused and clear. + +00:17:28.700 --> 00:17:31.700 +A lot of times it just it doesn't it doesn't get the Zen of it. + +00:17:31.700 --> 00:17:33.700 +You know, Amag, what do you think? + +00:17:33.700 --> 00:17:40.700 +It'll generate code, which it thinks is good, but we don't really know the ripple effect and we want to avoid such things. + +00:17:40.700 --> 00:17:44.700 +Such a long living app with lots of complexity. Right. + +00:17:44.700 --> 00:17:50.700 +And we all are using the AI for generating the code, to be honest, like so like most of my code. + +00:17:50.700 --> 00:17:56.700 +You should. Yeah, it's incredible. It's it's I pulled up this graphic here and I'll link to it in the show notes. + +00:17:56.700 --> 00:18:07.700 +I just given people a sense, I got this little utility that I released this week called Tallymon, which like analyzes code and gives you sort of a more of a breakdown than just like this many lines or whatever. + +00:18:07.700 --> 00:18:12.700 +So I want to just highlight maybe you all like can riff on this a little bit to give a sense. + +00:18:12.700 --> 00:18:23.700 +So 100 or 1.2 million lines of Python, 918,000 excluding comments, maybe a little over counting the way this thing works, but still 200,000 restructured texts. + +00:18:23.700 --> 00:18:28.700 +The one that really stood out to me, 81,000 lines of YAML and 16,000 lines of TAML. + +00:18:28.700 --> 00:18:32.700 +You guys, that's impressive. And you know what? + +00:18:32.700 --> 00:18:38.700 +Hat tip to just a just a sprinkle, just a hint of Java at 42 lines of Java. + +00:18:38.700 --> 00:18:43.700 +But, you know, almost a million, just over a million lines of code without comments. + +00:18:43.700 --> 00:18:45.700 +That's a big project. What do you think? + +00:18:45.700 --> 00:18:47.700 +What happened when you joined? + +00:18:47.700 --> 00:18:50.700 +I don't know. I think it was much less. + +00:18:50.700 --> 00:18:51.700 +You did contribute a lot. + +00:18:51.700 --> 00:18:56.700 +You can imagine so because of the number of packages we haven't read the monorepo discussion from earlier. + +00:18:56.700 --> 00:19:00.700 +We have a lot of packages and the YAML might surprise you at first. + +00:19:00.700 --> 00:19:05.700 +But if you actually go and see why the YAML, it's mostly for our providers. + +00:19:05.700 --> 00:19:08.700 +So integration with other systems is something we call as providers. + +00:19:08.700 --> 00:19:11.700 +And the spec of the providers is written in YAML. + +00:19:11.700 --> 00:19:14.700 +And TAML, sure, will come to it very, very, very soon. + +00:19:14.700 --> 00:19:23.700 +That's kind of why I pulled this up, actually, is the TAML aspect is quite interesting, which leave us with that number as we move on. + +00:19:23.700 --> 00:19:28.700 +16,000 lines of TAML. That's a lot of pyproject.TAML going on right there, folks. + +00:19:28.700 --> 00:19:31.700 +Oh, yes. And lots of it is generated, actually. + +00:19:31.700 --> 00:19:37.700 +So like, because we actually generate quite a lot of the YAML and TAML that we have and keep it in the repo. + +00:19:37.700 --> 00:19:39.700 +So we don't want to regenerate every time. + +00:19:39.700 --> 00:19:43.700 +So like, we don't write YAML by hand. + +00:19:43.700 --> 00:19:50.700 +Maybe we can start by introducing this by just giving a shout out to this series that you wrote over here on Medium. + +00:19:50.700 --> 00:19:55.700 +Yarek, modern Python repo for Apache Airflow, parts one through four. + +00:19:55.700 --> 00:20:00.700 +Yes, I initially started discussing this blog post idea with a few people. + +00:20:00.700 --> 00:20:04.700 +Like, you know, like people are busy and I couldn't get people like to write it. + +00:20:04.700 --> 00:20:06.700 +So I decided to write it myself. + +00:20:06.700 --> 00:20:09.700 +Well, with a lot of AI help, of course. + +00:20:09.700 --> 00:20:11.700 +It's not that everything is written by hand. + +00:20:11.700 --> 00:20:17.700 +And when I wrote it, I realized it's like too big and I had to split it into four. + +00:20:17.700 --> 00:20:29.700 +But the idea was like to document what we've done because because I think that a lot of people are struggling with like monorepo versus multirepo or like how they should do their repository in when they are the project grows. + +00:20:29.700 --> 00:20:37.700 +And there were lots of discussions in the past, including here, one of the, you know, one of the podcasts of yours were monorepo versus multirepo. + +00:20:37.700 --> 00:20:50.700 +And I can't remember who that was, but there was discussion about like going back and forth and like finding that people sometimes go back and then then go forth and like in different directions because there are different problems or approaches. + +00:20:50.700 --> 00:21:03.700 +So I just wanted to document the reasoning why we are doing it, like why it's possible now because of the packaging ecosystem maturing for Python and uv and other tools coming into the space. + +00:21:03.700 --> 00:21:12.700 +And then the last part was like really the kind of a little bit innovative approach that we do where the tooling is still not catching up with what we need and what we, what we, what we did. + +00:21:12.700 --> 00:21:16.700 +So those are the kind of history why we are doing it. + +00:21:16.700 --> 00:21:22.700 +The, you know, the packaging, the automated verification with Prec. + +00:21:22.700 --> 00:21:23.700 +So that was the third part. + +00:21:23.700 --> 00:21:29.700 +And the fourth part was about like the, this chart, libraries, innovation, innovative concept, but we added for, for, for. + +00:21:29.700 --> 00:21:34.700 +I'll link to the series as well as to a talk that you gave at Fostum that just got published, right? + +00:21:34.700 --> 00:21:35.700 +Yes. + +00:21:35.700 --> 00:21:35.700 + + +00:21:35.700 --> 00:21:36.700 +Yes. + +00:21:36.700 --> 00:21:40.700 +And, they are, they have amazing system of recording and publishing stuff. + +00:21:40.700 --> 00:21:44.700 +Like, like for the volunteer driven conference, thousand speakers. + +00:21:44.700 --> 00:21:45.700 +Oh, that was, that's amazing. + +00:21:45.700 --> 00:21:46.700 +That works. + +00:21:46.700 --> 00:21:48.700 +Like probably some automation going on there. + +00:21:48.700 --> 00:21:56.700 +Let's talk a little bit about, I guess the problems that you ran into because initially there were some challenges with the standards and tooling not be there. + +00:21:56.700 --> 00:22:05.700 +And you actually, one of the takeaways, if people read the series or watch the talk is you actually had to work with some of the tool providers to make this possible. + +00:22:05.700 --> 00:22:09.700 +So not only is it like, well, the tools have changed what we could do this. + +00:22:09.700 --> 00:22:17.700 +It's you all have changed the tools a little bit through, you know, working closely, like, Hey, we've got this 1 million line project with a hundred dollars. + +00:22:17.700 --> 00:22:20.700 +So not only is it a hundred sub modules or more help. + +00:22:20.700 --> 00:22:22.700 +Like it's just your tools to support this. + +00:22:22.700 --> 00:22:23.700 +Help me make this work. + +00:22:23.700 --> 00:22:24.700 +Right. + +00:22:24.700 --> 00:22:25.700 +What were some of the problems? + +00:22:25.700 --> 00:22:34.700 +Let me start with this cooperation and maybe, you know, Amok can also explain like what was before and after, because like he experienced that firsthand as a, as a user kind of this kind of repository structure. + +00:22:34.700 --> 00:22:39.700 +But for me, the idea was like, I was working on it for years. + +00:22:39.700 --> 00:22:44.700 +Like when we went to airflow to five years ago, we, or four years ago, I can't remember. + +00:22:44.700 --> 00:22:45.700 +That's a long time. + +00:22:45.700 --> 00:22:55.700 +And we didn't have all the tooling and we had to do pretty much everything that we do now with the, with monorepine uv by hand, by bash scripts by that time. + +00:22:55.700 --> 00:22:57.700 +By that time, by that time, crazy. + +00:22:57.700 --> 00:23:04.700 +So like, if you run it three years ago, the, the, your code, you would see more than 10,000 lines of bash code, which I wrote. + +00:23:04.700 --> 00:23:05.700 +But we, we since removed. + +00:23:05.700 --> 00:23:06.700 +We since removed. + +00:23:06.700 --> 00:23:07.700 +That is not joyful. + +00:23:07.700 --> 00:23:08.700 +That doesn't spark joy. + +00:23:08.700 --> 00:23:11.700 +That's why we removed it with some outreach internship actually. + +00:23:11.700 --> 00:23:19.700 +And shout out to edit and, and Borna who were our outreach mentors who helped us to convert it to, to Python, which was really helpful. + +00:23:19.700 --> 00:23:20.700 +That's how it started. + +00:23:20.700 --> 00:23:30.700 +No tooling need because we grew, we wanted to have more providers, more integrations, and it already was quite difficult to manage if they are well part of single distribution. + +00:23:30.700 --> 00:23:34.700 +So we have to split into many distributions, 60, I think at the beginning. + +00:23:34.700 --> 00:23:36.700 +Now we have more than hundreds. + +00:23:36.700 --> 00:23:43.700 +Now, when we did that, I, we had to do all manually and like working with that was like really cumbersome. + +00:23:43.700 --> 00:23:45.700 +Maybe, you know, like I can switch to, to Amok. + +00:23:45.700 --> 00:23:50.700 +So he can say like the past experience and new experience because like he experienced the change himself. + +00:23:50.700 --> 00:23:51.700 +Yeah. + +00:23:51.700 --> 00:23:55.700 +The, the past experience was scary to be, to speak the least. + +00:23:55.700 --> 00:24:05.700 +Whenever I, switch branches or have to rebase for whatever reason, I had a nightmare, a very bad time trying to, you know, package things together and try to run something. + +00:24:05.700 --> 00:24:09.700 +And I think Yarek found me often, you know, ranting on the Slack channels that, Hey, this doesn't work. + +00:24:09.700 --> 00:24:10.700 +Hey, that doesn't work. + +00:24:10.700 --> 00:24:11.700 +What do we do? + +00:24:11.700 --> 00:24:13.700 +Now it's, it's very easy. + +00:24:13.700 --> 00:24:19.700 +It's, it's effortless, almost effortless compared to what we had years, maybe like five years ago, four years ago. + +00:24:19.700 --> 00:24:20.700 +Yeah. Amazing. + +00:24:20.700 --> 00:24:21.700 +How does GitHub deal? + +00:24:21.700 --> 00:24:24.700 +I was the only one who actually managed the whole thing for years. + +00:24:24.700 --> 00:24:27.700 +And I was like overwhelmed as well when people have problems, of course. + +00:24:27.700 --> 00:24:31.700 +So then the change that we've done was not only with the tooling. + +00:24:31.700 --> 00:24:39.700 +And as you mentioned, we were actually cooperating with Charlie from Astral, Charlie Marsh and with Joe from FEC because we had this need. + +00:24:39.700 --> 00:24:46.700 +We had it implemented ourselves and then they could look at how we've done that and they could implement it properly in their tooling. + +00:24:46.700 --> 00:24:53.700 +And we've been like exchanging the, you know, like Charlie was even interviewing me at some point of time, how we, how, what, what are our needs? + +00:24:53.700 --> 00:25:00.700 +So I have for a long time, I have this, this motto that the best way to foresee future is to shape it. + +00:25:00.700 --> 00:25:08.700 +And like, so we did shape the future by, you know, talking to those tool providers so that they can, or builders so that they could build it for us and work with us. + +00:25:08.700 --> 00:25:10.700 +And we helped them to test them and everything like that. + +00:25:10.700 --> 00:25:27.700 +But also it was like listening to Amog and other contributors, like all the problems they had or like, and then when I solved it, I wouldn't, I wouldn't also own solve it with the new tooling, but we also engaged all the more people from the, from the team, like Amog and few other active contributors. + +00:25:27.700 --> 00:25:29.700 +And they were actually part of the whole process of conversion. + +00:25:29.700 --> 00:25:31.700 +And they are now part of the team. + +00:25:31.700 --> 00:25:36.700 +And now we can have this podcast while things are being broken in airflow right now. + +00:25:36.700 --> 00:25:38.680 +And somebody is probably fixing it right as we speak. + +00:25:38.700 --> 00:25:40.700 +So like, not me anymore. + +00:25:40.700 --> 00:25:42.700 +So that's, those old, old things are really great. + +00:25:42.700 --> 00:25:46.700 +This portion of Talk Python To Me is brought to you by us. + +00:25:46.700 --> 00:25:50.700 +I want to tell you about a course I put together that I'm really proud of. + +00:25:50.700 --> 00:25:53.700 +Agentic AI programming for Python developers. + +00:25:53.700 --> 00:26:00.700 +I know a lot of you have tried AI coding tools and come away thinking, well, this is more hassle than it's worth. + +00:26:00.700 --> 00:26:03.700 +And honestly, all the vibe coding hype isn't helping. + +00:26:03.700 --> 00:26:07.700 +It's a smoke screen that hides what these tools can actually do. + +00:26:07.700 --> 00:26:09.700 +This course is about agentic engineering. + +00:26:09.700 --> 00:26:19.700 +Applying real software engineering practices with AI that understands your entire code base, runs your tests, and builds complete features under your direction. + +00:26:19.700 --> 00:26:27.700 +I've used these techniques to ship real production code across Talk Python, Python bytes, and completely new projects. + +00:26:27.700 --> 00:26:33.700 +I migrated an entire CSS framework on a production site with thousands of lines of HTML in a few hours. + +00:26:33.700 --> 00:26:38.700 +I shipped a new search feature with caching and async in under an hour. + +00:26:38.700 --> 00:26:47.700 +I built a complete CLI tool for Talk Python from scratch, tested, documented, and published to PyPI in an afternoon. + +00:26:47.700 --> 00:26:51.700 +Real projects, real production code, both Greenfield and legacy. + +00:26:51.700 --> 00:26:53.700 +No toy demos, no fluff. + +00:26:53.700 --> 00:27:00.700 +I'll show you the guardrails, the planning techniques, and the workflows that turn AI into a genuine engineering partner. + +00:27:00.700 --> 00:27:04.700 +Check it out at talkpython.fm/agentic dash engineering. + +00:27:04.700 --> 00:27:07.700 +That's talkpython.fm/agentic dash engineering. + +00:27:07.700 --> 00:27:10.700 +The link is in your podcast player's show notes. + +00:27:10.700 --> 00:27:18.700 +How does GitHub deal with so many files and such a big project? Is it fine or is it a challenge? + +00:27:18.700 --> 00:27:21.700 +Except yesterday, where half of the time GitHub was not at the end. + +00:27:21.700 --> 00:27:22.700 +Except yesterday. + +00:27:22.700 --> 00:27:27.700 +Yeah, for people who don't know, yesterday morning, at least morning US time, GitHub was having a moment. + +00:27:27.700 --> 00:27:29.700 +Like, it was, I couldn't clone stuff. + +00:27:29.700 --> 00:27:35.700 +I pulled up the random page on GitHub and got the 503 Unicorn. + +00:27:35.700 --> 00:27:36.700 +It was not good, right? + +00:27:36.700 --> 00:27:38.700 +Besides that, not excluding that time. + +00:27:38.700 --> 00:27:41.700 +The Unicorn is actually a little bit like looking kind of angry at you. + +00:27:41.700 --> 00:27:44.700 +That's one of the observations I had from yesterday. + +00:27:44.700 --> 00:27:47.700 +I saw it so many times that it's like, it doesn't look nice. + +00:27:47.700 --> 00:27:48.700 +But maybe GitHub. + +00:27:48.700 --> 00:27:49.700 +I agree. + +00:27:49.700 --> 00:27:50.700 +That's not a great error page. + +00:27:50.700 --> 00:27:55.700 +Like, some error pages are amazing where it's like, you know, the coyote fell off of a cliff. + +00:27:55.700 --> 00:27:56.700 +Woo! + +00:27:56.700 --> 00:27:58.700 +You know, like, that one just looks like it's angry back at you. + +00:27:58.700 --> 00:28:00.700 +Besides that, it's perfect. + +00:28:00.700 --> 00:28:04.700 +Like, it works like seamlessly, no problems whatsoever with the size, with the numbers. + +00:28:04.700 --> 00:28:06.700 +Like, we are very, very happy in general. + +00:28:06.700 --> 00:28:08.700 +And of course, like, things like that happen. + +00:28:08.700 --> 00:28:09.700 +There is nothing wrong. + +00:28:09.700 --> 00:28:12.700 +Like, there is something wrong, but like, it's not like that, that it happens all the time. + +00:28:12.700 --> 00:28:13.700 +Not really like GitHub. + +00:28:13.700 --> 00:28:14.700 +It's super rare. + +00:28:14.700 --> 00:28:15.700 +GitHub is an incredible service. + +00:28:15.700 --> 00:28:22.700 +I mean, I know there's been some grief about the GitHub actions, but I put, that's a different, different conversation. + +00:28:22.700 --> 00:28:23.700 +Right? + +00:28:23.700 --> 00:28:31.700 +So let's talk about, next, about how the package standards have changed and how basically some of those things have made it possible. + +00:28:31.700 --> 00:28:45.700 +And so in your talk, you pulled up a bunch of different peps, nine of them or something like that, that were about packaging, recently packaging standards and different things like that, that have made basically the structure that you're working with and the tools that do it possible. + +00:28:45.700 --> 00:28:50.700 +Do you want to maybe highlight either of you, some of these things that stand out as, this one is really important. + +00:28:50.700 --> 00:29:07.700 +The one which is maybe not super related to Monorepo, but it actually helped us a lot, like the pep723, the last, all the, one but last inline script metadata, which is like one of the biggest successes and the biggest kind of usages I see from PEP implemented. + +00:29:07.700 --> 00:29:21.700 +It caught up very, very quickly. It allows us to, you know, embed inline script metadata into, into the Python scripts, which is like something that we've been dreaming of for years, especially for this kind of tooling, the FCI environment, et cetera, et cetera. + +00:29:21.700 --> 00:29:22.700 +This is really, really helpful. + +00:29:22.700 --> 00:29:24.700 +So that, that's the one that I would like to highlight. + +00:29:24.700 --> 00:29:46.700 +But I, you know, I read all of them like many times, all the peps and they are difficult things to read, to read and understand, but they were like, we actually did all that we could to, you know, be like fully compliant with the, not only with the specification of those peps, but also with the kind of spirit of the specification, because sometimes things are not very precisely described and there are some interpretations and stuff. + +00:29:46.700 --> 00:29:50.700 +So we just, we just made sure, and this is our, our goal as well. + +00:29:50.700 --> 00:29:56.700 +Like we just make sure that all the PEP standards that are being published are actually very meticulously followed. + +00:29:56.700 --> 00:30:00.700 +And we just try to adapt to any changes that are coming in the environment. + +00:30:00.700 --> 00:30:06.700 +So we know how difficult it is if people are sticking to the old ways and like that's, that makes difficult for Python maintainers. + +00:30:06.700 --> 00:30:07.700 +Mark, any other thoughts? + +00:30:07.700 --> 00:30:15.700 +This one is a particularly very important one for us also because it simplifies our pre-commit configurations where earlier we had to, + +00:30:15.700 --> 00:30:19.700 +you know, specify the dependencies as required. + +00:30:19.700 --> 00:30:23.700 +So like whatever the particular version was, but now it's all in the script. + +00:30:23.700 --> 00:30:28.700 +It's not, and the pre-commit remains as clean as it could just with the hook name and, + +00:30:28.700 --> 00:30:34.700 +you know, the regex for the file filter and minimal configurations for it to work well. + +00:30:34.700 --> 00:30:37.700 +And I think your dependency group is also the other pep. + +00:30:37.700 --> 00:30:40.700 +I don't recall the name, but I recall the number. + +00:30:40.700 --> 00:30:41.700 +I think it's six. + +00:30:41.700 --> 00:30:43.700 +Oh, I can't remember all the numbers, but one of those. + +00:30:43.700 --> 00:30:46.700 +That would be 735 folks, 735. + +00:30:46.700 --> 00:30:48.700 +That's also particularly nice for us. + +00:30:48.700 --> 00:30:55.700 +We can define the dependency groups in our by-projects and it's, it's nice to, it's really nice how it works with uv. + +00:30:55.700 --> 00:30:59.700 +We're very happy with this particular dependency group as well as the inline scripts. + +00:30:59.700 --> 00:31:00.700 +Right. + +00:31:00.700 --> 00:31:01.700 +The inline scripts are cool. + +00:31:01.700 --> 00:31:08.700 +I, you know, especially with uv these days, it really makes running some kind of Python code so much easier. + +00:31:08.700 --> 00:31:12.700 +It's, it's almost as if everything is standard library. + +00:31:12.700 --> 00:31:13.700 +I can give somebody a file. + +00:31:13.700 --> 00:31:14.700 +I can say the way you run it. + +00:31:14.700 --> 00:31:15.700 +No, no, no, no, no. + +00:31:15.700 --> 00:31:16.700 +Don't. + +00:31:16.700 --> 00:31:18.700 +I know it looks like you say Python, but don't say that. + +00:31:18.700 --> 00:31:21.700 +You say uv run this and then, and that's it. + +00:31:21.700 --> 00:31:22.700 +Like they didn't even have to have Python. + +00:31:22.700 --> 00:31:26.700 +They might need 10 dependencies and so on it, but it doesn't matter. + +00:31:26.700 --> 00:31:27.700 +Right. + +00:31:27.700 --> 00:31:28.700 +Yeah. + +00:31:28.700 --> 00:31:29.700 +And big standard. + +00:31:29.700 --> 00:31:31.700 +It makes it also, you know, like other tools are doing the same or hatch run. + +00:31:31.700 --> 00:31:32.700 +That's the same. + +00:31:32.700 --> 00:31:37.700 +That's like, yeah, there is even like support for inline script metadata just released in latest + +00:31:37.700 --> 00:31:38.700 +tip 26. + +00:31:38.700 --> 00:31:44.700 +So like, it's all good because of the standards and not because a single particular tool does it in an opinionated way. + +00:31:44.700 --> 00:31:46.700 +So this, this is really, really, really cool. + +00:31:46.700 --> 00:31:52.700 +And there is one big benefit of those kinds of apps and this part, particularly inline script metadata. + +00:31:52.700 --> 00:31:54.700 +It's like, we have less YAML. + +00:31:54.700 --> 00:31:55.700 +Yeah. + +00:31:55.700 --> 00:31:58.700 +You already have a lot of YAML, but less is better. + +00:31:58.700 --> 00:31:59.700 +We have a lot still. + +00:31:59.700 --> 00:32:00.700 +We can't come from that. + +00:32:00.700 --> 00:32:03.700 +It's better than it was. + +00:32:03.700 --> 00:32:04.700 +Yeah. + +00:32:04.700 --> 00:32:09.700 +And so the dependency groups are like, you know, for dev or for tests or something like that. + +00:32:09.700 --> 00:32:10.700 +Right. + +00:32:10.700 --> 00:32:19.700 +So you can say like uv sync or uv pip install, and you can say like thing bracket dev or something like that. + +00:32:19.700 --> 00:32:20.700 +Right. + +00:32:20.700 --> 00:32:31.700 +The nice thing is about you think is that it sends the dev dependencies automatically without you even specifying that, which is like the best thing for development because you actually always want to have developer developing development tools with you. + +00:32:31.700 --> 00:32:32.700 +That's a good point. + +00:32:32.700 --> 00:32:33.700 +Yeah. + +00:32:33.700 --> 00:32:34.700 +That's really cool. + +00:32:34.700 --> 00:32:34.700 + + +00:32:34.700 --> 00:32:37.700 +That was the changes to Python itself through the peps. + +00:32:37.700 --> 00:32:47.700 +But there's also tools and you've already mentioned some of them, both of them, but tools that make this possible, which I mean, I think uv has to be number one that goes on this list, right? + +00:32:47.700 --> 00:32:50.700 +Like uv has really done some powerful stuff here. + +00:32:50.700 --> 00:32:51.700 +Right. + +00:32:51.700 --> 00:32:56.700 +Again, Amok can say like, I introduced it, but Amok was the one to switch to use uv at some point of time. + +00:32:56.700 --> 00:32:57.700 +Yep. + +00:32:57.700 --> 00:32:58.700 +UV has been a game changer. + +00:32:58.700 --> 00:33:01.700 +I think we were using poetry before this or Hatch. + +00:33:01.700 --> 00:33:02.700 +I don't know. + +00:33:02.700 --> 00:33:03.700 +No, not even that. + +00:33:03.700 --> 00:33:03.700 + + +00:33:03.700 --> 00:33:04.700 +Just pitch. + +00:33:04.700 --> 00:33:05.700 +Just pitch. + +00:33:05.700 --> 00:33:06.700 +Just pitch. + +00:33:06.700 --> 00:33:07.700 +Just pitch. + +00:33:07.700 --> 00:33:08.700 +Just the image. + +00:33:08.700 --> 00:33:09.700 +It's so good. + +00:33:09.700 --> 00:33:13.700 +I don't even remember the last, you know, game changing aspect that uv brought in was this notion of workspaces. + +00:33:13.700 --> 00:33:14.700 +It's something very simple. + +00:33:14.700 --> 00:33:26.700 +You can compare it very similar to, you know, a co-working space or something similar where it's a unified environment where multiple interconnected pieces coexist and they're very easy to manage. + +00:33:26.700 --> 00:33:32.700 +And that's something that eventually led us to splitting the whole repository across our distributions. + +00:33:32.700 --> 00:33:35.700 +And that's the reason you see so many toml files. + +00:33:35.700 --> 00:33:37.700 +So everything has a by project toml. + +00:33:37.700 --> 00:33:46.700 +Everything defines the dependency groups it needs and development of a particular package is restricted only to its dependencies. + +00:33:46.700 --> 00:33:55.700 +So you develop it, you run uv sync, you can run your by test using uv and everything that is supposed to run with it is running with it. + +00:33:55.700 --> 00:33:59.700 +And any bad or, you know, cross imports are caught really easily. + +00:33:59.700 --> 00:34:05.700 +So I think the workspace feature at least was the most important one for me. + +00:34:05.700 --> 00:34:07.700 +And obviously the speed that it brings with it. + +00:34:07.700 --> 00:34:08.700 +And that's impressive. + +00:34:08.700 --> 00:34:09.700 +It is. + +00:34:09.700 --> 00:34:13.700 +And I think this workspace concept, it's new to me. + +00:34:13.700 --> 00:34:14.700 +I'll say it's new to me. + +00:34:14.700 --> 00:34:16.700 +I don't know how new it is to other other folks. + +00:34:16.700 --> 00:34:26.700 +So you've got this giant monorepo and how many different conceptually different packages or projects are in there right now? + +00:34:26.700 --> 00:34:27.700 +120 plus. + +00:34:27.700 --> 00:34:38.700 +It changes by day because Amok is doing a lot to increase the number very, very quickly because we are just now in the middle of finishing some isolation kind of restructuring. + +00:34:38.700 --> 00:34:48.700 +And Amok is the one that that's why he's here also to lead the introduction of new packages that we or new distributions that we that we have like a shared libraries that we will talk about later. + +00:34:48.700 --> 00:34:49.700 +So we have a lot of those. + +00:34:49.700 --> 00:34:50.700 +Yes. + +00:34:50.700 --> 00:34:54.700 +I think this is super important to dive into and how uv makes this possible. + +00:34:54.700 --> 00:35:00.700 +And I think you said also Hatch, you talked with Ofec, who runs Hatch as well about this, right? + +00:35:00.700 --> 00:35:01.700 +Yes. + +00:35:01.700 --> 00:35:02.700 +Yes. + +00:35:02.700 --> 00:35:07.700 +Hatch is also supporting workspaces, which are modeled mainly about what like after what uv has done. + +00:35:07.700 --> 00:35:16.700 +We haven't tried it yet, but I've heard it's very, very similar or even like you can use it as a one to one replacement in some cases or maybe even in all. + +00:35:16.700 --> 00:35:21.700 +But generally, I would love this eventually to become some kind of standard so that multiple tools are supporting this. + +00:35:21.700 --> 00:35:28.700 +But but yes, there are a few other tools that we were considering before, but uv is by far the kind of like, yeah, well, we work together. + +00:35:28.700 --> 00:35:30.700 +We shaped it together with the uv team. + +00:35:30.700 --> 00:35:33.700 +So it definitely works well for us. + +00:35:33.700 --> 00:35:34.700 +Yeah. + +00:35:34.700 --> 00:35:35.700 +Amazing. + +00:35:35.700 --> 00:35:39.700 +So let me describe this a little bit and then you all can can actually introduce it. + +00:35:39.700 --> 00:35:45.700 +So the idea is we've got this mono repo with a bunch of different folders for the sections, right? + +00:35:45.700 --> 00:35:49.700 +Like airflow dash CLI or CTL and airflow dash core and so on. + +00:35:49.700 --> 00:35:56.700 +And you'd like to be able to kind of just jump into one section and treat it as a top level project, right? + +00:35:56.700 --> 00:35:57.700 +It's got a pyproject.toml. + +00:35:57.700 --> 00:35:59.700 +It's got a source file, tests and so on. + +00:35:59.700 --> 00:36:12.700 +But the challenge is you can't just have a bunch of disconnected pieces like maybe airflow core depends on five other parts of it that are also themselves have their own pyproject.toml and different things. + +00:36:12.700 --> 00:36:14.700 +And you've got to set up, you know, set up. + +00:36:14.700 --> 00:36:19.700 +If you jump into the airflow core, you've got to set up the environment just right to be working on those other parts, right? + +00:36:19.700 --> 00:36:20.700 +It sounds pretty tricky. + +00:36:20.700 --> 00:36:22.700 +So how does how does that work? + +00:36:22.700 --> 00:36:24.700 +Who wants to make sense of this for us? + +00:36:24.700 --> 00:36:25.700 +It works perfectly. + +00:36:25.700 --> 00:36:27.700 +Like it's super, super simple, actually. + +00:36:27.700 --> 00:36:33.700 +You know, like the whole thing about the uv is like its simplicity of the of not of the concept. + +00:36:33.700 --> 00:36:35.700 +The implementation is actually quite tricky. + +00:36:35.700 --> 00:36:37.700 +But the way how you use it is very simple. + +00:36:37.700 --> 00:36:39.700 +Just go to the directory and run uv sync. + +00:36:39.700 --> 00:36:40.700 +That's basically it. + +00:36:40.700 --> 00:36:43.700 +This is the directory you want to work on. + +00:36:43.700 --> 00:36:47.700 +And it does exactly what you would expect it to do, which means that it syncs. + +00:36:47.700 --> 00:36:56.700 +It actually updates the or recreates basically the virtual environment that you're using with all the dependencies that this particular distribution needs and anything that it needs. + +00:36:56.700 --> 00:36:58.700 +As a transitive dependency as well. + +00:36:58.700 --> 00:37:05.700 +So if it refers to another project project inside the workspace, it will also use it from there, not from like installed by by PR. + +00:37:05.700 --> 00:37:14.700 +So you can immediately start working on this because everything after uv sync, everything is exactly as you expect for this particular subset of the repository that you were on. + +00:37:14.700 --> 00:37:16.700 +And that's basically it. + +00:37:16.700 --> 00:37:17.700 +This is all. + +00:37:17.700 --> 00:37:18.700 +Like there is nothing more, basically. + +00:37:18.700 --> 00:37:19.700 +That's it. + +00:37:19.700 --> 00:37:20.700 +It works. + +00:37:20.700 --> 00:37:25.700 +And you can, when you run uv sync pytest run, it will do exactly what you want. + +00:37:25.700 --> 00:37:38.700 +So in this folder, because it will also uv, uv run py test, it will do exactly what you want because even uv run will automatically sync the virtual and very, very quickly to the one that your project needs. + +00:37:38.700 --> 00:37:43.700 +And then it will just run pytest in this virtual environment and it will run all the tests in your project. + +00:37:43.700 --> 00:37:44.700 +And that's basically it. + +00:37:44.700 --> 00:37:49.700 +So it's like conceptually for the users is like, you don't have to do much, just uv sync. + +00:37:49.700 --> 00:37:50.700 +And that's it. + +00:37:50.700 --> 00:37:57.700 +I think one of the big challenges here is how do different parts of the project know about each other, right? + +00:37:57.700 --> 00:37:58.700 +Yeah. + +00:37:58.700 --> 00:38:01.700 +You said that it, it, it's similar links the different elements in. + +00:38:01.700 --> 00:38:05.700 +The basic kind of workspace and implementation is just a workspace definition. + +00:38:05.700 --> 00:38:09.700 +So you have to have the definition of workspace in the top level by project. + +00:38:09.700 --> 00:38:11.700 +So there you have all of them listed. + +00:38:11.700 --> 00:38:12.700 +You have links to it. + +00:38:12.700 --> 00:38:18.700 +They have described where they are and uv will read the by project from the top level and will, will know what they are. + +00:38:18.700 --> 00:38:21.700 +We'll, will know where to look for particular distributions. + +00:38:21.700 --> 00:38:29.700 +So that's the, that's the simple discovery and the way how we know that we are using it from the sources and not from the, from the IPI. + +00:38:29.700 --> 00:38:36.700 +But then like the shared libraries as, as it's like something that we added on top of it and the sim links are on the top of it. + +00:38:36.700 --> 00:38:44.700 +And this is kind of extra innovative thing that we are doing for something else that we need, but you know, we can, we can talk about that now or like I'm not can talk about. + +00:38:44.700 --> 00:38:45.700 +This is really cool. + +00:38:45.700 --> 00:38:59.700 +So one of the things that happens here is these different slices or subsections of the monorepo PI project.toml that PI project.toml depend, defines its true dependencies and its dev dependencies and so on. + +00:38:59.700 --> 00:39:10.700 +So when you go and jump into a section, it will, uv will basically realign the virtual environment with whatever dependencies are supposed to be there from those things. + +00:39:10.700 --> 00:39:17.700 +Right. So that means installing stuff, obviously, but actually what surprised me a little bit, not a lot, but like, Oh yeah, I guess it does do that. + +00:39:17.700 --> 00:39:29.700 +That's cool. Is it actually uninstalled stuff. That's not explicitly put there, which I can imagine before that you could be like, well, this one part way down here depends on this weird library. + +00:39:29.700 --> 00:39:36.700 +And somehow I used to be over there. Then I went back to the, this other piece and then I came back and I forgot where that even came from. + +00:39:36.700 --> 00:39:41.700 +Like, why is that in my virtual environment? And like, how do I specify that? Probably juggling that was a big problem, right? + +00:39:41.700 --> 00:39:47.700 +This, this like loading and unloading dependencies based on what part of the monorepo you're in. + +00:39:47.700 --> 00:39:52.700 +And I think that actually makes it really much easier to deal with like this, this type of code structure. + +00:39:52.700 --> 00:40:02.700 +Let me add to that one more thing, because it's also not only the dependencies that you might have from somewhere else, but also it's a cross dependencies between different distributions inside. + +00:40:02.700 --> 00:40:14.700 +So for example, if our flow CTL does not use our flow core, if you go there and you think you will not be able to report and use any of the source code, which is in airflow core, because it's not a dependency of our flow CTL. + +00:40:14.700 --> 00:40:23.700 +So uv sync will not only uninstall the dependencies that you have, but also uninstall the source code that you have from other parts of the repo, which is a fantastic thing for us. + +00:40:23.700 --> 00:40:27.700 +And that was exactly what was missing before kind of isolation between those. + +00:40:27.700 --> 00:40:36.700 +You only actually can from your source, you only can refer to the source code of those distribution that you depend on and nothing else from the monorepo. + +00:40:36.700 --> 00:40:41.700 +So this means that it's like you can slice and dice your repository as you want. + +00:40:41.700 --> 00:40:51.700 +So depending on in which the directory you are and when you run uv sync, you will have like subset, like the actual useful and the used subset from your repository. + +00:40:51.700 --> 00:40:58.700 +And it can be completely different if you go to another directory, some of that can be overlapping, some of that can be completely different. + +00:40:58.700 --> 00:41:00.700 +Depends like which dependencies are defined. + +00:41:00.700 --> 00:41:06.700 +And this is like, this all magically happens, like by just defining the dependency in PI project. + +00:41:06.700 --> 00:41:09.700 +And uv sync will handle it for you in the workspace. + +00:41:09.700 --> 00:41:13.700 +It's like exactly the reason why it's so useful for developers. + +00:41:13.700 --> 00:41:27.700 +It helped us in our vision to actually, you know, decompose the project into multiple parts and avoid the classic problem of coupling, which every monorepo faces at some point in their lifecycle, because everything is out there. + +00:41:27.700 --> 00:41:31.700 +Why don't we just, you know, have code leaks all over the place. + +00:41:31.700 --> 00:41:33.700 +So this helps us prevent that. + +00:41:33.700 --> 00:41:36.700 +And I cannot imagine a time how we did it earlier before uv. + +00:41:36.700 --> 00:41:41.700 +I don't know if we did it, but if we did it, it would have been a really tough thing. + +00:41:41.700 --> 00:41:50.700 +Yeah, there's a bunch of tools that you can, linters and code analysis things you can run on your code that breaks down for these different modules and these layers. + +00:41:50.700 --> 00:42:00.700 +Here's like a directed graph of how this thing, and you can set up rules to say this should never cross that boundary, but these are just very, very vague things. + +00:42:00.700 --> 00:42:04.700 +And this setup actually makes it so it's not accessible to your code. + +00:42:04.700 --> 00:42:05.700 +If you didn't say it should be. + +00:42:05.700 --> 00:42:13.700 +It's just built in exactly the definition of your distribution, which you anyhow have to do because like you have to define what the, what the dependencies are. + +00:42:13.700 --> 00:42:15.700 +And yes, we did something like that before. + +00:42:15.700 --> 00:42:18.700 +So we get a number of like rough rules or whatever. + +00:42:18.700 --> 00:42:21.700 +Don't import here, import here. + +00:42:21.700 --> 00:42:28.700 +We still have them for shared libraries, which we can talk about now, because I think this is an important modification of the concept. + +00:42:28.700 --> 00:42:36.700 +So we do have some automated check for quality and for imports with Prec, our Prec commit hook implementation. + +00:42:36.700 --> 00:42:42.700 +But before that, it was just completely, completely like handwritten and unmaintainable. + +00:42:42.700 --> 00:42:48.700 +People will not, we're not actually updating it with all the distributions you couldn't really, you know, follow when things change. + +00:42:48.700 --> 00:42:56.700 +With PyProject Tom being the, for each distribution being the single source of truth, you don't have to do anything because the dependency is declared there. + +00:42:56.700 --> 00:43:03.700 +And this is like the best part of, of uv understanding that and, and doing everything that is like reasonable in this case. + +00:43:03.700 --> 00:43:15.700 +The other major tool involved here was Prec, which it's a pre commit framework for running hooks, many languages, but especially Python relevant here written in Rust. + +00:43:15.700 --> 00:43:17.700 +So it pairs well with uv, I suppose. + +00:43:17.700 --> 00:43:18.700 +Oh yeah. + +00:43:18.700 --> 00:43:20.700 +It was inspired by uv as well. + +00:43:20.700 --> 00:43:25.700 +And, and Joe was mentioning, mentioned that, that he was actually contributing to uv before. + +00:43:25.700 --> 00:43:26.700 +Great. + +00:43:26.700 --> 00:43:27.700 +How's Prec show up here? + +00:43:27.700 --> 00:43:30.700 +I feel like this is leading towards what you were hinting at earlier. + +00:43:30.700 --> 00:43:31.700 +It's a new name, Prec. + +00:43:31.700 --> 00:43:32.700 +So, yep. + +00:43:32.700 --> 00:43:39.700 +This allows us to do a few things which pre commit did not do, or, you know, did not accept as suggestions. + +00:43:39.700 --> 00:43:45.700 +So, one certain thing that Prec offers is obviously it's written in Rust. + +00:43:45.700 --> 00:43:48.700 +So speed is the obvious one is that we get. + +00:43:48.700 --> 00:43:53.700 +But apart from that, we also get this notion of it pairing well with uv in terms of modularized hooks. + +00:43:53.700 --> 00:44:00.700 +Earlier, we had all the hooks in one place in that, in the top level pre commit YAML, right? + +00:44:00.700 --> 00:44:01.700 +And it was a big fight. + +00:44:01.700 --> 00:44:03.700 +It was really big. + +00:44:03.700 --> 00:44:04.700 +You can imagine. + +00:44:04.700 --> 00:44:05.700 +So, yeah. + +00:44:05.700 --> 00:44:11.700 +So this Prec allowed us to, Prec again, you know, it, it consumed the concept of workspaces here, I would say. + +00:44:11.700 --> 00:44:18.700 +So it allowed you to define pre commit hooks or Prec hooks within a module itself. + +00:44:18.700 --> 00:44:26.700 +And this paired well with uv in the sense that when you have to run hooks that are bound to a certain distribution, + +00:44:26.700 --> 00:44:31.700 +all you have to do is check in into the, you know, the sub module and just do a Prec run. + +00:44:31.700 --> 00:44:34.700 +It will run the relevant hooks for that particular module. + +00:44:34.700 --> 00:44:41.700 +And the other, other thing that I really love about Prec is auto completion, which is not something pre commit had. + +00:44:41.700 --> 00:44:50.700 +So you can imagine that something fails in the CI, you have to copy that and copy the ID and try to kind of backtrack it in your repo as to which one is failing. + +00:44:50.700 --> 00:44:56.700 +So it's, it used to be a nightmare, but now with the, you know, the tab completion, it's, it's amazing. + +00:44:56.700 --> 00:45:00.700 +Nice. Are you talking about like shell autocomplete integration? + +00:45:00.700 --> 00:45:02.700 +Yeah. Yeah. So, okay. I've seen. + +00:45:02.700 --> 00:45:04.700 +I have some story about that very, very short. + +00:45:04.700 --> 00:45:11.700 +So like we actually tried to get out the completion for hook names with, with Prec commit, which was the predecessor of Prec. + +00:45:11.700 --> 00:45:22.700 +Like Prec was largely based on Prec commit, but somehow the author of it didn't accept even idea of us contributing it or actually had some very, very excessive expectations for that. + +00:45:22.700 --> 00:45:29.700 +And we, you know, discussed and like, there were like, other people were also trying to convince the author to do that, but they refused. + +00:45:29.700 --> 00:45:32.700 +He refused basically and refused to accept contributions. + +00:45:32.700 --> 00:45:36.700 +Even when we spoke to Joe, that was like completely different stories. + +00:45:36.700 --> 00:45:37.700 +Like we need that. + +00:45:37.700 --> 00:45:38.700 +And next day it was there. + +00:45:38.700 --> 00:45:41.700 +Like it's like completely different approach. + +00:45:41.700 --> 00:45:49.700 +So, so this is, and then we said like, we need workspaces and like a few weeks later, because it took a little bit of time, it was there and we work together and we tested that. + +00:45:49.700 --> 00:45:55.700 +And like, I raised, I don't know how many issues in the initial kind of pre-release version when, when we wanted to use it. + +00:45:55.700 --> 00:46:04.700 +So I think the collaboration and being, you know, working together, listening to your users and be responding and actually working as an open source maintainers together. + +00:46:04.700 --> 00:46:08.700 +This actually worked perfectly well here, both, both in uv and Prec. + +00:46:08.700 --> 00:46:22.700 +And this is why we love Prec actually because, because we know we can rely, if something is not working, that it's going to be like, we can discuss and either submit a fix or, or, you know, Joe will do this or even like lots of other people can do it. + +00:46:22.700 --> 00:46:26.700 +Because there was a few features that we wanted and somebody else implemented it. + +00:46:26.700 --> 00:46:35.700 +And that wasn't Joe, they contributed Prec because of this openness and, you know, being able to accept the needs of the users. + +00:46:35.700 --> 00:46:38.700 +That was very, very important part, like why we moved to Prec. + +00:46:38.700 --> 00:46:41.700 +Yeah. I think Airflow was also one of the initial case studies for Prec. + +00:46:41.700 --> 00:46:48.700 +It's a project of that scale. And if you kind of satisfy that project's needs, you are, you're pretty good with most use cases. + +00:46:48.700 --> 00:46:51.700 +I think that's quite both Prec and UVS. + +00:46:51.700 --> 00:47:01.700 +Yeah. Right there at the top of the Prec repo, it says, although Prec is pretty new, it's already powering real projects, you know, little things like CPython, Apache Airflow and FastAPI. + +00:47:01.700 --> 00:47:13.700 +I know Hugo van Kameret from the release manager of Python. So we met at Fosdem as well. And like, he was actually listening to our Prec discussion and he converted, you know, CPython to use Prec because of the, of the needs they had. + +00:47:13.700 --> 00:47:19.700 +So like, it was all about, you know, people talking to each other, word of mouth and things like that. + +00:47:19.700 --> 00:47:29.700 +You know, there's a feature listed here that just makes me jealous. One of the features of Prec is a single binary with no dependencies that doesn't require Python or any other runtime to be installed. + +00:47:29.700 --> 00:47:36.700 +Like how incredible would it be with Python if we had a, a Python --build app or something, you know what I mean? + +00:47:36.700 --> 00:47:43.700 +You can put it at your thing and you get something you could distribute. I know uv solves a lot, but you still got to have uv installed. + +00:47:43.700 --> 00:47:49.700 +And then, you know, like this, that is a huge advantage of things like Rust and go and some other languages. + +00:47:49.700 --> 00:47:55.700 +It's both good and bad in some cases. So it's like, there are always trade-offs, different choice made by Python here. + +00:47:55.700 --> 00:48:09.700 +I don't think it's like the best choice for, for Python. I think Python being script language, it's okay to have, you know, like dependencies and especially like inline script, script metadata almost did it because you just, you know, can install stuff. + +00:48:09.700 --> 00:48:16.700 +And uv also, and the kind of tooling is also doing all the stuff like uv install or uv tool install, whatever. + +00:48:16.700 --> 00:48:22.700 +And it would not only install the project, its dependencies, but also install Python that is needed to run it. + +00:48:22.700 --> 00:48:27.700 +So like all this is really a matter of two weeks and it has improved dramatically over the last few years. + +00:48:27.700 --> 00:48:32.700 +Yeah. I was pining for an option, not a only binary thing. + +00:48:32.700 --> 00:48:43.700 +All right. So one thing I actually want to talk about going back to this workspaces thing real quick is what does it look like from a IDE or editor experience to work on this? + +00:48:43.700 --> 00:48:52.700 +All right. Like you've got Python projects, you've got maybe VS Code workspaces where you can pull in different pieces. How do you all manage that? + +00:48:52.700 --> 00:49:02.700 +I cannot talk for VS Code. I'm a Python user here, but we had to do a little bit of hacking, I would say, or more like a helper script for the IDs, right? + +00:49:02.700 --> 00:49:12.700 +Because so we have a IDE helper script right in the repo and we recommend the users to run it so that the IDE knows what is where in terms of maintaining things, right? + +00:49:12.700 --> 00:49:19.700 +Because in normal projects, there's usually just one source, one desk at the top level, but it has 120 plus. + +00:49:19.700 --> 00:49:27.700 +And the helper script is, it does a pretty simple thing. It just auto discovers all the packages in the monorepo and adds this. + +00:49:27.700 --> 00:49:35.700 +So IntelliJ and PyCharm both have a .IDR within each, a hidden folder within each of the projects that it opens. + +00:49:35.700 --> 00:49:41.700 +And it has a, and it supports XML like format for IML where you can define certain things. + +00:49:42.700 --> 00:49:44.700 +So this essentially does a very simple thing. + +00:49:44.700 --> 00:49:51.700 +It just, for each package, it adds the module slash source as the source root and the module slash tests of the test. + +00:49:52.700 --> 00:49:56.700 +It's as if you went through all 120 things and right clicked and said mark as sources root or something like that. + +00:49:56.700 --> 00:50:00.700 +Yeah, we had this PyCharm script and then we have the same approach for VS Code. + +00:50:00.700 --> 00:50:08.700 +So we have another script for VS Code as well, which was contributed by someone who uses VS Code because neither me or Amog are VS Code users. + +00:50:08.700 --> 00:50:09.700 +PyCharm uses both of us. + +00:50:09.700 --> 00:50:13.700 +But, you know, communities also and like somebody said, OK, I'll do it. + +00:50:13.700 --> 00:50:14.700 +And there it was. + +00:50:14.700 --> 00:50:15.700 +And they tested it. + +00:50:15.700 --> 00:50:17.700 +And, you know, like that's, that was super cool actually. + +00:50:17.700 --> 00:50:19.700 +So, yeah, it works well. + +00:50:19.700 --> 00:50:32.700 +Also, the, you know, a little bit of words, probably we don't talk, we won't talk too much about like the, we don't have too much time, but the shared libraries concept a little bit might maybe it's the right time to introduce the concept. + +00:50:32.700 --> 00:50:41.700 +Because, because we like one thing that Amog mentioned is like the, we have, we solve this coupling problem, but also we wanted to solve the dry problem. + +00:50:41.700 --> 00:50:54.700 +And those two are always kind of mixture, like you get dry and then you get more dry and less coupling and like, like more dry and more coupling and like all these things are complex when you have lots of code interacting with each other. + +00:50:54.700 --> 00:50:57.700 +Dry being the architectural philosophy of do not repeat yourself. + +00:50:57.700 --> 00:51:04.700 +But if you're not repeating yourself, everything where if it exists somewhere, everything's got to depend on that somewhere and it starts to become more linked together. + +00:51:04.700 --> 00:51:05.700 +Right. + +00:51:05.700 --> 00:51:08.700 +So it's a little bit of like a, eat cake and have it too. + +00:51:08.700 --> 00:51:18.700 +Like we want to have dry code and not to repeat it for like common utilities, like logging, configuration, whatever, all the things that are kind of common between all the different distributions. + +00:51:18.700 --> 00:51:26.700 +But also we didn't want to depend on a single version of those, because if we do, then it means that we have to make sure that the backwards compatibility is maintained. + +00:51:26.700 --> 00:51:35.700 +Because like when we install different version of different distributions coming from different time of repository, they might use different version of those shared libraries. + +00:51:35.700 --> 00:51:48.700 +And like how to make sure that they don't have breaking changes and stuff like so this is all the whole level of complexity between like how to manage the dependencies there and manage versions, especially manage the backwards compatibility. + +00:51:48.700 --> 00:52:01.700 +So we figured out that with some very simple approach, we tried a few different approaches, but like, like one of the approaches was using the vendor link library from pip and from Byton, no, from pip, from pip. + +00:52:01.700 --> 00:52:09.700 +And the second one, and that's the one we came up, we finally implemented, was like using Simlinks to share the code between different distributions. + +00:52:09.700 --> 00:52:14.700 +And that's a very innovative approach that I hope will make it into some kind of standard eventually. + +00:52:14.700 --> 00:52:27.700 +So like we came up with this approach where we actually have cake and eat it too, like, which is like pretty amazing if you fought with like for years with this kind of common dependency issues that and backwards compatibility. + +00:52:27.700 --> 00:52:33.700 +So in our case, like the Simlink approach we have, it needs some pre-processing of by project DOM. + +00:52:33.700 --> 00:52:37.700 +Some parts of the PyProject DOM are generated to make it actually work. + +00:52:37.700 --> 00:52:41.700 +But this is all automated with Preq, which is like, we don't have to think about that even. + +00:52:41.700 --> 00:52:50.700 +And once we do that, and once we create some Simlinks between different parts of code, like one library, one distribution is Simlinks in code from the shared distribution. + +00:52:50.700 --> 00:53:02.700 +The end result is that this code gets automatically vendored in during the building of the package, which means that we actually have the same library in different package, in different version, in different distributions. + +00:53:02.700 --> 00:53:08.700 +So distribution released a week ago will have a shared configuration from a week ago. + +00:53:08.700 --> 00:53:14.700 +But another distribution will have the same shared configuration code from today if it's released today. + +00:53:14.700 --> 00:53:16.700 +And we can install them together. + +00:53:16.700 --> 00:53:22.700 +And all of them have effectively, like if they had a different version of the same library installed. + +00:53:22.700 --> 00:53:31.700 +It's as if the Airflow-CTL said it had a dependency on core and it pinned that version to something, but a different part of the repo pinned it to a different. + +00:53:31.700 --> 00:53:34.700 +And they can both kind of coexist. + +00:53:34.700 --> 00:53:36.700 +But it's actually all within the same code file. + +00:53:36.700 --> 00:53:37.700 +That's insane. + +00:53:37.700 --> 00:53:38.700 +OK. + +00:53:38.700 --> 00:53:40.700 +And this is like largely, like it's nothing new. + +00:53:40.700 --> 00:53:46.700 +It's largely inspired by how the libraries work in C and like traditional kind of building code. + +00:53:46.700 --> 00:53:49.700 +Like you have dynamic libraries and static libraries. + +00:53:49.700 --> 00:53:59.700 +So this is like essentially equivalent of static libraries where you take the code of the version that you compile the stuff in and put it inside the final binary. + +00:53:59.700 --> 00:54:02.700 +And then it results like in Rust, the kind of single binary thing. + +00:54:02.700 --> 00:54:15.700 +So it's a little bit like, so we have a little bit of this single binary by doing that in the sense that we automatically vendor in all the, you know, shared dependencies that we have in the same distribution. + +00:54:15.700 --> 00:54:21.700 +So it's kind of hybrid, but it's always like, so Rust is a little bit too far because everything is single binary. + +00:54:21.700 --> 00:54:23.700 +In our case, we have a bit of both. + +00:54:23.700 --> 00:54:29.700 +Like we can use libraries dynamically, but we can also embed libraries as shared inside the single distribution. + +00:54:29.700 --> 00:54:30.700 +That's very cool. + +00:54:30.700 --> 00:54:31.700 +That's wild. + +00:54:31.700 --> 00:54:32.700 +Amag? + +00:54:32.700 --> 00:54:34.700 +Sounds like you were instrumental in this, Pari. + +00:54:34.700 --> 00:54:37.700 +That's the nice thing about the approach that was chosen, right? + +00:54:37.700 --> 00:54:40.700 +We all came together as a community on this one. + +00:54:40.700 --> 00:54:50.700 +And we had one email, DevList discussion one fine day that, hey, we want to achieve something like this, which more or less was something everyone agreed upon. + +00:54:50.700 --> 00:54:53.700 +So people started chiming in and we started trying different things out. + +00:54:53.700 --> 00:54:57.700 +The first one, obviously using the rendering tool from Pip. + +00:54:57.700 --> 00:55:02.700 +Somebody did a POC on that, but it felt like it's going to be difficult to achieve that long term. + +00:55:02.700 --> 00:55:04.700 +And also it could be brittle. + +00:55:04.700 --> 00:55:10.700 +So Yarek came up with this particular option with Simlinks, which again was discussed within the community. + +00:55:10.700 --> 00:55:16.700 +A few of us picked this PR up, passed it locally, played around and gave the feedback. + +00:55:16.700 --> 00:55:23.700 +So I don't think this would be possible with AI in the sense that this has never been done before. + +00:55:23.700 --> 00:55:31.700 +Or something like this, where a community comes together and solves a rather difficult problem, is something that makes me really happy. + +00:55:31.700 --> 00:55:38.700 +And also something that all of us are working towards a common goal while also bound by our corporate hats, right? + +00:55:38.700 --> 00:55:41.700 +Is something that is again, really nice to see. + +00:55:41.700 --> 00:55:54.700 +We have about how 11, I think at this point, we have about 11 to 12 shared libraries where the main notion here is to reimagine Airflow as a independent server and more like a control plane and execution plane. + +00:55:54.700 --> 00:56:01.700 +What we did with Airflow three and this shared libraries is helping us achieve that model. + +00:56:01.700 --> 00:56:03.700 +And we have about 11 to 12 of them. + +00:56:03.700 --> 00:56:05.700 +And I think a few more coming very soon. + +00:56:05.700 --> 00:56:10.700 +But yeah, that's yeah, it's been nice working on the shared libraries. + +00:56:10.700 --> 00:56:11.700 +It's yeah. + +00:56:11.700 --> 00:56:16.700 +Is this something that people can take and adopt into their monorepo if they want to live that life? + +00:56:16.700 --> 00:56:17.700 +Absolutely. + +00:56:17.700 --> 00:56:18.700 +Yeah. + +00:56:18.700 --> 00:56:23.700 +It's just that it's really like one or two kind of preq hooks which are maintaining the consistency. + +00:56:23.700 --> 00:56:38.700 +And like, so that you don't forget to add this symlink here and that kind of I project com definition here and or that hatch definition for the hatch link to actually embed your symlink code into the final distribution. + +00:56:38.700 --> 00:56:43.700 +So like there are like a few pieces that have to be put together from existing libraries. + +00:56:43.700 --> 00:56:44.700 +So that's basically it. + +00:56:44.700 --> 00:56:51.700 +And once you do it, it's just that those are the funny thing is like those shared libraries are just standalone distributions. + +00:56:51.700 --> 00:56:54.700 +You can actually build them separately as a library as well. + +00:56:54.700 --> 00:56:58.700 +We could potentially even, you know, like just use them as library as well. + +00:56:58.700 --> 00:57:02.700 +No problem whatsoever because they are just standard plane distributions or any other. + +00:57:02.700 --> 00:57:11.700 +We just happen to take the source code of it and then embed it in into the target distribution that wants to use it rather than, you know, link to it by dependency. + +00:57:11.700 --> 00:57:13.700 +So that's basically other than that. + +00:57:13.700 --> 00:57:18.700 +It's it's it's a kind of completely standard library and or standard distribution. + +00:57:18.700 --> 00:57:24.700 +And one one more thing that is really important to add here is like this also has a side effect, but I think a very nice one. + +00:57:24.700 --> 00:57:31.700 +And Amo can confirm that because he has been doing a lot of that is like we actually come up with like way better internal architecture. + +00:57:31.700 --> 00:57:38.700 +Or because of that, because a lot of those shared libraries, they depended on each other, sometimes in a circular fashion. + +00:57:38.700 --> 00:57:43.700 +Sometimes it really dependent, like which import you did first, like what happened, like what was initialized. + +00:57:43.700 --> 00:57:49.700 +And I was like complete spaghetti of dependencies between generally independent pieces of functionality. + +00:57:49.700 --> 00:57:56.700 +Right now, by having shared libraries, we are actually forcing ourselves to make it make them isolated. + +00:57:56.700 --> 00:57:59.700 +We are changing the way how we initialize them. + +00:57:59.700 --> 00:58:06.700 +For example, we are injecting all the configuration rather than using them from inside the library, because like configuration libraries and other libraries. + +00:58:06.700 --> 00:58:08.700 +So you don't want to depend on the other libraries. + +00:58:08.700 --> 00:58:09.700 +So it's and it's really nice. + +00:58:09.700 --> 00:58:11.700 +I think it comes. + +00:58:11.700 --> 00:58:17.700 +The result is that really the architecture of Airflow internally is so much better because of that. + +00:58:17.700 --> 00:58:28.700 +So less surprises and explicit initialization is like something that we'll have to do rather than implicit initialization, initialization during imports, which which has always been plaguing as a big issue. + +00:58:28.700 --> 00:58:45.700 +Certainly, it also allows you to imagine each component having an entry point, per se, where you have an initial starting point and it initializes everything it needs by injecting and calling certain factories, which makes a very clean for anyone visiting the project. + +00:58:45.700 --> 00:58:49.700 +Also, they look at something and they know the entry point very clearly that, hey, this is how it starts. + +00:58:49.700 --> 00:58:50.700 +This is what it initializes. + +00:58:50.700 --> 00:58:59.700 +You know, it reminds me of like Golang or Java projects where they have a nice, nice main where in Python, Python, it's not really the same way. + +00:58:59.700 --> 00:59:02.700 +All right. Well, I think that's about it for all the time we have. + +00:59:02.700 --> 00:59:05.700 +I guess let's close it out with one final thought. + +00:59:05.700 --> 00:59:15.700 +Here's just people who are maybe inspired by your design, by the way you put together Airflow and this monorepo concept, especially Python people. + +00:59:15.700 --> 00:59:17.700 +What do you what do you say to them? + +00:59:17.700 --> 00:59:18.700 +Final thoughts here. + +00:59:18.700 --> 00:59:18.700 + + +00:59:18.700 --> 00:59:20.700 +I mean, like there was always discussion. + +00:59:20.700 --> 00:59:24.700 +Like we had lots of discussions internally, even some of the teams members in Airflow. + +00:59:24.700 --> 00:59:27.700 +They let's split the repository into smaller one. + +00:59:27.700 --> 00:59:30.700 +Like let's make more of them because it's going to make things easier. + +00:59:30.700 --> 00:59:35.700 +I was always the monorepo fan and and I made a lot of work to make it possible. + +00:59:35.700 --> 00:59:37.700 +But that was a very, very difficult thing. + +00:59:37.700 --> 00:59:38.700 +It's changed. + +00:59:38.700 --> 00:59:44.700 +So like the reasons why you would like to have multiple repos are gone now if you're using the right tooling. + +00:59:44.700 --> 00:59:51.700 +And only the benefits or mostly the benefits from having it in one place where you can test everything together and work on it together, remain. + +00:59:51.700 --> 00:59:53.700 +All the rest is basically gone. + +00:59:53.700 --> 00:59:57.700 +So for me, the discussion monorepo versus multirepo is already solved. + +00:59:57.700 --> 00:59:58.700 +Yeah, just do it. + +00:59:58.700 --> 01:00:00.700 +We it's not even. + +01:00:00.700 --> 01:00:07.700 +So personally, I've been using the read me that we have present in the shared libraries as a context for my ID. + +01:00:07.700 --> 01:00:12.700 +So it's turning out to be very nice for the shared library split, for example. + +01:00:12.700 --> 01:00:20.700 +All I have to do is just provide it the context and tell it, hey, just just construct the structure for me and I can do everything else. + +01:00:20.700 --> 01:00:21.700 +So it's that easy. + +01:00:21.700 --> 01:00:23.700 +We have all the things in place. + +01:00:23.700 --> 01:00:24.700 +We are in the right area to do it. + +01:00:24.700 --> 01:00:25.700 +So just do it. + +01:00:25.700 --> 01:00:26.700 +Very inspiring. + +01:00:26.700 --> 01:00:28.700 +Thank you for being here. + +01:00:28.700 --> 01:00:31.700 +Awesome for this look inside. + +01:00:31.700 --> 01:00:32.700 +And it's Apache Airflow. + +01:00:32.700 --> 01:00:33.700 +It's on GitHub. + +01:00:33.700 --> 01:00:34.700 +People can go look and see. + +01:00:34.700 --> 01:00:38.700 +It's not just a talking vaguely about some internal project. + +01:00:38.700 --> 01:00:39.700 +Right. + +01:00:39.700 --> 01:00:40.700 +So people can go check it out. + +01:00:40.700 --> 01:00:41.700 +All right. + +01:00:41.700 --> 01:00:42.700 +See you later. + +01:00:42.700 --> 01:00:43.700 +Thanks. + +01:00:43.700 --> 01:00:43.700 + + +01:00:43.700 --> 01:00:44.700 +Thanks. + +01:00:45.480 --> 01:00:47.960 +This has been another episode of Talk Python To Me. + +01:00:48.120 --> 01:00:49.080 +Thank you to our sponsors. + +01:00:49.280 --> 01:00:50.560 +Be sure to check out what they're offering. + +01:00:50.740 --> 01:00:52.100 +It really helps support the show. + +01:00:52.560 --> 01:00:56.780 +This episode is brought to you by our Agentic AI Programming for Python course. + +01:00:57.280 --> 01:01:01.860 +Learn to work with AI that actually understands your code base and build real features. + +01:01:02.380 --> 01:01:05.760 +Visit talkpython.fm/agentic-ai. + +01:01:06.300 --> 01:01:11.700 +If you or your team needs to learn Python, we have over 270 hours of beginner and advanced + +01:01:11.700 --> 01:01:18.140 +courses on topics ranging from complete beginners to async code, Flask, Django, HTMX, and even + +01:01:18.140 --> 01:01:18.720 +LLMs. + +01:01:18.960 --> 01:01:21.380 +Best of all, there's no subscription in sight. + +01:01:21.820 --> 01:01:23.580 +Browse the catalog at talkpython.fm. + +01:01:24.220 --> 01:01:28.380 +And if you're not already subscribed to the show on your favorite podcast player, what + +01:01:28.380 --> 01:01:28.900 +are you waiting for? + +01:01:29.500 --> 01:01:31.380 +Just search for Python in your podcast player. + +01:01:31.480 --> 01:01:32.340 +We should be right at the top. + +01:01:32.720 --> 01:01:35.660 +If you enjoy that geeky rap song, you can download the full track. + +01:01:35.780 --> 01:01:37.680 +The link is actually in your podcast blur show notes. + +01:01:38.380 --> 01:01:39.800 +This is your host, Michael Kennedy. + +01:01:39.980 --> 01:01:41.300 +Thank you so much for listening. + +01:01:41.300 --> 01:01:42.300 +I really appreciate it. + +01:01:42.660 --> 01:01:43.480 +I'll see you next time. + +01:02:11.300 --> 01:02:11.960 +Bye. + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:11.960 + + +01:02:11.960 --> 01:02:12.960 +Thank you. diff --git a/youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt b/youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt new file mode 100644 index 0000000..85abfe2 --- /dev/null +++ b/youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt @@ -0,0 +1,2959 @@ +WEBVTT + +00:00:00.880 --> 00:00:03.560 +Martin, welcome to Talk Python To Me. Great to have you here. + +00:00:04.280 --> 00:00:05.000 +Thanks for having me. + +00:00:06.080 --> 00:00:16.180 +I'm excited to talk about static sites and the next big platform for building them here in Python and beyond. + +00:00:16.660 --> 00:00:20.680 +So really excited to talk about Zensicle. Am I saying that right? + +00:00:21.300 --> 00:00:22.820 +Yeah, pretty much. Zensicle. + +00:00:23.480 --> 00:00:25.440 +Zensicle. Okay. Great. + +00:00:25.440 --> 00:00:32.140 +Yeah, I know MKDocs, the material for MKDocs has been really, really popular. + +00:00:33.040 --> 00:00:37.460 +And you all have made a big splash announcing this new project. + +00:00:38.020 --> 00:00:40.100 +So I'm really looking forward to diving into it. + +00:00:40.400 --> 00:00:44.420 +Before we do, though, let's just get a little bit of background on you. + +00:00:44.540 --> 00:00:45.020 +Who is Martin? + +00:00:45.660 --> 00:00:47.960 +Of course. So hi, my name is Martin Donut. + +00:00:48.760 --> 00:00:51.180 +Most people probably know me as Squidfunk. + +00:00:51.180 --> 00:00:56.300 +I've been an independent developer and consultant for the last 20 years now. + +00:00:56.960 --> 00:01:01.880 +And I mostly write in TypeScript, Python, and lately a lot of Rust. + +00:01:02.080 --> 00:01:04.380 +So I've become a huge fan of Rust, actually. + +00:01:05.280 --> 00:01:06.960 +I'm kind of a free spirit. + +00:01:07.400 --> 00:01:12.460 +So I love doing my own thing and building products from front to back, basically. + +00:01:12.780 --> 00:01:14.640 +So doing the front end as well as the back end. + +00:01:15.400 --> 00:01:18.940 +And for the past 15 years, I contributed a lot to open source. + +00:01:18.940 --> 00:01:24.020 +As I already mentioned, my most popular project so far is Material for MKDocs. + +00:01:25.000 --> 00:01:31.960 +And it's, well, millions of people basically look at sites that are built with it every day. + +00:01:32.580 --> 00:01:36.540 +Yeah, well, and Zanzico, my latest project, will hopefully go far beyond that. + +00:01:36.600 --> 00:01:37.800 +So we're working very hard on it. + +00:01:38.020 --> 00:01:39.180 +And this is why I'm here today. + +00:01:39.420 --> 00:01:41.020 +So excited to talk about it. + +00:01:42.160 --> 00:01:43.360 +Yeah, I am as well. + +00:01:43.920 --> 00:01:48.660 +And let's just start by admiring your website a little bit. + +00:01:48.660 --> 00:01:49.140 +Thanks. + +00:01:50.860 --> 00:01:54.500 +Brian and I spoke about this over on our Python Bytes podcast. + +00:01:55.800 --> 00:02:00.560 +And we kind of just got distracted just staring at the website. + +00:02:00.720 --> 00:02:05.000 +It's this beautiful flow of, I don't know, colors. + +00:02:05.120 --> 00:02:09.520 +It looks a little bit like a black hole worm, a white wormhole sort of experience. + +00:02:09.640 --> 00:02:09.940 +I don't know. + +00:02:10.000 --> 00:02:13.500 +What was the inspiration there at this cool design? + +00:02:14.220 --> 00:02:16.260 +Yeah, this is actually a strange attractor. + +00:02:16.420 --> 00:02:17.800 +So this is something from physics. + +00:02:18.420 --> 00:02:20.620 +I'm not very, very proficient in physics. + +00:02:20.620 --> 00:02:25.580 +But those strange attractors, I had a fascination for them for a very long time. + +00:02:26.360 --> 00:02:28.480 +And they follow very simple rules. + +00:02:28.640 --> 00:02:34.860 +So it's just three equations that define how their points move in three-dimensional space. + +00:02:36.140 --> 00:02:40.760 +And yeah, but still with those simple rules, a very complex shape can emerge. + +00:02:40.760 --> 00:02:47.000 +And this is for us actually symbolizes the process of evolving ideas through writing. + +00:02:47.220 --> 00:02:55.320 +So if you have slightly different conditions from the start, it's still orbiting around the same shape. + +00:02:55.320 --> 00:02:56.940 +But it might look a little bit different. + +00:02:56.940 --> 00:03:00.860 +And there's actually, I can share this now, there's actually a little Easter egg. + +00:03:00.940 --> 00:03:02.220 +Nobody has found it so far. + +00:03:02.440 --> 00:03:15.640 +So if you hover over the homepage on zensicle.org with the mouse in the left bottom corner, you can actually change the coefficients of the animation. + +00:03:16.080 --> 00:03:20.280 +And if you do this, you can click on them and then you can use your cursor. + +00:03:20.280 --> 00:03:21.920 +I'm changing beta. + +00:03:22.120 --> 00:03:24.420 +We're running beta 0.22 right now. + +00:03:24.420 --> 00:03:26.220 +Oh, it really does change it. + +00:03:26.300 --> 00:03:26.500 +Yeah. + +00:03:26.580 --> 00:03:27.220 +Oh, my goodness. + +00:03:27.820 --> 00:03:27.960 +Yeah. + +00:03:28.080 --> 00:03:32.320 +So it takes a little time. + +00:03:32.560 --> 00:03:38.860 +But if you change the coefficients in a specific way, it might be completely chaotic and become unstable. + +00:03:39.300 --> 00:03:42.460 +So this is what I really find fascinating about those strange attractors. + +00:03:43.180 --> 00:03:44.720 +And it's also the inspiration for the logo. + +00:03:45.480 --> 00:03:47.940 +So we're building on this image a lot. + +00:03:49.840 --> 00:03:50.280 +Okay. + +00:03:50.400 --> 00:03:51.660 +I thought it was just a cool design. + +00:03:51.660 --> 00:03:56.400 +I didn't realize it had all this meaning and actual math and physics behind it. + +00:03:56.480 --> 00:03:57.360 +That's super cool. + +00:03:57.420 --> 00:03:57.500 +Yeah. + +00:03:57.500 --> 00:04:02.680 +I love chaos theory and all of this, these fractal type of ideas here. + +00:04:02.960 --> 00:04:04.600 +And yeah, it's super neat. + +00:04:05.780 --> 00:04:06.180 +Okay. + +00:04:06.260 --> 00:04:08.380 +So what is zensicle? + +00:04:09.120 --> 00:04:10.060 +Why did you build it? + +00:04:10.100 --> 00:04:11.400 +Why not just more material? + +00:04:11.400 --> 00:04:14.980 +So there are a lot of questions in there, actually. + +00:04:15.320 --> 00:04:18.820 +Maybe let me just start by shortly speaking about what it is. + +00:04:19.140 --> 00:04:24.420 +So in very simple terms, it's a tool to build beautiful websites from a folder of text files. + +00:04:24.780 --> 00:04:28.420 +So you just write in Markdown and can generate a static site. + +00:04:29.020 --> 00:04:30.360 +You don't need a database for it. + +00:04:30.440 --> 00:04:34.980 +So to those that don't know what a static site is, you don't need a database or server. + +00:04:34.980 --> 00:04:40.860 +It's just static HTML, which means you just pip install zensicle and you're ready to go within a few minutes. + +00:04:41.560 --> 00:04:43.840 +And it's fully open source, MIT licensed. + +00:04:44.560 --> 00:04:48.520 +And to maybe explain a little bit more about static sites. + +00:04:48.680 --> 00:04:52.040 +So the big benefit of it, you can host it for free in many places. + +00:04:52.040 --> 00:04:54.400 +For instance, on GitHub Pages or Cloudflare. + +00:04:54.400 --> 00:04:58.880 +And they're secure and fast by default because there's only static file serving involved. + +00:04:59.480 --> 00:04:59.900 +And zensicle. + +00:05:00.060 --> 00:05:08.280 +So we try to make it pretty with a modern design, many built-in features and fun, according to the feedback of our users, which is kind of unusual for writing documentation. + +00:05:08.820 --> 00:05:09.880 +So, yeah. + +00:05:11.240 --> 00:05:11.720 +Yeah. + +00:05:11.900 --> 00:05:12.540 +Very cool. + +00:05:12.540 --> 00:05:26.280 +And if anyone's tried to manually create a static site, it quickly becomes a challenge if you're just writing. + +00:05:26.860 --> 00:05:29.200 +I say, hey, it's only five HTML pages. + +00:05:29.280 --> 00:05:30.700 +I can just write the HTML. + +00:05:30.920 --> 00:05:31.480 +You know what I mean? + +00:05:32.780 --> 00:05:37.740 +But, well, what if you want to have common navigation or you want to change the look and feel? + +00:05:38.620 --> 00:05:41.800 +Oh, well, now I've got to go edit that in five places, right? + +00:05:41.800 --> 00:05:53.320 +And so if even just beyond, basically beyond one page, having something that generates the static site is super valuable, right? + +00:05:53.360 --> 00:06:01.640 +Because it'll generate the wrapper navigation, the common CSS, the footer, all those kinds of things, right? + +00:06:02.720 --> 00:06:03.160 +Yes. + +00:06:03.240 --> 00:06:04.600 +So it depends on what you want to do. + +00:06:04.600 --> 00:06:11.360 +So, of course, if you have a small site, like a personal website or so, you can just write basic HTML if you're proficient in it. + +00:06:11.360 --> 00:06:18.540 +For instance, the users of Material, only 7% of them are front-end developers. + +00:06:19.980 --> 00:06:23.660 +We will dive a little bit into how Zensicle relates to Material later. + +00:06:23.660 --> 00:06:29.540 +And what Zensicle is being used for primarily is for documentation. + +00:06:29.920 --> 00:06:37.360 +So it builds on the Doccess code philosophy, which means that you treat your documentation exactly like your source code. + +00:06:37.440 --> 00:06:38.860 +So you primarily write documentation. + +00:06:39.080 --> 00:06:43.380 +You don't want to fight front-end development problems. + +00:06:43.720 --> 00:06:46.820 +You just want to keep the content, like get the content out. + +00:06:46.820 --> 00:06:59.200 +And with this Doccess code, what the cool thing about it is you can use the same tools and processes and workflows like you use for code, like versioning and PRs to make changes. + +00:06:59.820 --> 00:07:07.620 +And the adoption is growing really fast, actually, among companies in recent years as they're moving away from proprietary tools to open-source solutions. + +00:07:07.620 --> 00:07:15.360 +So Zensicle is for you or a static site generator in general is for you if you just want to get your writing out. + +00:07:15.780 --> 00:07:19.340 +And, of course, you can also customize it and make it pretty as you want. + +00:07:19.640 --> 00:07:24.240 +But you don't necessarily need to know HTML, CSS, and JavaScript. + +00:07:24.560 --> 00:07:26.520 +And that's quite practical. + +00:07:26.520 --> 00:07:32.100 +And you talked about writing, and you even have your metaphor with strange attractors. + +00:07:33.940 --> 00:07:42.980 +I personally find if I'm just in a clean space where it's really just about the ideas, I don't have to worry about the design. + +00:07:43.260 --> 00:07:47.620 +It makes it so much easier to just focus on the actual writing. + +00:07:47.860 --> 00:07:49.180 +You're in a markdown editor. + +00:07:49.700 --> 00:07:54.560 +My favorite is Type Hora, but you can use whatever variety that you want, right? + +00:07:54.560 --> 00:07:56.400 +And you're just there. + +00:07:56.700 --> 00:07:59.160 +You're not worried even hardly about the formatting of the markdown. + +00:07:59.260 --> 00:07:59.860 +You're just writing. + +00:08:00.160 --> 00:08:04.440 +And I find that very good creative space, I guess. + +00:08:06.080 --> 00:08:07.540 +Yeah, that's the beauty of markdown. + +00:08:07.900 --> 00:08:11.320 +So you can just write, as you mentioned. + +00:08:11.600 --> 00:08:16.140 +And how you, in the end, use it, you can still decide that afterwards. + +00:08:16.260 --> 00:08:19.020 +So if you want to build a website, if you want to create a PDF of it, + +00:08:19.560 --> 00:08:21.940 +if you just want to use it for internal note-taking or so. + +00:08:21.940 --> 00:08:26.760 +And this is the big benefit of markdown. + +00:08:26.860 --> 00:08:34.640 +It takes away a lot of the headache of having to remember a lot of markup in order to get your ideas out of the door. + +00:08:35.700 --> 00:08:38.520 +Can you actually put markup in it if you need to? + +00:08:39.000 --> 00:08:46.340 +For example, maybe you need a particular image, two of them side by side that are links, + +00:08:46.340 --> 00:08:48.780 +and you want them to open in a new tab if somebody clicks them. + +00:08:49.220 --> 00:08:53.620 +Can you set it into basically an unsafe mode and let it do embedded markup? + +00:08:54.620 --> 00:08:55.800 +Yeah, that's a great question. + +00:08:56.320 --> 00:08:57.380 +So, yes, it's possible. + +00:08:57.520 --> 00:09:00.040 +You can just use HTML within markdown. + +00:09:00.260 --> 00:09:04.460 +We currently depend on Python markdown, which we inherited from material for MKDocs. + +00:09:04.460 --> 00:09:10.480 +We are gradually moving towards common mark, which, so just as a context, + +00:09:10.680 --> 00:09:14.860 +Python markdown has some oddities when you use HTML within markdown. + +00:09:14.960 --> 00:09:19.600 +For instance, it won't replace relative URLs correctly. + +00:09:19.720 --> 00:09:21.200 +This is like an annoying thing. + +00:09:21.200 --> 00:09:28.520 +But once we move to common mark, we will also have predefined components that you can use + +00:09:28.520 --> 00:09:33.680 +because you can't express everything, like more complex things in plain markdown. + +00:09:33.860 --> 00:09:38.180 +So there are only things like you can make text bold, you can have lists, tables, et cetera. + +00:09:38.240 --> 00:09:45.600 +But if it's more complex, as you mentioned, aligning to images or having an image with a caption or so, + +00:09:45.960 --> 00:09:47.200 +you need basically HTML. + +00:09:47.200 --> 00:09:47.720 +HTML. + +00:09:48.280 --> 00:09:52.000 +And this is possible already, but we will make it much easier in the future. + +00:09:52.220 --> 00:09:54.020 +The frontend world already knows this. + +00:09:54.300 --> 00:09:55.200 +So they use MDX. + +00:09:55.260 --> 00:09:59.480 +They've been using MDX for quite a while, which is a dialect on top of markdown, + +00:09:59.860 --> 00:10:04.380 +which adds more liberty with components and so on. + +00:10:04.380 --> 00:10:07.140 +So you can create reusable components that you can use. + +00:10:08.400 --> 00:10:08.520 +Yeah. + +00:10:08.780 --> 00:10:09.980 +But, yeah. + +00:10:10.200 --> 00:10:11.540 +So it's possible. + +00:10:12.360 --> 00:10:15.500 +It's our users already also do it. + +00:10:15.500 --> 00:10:20.040 +We also have some examples on the documentation, and we will make it much more powerful in the future. + +00:10:20.980 --> 00:10:21.180 +Yeah. + +00:10:21.380 --> 00:10:21.860 +Very nice. + +00:10:22.240 --> 00:10:28.220 +I do think regular markdown is just a few missing things. + +00:10:28.320 --> 00:10:29.720 +I love the simplicity of it. + +00:10:30.200 --> 00:10:32.800 +And hat tips, John Gruber, for creating it. + +00:10:32.800 --> 00:10:40.360 +But it's just like, I just need to maybe put a class here or just do a little, if I could just control this a little bit more, + +00:10:40.580 --> 00:10:43.560 +then you could basically escape HTML. + +00:10:43.900 --> 00:10:49.260 +With obviously being careful to not just recreate HTML with square brackets instead of angle brackets, right? + +00:10:49.260 --> 00:10:52.420 +Yeah, there's been a lot of work on Python markdown. + +00:10:52.500 --> 00:10:57.860 +So in Python markdown, there are some extensions that allow you to add classes at least to block elements. + +00:10:58.280 --> 00:11:03.180 +So on markdown, you need to distinguish between inline and block elements. + +00:11:03.320 --> 00:11:04.080 +Oh, no, it also works. + +00:11:04.080 --> 00:11:04.240 +Sorry. + +00:11:04.300 --> 00:11:06.480 +It also works on inline elements like links and so on. + +00:11:06.980 --> 00:11:08.300 +But this is special syntax. + +00:11:08.300 --> 00:11:12.280 +So Python markdown is a dialect that is not standardized like common mark. + +00:11:12.420 --> 00:11:16.240 +In common mark, this is not easily possible to add specific classes. + +00:11:16.580 --> 00:11:21.200 +But with common mark, as I mentioned, you have MDX, which is a de facto standard. + +00:11:21.320 --> 00:11:23.080 +I don't know if they've standardized it already. + +00:11:23.840 --> 00:11:25.320 +That allows for much, much more. + +00:11:26.320 --> 00:11:26.800 +Nice. + +00:11:28.320 --> 00:11:31.400 +So what is Zensicle for? + +00:11:31.520 --> 00:11:34.980 +Is this a documentation generating tool? + +00:11:34.980 --> 00:11:39.860 +Is it a just open-ended static site generator? + +00:11:41.400 --> 00:11:47.340 +What is possible and what is your goal or your target with this project? + +00:11:49.500 --> 00:11:53.260 +Yeah, so as I mentioned right now, we're focusing on documentation. + +00:11:53.660 --> 00:11:56.200 +So because this is the thing we're coming from. + +00:11:56.700 --> 00:11:59.240 +But we're building Zensicle for much, much more. + +00:11:59.240 --> 00:12:05.980 +So our stretch goal is to have a fully-fledged knowledge management and documentation solution. + +00:12:06.840 --> 00:12:11.120 +There are already a lot of companies that use it internally for knowledge management. + +00:12:12.080 --> 00:12:16.480 +Basically, there's an alternative to a ZaaS-based solution like Confluence and Notion. + +00:12:16.900 --> 00:12:19.360 +We are aware that for this, we need WYSIWYG. + +00:12:19.480 --> 00:12:20.760 +So what you see is what you get. + +00:12:20.880 --> 00:12:23.240 +A visual editor that is also usable by non-technicals. + +00:12:23.240 --> 00:12:30.580 +And if you scroll, if you check out our roadmap and scroll down all the way, you will see it as a stretch goal. + +00:12:31.220 --> 00:12:34.580 +Which is basically something we're working towards. + +00:12:35.160 --> 00:12:41.040 +Because this would actually allow so much more people within organizations to use it. + +00:12:41.040 --> 00:12:52.880 +And in general, Zensicle, with Zensicle, we focus on three key areas that make us different from other static site generators. + +00:12:52.880 --> 00:12:55.440 +Which is, well, a modern design. + +00:12:55.540 --> 00:12:57.580 +So, of course, some also have a modern design. + +00:12:57.740 --> 00:13:04.740 +But within the Python ecosystem, some options might look a little bit dated. + +00:13:04.740 --> 00:13:08.880 +So we try to be a little bit more on the edge, actually. + +00:13:09.580 --> 00:13:12.280 +And it should be flexible and it should be fast. + +00:13:12.340 --> 00:13:13.340 +So those three things. + +00:13:13.420 --> 00:13:17.880 +Because the design, actually, is the thing that people notice first. + +00:13:18.520 --> 00:13:22.860 +So what we offer is a design that is customizable, brandable. + +00:13:23.040 --> 00:13:26.740 +You have tons of options with which you can change how navigation is laid out. + +00:13:28.200 --> 00:13:30.460 +You can also change colors, fonts, etc. + +00:13:30.460 --> 00:13:36.300 +And we have a lot of components that make it ready for technical writing. + +00:13:36.400 --> 00:13:38.520 +As you mentioned, you just want to start writing. + +00:13:38.860 --> 00:13:41.560 +So we have stuff like admonitions, tabs. + +00:13:42.160 --> 00:13:48.020 +And one very specific feature that we have is code annotations that we inherited from Material for MCADOX. + +00:13:48.080 --> 00:13:50.080 +Which is quite unique among static site generators. + +00:13:50.080 --> 00:13:57.460 +Which allows you to put a little bubble onto any line of code. + +00:13:57.960 --> 00:13:59.420 +You have to visit our documentation. + +00:13:59.420 --> 00:14:03.680 +This is our, you're currently browsing our, the other site. + +00:14:04.000 --> 00:14:04.660 +All right, all right. + +00:14:04.660 --> 00:14:05.000 +Hold on. + +00:14:05.140 --> 00:14:05.600 +I got it. + +00:14:05.640 --> 00:14:05.980 +Keep going. + +00:14:06.040 --> 00:14:06.780 +I'll get to stay. + +00:14:07.220 --> 00:14:07.680 +Right, right. + +00:14:07.720 --> 00:14:08.120 +No worries. + +00:14:08.640 --> 00:14:08.820 +Yeah. + +00:14:09.120 --> 00:14:11.140 +And there you have to search for code annotations. + +00:14:11.800 --> 00:14:19.160 +Yeah, so code annotations, which allow you to create a bubble in any line of code. + +00:14:19.360 --> 00:14:22.060 +And if you click that bubble, there opens a tooltip. + +00:14:22.140 --> 00:14:24.180 +And within this tooltip, you can use any rich content. + +00:14:24.180 --> 00:14:28.360 +So you can have lists, any formatted markdown tables, diagrams. + +00:14:29.620 --> 00:14:34.160 +Basically anything you can use anyway within markdown. + +00:14:34.680 --> 00:14:36.500 +And this is a very popular feature in Material. + +00:14:36.880 --> 00:14:38.840 +And so, of course, we brought it over. + +00:14:39.340 --> 00:14:41.080 +So users can still use it. + +00:14:41.080 --> 00:14:44.940 +So the second thing I talked about is it should be flexible. + +00:14:45.140 --> 00:14:47.600 +So what makes Zensicle different is we have a modular architecture. + +00:14:48.020 --> 00:14:50.060 +Or say we're working towards a modular architecture. + +00:14:50.240 --> 00:14:51.760 +We're still a little, we're still in alpha. + +00:14:51.760 --> 00:14:55.540 +So we're close to finishing the module system. + +00:14:56.620 --> 00:15:01.620 +And in Zensicle, it's modules all the way down, which means all core functionality is implemented as modules, + +00:15:01.620 --> 00:15:09.680 +which is different from other solutions where the plugin system sometimes is more or less an afterthought. + +00:15:09.860 --> 00:15:14.420 +So there's a plugin system added with specific hooks, extension points where you can hook into. + +00:15:14.420 --> 00:15:23.680 +And this might seem sufficient at first, but in the end, so for us, for instance, MKDocs in the end was a little bit limiting. + +00:15:24.260 --> 00:15:28.320 +And this allows you to basically swap, extend, replace all modules. + +00:15:28.480 --> 00:15:29.480 +You can use our modules. + +00:15:29.700 --> 00:15:31.660 +You can write your own, pull in third-party modules. + +00:15:31.980 --> 00:15:33.660 +And as I mentioned, Rust. + +00:15:33.860 --> 00:15:34.620 +So don't worry. + +00:15:34.820 --> 00:15:35.820 +You don't need to learn Rust. + +00:15:36.120 --> 00:15:42.100 +You will also be able to write modules in Python because we are super happy users of Pyro 3, which is absolutely amazing library. + +00:15:42.100 --> 00:15:48.280 +And Pyro 3 has really become a super important foundation of Python these days. + +00:15:48.400 --> 00:15:52.760 +It's almost like the C bindings for CPython. + +00:15:53.500 --> 00:15:53.740 +Exactly. + +00:15:54.200 --> 00:15:55.400 +So, yeah. + +00:15:55.560 --> 00:15:58.760 +So with Pyro 3, it allows us to have a Rust runtime. + +00:15:59.280 --> 00:16:07.000 +So all of the orchestration and how, in which order, so in which order things are run, threading, caching, parallelization, et cetera, + +00:16:07.000 --> 00:16:08.140 +all is happening in Rust. + +00:16:08.140 --> 00:16:13.980 +And we will provide Python binding so that you still can use Python to write modules. + +00:16:14.380 --> 00:16:16.000 +And they're still running fast. + +00:16:16.620 --> 00:16:16.740 +Yeah. + +00:16:16.920 --> 00:16:19.200 +Which brings me to the last point where we're different. + +00:16:19.460 --> 00:16:21.500 +We have a very heavy focus on performance. + +00:16:21.800 --> 00:16:29.520 +So our goal is to let you start with one page because, of course, all documentation sites or projects start small. + +00:16:29.820 --> 00:16:32.860 +And let you scale that to something like 100,000 pages. + +00:16:32.860 --> 00:16:36.160 +How we do it is through differential builds. + +00:16:36.360 --> 00:16:39.740 +We have created our own runtime, which is called ZRX. + +00:16:40.340 --> 00:16:43.720 +And differential builds mean that we are only rebuilding what changed. + +00:16:43.800 --> 00:16:49.740 +So, for instance, if you only change the page title, only that page and all instances where the page title is used are being rebuilt. + +00:16:49.940 --> 00:16:53.060 +And this means that changes are visible in milliseconds and not minutes. + +00:16:53.060 --> 00:16:53.460 +Yeah. + +00:16:54.260 --> 00:16:54.660 +Yeah. + +00:16:55.940 --> 00:16:56.860 +That's super cool. + +00:16:57.860 --> 00:17:01.540 +And so I'm presuming the build system itself is Rust-based, right? + +00:17:02.220 --> 00:17:02.720 +Yeah, exactly. + +00:17:02.840 --> 00:17:03.940 +It's 100% Rust, yeah. + +00:17:04.440 --> 00:17:05.000 +Yeah, yeah. + +00:17:06.220 --> 00:17:10.600 +Coming from a Python background, what was that experience like building that? + +00:17:10.600 --> 00:17:20.440 +Yeah, so that's kind of a tricky question because I'm not really coming from a long history of a Python. + +00:17:20.440 --> 00:17:22.700 +So I don't have a long Python background. + +00:17:23.500 --> 00:17:25.480 +I wrote many in TypeScript. + +00:17:26.240 --> 00:17:30.240 +And I only started 2021 writing Python. + +00:17:31.020 --> 00:17:36.300 +So this is actually the history, how materials started and how all of this unfolded. + +00:17:36.300 --> 00:17:40.300 +But I've written in several languages. + +00:17:41.240 --> 00:17:44.840 +So I also have written in C, Erlang, Ruby, Python, TypeScript. + +00:17:45.660 --> 00:17:47.180 +Rust was still extremely hard to learn. + +00:17:47.580 --> 00:17:51.040 +So I basically banged my head against the keyboard for a month. + +00:17:51.140 --> 00:17:54.500 +Wasn't making no progress at all because, yeah, you know, fighting with the borrow checker. + +00:17:54.920 --> 00:18:02.120 +So and once you get past that and then, of course, lifetimes and higher rank trade bounds and some other features, + +00:18:02.120 --> 00:18:11.340 +I'm now some kind of like 3,000 or 4,000 hours in, something like that, it gets really good. + +00:18:11.340 --> 00:18:20.860 +So I think Rust is seriously one of the best languages ever made because it allows you to express ideas extremely clearly with extreme clarity. + +00:18:21.860 --> 00:18:28.760 +And this is due to the very good type system, of course, and you get bare metal performance. + +00:18:28.760 --> 00:18:36.700 +And so I find it kind of insane having a language like Rust because it's so easy to write once you're used to it. + +00:18:36.700 --> 00:18:42.020 +You will be very productive and still have bare metal performance. + +00:18:42.440 --> 00:18:43.280 +It's completely insane. + +00:18:44.100 --> 00:18:44.780 +Yeah, that's wild. + +00:18:45.060 --> 00:18:50.820 +But it's got a little bit of a learning curve compared to like Python or TypeScript or something like that. + +00:18:51.840 --> 00:18:52.140 +Yeah. + +00:18:52.140 --> 00:18:57.160 +So I had, I think, 18 years of experience with many languages. + +00:18:58.140 --> 00:19:03.660 +As I mentioned, I also did a lot of C and I still found it very hard to learn. + +00:19:04.460 --> 00:19:04.660 +Yeah. + +00:19:04.660 --> 00:19:07.580 +But it's worth it. + +00:19:07.960 --> 00:19:08.820 +It's worth it. + +00:19:08.880 --> 00:19:14.860 +And my recommendation probably would be to learn it on something that you really care about, + +00:19:14.860 --> 00:19:23.720 +so that you want to build because otherwise you will probably lose the drive since you're running against those walls. + +00:19:24.300 --> 00:19:28.360 +Maybe for you or for somebody else, it's much easier to learn. + +00:19:28.660 --> 00:19:33.300 +So maybe it's just I'm a bad example that I needed so long. + +00:19:33.400 --> 00:19:33.660 +I don't know. + +00:19:33.860 --> 00:19:38.940 +But because after that month, it wasn't that I was completely up to speed. + +00:19:38.940 --> 00:19:45.920 +So it was just I was making very, very tiny progress, at least progress, because for a month I wasn't making progress at all. + +00:19:47.020 --> 00:19:47.580 +Yeah. + +00:19:47.680 --> 00:19:47.840 +Wow. + +00:19:49.780 --> 00:19:57.000 +The next show that I'm doing after this one, which actually is in real clock time, wall time, + +00:19:57.140 --> 00:20:02.240 +it's happening in like two hours or less from now is with Samuel Colvin from Pydantic. + +00:20:03.380 --> 00:20:03.500 +Yeah. + +00:20:03.580 --> 00:20:07.020 +Talking about Monty, a Python runtime. + +00:20:07.020 --> 00:20:12.640 +He and his team are rewriting in Rust, specifically targeting AI. + +00:20:12.900 --> 00:20:15.040 +So the Rust theme will continue. + +00:20:15.320 --> 00:20:21.400 +It's definitely a very – it caught me a little bit off guard, like how much people love it. + +00:20:21.480 --> 00:20:29.220 +But it's also – it makes perfect sense that we want this nice modern language for writing lower level things, + +00:20:29.580 --> 00:20:31.560 +even if it plugs into Python, right? + +00:20:31.560 --> 00:20:31.960 +Yeah. + +00:20:32.160 --> 00:20:32.560 +Yeah. + +00:20:32.640 --> 00:20:39.360 +So the fun thing is I also talked to Samuel a long time ago, and he was the one recommending to me to write it in Rust. + +00:20:40.100 --> 00:20:40.420 +Okay. + +00:20:40.980 --> 00:20:41.760 +So it's his fault. + +00:20:42.180 --> 00:20:45.900 +It's one of the reasons I – yeah, definitely I looked into it. + +00:20:46.920 --> 00:20:47.180 +Nice. + +00:20:47.240 --> 00:20:47.420 +Okay. + +00:20:47.560 --> 00:20:51.780 +And it made a lot of sense also during the time, the progress we're making and so on, + +00:20:51.780 --> 00:20:55.280 +and the walls we're hitting, that's to reconsider learning Rust. + +00:20:56.400 --> 00:20:57.200 +Best investment. + +00:20:57.880 --> 00:20:58.220 +Yeah. + +00:20:58.400 --> 00:20:58.800 +Amazing. + +00:20:59.160 --> 00:20:59.520 +Amazing. + +00:21:00.140 --> 00:21:04.700 +So I want to dig into your component structure and some of those things. + +00:21:04.800 --> 00:21:08.720 +But maybe before we do, let's talk about the origins a little bit. + +00:21:08.720 --> 00:21:14.340 +But so let's talk about how you went from material for MKDocs. + +00:21:15.740 --> 00:21:17.100 +Why even change? + +00:21:17.160 --> 00:21:18.800 +Why not just more material? + +00:21:20.660 --> 00:21:21.100 +Yeah. + +00:21:21.200 --> 00:21:25.140 +So this is a great question, and this is a little bit of a story. + +00:21:25.240 --> 00:21:26.980 +So there are several stories in there, actually. + +00:21:27.220 --> 00:21:27.360 +Yeah. + +00:21:27.360 --> 00:21:29.500 +So it's 10 years. + +00:21:29.640 --> 00:21:35.200 +I try to go make it as compact as possible while keeping the most important things. + +00:21:35.480 --> 00:21:39.740 +So to those who don't know, material for MKDocs is a very popular documentation framework. + +00:21:39.860 --> 00:21:41.420 +It's used by tens of thousands of projects. + +00:21:42.040 --> 00:21:45.080 +There are prominent users like AWS, Microsoft, OpenAI. + +00:21:46.320 --> 00:21:51.340 +Also, large open source projects use it, like, for instance, FastAPI, uv, Knative. + +00:21:51.620 --> 00:21:57.080 +And it's built on top of MKDocs, as the name says, which became one of the most popular aesthetic side generators. + +00:21:57.360 --> 00:21:59.720 +And it also eventually became my job. + +00:22:00.100 --> 00:22:02.220 +So I could make it my job. + +00:22:02.280 --> 00:22:05.740 +I could work in open source and earn a living somehow. + +00:22:06.040 --> 00:22:07.900 +I'm getting there how that worked. + +00:22:09.320 --> 00:22:12.700 +But at some point, we needed a new foundation. + +00:22:13.120 --> 00:22:17.180 +We've kind of outgrown MKDocs because it was not evolving at the pace that we needed. + +00:22:17.380 --> 00:22:18.920 +So we began exploring alternatives. + +00:22:19.560 --> 00:22:21.000 +And, yeah. + +00:22:21.100 --> 00:22:23.580 +So there's a lot of lessons learned in material. + +00:22:23.580 --> 00:22:27.320 +So let me shortly maybe talk about how it started. + +00:22:27.860 --> 00:22:32.180 +Because it started as a side project in 2015, like many things start. + +00:22:32.420 --> 00:22:38.740 +Because I wanted to release actually a C library, a zero-copy protocol buffers library I wrote called Protobluff. + +00:22:39.340 --> 00:22:42.480 +But then I realized that it needed more than a readme. + +00:22:42.660 --> 00:22:48.620 +So I looked at the existing aesthetic side generators, which were Hugo, Jekyll, Sphinx, MKDocs, something like that. + +00:22:49.240 --> 00:22:50.720 +And they all looked a little bit dated. + +00:22:51.340 --> 00:22:52.240 +I'm not a designer. + +00:22:52.440 --> 00:22:54.180 +But I wanted something more modern. + +00:22:54.320 --> 00:22:59.680 +And Google was pushing material design quite hard for app development at the time. + +00:23:00.060 --> 00:23:02.460 +And I've also seen it being used in the web. + +00:23:02.580 --> 00:23:04.340 +So I thought, well, maybe combine this. + +00:23:05.300 --> 00:23:06.980 +I quickly settled on MKDocs. + +00:23:07.040 --> 00:23:07.700 +It was easy to use. + +00:23:07.760 --> 00:23:08.360 +Simple templating. + +00:23:09.560 --> 00:23:10.920 +Enough for a side project, basically. + +00:23:11.120 --> 00:23:12.020 +So it was a side project. + +00:23:12.540 --> 00:23:13.440 +Did what most devs do. + +00:23:13.440 --> 00:23:14.360 +Check the license. + +00:23:14.360 --> 00:23:17.100 +But didn't do any further due diligence. + +00:23:17.560 --> 00:23:21.820 +So even put MKDocs in the name to show the connection, which is common for themes. + +00:23:22.080 --> 00:23:26.300 +And that actually turned out to be one of the biggest decisions I made in my career. + +00:23:26.520 --> 00:23:30.620 +Since I was basing my complete work on something I don't control. + +00:23:31.320 --> 00:23:35.520 +And it shaped the next 10 years of all of the work I was doing. + +00:23:35.600 --> 00:23:38.660 +And it's actually the reason why Zensical exists today. + +00:23:39.720 --> 00:23:39.980 +I see. + +00:23:39.980 --> 00:23:47.140 +So after I started developing it, I, like nine months later, released the first version. + +00:23:47.280 --> 00:23:48.040 +And send me good users. + +00:23:48.220 --> 00:23:49.280 +A lot of feature requests. + +00:23:50.140 --> 00:23:52.140 +And, you know, it was a side project. + +00:23:52.260 --> 00:23:54.380 +So I was doing client work at the time. + +00:23:54.600 --> 00:24:00.240 +As I mentioned, I've been like a consultant and developer, freelancer for 20 years. + +00:24:01.480 --> 00:24:04.080 +And I only had Sundays to work on it. + +00:24:04.520 --> 00:24:06.900 +So which at first was efficient. + +00:24:07.180 --> 00:24:09.700 +But the more popular it got, the more maintenance there came. + +00:24:09.700 --> 00:24:12.920 +So it kind of crept into my mornings and evenings. + +00:24:12.920 --> 00:24:19.300 +And I was doing triage, like answering questions and trying to fix bugs before I went to the client. + +00:24:19.300 --> 00:24:22.460 +And it was getting harder and harder to justify in front of my partner, actually. + +00:24:22.460 --> 00:24:25.100 +Because I was doing it in my spare time. + +00:24:25.660 --> 00:24:30.880 +And so I did what eventually all projects that started side projects. + +00:24:30.880 --> 00:24:36.220 +And where you don't have the full time to work on it, how they, yeah. + +00:24:36.300 --> 00:24:39.960 +So what basically happens is you start turning down feature requests. + +00:24:40.320 --> 00:24:42.600 +And many open source projects don't cross this line. + +00:24:42.720 --> 00:24:43.940 +And for me, it was a first. + +00:24:44.520 --> 00:24:46.440 +So, yeah. + +00:24:46.520 --> 00:24:50.940 +And also additionally, so I mentioned before that I started writing Python in 2021. + +00:24:51.320 --> 00:24:52.940 +At the time, I was focusing. + +00:24:53.800 --> 00:24:55.740 +So I only had Sundays to work on it. + +00:24:55.780 --> 00:24:56.480 +I didn't know Python. + +00:24:56.780 --> 00:25:00.080 +So I said that, okay, I will focus on the templating stuff. + +00:25:00.200 --> 00:25:02.640 +I will do the HTML, CSS, JavaScript, all of this, make it beautiful. + +00:25:03.000 --> 00:25:07.300 +And try to solve as much, as many problems as possible in the front end. + +00:25:07.740 --> 00:25:09.760 +But I won't start learning Python. + +00:25:09.840 --> 00:25:13.420 +Because it wasn't a language that I was using at that time. + +00:25:13.420 --> 00:25:16.320 +And I couldn't make up the time for it. + +00:25:16.420 --> 00:25:18.120 +So that's where I drew the line. + +00:25:19.780 --> 00:25:22.340 +It's probably going to be a fad, that Python thing anyway. + +00:25:23.620 --> 00:25:24.780 +I don't think so. + +00:25:25.820 --> 00:25:31.200 +Well, at the time, in 2015, it wasn't clear that it was going to be as popular as it was. + +00:25:31.640 --> 00:25:33.000 +As it is now, right? + +00:25:33.060 --> 00:25:36.060 +It's really, it started to become popular then. + +00:25:36.640 --> 00:25:38.700 +But it's really taken over the world. + +00:25:39.400 --> 00:25:39.840 +Absolutely. + +00:25:40.680 --> 00:25:41.400 +For a lot of reasons. + +00:25:41.400 --> 00:25:43.240 +Of course, yeah. + +00:25:44.040 --> 00:25:49.540 +I think one of the main reasons is because it's very popular in the ML community. + +00:25:49.720 --> 00:25:53.020 +And all of the LLM AI work that's happening and so on made it extremely popular. + +00:25:54.240 --> 00:26:00.660 +And I also think that Rust is doing a very good job on keeping it that way. + +00:26:00.960 --> 00:26:06.320 +Because finally, you have a very easy way to offload work to native code. + +00:26:06.320 --> 00:26:11.240 +Which is much easier than fiddling with C and C++ and void pointers and whatever. + +00:26:11.400 --> 00:26:14.760 +So as I mentioned, Pyro 3 is just an absolutely amazing library. + +00:26:14.880 --> 00:26:16.720 +It's so easy to write Rust code. + +00:26:17.520 --> 00:26:18.520 +Yeah, I think you're right. + +00:26:18.660 --> 00:26:22.640 +I think Rust has really provided an important escape hatch for it. + +00:26:22.680 --> 00:26:23.440 +I wrote it this way. + +00:26:23.500 --> 00:26:24.260 +It's not fast enough. + +00:26:24.440 --> 00:26:27.580 +Like, well, this part, we're going to make it as fast as it can be, basically. + +00:26:28.280 --> 00:26:28.460 +Yeah. + +00:26:29.460 --> 00:26:29.820 +Yeah. + +00:26:30.100 --> 00:26:30.940 +So... + +00:26:30.940 --> 00:26:32.600 +Sorry, I interrupted you. + +00:26:32.640 --> 00:26:32.900 +Keep going. + +00:26:32.980 --> 00:26:33.440 +No worries. + +00:26:33.540 --> 00:26:33.820 +No worries. + +00:26:34.080 --> 00:26:34.640 +Yeah, no, no. + +00:26:35.220 --> 00:26:38.960 +Yeah, so as I mentioned, I tried to keep it basically afloat for the first four years. + +00:26:40.080 --> 00:26:42.560 +And at the time, I didn't see the potential at all. + +00:26:42.680 --> 00:26:45.460 +It was just a theme, not a kind of product or so. + +00:26:45.860 --> 00:26:48.520 +But yet I felt responsible and kept on maintaining it. + +00:26:48.520 --> 00:26:52.220 +And my developer friends didn't understand why I was doing that. + +00:26:52.620 --> 00:26:53.020 +So... + +00:26:53.020 --> 00:26:57.660 +But for me, it was like, you know, it was kind of cool because I had a growing project. + +00:26:57.840 --> 00:26:58.680 +I had no immediate plans. + +00:26:58.800 --> 00:26:59.120 +I don't know. + +00:26:59.420 --> 00:27:02.540 +Let's see where I can take it. + +00:27:03.340 --> 00:27:04.900 +And yeah, so... + +00:27:04.900 --> 00:27:09.000 +And with this steady and slowly growth over years, then companies and organizations started using it. + +00:27:09.000 --> 00:27:17.200 +So they were basing their public-facing documentation on me, like the guy that maybe works on this project on a Sunday. + +00:27:18.160 --> 00:27:23.420 +And yet I felt responsible enough to trying to fix the bugs reported as quickly as possible. + +00:27:24.260 --> 00:27:24.400 +Yeah. + +00:27:25.000 --> 00:27:28.140 +And yeah, then in 2020 actually came the turning point. + +00:27:28.220 --> 00:27:31.460 +So when I was working on version five of it, I shared my progress publicly as I did before. + +00:27:31.520 --> 00:27:33.040 +And somebody mentioned a donate button. + +00:27:33.040 --> 00:27:42.600 +So I think the wording was something like, so that I can order pizza to survive the long Sunday coding sessions. + +00:27:44.040 --> 00:27:50.760 +But I heard from another developer who did this on his project, successful project for five years, a donate button. + +00:27:51.040 --> 00:27:52.260 +And he made $90. + +00:27:52.620 --> 00:27:56.360 +So I immediately said, that's not going to work. + +00:27:56.480 --> 00:27:59.760 +But I said, let's try an Amazon wish list. + +00:27:59.760 --> 00:28:08.400 +You know, I just put some stuff on there and maybe if somebody thinks my work is useful, then he can order me, like, make me a present, something, send me a present. + +00:28:09.240 --> 00:28:13.120 +So, yeah, and I basically received everything on that wish list. + +00:28:13.420 --> 00:28:14.420 +It was completely insane. + +00:28:14.540 --> 00:28:16.760 +So there were two consecutive days that felt like Christmas. + +00:28:17.000 --> 00:28:17.940 +I even put like... + +00:28:17.940 --> 00:28:20.760 +So I put some, you know, books and... + +00:28:21.360 --> 00:28:23.320 +But then also a single malt. + +00:28:23.520 --> 00:28:25.720 +I love Scottish single malt. + +00:28:26.720 --> 00:28:28.720 +It was a whiskey that cost $120. + +00:28:28.720 --> 00:28:30.940 +And I received that as well. + +00:28:31.500 --> 00:28:33.560 +So it was like, what's happening? + +00:28:34.380 --> 00:28:37.320 +And that led me to start thinking actually about demographics. + +00:28:38.020 --> 00:28:43.100 +So that I needed to better understand the audience of material for MKDocs. + +00:28:43.420 --> 00:28:44.580 +And I did a poll. + +00:28:44.880 --> 00:28:46.740 +And the results were absolutely eye-opening. + +00:28:46.980 --> 00:28:52.220 +I mentioned before, 7% only of users are front-end developers. + +00:28:52.480 --> 00:28:53.060 +Which means... + +00:28:53.060 --> 00:28:54.640 +And material is a front-end heavy project. + +00:28:54.640 --> 00:28:59.340 +So I kind of had an edge there in the Python space. + +00:28:59.920 --> 00:29:01.940 +Because, yeah, you know, it's based on Python. + +00:29:02.080 --> 00:29:08.320 +So front-end developers that write in JavaScript, they rather go for something like DocuSaurus or React-based or whatever. + +00:29:09.000 --> 00:29:11.100 +And technical writers were quite happy with the project. + +00:29:11.420 --> 00:29:13.640 +I didn't know even technical writers existed. + +00:29:13.640 --> 00:29:17.560 +So I had no clue that this job, that this is a job. + +00:29:17.860 --> 00:29:21.380 +Because I thought at the time, and it's in hindsight completely naive, of course. + +00:29:21.540 --> 00:29:24.820 +I thought that as a developer, you need to write the documentation, you know. + +00:29:25.300 --> 00:29:30.660 +So I learned about that and accidentally built a product for technical writers. + +00:29:31.240 --> 00:29:36.960 +And by the way, when I say product, I mean something that is not necessarily something you pay for. + +00:29:37.020 --> 00:29:38.600 +But something that doesn't feel engineered. + +00:29:38.600 --> 00:29:44.360 +So something that is like polished and designed and that you actually want to use. + +00:29:45.920 --> 00:29:46.340 +And, yeah. + +00:29:46.580 --> 00:29:51.280 +So I had a product that has like product market fit. + +00:29:51.600 --> 00:29:54.040 +But at the time, I didn't earn any money off it. + +00:29:54.440 --> 00:29:56.600 +So at the same time, I read about SponsorWare. + +00:29:57.340 --> 00:30:00.380 +And this, like, I'm not sure if you heard of it. + +00:30:00.420 --> 00:30:03.680 +But it's like a new model of monetization for open source. + +00:30:03.740 --> 00:30:04.900 +At the time, it was quite new. + +00:30:04.900 --> 00:30:07.560 +So that you can get paid for your work. + +00:30:07.740 --> 00:30:14.760 +So you can, so some developers, for instance, they sell course material or access to gated content or code or nothing at all. + +00:30:14.920 --> 00:30:22.380 +So if you have a popular project, you can just try to raise sponsorships from, and some companies are very generous when it comes to open source. + +00:30:23.000 --> 00:30:28.820 +And what we did with Material was we gave away early access to the latest features to the sponsors. + +00:30:28.820 --> 00:30:31.620 +And each feature was tied to a funding goal. + +00:30:31.720 --> 00:30:35.060 +And when that funding goal was met, it became free for everyone. + +00:30:35.380 --> 00:30:40.860 +So it was like kind of a funded feature development in multiple stages. + +00:30:41.400 --> 00:30:43.420 +And that's what I thought of it. + +00:30:44.320 --> 00:30:44.880 +Sorry? + +00:30:45.100 --> 00:30:45.180 +Yeah. + +00:30:45.760 --> 00:30:46.780 +That's super clever. + +00:30:47.000 --> 00:30:52.480 +I really love the idea of providing something for the sponsors. + +00:30:52.480 --> 00:30:58.600 +But still not turning it into, well, here's a paid version of our product and here's the open source version. + +00:30:58.820 --> 00:31:07.140 +But there's always this tension of how do you reward the people who support you without undermining the open source project? + +00:31:07.340 --> 00:31:08.460 +And that's a clever angle. + +00:31:08.460 --> 00:31:12.020 +Yeah, so that's extremely challenging. + +00:31:12.780 --> 00:31:15.960 +So as I'm telling this, so this is what I came up with. + +00:31:16.180 --> 00:31:18.720 +And I thought maybe it could work, something like that. + +00:31:18.760 --> 00:31:22.660 +And again, my developer friends, they said, well, never work. + +00:31:22.780 --> 00:31:24.340 +Nobody will pay for open source. + +00:31:24.440 --> 00:31:25.020 +You're insane. + +00:31:25.660 --> 00:31:27.200 +Spoiler alert, it did work. + +00:31:27.360 --> 00:31:31.380 +And in the end, we made 200K a year of it and could build a team and everything. + +00:31:31.520 --> 00:31:34.140 +So I know in Silicon Valley terms, this is probably minimum wage. + +00:31:34.140 --> 00:31:39.760 +But in Europe, it's quite an amount with which you can work very well. + +00:31:40.700 --> 00:31:44.900 +And yeah, so I started this program in 2020 and it grew steadily. + +00:31:45.340 --> 00:31:49.900 +And it finally allowed me to work on features outside of the Sunday. + +00:31:50.100 --> 00:31:55.000 +So invest more hours into it and finally learn Python in 2021. + +00:31:55.000 --> 00:32:05.060 +Wrote my first plugin and started hacking the MKDocs features that, well, that got turned down, that we upstreamed. + +00:32:05.140 --> 00:32:08.920 +But where the maintainer said, ah, it's maybe not a good fit or we don't have the time for it. + +00:32:09.520 --> 00:32:12.140 +And yeah, in total, I wrote 12 MKDocs plugins. + +00:32:12.900 --> 00:32:18.060 +So it started as a theme, but it turned into a popular, sorry, into a powerful docs framework in the end. + +00:32:18.060 --> 00:32:23.340 +And this worked quite well for several years until it didn't anymore. + +00:32:23.740 --> 00:32:27.720 +And that's the reason why Zensicle then came into being. + +00:32:28.640 --> 00:32:35.960 +So the way it didn't work is that, like, just where you want to take it started to diverge from MKDocs + +00:32:35.960 --> 00:32:40.720 +or you couldn't get your changes upstreamed or committed back? + +00:32:40.720 --> 00:32:48.060 +So the thing was that MKDocs was not evolving as we needed it. + +00:32:49.000 --> 00:32:52.600 +So historically, MKDocs had a sequence of single maintainers. + +00:32:53.000 --> 00:32:59.240 +And as far as I know, all of them worked on it in their spare time because they had regular jobs. + +00:32:59.940 --> 00:33:02.800 +And material was evolving quickly because, you know, we had funding. + +00:33:03.180 --> 00:33:05.700 +We could invest much more time in it. + +00:33:05.700 --> 00:33:11.000 +We could then, of course, then an open source project that is only maintained in the spare time. + +00:33:11.760 --> 00:33:12.960 +And so it was changing too slowly. + +00:33:13.100 --> 00:33:19.780 +So we started a lot of discussions on necessary API changes because for many users, material for MKDocs was MKDocs. + +00:33:20.040 --> 00:33:27.660 +So we were kind of like the storefront where most of the issues and, like, bug reports and feature requests came in + +00:33:27.660 --> 00:33:34.620 +because many people are using material for MKDocs and with this MKDocs, basically. + +00:33:35.700 --> 00:33:39.960 +And the main challenges that we faced were performance and plugin orchestration. + +00:33:40.060 --> 00:33:41.280 +I mentioned I wrote 12 plugins. + +00:33:41.940 --> 00:33:45.580 +And it's very hard to make them cooperate. + +00:33:46.320 --> 00:33:52.940 +And if you look at any popular MKDocs plugins issue tracker, you will find issues that go something like, + +00:33:53.340 --> 00:33:55.140 +well, this plugin is incompatible with this plugin. + +00:33:55.700 --> 00:34:00.220 +Well, if I change the order of the plugins in the configuration, this and this happens. + +00:34:00.220 --> 00:34:08.020 +So, and both of those problems were brought to us again and again by the users with which we talked. + +00:34:08.360 --> 00:34:10.540 +And so, you know, it was coming up a lot. + +00:34:11.260 --> 00:34:14.780 +Then suddenly after nine years, the original maintainer returned to MKDocs. + +00:34:14.900 --> 00:34:18.340 +And we were super optimistic because the project was, like, maintained again. + +00:34:18.420 --> 00:34:20.420 +He also started a sponsorship program. + +00:34:20.580 --> 00:34:23.640 +We upstreamed some of our funding immediately and supported his work. + +00:34:23.640 --> 00:34:28.240 +So, before MKDocs had no way to sponsor them. + +00:34:28.940 --> 00:34:33.640 +And the moment this went live, we immediately supported it. + +00:34:34.200 --> 00:34:36.740 +And some PRs were finally merged and issues were closed. + +00:34:37.000 --> 00:34:45.580 +But, yeah, then the works went silent and it started working in basically in the quiet. + +00:34:45.580 --> 00:34:48.600 +And three months later, we were invited to a video call. + +00:34:49.380 --> 00:34:56.880 +So, we as maintainers from, so I as a maintainer for material for MKDocs and some other key ecosystem maintainers. + +00:34:57.240 --> 00:35:05.680 +And we learned that MKDocs, that the plans for MKDocs 2.0 were completely different from what existed. + +00:35:05.680 --> 00:35:15.880 +So, what currently exists, MKDocs 1.x, which primarily means no plugin API and customization via templating alone. + +00:35:16.320 --> 00:35:22.460 +So, we already knew this is not enough because that's what we've done the first four years where, as I mentioned, I was only doing the templating. + +00:35:23.200 --> 00:35:26.140 +And some things you can't just do with templates. + +00:35:26.140 --> 00:35:36.340 +For instance, having a tag support where you need to pull in different tags from different pages and then render them on another page or so. + +00:35:36.400 --> 00:35:40.100 +So, you need synchronization efforts and you can't do this with templating. + +00:35:40.820 --> 00:35:42.260 +By the way, all of this information is public. + +00:35:42.540 --> 00:35:44.940 +So, you can read it on the MKDocs issue tracker. + +00:35:45.360 --> 00:35:47.860 +So, yeah, I'm not telling anything secret or so. + +00:35:48.540 --> 00:35:51.920 +Yeah, so it's a completely different direction than the one that we worked on. + +00:35:51.920 --> 00:35:56.940 +And we erased objections in the call and, but, yeah, still they were dismissed. + +00:35:57.440 --> 00:36:02.900 +So, MKDocs 2.0, as it looks right now, is incompatible with material for MKDocs. + +00:36:03.220 --> 00:36:08.600 +300 plugins in the ecosystem will become useless and tens of thousands of projects will be affected for us. + +00:36:08.760 --> 00:36:13.980 +So, we had absolutely no choice than to start building something. + +00:36:13.980 --> 00:36:22.380 +So, to start make something of this, because at the time, we had already 50,000 projects, 50,000 public projects, depending on us. + +00:36:22.900 --> 00:36:28.300 +We were talking to enterprise users and we knew that this number is much, much higher. + +00:36:28.440 --> 00:36:33.100 +So, for instance, one of our professional users, they already also sponsored material. + +00:36:33.660 --> 00:36:37.760 +They have two and a half thousand projects internally. + +00:36:37.760 --> 00:36:40.800 +So, only one company. + +00:36:41.160 --> 00:36:52.760 +And they have a dedicated team of individuals that maintain their customizations on top of material for MKDocs for all of the teams inside the company. + +00:36:52.860 --> 00:36:53.620 +It's a very big company. + +00:36:53.980 --> 00:36:58.320 +So, that's what you could infer from the... + +00:36:58.320 --> 00:37:01.180 +I could believe it. + +00:37:01.540 --> 00:37:03.040 +I couldn't believe it at all. + +00:37:03.180 --> 00:37:04.340 +So, absolutely insane. + +00:37:04.940 --> 00:37:07.320 +Yeah, so, as I mentioned, we had no choice. + +00:37:07.320 --> 00:37:15.320 +So, what we did was we immediately went back to the drawing board with the learnings from the almost 10 years that passed since I started Material. + +00:37:16.060 --> 00:37:20.440 +We built a lot of prototypes in TypeScript and Python, iterated on them. + +00:37:20.540 --> 00:37:28.880 +We did a lot of conceptual work and realized within weeks what could actually be done with a radically different architecture. + +00:37:28.880 --> 00:37:33.600 +Because writing 12 plugins, I know the ins and outs of MKDocs. + +00:37:33.600 --> 00:37:41.720 +So, I had to do a lot of hacks, for instance, to make the block plugin of Material work with the way navigation works in MKDocs. + +00:37:41.720 --> 00:37:48.840 +And the number one complaint, as I mentioned, was MKDocs is slow and it doesn't scale. + +00:37:49.000 --> 00:37:51.860 +So, like fixing a typo, you're doing a full rebuild. + +00:37:52.040 --> 00:37:53.280 +And this can take minutes. + +00:37:53.440 --> 00:37:57.200 +So, our design work centered exactly around this problem. + +00:37:58.080 --> 00:38:03.940 +And after a short while, so, we knew exactly what MKDocs should look like. + +00:38:03.940 --> 00:38:06.660 +And we didn't want to let our users down. + +00:38:06.940 --> 00:38:09.200 +And so, in essence, we had two options. + +00:38:09.880 --> 00:38:11.680 +We know what it should look like. + +00:38:11.860 --> 00:38:14.580 +We could fork it or we could start from scratch. + +00:38:15.300 --> 00:38:21.580 +And forking is not really possible because of the way how Python dependencies work. + +00:38:21.680 --> 00:38:24.660 +So, all of the plugins have a dependency on MKDocs. + +00:38:24.660 --> 00:38:28.860 +And this means that we would also need to fork all of the... + +00:38:28.860 --> 00:38:35.160 +So, without doing black magic with imports, which might not be the best idea. + +00:38:36.560 --> 00:38:41.880 +So, we would also need to fork all plugins or all plugins would need to switch to the fork. + +00:38:41.980 --> 00:38:45.500 +So, this would be like moving an entire city at once. + +00:38:45.800 --> 00:38:47.500 +And it's frankly impossible. + +00:38:47.500 --> 00:38:56.900 +So, and if we would fork it, we wouldn't be able to realize our learnings that we gained in the groundwork that we did. + +00:38:56.960 --> 00:38:58.280 +So, we had to start from scratch, actually. + +00:38:58.340 --> 00:38:58.460 +Right. + +00:38:58.540 --> 00:39:04.620 +Plus, you'd have to convince the entire community to at least create a parallel package. + +00:39:05.100 --> 00:39:12.500 +Because when you pip install that other plugin, it's going to say, hey, PyPI, I need MKDocs. + +00:39:13.520 --> 00:39:13.700 +Yeah. + +00:39:13.700 --> 00:39:18.380 +And now you'd need the forked version, you know, whatever that's going to be called, right? + +00:39:18.500 --> 00:39:22.040 +So, yeah, it would be a big battle, wouldn't it? + +00:39:22.600 --> 00:39:23.700 +Just technically with... + +00:39:24.580 --> 00:39:29.080 +Or you'd have to move the community, which is a very challenging thing to do. + +00:39:30.180 --> 00:39:30.580 +Yeah. + +00:39:30.740 --> 00:39:36.740 +And so, for us, the most sensible thing was to just, you know, we just start from scratch. + +00:39:37.080 --> 00:39:38.980 +We make it as compatible as possible. + +00:39:38.980 --> 00:39:45.620 +It became quite clear very quickly that we need to optimize for compatibility. + +00:39:45.960 --> 00:39:57.620 +Because if you create something that is not compatible and that forces users to migrate documentation manually and to do a lot of work to get over to something else, you won't get a lot of adoption. + +00:39:58.360 --> 00:40:02.160 +So, all you got to do is think about that 2,500 project team. + +00:40:02.440 --> 00:40:02.780 +Like, okay. + +00:40:03.420 --> 00:40:03.840 +Exactly. + +00:40:03.980 --> 00:40:05.600 +How do I keep them working with this, right? + +00:40:06.140 --> 00:40:06.460 +Yes. + +00:40:06.660 --> 00:40:06.860 +Yes. + +00:40:06.860 --> 00:40:06.940 +Yes. + +00:40:07.560 --> 00:40:07.780 +Yeah. + +00:40:07.840 --> 00:40:13.300 +So, what we then did is we had an idea how it should look. + +00:40:14.120 --> 00:40:17.200 +Then we started with Rust because it was recommended to us. + +00:40:17.300 --> 00:40:19.080 +So, it was very hard at first. + +00:40:19.820 --> 00:40:25.120 +And in total, it took us 16 months to build all of this. + +00:40:25.340 --> 00:40:26.840 +But it was not only writing code. + +00:40:26.960 --> 00:40:29.860 +It was also exactly knowing where we want to go. + +00:40:29.860 --> 00:40:32.620 +Because, you know, we're starting fresh. + +00:40:32.620 --> 00:40:40.420 +So, we better be sure that we are going into a direction where we actually want to go for the next 10 to 20 to 30 years. + +00:40:41.080 --> 00:40:41.480 +Depends. + +00:40:41.780 --> 00:40:41.720 +Yeah. + +00:40:41.720 --> 00:40:44.760 +We are really in for this for the long game. + +00:40:45.460 --> 00:40:50.060 +So, the 10 years that I've been doing this, I see that this is only the start. + +00:40:50.640 --> 00:40:52.700 +And we wrote a lot of things from scratch. + +00:40:52.700 --> 00:40:56.580 +So, the runtime, as I mentioned, it's like the heart of Zendicle. + +00:40:57.160 --> 00:41:00.160 +It already has something like 15,000 lines of code. + +00:41:00.720 --> 00:41:05.100 +A tiny HTTP middleware framework for file serving because we don't want to. + +00:41:05.160 --> 00:41:13.780 +So, we also want to make the file server extensible and don't want users to force them into async Rust and also don't have a dependency on Tokyo. + +00:41:13.780 --> 00:41:22.040 +And also like a monorepo management tool for Rust and JavaScript that we also open sourced, which I'm not sure if you've worked with monorepos. + +00:41:22.280 --> 00:41:26.840 +But in JavaScript, for instance, there's Lerner and it has 800 dependencies. + +00:41:27.100 --> 00:41:30.520 +So, when you install it, what you pull down is just insane. + +00:41:31.320 --> 00:41:39.340 +So, we worked a lot on the processes as well that we can make releases very easy and that we have a good way of working, basically. + +00:41:39.460 --> 00:41:42.380 +And we're very careful about our choice of dependencies. + +00:41:42.380 --> 00:41:46.600 +So, if it's not something that – let me put it another way. + +00:41:46.620 --> 00:41:55.480 +If it's something that you can write quite quickly, actually, and we rather own in order to make changes ourselves, we rather write it from scratch. + +00:41:56.360 --> 00:41:58.400 +I think that's a very healthy philosophy. + +00:41:59.240 --> 00:42:07.660 +And also, I think this agentic AI world that we're in these days, if you just need one or two functions and you used to think, + +00:42:07.660 --> 00:42:12.880 +well, maybe I'll lean on this, in your case, a crate or maybe a PyPI package or something. + +00:42:13.940 --> 00:42:18.660 +But if it's just one or two functions, maybe you really can just write it yourself without much effort. + +00:42:18.940 --> 00:42:21.960 +And it just – it saves you so much trouble, you know. + +00:42:21.960 --> 00:42:27.380 +So, I started using pip-audit for a lot of my projects. + +00:42:28.680 --> 00:42:40.500 +And I would say for my bigger projects, every two weeks, I get at least one CVE vulnerability notification for something I'm using. + +00:42:40.700 --> 00:42:41.960 +I'm like – + +00:42:41.960 --> 00:42:43.400 +But here's the thing. + +00:42:43.580 --> 00:42:50.160 +It's in a situation of that – probably a piece of code or functionality of that package that I don't even use or care about. + +00:42:50.580 --> 00:42:52.120 +So, it doesn't really apply to me. + +00:42:52.440 --> 00:42:56.500 +But then I've got all these, like, here's an issue – like, a latent issue that is in my code. + +00:42:56.880 --> 00:42:57.240 +Yeah. + +00:42:57.320 --> 00:42:59.120 +I'm going to have to figure out and deal with. + +00:42:59.360 --> 00:43:06.280 +But it's because I've taken in so much as part of this package where if I had just written the one or two functions, then it'd be fine. + +00:43:06.400 --> 00:43:06.920 +You know what I mean? + +00:43:06.920 --> 00:43:07.400 +Yeah. + +00:43:07.880 --> 00:43:08.400 +Absolutely. + +00:43:08.400 --> 00:43:16.440 +I think there's – I think things are swinging back a little bit from, like, let's just pull in everything because it's going to help us to, like, well, maybe not everything. + +00:43:17.120 --> 00:43:17.360 +Yeah. + +00:43:17.820 --> 00:43:26.400 +And also, you're not – you can't just change things easily and you depend on other APIs. + +00:43:26.400 --> 00:43:34.280 +So, for instance, one of the reasons why we choose to build a lot of things from scratch is that we want to control the public APIs. + +00:43:34.280 --> 00:43:43.100 +So, the worst thing for us would probably just be to export a third-party API that we're using as part of our public interface because it's Rust. + +00:43:43.320 --> 00:43:48.320 +So, it would mean that if this public API would change, the entire ecosystem would break. + +00:43:48.480 --> 00:43:58.820 +So, we're very careful what APIs we expose and rather wrap it in order to be safe so we can replace things. + +00:43:58.820 --> 00:43:59.000 +I see. + +00:43:59.340 --> 00:44:00.360 +Keep things replaceable. + +00:44:00.360 --> 00:44:13.160 +So, maybe you have the philosophy of it might be okay to use this crate, but we don't exchange its types as the public as part of our API or something along those lines. + +00:44:13.640 --> 00:44:14.640 +Yeah, we don't expose it. + +00:44:15.100 --> 00:44:24.620 +So, we – in some instance, the wrappers that I've wrote are identical to the types that we use from another crate. + +00:44:24.620 --> 00:44:31.040 +But by using our own types or just wrapping them, because in Rust, the nice benefit is you have zero-cost abstraction. + +00:44:31.260 --> 00:44:33.200 +So, all the code is monomorphized in line. + +00:44:33.280 --> 00:44:36.320 +So, you don't pay for wrapping code. + +00:44:36.820 --> 00:44:38.560 +That's the absolute crazy thing. + +00:44:38.920 --> 00:44:46.180 +So, you can finally create a really clean architecture without runtime penalties if you do it right. + +00:44:46.700 --> 00:44:47.300 +Oh, that's wild. + +00:44:47.400 --> 00:44:47.560 +Yeah. + +00:44:47.720 --> 00:44:47.940 +Yeah. + +00:44:47.940 --> 00:44:48.080 +Yeah. + +00:44:48.080 --> 00:44:48.580 +Very interesting. + +00:44:48.580 --> 00:44:57.000 +So, you can see I have this huge list of topics we're basically just barely cracked in the surface. + +00:44:57.440 --> 00:45:00.080 +But I'd like to go back to this components. + +00:45:02.740 --> 00:45:05.000 +Wrong search there. + +00:45:06.380 --> 00:45:08.080 +You have components. + +00:45:08.860 --> 00:45:09.900 +That was in the other part, wasn't it? + +00:45:09.900 --> 00:45:13.300 +Let's just talk down – talk through some of these things here. + +00:45:13.300 --> 00:45:18.040 +So, you've got, like, admonitions, buttons, code blocks. + +00:45:18.140 --> 00:45:21.620 +Like, let's talk through some of the building blocks, I guess, that you think are interesting here. + +00:45:23.040 --> 00:45:23.280 +Yeah. + +00:45:23.360 --> 00:45:31.720 +So, I think most of the – so, if you're not new to technical writing, most of the stuff shouldn't be quite new. + +00:45:31.820 --> 00:45:34.260 +So, like, admonitions, code blocks, stuff like that. + +00:45:34.260 --> 00:45:35.760 +You've probably seen our data tables. + +00:45:35.760 --> 00:45:41.360 +Diagrams are just mermaid diagrams, as they are – as you can use them on GitHub. + +00:45:42.400 --> 00:45:52.540 +One of the – so, like, the flagship features in Material, and now Zensical, as I mentioned, like, code annotations, which is a part of code blocks. + +00:45:53.860 --> 00:45:57.480 +Otherwise, we also have an icon, an emoji integration. + +00:45:57.480 --> 00:46:04.680 +So, you can use one of – I think we have something like over 10,000 icons now with a quite simple syntax. + +00:46:05.120 --> 00:46:06.180 +That's not standard markdown. + +00:46:06.320 --> 00:46:06.840 +That's the problem. + +00:46:06.940 --> 00:46:08.400 +So, that's, like, a Python markdown extension. + +00:46:09.240 --> 00:46:16.200 +And we're working on moving this over to common mark and finding a way to migrate this over. + +00:46:17.140 --> 00:46:27.280 +Because, you know, right now, it's – Zensical uses Python markdown for compatibility with Material for MKDocs, which means that for markdown rendering, we need to go through Python. + +00:46:28.140 --> 00:46:35.600 +And this is a temporary limitation that we have because I mentioned we are focusing really hard on compatibility. + +00:46:37.060 --> 00:46:44.740 +And all of those components will also, of course, be available within our common mark solution that we're working on that we will ship later this year. + +00:46:45.980 --> 00:46:46.380 +Yeah. + +00:46:46.980 --> 00:46:51.480 +But, yeah, right now, of course, you can use them as they're mentioned on our documentation. + +00:46:51.620 --> 00:46:55.260 +And we will, of course, provide automated tooling to get them over to common mark. + +00:46:55.260 --> 00:46:55.700 +Yeah. + +00:46:57.120 --> 00:46:57.560 +Yeah. + +00:46:57.680 --> 00:47:11.160 +I guess it's interesting that you've got to not just consider the API and the syntax and stuff, but maybe even the same parsing engine to have this strong compatibility, right? + +00:47:11.880 --> 00:47:12.040 +Yeah. + +00:47:12.040 --> 00:47:14.040 +We can even read MKDocs YML configuration. + +00:47:14.360 --> 00:47:17.260 +So, you can build an MKDocs project with Zensical as it stands. + +00:47:17.840 --> 00:47:23.040 +The thing that we currently don't support in its entirety is the plugins from the ecosystem. + +00:47:23.040 --> 00:47:26.440 +We already support some plugins. + +00:47:26.940 --> 00:47:28.640 +For instance, the MKDocs strings plugin. + +00:47:29.120 --> 00:47:36.220 +The author is also part of the Zensical team now with MKDocs strings being the second biggest project in the MKDocs space. + +00:47:36.220 --> 00:47:38.220 +So, we're very happy to have Tim on board. + +00:47:39.980 --> 00:47:41.540 +And several other plugins. + +00:47:41.800 --> 00:47:44.920 +But, as I mentioned, so Zensical uses modules. + +00:47:45.260 --> 00:47:54.280 +So, what we will do in the end is we will still always be able to read MKDocs configuration and map the plugin configurations to equivalent Zensical modules. + +00:47:54.280 --> 00:48:01.720 +So, the logic will be completely rewritten, but you will be able to migrate your project with a command. + +00:48:02.820 --> 00:48:04.160 +That's our goal. + +00:48:04.240 --> 00:48:11.680 +Because, you know, there has so much work been going on into projects built with Material and MKDocs. + +00:48:11.760 --> 00:48:16.200 +So, we need to make it easy for users and organizations to switch. + +00:48:16.640 --> 00:48:22.300 +And this is the main part we're working on in 2026. + +00:48:22.300 --> 00:48:25.360 +I think this is critical, right? + +00:48:26.020 --> 00:48:26.420 +Yeah. + +00:48:26.880 --> 00:48:35.920 +Your absolute best users, you know, like that big company, but many others, of course, they're not going to rewrite everything. + +00:48:36.000 --> 00:48:36.720 +Well, maybe they will. + +00:48:36.840 --> 00:48:38.700 +But many of them won't rewrite everything. + +00:48:38.840 --> 00:48:42.700 +They'll just use an old version and grin and bear it as long as they have to. + +00:48:42.800 --> 00:48:43.320 +You know what I mean? + +00:48:43.360 --> 00:48:47.480 +Like this idea of doing it from scratch. + +00:48:47.480 --> 00:48:54.800 +But if you provide a path for them that's very easy, then all of a sudden they get this way better experience, right? + +00:48:54.860 --> 00:48:59.060 +I can only imagine, you know, the build speed helping out the bigger projects the most. + +00:49:00.180 --> 00:49:00.380 +Yeah. + +00:49:00.620 --> 00:49:03.840 +And the compatibility part is one of the hardest engineering parts, actually. + +00:49:04.260 --> 00:49:09.560 +So, that you have to think about that, you know, because we don't want to paint ourselves into a corner. + +00:49:09.560 --> 00:49:22.640 +So, we need to think about where do we want to go, but how can we go there faster right now without making sacrifices in a way that we can't, in the end, replace things. + +00:49:22.880 --> 00:49:25.140 +And we have a pretty elaborate plan how to do all of this. + +00:49:25.880 --> 00:49:26.380 +And, yeah. + +00:49:26.780 --> 00:49:29.660 +So, we're working very hard on it to make it. + +00:49:30.040 --> 00:49:32.200 +So, right now, you can just use material, of course. + +00:49:32.340 --> 00:49:33.400 +You can keep using it. + +00:49:33.400 --> 00:49:38.480 +Or if your site already builds in Zensicle, you will have better speed and the modern design and the better search. + +00:49:38.680 --> 00:49:43.900 +So, the search has been completely rewritten from material to Zensicle. + +00:49:44.140 --> 00:49:46.060 +It's also, it's currently integrated. + +00:49:46.240 --> 00:49:47.560 +It's integrated with Zensicle. + +00:49:48.020 --> 00:49:51.960 +And we will open source it as a dedicated open source project. + +00:49:52.880 --> 00:49:54.020 +It's called Disco. + +00:49:54.620 --> 00:49:57.040 +So, you will also be able to use the search in other projects. + +00:49:57.040 --> 00:50:03.140 +And just as a number, to get a feel for it, it's 20 times faster than the search in material for MKDocs. + +00:50:03.400 --> 00:50:03.920 +Wow. + +00:50:04.020 --> 00:50:05.520 +So, it's a ground-up rewrite. + +00:50:06.140 --> 00:50:10.360 +And we actually started working on the search before we started working on Zensicle. + +00:50:10.940 --> 00:50:15.680 +Yeah, I noticed how nice the search was when I was playing with it. + +00:50:16.900 --> 00:50:17.600 +We're in. + +00:50:19.760 --> 00:50:23.800 +So, is Zensicle.org itself built in Zensicle? + +00:50:24.560 --> 00:50:25.100 +Yeah, of course. + +00:50:25.500 --> 00:50:28.880 +And it's actually built with an MKDocs YML because we're dog booting. + +00:50:28.880 --> 00:50:36.060 +So, you can also build it with MKDocs, with material for MKDocs. + +00:50:36.200 --> 00:50:37.560 +The project layout is exactly the same. + +00:50:38.260 --> 00:50:38.400 +Yeah. + +00:50:39.080 --> 00:50:47.840 +You know, I find that there's just a bunch of static sites that seem to have, I don't know what's going on with them, but their search is really bad. + +00:50:47.840 --> 00:50:57.040 +And, you know, either they've just integrated some kind of Google thing where it says site colon and they use your URL and then the search, which is a real bad experience. + +00:50:57.220 --> 00:51:02.440 +Or you go search and it sits there and it spins and it spins and then eventually it pulls up. + +00:51:02.440 --> 00:51:10.540 +So, it looks like you are pre-computing these types of things or something with your search engine or you've got some cool data structure to make that fast, right? + +00:51:11.520 --> 00:51:16.720 +Well, it's not one cool data structure that would be great because then everybody could just use it. + +00:51:17.040 --> 00:51:17.840 +But, no. + +00:51:18.280 --> 00:51:19.380 +A series of algorithms. + +00:51:19.700 --> 00:51:24.380 +Several months of work went into the search. + +00:51:24.800 --> 00:51:25.200 +Of course. + +00:51:25.200 --> 00:51:30.020 +So, it's a project of its own, as I mentioned. + +00:51:30.160 --> 00:51:31.640 +It's also completely modular. + +00:51:32.220 --> 00:51:47.240 +And the reason why most of the search engines that are out there, that are open source, so like the libraries that you can use, not services you have to pay for, that they don't provide results that are really relevant or are, is that they use BM25, + +00:51:47.240 --> 00:51:54.940 +which is like the standard bag of words ranking algorithm for information retrieval. + +00:51:55.200 --> 00:51:57.780 +And this doesn't nicely pair with autocomplete. + +00:51:57.920 --> 00:52:01.220 +So, what you get is you start typing and you get a lot of dancing results. + +00:52:02.220 --> 00:52:12.980 +And also, if you add further documents to your index, the balancing will be off because the relevance is computed based on the occurrence of a word in the entire corpus. + +00:52:13.220 --> 00:52:16.360 +So, you add a new document, those weights change again. + +00:52:16.360 --> 00:52:21.940 +So, the search that we have, we, of course, as a baseline also have a BM25 implementation. + +00:52:22.400 --> 00:52:28.200 +But the implementation you're seeing is a tie-breaking implementation, which provides much, much better accuracy. + +00:52:28.940 --> 00:52:30.060 +And you can configure it. + +00:52:30.060 --> 00:52:39.720 +So, tie-breaking means, okay, we first look into the title of the document and see if we have matches, then how many matches, then where they are. + +00:52:39.840 --> 00:52:45.040 +Then we look into the path and then in the body of the document and so on. + +00:52:45.240 --> 00:52:46.320 +All of this is configurable. + +00:52:46.320 --> 00:52:54.280 +And this is also why we believe that Disco alone will also be a very interesting project for other, for instance, static site generators to integrate. + +00:52:55.300 --> 00:52:57.360 +And you asked about, like, pre-computing. + +00:52:57.520 --> 00:53:01.960 +So, no, this is a search from the documents. + +00:53:02.060 --> 00:53:09.060 +We build a search index, which is a strapped-down version of the HTML that is rendered when you load the page. + +00:53:09.060 --> 00:53:12.040 +It's one JSON that we ship to the client. + +00:53:12.460 --> 00:53:15.480 +And for most pages, actually, this JSON is below one megabyte. + +00:53:15.580 --> 00:53:16.780 +You can GC it. + +00:53:17.000 --> 00:53:17.960 +So, compress it. + +00:53:18.460 --> 00:53:19.840 +Then it's something like 200K. + +00:53:20.040 --> 00:53:24.400 +And you have extremely fast search on the client with no cost. + +00:53:25.480 --> 00:53:33.520 +And so, we believe that for 90, 95, maybe 99% of documentation sites or sites in general, + +00:53:33.520 --> 00:53:41.640 +this client-side search is basically the way to go because it's fast and it doesn't require you to pay for anything. + +00:53:41.760 --> 00:53:48.480 +And there are several ZaaS-based services that can be extremely expensive when you do the math. + +00:53:49.380 --> 00:53:57.260 +So, yeah, you only need to use a server, basically, when the index becomes too big to ship to the client. + +00:53:57.860 --> 00:53:59.000 +And we're also working on that, by the way. + +00:53:59.600 --> 00:54:00.660 +Okay, that's really cool. + +00:54:00.660 --> 00:54:05.580 +You could shard the index or something like that, right, I suppose? + +00:54:05.800 --> 00:54:10.300 +Like, you could say, we're going to have 26 index bits. + +00:54:10.460 --> 00:54:14.920 +And only if the word starts with an A do you pull that piece down or something. + +00:54:15.240 --> 00:54:17.660 +But, yeah, a lot of cool aspects. + +00:54:18.480 --> 00:54:18.580 +Yeah. + +00:54:19.360 --> 00:54:20.540 +It's not that simple. + +00:54:21.060 --> 00:54:25.040 +But there are also some other interesting solutions. + +00:54:25.200 --> 00:54:26.940 +Like, PageFind is a pretty interesting library. + +00:54:27.040 --> 00:54:28.960 +It does a completely different approach. + +00:54:28.960 --> 00:54:36.340 +But it's not as snappy as the search that we ship to the client. + +00:54:36.840 --> 00:54:37.020 +Yeah. + +00:54:37.260 --> 00:54:40.040 +I use PageFind for my personal website, which is a static site. + +00:54:40.600 --> 00:54:40.820 +Yeah. + +00:54:41.340 --> 00:54:42.620 +It's also a great, great solution. + +00:54:42.840 --> 00:54:47.580 +But some things you won't be able to implement in PageFind properly. + +00:54:48.440 --> 00:54:48.840 +Sure. + +00:54:48.840 --> 00:54:52.180 +So, it's, you know, it's with software, it's trade-offs all the way. + +00:54:52.880 --> 00:54:56.340 +Well, I'm already thinking, like, I better pay attention to disco when it comes out. + +00:54:56.600 --> 00:54:59.500 +So, maybe adopt it for some stuff. + +00:55:00.500 --> 00:55:00.980 +Beautiful. + +00:55:01.180 --> 00:55:01.360 +Okay. + +00:55:01.360 --> 00:55:01.400 +Okay. + +00:55:01.800 --> 00:55:08.240 +We got a couple interesting questions sort of following up from the component side of things. + +00:55:09.280 --> 00:55:13.300 +Jamstack says, do you foresee community-led templates or themes for Zensicle? + +00:55:14.380 --> 00:55:19.640 +I know you have, like, two themes that I see something along those lines, a couple of themes that you can choose now. + +00:55:19.640 --> 00:55:22.580 +But what is the theme story, I guess? + +00:55:22.920 --> 00:55:24.200 +I want to ask you more broadly. + +00:55:25.940 --> 00:55:26.340 +Yeah. + +00:55:26.460 --> 00:55:27.120 +So, absolutely. + +00:55:27.480 --> 00:55:30.240 +So, right now, we have only this one theme. + +00:55:30.580 --> 00:55:38.520 +We have this variant setting where you can choose, like, the classic variant, which is when you move over from material for MKDocs. + +00:55:38.600 --> 00:55:39.800 +It looks exactly the same. + +00:55:39.800 --> 00:55:47.960 +This is also why we need it to keep the HTML as it is, also with the modern design that we provided, and the modern variant, which is the standard for Zensicle. + +00:55:47.960 --> 00:56:02.300 +Once we move to the component system, we will make it possible to, one, use components within Markdown, and, two, also create a template engine that is based on components. + +00:56:02.300 --> 00:56:13.900 +This will allow us much, much faster rendering, because, for instance, if you render the header for a site, it's a lot of HTML, because, you know, there's the search box in it and some other stuff. + +00:56:14.100 --> 00:56:15.260 +But only the title changes. + +00:56:15.400 --> 00:56:18.740 +So, we will also make the rendering differential as part of the build. + +00:56:18.840 --> 00:56:19.780 +That's the plan. + +00:56:20.240 --> 00:56:24.460 +And with this, we will also make it open to theme developers, of course. + +00:56:24.460 --> 00:56:34.100 +So, there will be the, like, packaging, for instance, compilation of ZaaS styles or TypeScript or so will be part of Zensicle. + +00:56:34.380 --> 00:56:40.200 +So, you don't need to pre-compile the theme like we need to do for, like, the last 10 years for material. + +00:56:41.120 --> 00:56:42.980 +So, it will have a proper asset pipeline. + +00:56:43.100 --> 00:56:45.500 +It will have a proper process to install themes. + +00:56:45.800 --> 00:56:46.600 +All of this is planned. + +00:56:46.600 --> 00:56:49.140 +But right now, we focus on feature parity. + +00:56:49.140 --> 00:56:54.380 +So, in order to make it possible for more users to migrate right now. + +00:56:55.180 --> 00:57:07.020 +That's really interesting that you would deliver the theme as, basically, its original source, not its rendered, you know, compiled or transpiled version, right? + +00:57:07.420 --> 00:57:11.400 +To keep it, I guess, a part of the Zensicle build step, right? + +00:57:11.400 --> 00:57:13.320 +Yes, exactly. + +00:57:13.460 --> 00:57:20.260 +Because we had a lot of requests for something like, hey, can we change the media queries a little bit? + +00:57:20.300 --> 00:57:24.720 +Because the sidebar disappears too early for my taste. + +00:57:25.300 --> 00:57:27.040 +And this is not... + +00:57:27.040 --> 00:57:33.020 +So, for this, you have to go through the compilation step again and, basically, fork the theme and recompile it. + +00:57:33.020 --> 00:57:37.020 +We want to make this configurable so that you can use... + +00:57:37.980 --> 00:57:44.360 +Yeah, so, you know, configure the theme and build it and it just works. + +00:57:44.700 --> 00:57:46.520 +So, this, like, you know, it just works. + +00:57:46.640 --> 00:57:48.460 +That's, like, the thing we're working towards. + +00:57:48.900 --> 00:57:50.100 +Make it as simple as possible. + +00:57:50.900 --> 00:57:51.040 +Yeah. + +00:57:51.540 --> 00:57:52.380 +Yeah, very cool. + +00:57:53.380 --> 00:57:54.020 +Let's maybe... + +00:57:54.560 --> 00:57:55.760 +I'm getting short on time here. + +00:57:55.780 --> 00:58:00.460 +Maybe wrap up our chat talking about two things. + +00:58:00.460 --> 00:58:01.680 +The future. + +00:58:02.900 --> 00:58:03.760 +Where are you going? + +00:58:03.860 --> 00:58:10.740 +You talked about compatibility being a big part of things going forward in 2026. + +00:58:10.960 --> 00:58:13.840 +But also sustainability, right? + +00:58:14.240 --> 00:58:23.860 +You had all these great supporters for material for MKDocs, which you must have just been absolutely thrilled to realize how successful that was, right? + +00:58:23.960 --> 00:58:28.820 +I mean, going from the wall, put up a wish list, and then, actually, people love this. + +00:58:28.820 --> 00:58:30.680 +I can put all my energy into it. + +00:58:30.760 --> 00:58:32.500 +I mean, I know how great of a feeling that is, right? + +00:58:33.360 --> 00:58:34.360 +That's completely insane. + +00:58:34.720 --> 00:58:35.260 +And I would... + +00:58:35.260 --> 00:58:35.280 +Yeah. + +00:58:35.460 --> 00:58:40.040 +When I started it, I would never believe that this would be my job at some point. + +00:58:40.660 --> 00:58:42.060 +Yeah, I feel the same way about the podcast. + +00:58:42.540 --> 00:58:43.460 +And it's just... + +00:58:43.460 --> 00:58:44.320 +I'm so grateful for it. + +00:58:44.340 --> 00:58:44.800 +It's amazing. + +00:58:45.580 --> 00:58:45.700 +Yeah. + +00:58:45.700 --> 00:58:46.020 +I can imagine. + +00:58:46.740 --> 00:58:47.000 +Yeah. + +00:58:47.000 --> 00:58:53.120 +But then with this transition to Zensicle, how does that change? + +00:58:53.220 --> 00:58:54.120 +Does that change anything? + +00:58:54.500 --> 00:58:55.660 +Or what's the story? + +00:58:56.500 --> 00:58:57.080 +Yeah, so... + +00:58:57.080 --> 00:58:58.640 +How do you bring that support over to Zensicle? + +00:58:58.640 --> 00:59:03.440 +Well, as we don't have a lot of time, I try to explain it as compact as possible. + +00:59:03.640 --> 00:59:09.440 +So, we are saying goodbye to this pay for features, pay for extra features. + +00:59:09.800 --> 00:59:15.400 +So, in material, you needed to be a sponsor in order to get the latest features earlier. + +00:59:15.660 --> 00:59:18.000 +What we will do is everything is open source from the start. + +00:59:18.000 --> 00:59:20.360 +So, for users, it's completely free. + +00:59:21.100 --> 00:59:26.880 +And we are shifting our model from the sponsorships to something we call Zensicle Spark. + +00:59:27.300 --> 00:59:32.240 +Because what we discovered, talking a lot to our professional users, is that the more we + +00:59:32.240 --> 00:59:36.740 +know about the problem space, and the better we understand the problem space, and the more + +00:59:36.740 --> 00:59:39.900 +we can collaborate with them, the more we can... + +00:59:39.900 --> 00:59:42.200 +The better degrees of freedom we can provide. + +00:59:42.200 --> 00:59:45.600 +So, we don't intend to just chip feature, feature, feature. + +00:59:45.600 --> 00:59:52.080 +But we intend to create degrees of freedom, so that you can adapt Zensicle to the processes + +00:59:52.080 --> 00:59:56.220 +within your organization, how they work, to the workflows, etc., which are all different, + +00:59:56.360 --> 00:59:58.480 +which is all very diverse, basically. + +00:59:59.040 --> 01:00:04.060 +So, Spark is a space where you, as a company, can basically get a seat. + +01:00:04.720 --> 01:00:08.720 +And together with us, Shape Zensicle is part of high-level discussions, where we explore + +01:00:08.720 --> 01:00:10.000 +the problem space. + +01:00:10.460 --> 01:00:11.300 +We create proposals. + +01:00:11.300 --> 01:00:13.940 +So, on the website, you will have clicked on the Spark section. + +01:00:13.940 --> 01:00:15.620 +There's this Zaps in progress. + +01:00:15.780 --> 01:00:18.660 +We call them Zaps, Zensicle Advancement Proposals. + +01:00:18.920 --> 01:00:19.620 +It's on the left side. + +01:00:20.320 --> 01:00:26.020 +We write very elaborate, detailed proposals on specific topics that we intend to work on. + +01:00:26.640 --> 01:00:32.780 +And then, with the feedback that we get, iterate on them and create an authoring... + +01:00:32.780 --> 01:00:37.860 +Like, the ideal authoring experience that caters to the most cases possible. + +01:00:37.860 --> 01:00:42.560 +Because we want to build Zensicle, as I mentioned, for the very long term. + +01:00:42.900 --> 01:00:46.740 +And not just a solution that is opinionated, but that is as unopinionated as possible. + +01:00:47.560 --> 01:00:53.380 +And the third thing that you get, besides the opportunity to discuss high-level discussions + +01:00:53.380 --> 01:00:57.840 +with us and create the proposals with us, is, of course, professional support. + +01:00:57.840 --> 01:01:01.220 +So, this is also, we've been asking, we've been asked for quite a lot by companies. + +01:01:02.080 --> 01:01:07.940 +So, in Spark, you, yeah, you can basically get our time. + +01:01:08.080 --> 01:01:11.240 +You can, we will, you can get direct access to the team. + +01:01:11.240 --> 01:01:17.920 +And also, we have, like, those open video calls where we share our progress and where you can get a window of support. + +01:01:18.060 --> 01:01:21.900 +And we talk about any problem that is keeping you up at night, basically. + +01:01:22.280 --> 01:01:27.240 +And stuff like migrations or how do you do this and this in Zensicle. + +01:01:27.740 --> 01:01:30.040 +And, yeah, it's been a blast. + +01:01:30.360 --> 01:01:34.700 +So, we're really happy that the organizations are enrolling into this new model. + +01:01:34.700 --> 01:01:41.880 +And we think it could also be a model that might translate quite well to other projects because you get a huge competitive advantage. + +01:01:42.040 --> 01:01:43.500 +You know exactly what to build. + +01:01:44.820 --> 01:01:47.800 +Yeah, you're on, you're talking to the actual users. + +01:01:48.160 --> 01:01:51.140 +They're saying, this is the thing that really is hard for us. + +01:01:51.220 --> 01:01:54.240 +Or you just get, maybe they don't say it, but you see it, right? + +01:01:55.160 --> 01:01:56.100 +Exactly, yes, yes. + +01:01:56.220 --> 01:01:58.320 +And talking to the users is the best thing you can do. + +01:01:58.320 --> 01:02:05.900 +So, what we learned from those, from the many times we talked to them is always something like, wow, we never would have come up with this. + +01:02:07.280 --> 01:02:08.280 +Yeah, incredible. + +01:02:09.060 --> 01:02:16.180 +Well, congratulations on the success for Material for MKDocs and then this new project. + +01:02:16.360 --> 01:02:18.140 +I'm very excited to see it coming along. + +01:02:18.380 --> 01:02:20.380 +And it looks like it's going to be great. + +01:02:21.420 --> 01:02:23.300 +Maybe a final call to action for people. + +01:02:23.680 --> 01:02:26.120 +Like, can they go ahead and start using Zensicle? + +01:02:26.120 --> 01:02:27.760 +If they're interested, what do they do? + +01:02:28.000 --> 01:02:28.340 +So on. + +01:02:30.760 --> 01:02:31.280 +Yeah, of course. + +01:02:31.660 --> 01:02:35.560 +So, you can, so, we mentioned Material for MKDocs a lot. + +01:02:35.640 --> 01:02:39.360 +And this is because we are coming from this direction. + +01:02:39.640 --> 01:02:44.420 +So, it means if you have a Material for MKDocs project, you should definitely try out Zensicle and see if you can build your project. + +01:02:44.540 --> 01:02:48.380 +But if you haven't used it, you can also just jumpstart a new project. + +01:02:48.640 --> 01:02:50.580 +It has a lot of built-in functionality already. + +01:02:50.580 --> 01:02:59.320 +You get, like, all of these components that we talked about, free search that you don't have to host, a very modern static site that is great on mobile. + +01:02:59.640 --> 01:03:01.000 +So, just give it a try. + +01:03:01.600 --> 01:03:02.980 +And we have a newsletter. + +01:03:03.340 --> 01:03:06.600 +So, where we, once a month, share the latest updates. + +01:03:07.000 --> 01:03:10.160 +And that might also be worth checking out. + +01:03:10.160 --> 01:03:15.980 +But, yeah, and otherwise, we'd be happy to see you, to get any feedback. + +01:03:16.540 --> 01:03:21.420 +By the way, we also have a public Discord, a community Discord, which is growing very well. + +01:03:21.720 --> 01:03:24.620 +So, if you have any problems or so, then you will get help there. + +01:03:25.280 --> 01:03:25.400 +Yeah. + +01:03:26.260 --> 01:03:34.620 +Would be great to see as many users as possible, of course, and shape the future of Zensicle together with all of you. + +01:03:34.620 --> 01:03:35.020 +Yeah. + +01:03:35.460 --> 01:03:35.660 +Yeah. + +01:03:36.600 --> 01:03:36.960 +Fantastic. + +01:03:37.580 --> 01:03:38.620 +Martin, thanks for coming on the show. + +01:03:39.400 --> 01:03:40.220 +Congrats on the project. + +01:03:41.300 --> 01:03:42.280 +Thanks for the invitation. + +01:03:42.600 --> 01:03:45.300 +And happy any time to come back. + +01:03:46.020 --> 01:03:46.320 +Yeah. + +01:03:46.420 --> 01:03:46.960 +Sounds good. + +01:03:46.960 --> 01:03:47.020 +Yeah. + +01:03:47.020 --> 01:03:47.000 +Sounds good. From 5ae7208efbaa426d4ef07e2902d6cdea51433175 Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 18 Mar 2026 14:58:10 -0700 Subject: [PATCH 07/16] Add/update some transcripts. --- ...hing-up-with-the-python-typing-council.txt | 18 +- .../541-monty-python-in-rust-for-ai.vtt | 3290 +++++++++++++++ ...tic-site-generator-transcript-original.vtt | 2959 ------------- youtube_transcripts/542-zensical.vtt | 3263 +++++++++++++++ .../543-langchain-deep-agents.vtt | 2942 +++++++++++++ youtube_transcripts/544-wheel-next.vtt | 3647 +++++++++++++++++ 6 files changed, 13151 insertions(+), 2968 deletions(-) create mode 100644 youtube_transcripts/541-monty-python-in-rust-for-ai.vtt delete mode 100644 youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt create mode 100644 youtube_transcripts/542-zensical.vtt create mode 100644 youtube_transcripts/543-langchain-deep-agents.vtt create mode 100644 youtube_transcripts/544-wheel-next.vtt diff --git a/transcripts/539-catching-up-with-the-python-typing-council.txt b/transcripts/539-catching-up-with-the-python-typing-council.txt index 1714687..25d6386 100644 --- a/transcripts/539-catching-up-with-the-python-typing-council.txt +++ b/transcripts/539-catching-up-with-the-python-typing-council.txt @@ -244,7 +244,7 @@ 00:06:33 So let me ask you, do you feel like it would be different? -00:06:36 Would it have gone different now if tools like TY and Pyrefly existed back then? +00:06:36 Would it have gone different now if tools like ty and Pyrefly existed back then? 00:06:42 Is Python typing different now than it was then? @@ -662,7 +662,7 @@ 00:18:21 So get started today with Sentry. -00:18:23 Just visit talkpython.fm slash Sentry and get $100 in Sentry credits. +00:18:23 Just visit talkpython.fm/sentry and get $100 in Sentry credits. 00:18:28 Please use that link. @@ -742,9 +742,9 @@ 00:20:31 Yeah, that made it hard because often there's peps built on top of each other. -00:20:35 So then in the extreme, you might see like one thing in one pep and then there's +00:20:35 So then in the extreme, you might see like one thing in one PEP and then there's -00:20:39 another pep that adds an aspect of it, another one that adds another aspect. +00:20:39 another PEP that adds an aspect of it, another one that adds another aspect. 00:20:43 And overall it makes it very hard to follow. @@ -1016,7 +1016,7 @@ 00:28:12 you can run across and get that? -00:28:14 I mean, you could do that with Powerfly or TY and the CLI as well, but you know, +00:28:14 I mean, you could do that with Powerfly or ty and the CLI as well, but you know, 00:28:18 thinking more mypy is like kind of being real strict on some of that stuff. @@ -1150,9 +1150,9 @@ 00:31:51 turn AI into a genuine engineering partner. -00:31:54 Check it out at talkpython.fm slash agentic dash engineering. +00:31:54 Check it out at talkpython.fm/agentic dash engineering. -00:31:58 That's talkpython.fm slash agentic dash engineering. +00:31:58 That's talkpython.fm/agentic dash engineering. 00:32:01 The link is in your podcast player's show notes. @@ -2306,7 +2306,7 @@ 01:00:06 applications with Sentry. -01:00:07 Just visit talkpython.fm slash Sentry and get started for free. +01:00:07 Just visit talkpython.fm/sentry and get started for free. 01:00:12 Be sure to use our code talkpython26. @@ -2318,7 +2318,7 @@ 01:00:28 features. -01:00:29 Visit talkpython.fm slash agentic dash AI. +01:00:29 Visit talkpython.fm/agentic dash AI. 01:00:33 If you or your team needs to learn Python, we have over 270 hours of beginner diff --git a/youtube_transcripts/541-monty-python-in-rust-for-ai.vtt b/youtube_transcripts/541-monty-python-in-rust-for-ai.vtt new file mode 100644 index 0000000..d061108 --- /dev/null +++ b/youtube_transcripts/541-monty-python-in-rust-for-ai.vtt @@ -0,0 +1,3290 @@ +WEBVTT + +00:00:01.519 --> 00:00:05.040 +Samuel, welcome back to Talk Python To Me. Great to have you here as always. + +00:00:05.800 --> 00:00:07.940 +Thank you so much for having me back. Yeah, it's good to be here. + +00:00:09.460 --> 00:00:14.060 +I saw your project and I immediately sent you a message. You need to come on the show and talk + +00:00:14.180 --> 00:00:19.200 +about this. What is going on? What is Monty? Hat tip to the name. I want to hear the origin of the + +00:00:19.300 --> 00:00:25.100 +name, but you might be able to guess it. I think I can guess it. I think I can guess it. + +00:00:26.740 --> 00:00:29.700 +But yeah, it's awesome to be here talking about this. + +00:00:30.840 --> 00:00:35.180 +You've been on a bunch of times, but there's a bunch of new listeners or they don't listen to every show. + +00:00:36.220 --> 00:00:36.800 +Give us your background. + +00:00:38.040 --> 00:00:38.180 +Yeah. + +00:00:38.440 --> 00:00:39.280 +So I'm Samuel. + +00:00:39.660 --> 00:00:45.520 +I'm probably best known as creating the Pydantic Validation Library way back in the annals of time in 2017. + +00:00:47.700 --> 00:00:52.480 +That is kind of infrastructure, but Python today, we just crossed 10 billion downloads in total. + +00:00:53.180 --> 00:00:55.780 +We're at like 580 million downloads a month. + +00:00:55.820 --> 00:00:57.680 +So that gets a lot of usage. + +00:00:58.980 --> 00:01:04.839 +Very lucky that Sequoia Capital came along and invested in Pydantic to start a company at the beginning of 2023. + +00:01:05.760 --> 00:01:09.740 +So now we have a kind of stable of different things we do, what we call the Pydantic stack. + +00:01:09.960 --> 00:01:11.600 +So there's Pydantic validation. + +00:01:12.040 --> 00:01:17.540 +We talked about Pydantic AI, which is an agent framework where Monty kind of fits in best. + +00:01:18.200 --> 00:01:26.360 +And then there's Pydantic Logfire, the observability platform for AI and general observability, which is the commercial bit of what we do. + +00:01:26.960 --> 00:01:31.400 +So I suppose I'm supposed to be being CEOing most of the time. + +00:01:31.460 --> 00:01:33.280 +I actually spend far too much of my time clauding. + +00:01:33.500 --> 00:01:34.620 +I seem to be in good company. + +00:01:34.800 --> 00:01:38.840 +I keep seeing people on Twitter, lots of CEOs of much bigger companies and writing lots of code. + +00:01:38.980 --> 00:01:40.100 +So apparently I'm allowed to again. + +00:01:41.520 --> 00:01:47.840 +It is an insanely exciting time with just the agentic AI in general. + +00:01:47.900 --> 00:01:53.060 +And Claude, Claude Opus, Claude Sonnet in particular, they are so good. + +00:01:53.360 --> 00:01:53.960 +I don't know about you. + +00:01:54.040 --> 00:01:58.160 +I'm sure at least half the people, at least half of the people listening are like, + +00:01:59.020 --> 00:02:01.060 +they've got a backlog of ideas they want to try, + +00:02:01.300 --> 00:02:03.340 +things they've always wanted to build and not the time. + +00:02:03.460 --> 00:02:04.760 +Or maybe it's a bit of a stretch. + +00:02:04.900 --> 00:02:06.120 +Like, I don't really know mobile. + +00:02:06.180 --> 00:02:07.380 +I can't really build a mobile app. + +00:02:07.400 --> 00:02:08.899 +But if I could, I would build this. + +00:02:09.539 --> 00:02:10.880 +And now you kind of can, right? + +00:02:11.260 --> 00:02:26.640 +Yeah, I mean, I think it's got scary bits of it, too. I mean, maybe we're experiencing the bonfire of the thing we all, you know, I was speaking to Zach Hatfield-Dodds just before Christmas, and he was like, we have had this weird time period when the thing I love doing happens to be incredibly financially lucrative. + +00:02:26.840 --> 00:02:56.600 +I mean, he's anthropic, so it's probably more financially. But hey, and maybe that that time is going to come to an end. But I still feel very privileged to have had that time. I don't know exactly what's going to go. I mean, and definitely the jobs of software developers are changing. And some of that is scary. But as you say, it's also super exciting projects from go build a mobile app, which you didn't know how to do. But there were others who did through to building Monty, which I think we were relatively well placed to do it as a team. + +00:02:56.700 --> 00:03:01.600 +of people, but we would never have had the resources or the time to do it if it wasn't + +00:03:01.700 --> 00:03:06.340 +for LLMs being especially good at tasks like that. + +00:03:07.040 --> 00:03:07.260 +Interesting. + +00:03:07.630 --> 00:03:07.760 +Okay. + +00:03:07.820 --> 00:03:12.040 +I do want to dive into that later, but we haven't even introduced what Monty is yet. + +00:03:12.110 --> 00:03:14.880 +So let's hold off on that deep dive. + +00:03:15.000 --> 00:03:21.040 +But when I saw this, I'm like, I wonder what role that agentic coding sort of made this + +00:03:21.360 --> 00:03:22.500 +possible for a small team. + +00:03:22.980 --> 00:03:25.820 +That was certainly one of the thoughts I had. + +00:03:26.580 --> 00:03:29.200 +- Yeah, I mean, I can dive into it, but yeah, + +00:03:29.230 --> 00:03:32.700 +I mean, I've got a bit of help now from David Hewitt, + +00:03:32.780 --> 00:03:35.520 +who is a great deal better Rust developer + +00:03:35.610 --> 00:03:38.560 +and knows more of the Python internals than many people, + +00:03:39.080 --> 00:03:39.920 +definitely more than me. + +00:03:41.280 --> 00:03:43.020 +But for the most part, it was just me + +00:03:43.020 --> 00:03:44.240 +in my spare time building it, + +00:03:44.310 --> 00:03:46.980 +which I'll talk in a bit about like why I think + +00:03:47.500 --> 00:03:49.980 +this is such an eligible project for LLM acceleration. + +00:03:50.740 --> 00:03:51.600 +- Yeah, yeah. + +00:03:51.730 --> 00:03:53.620 +So you're playing both sides of the fence here, + +00:03:54.220 --> 00:04:00.840 +It sounds like both maybe using a little AI, but also building for AI, which I think is quite interesting. + +00:04:01.380 --> 00:04:11.940 +Yeah, I think we're, I mean, yeah, we're doing, we're building Pydantic AI as a way for LLMs to power applications or be part of applications. + +00:04:12.660 --> 00:04:14.680 +We're also using AI to build that. + +00:04:15.280 --> 00:04:21.440 +More and more, I think of people's usage of Logfire is through their coding agent, as in, sure, people can log into Logfire. + +00:04:21.660 --> 00:04:23.060 +We love our tracing view, et cetera. + +00:04:23.150 --> 00:04:25.000 +But I acknowledge there's a lot of people + +00:04:25.030 --> 00:04:27.260 +who are just gonna point and code code at it + +00:04:27.440 --> 00:04:29.340 +and ask it to go and work out what's wrong + +00:04:29.350 --> 00:04:30.000 +and fix their bug. + +00:04:30.220 --> 00:04:33.800 +So yeah, we contact with what's going on in LLMs + +00:04:33.940 --> 00:04:34.440 +all over the place. + +00:04:35.500 --> 00:04:36.960 +- How did you facilitate that? + +00:04:38.200 --> 00:04:40.140 +Like how can the AI get that information? + +00:04:40.659 --> 00:04:43.300 +- We made this weird, esoteric, odd decision + +00:04:43.540 --> 00:04:45.140 +back when we first started Logfire + +00:04:45.720 --> 00:04:49.640 +not to allow users to write arbitrary SQL against their data. + +00:04:50.100 --> 00:04:54.240 +We did that really because we thought it was too much hard work to build a query builder. + +00:04:54.840 --> 00:04:56.680 +And SQL seemed like the thing we would want. + +00:04:57.080 --> 00:05:01.060 +And it seemed like a pretty esoteric odd decision back when we started it in 2023. + +00:05:02.240 --> 00:05:06.260 +Now it is the most powerful, most defensible thing we have. + +00:05:06.660 --> 00:05:11.960 +Because we've spent two years learning how to build effectively an analytical database + +00:05:12.180 --> 00:05:16.720 +that anyone can go and query and run any query against and dealing with all of the side effects of that. + +00:05:17.180 --> 00:05:19.100 +But look, everyone has an MCP server, fine. + +00:05:19.220 --> 00:05:24.560 +but what's powerful about log fires is LLMs are very, very, very good at writing SQL when they have a schema. + +00:05:25.120 --> 00:05:30.840 +And so you ask it something that no one's ever asked it before, say, find me the five slowest endpoints by P95. + +00:05:31.040 --> 00:05:35.960 +Now, that's a reasonable one, but you can imagine some incredibly complex question that no one's ever answered before + +00:05:36.220 --> 00:05:39.180 +that no other kind of query builder dialect could do. + +00:05:39.220 --> 00:05:41.420 +But because you have full SQL, you can go and write this. + +00:05:41.780 --> 00:05:43.560 +LLM will write the SQL to give you back the answer. + +00:05:43.940 --> 00:05:47.740 +I want the P95 worst top five there. + +00:05:49.000 --> 00:05:52.540 +for this app at this endpoint for the people + +00:05:52.860 --> 00:05:56.060 +in Southeast Asia on Tuesday, right? + +00:05:56.160 --> 00:05:57.040 +Something like, like you're like, okay, + +00:05:57.480 --> 00:06:00.320 +we've run out of filters, but like SQL just keeps going. + +00:06:01.240 --> 00:06:03.060 +And yeah, by the way, group that by hour + +00:06:03.979 --> 00:06:05.840 +or group that by every 15 minutes. + +00:06:06.120 --> 00:06:07.940 +And like, you know, it gets arbitrarily more complex. + +00:06:08.040 --> 00:06:08.980 +That just, just works. + +00:06:09.980 --> 00:06:11.280 +- Yeah, how very interesting. + +00:06:11.680 --> 00:06:16.479 +I just wrote an article about how I think working + +00:06:16.500 --> 00:06:20.840 +in the native query language if you're using agentic programming. + +00:06:21.140 --> 00:06:21.920 +I saw you write it. + +00:06:22.460 --> 00:06:23.480 +Yeah, yeah, yeah. + +00:06:24.040 --> 00:06:27.100 +And I mean, Pydantic's a perfect fit for that style. + +00:06:27.300 --> 00:06:29.940 +It's like if you could write your actual queries + +00:06:30.060 --> 00:06:32.980 +in native syntax and then transform it to a rich class, + +00:06:33.080 --> 00:06:35.260 +like a Pydantic or a Data class or something like that, + +00:06:37.280 --> 00:06:41.980 +these AIs, they are so trained on SQL or MongoDB native query + +00:06:42.180 --> 00:06:45.020 +syntax or whatever vanilla lowest level thing. + +00:06:45.160 --> 00:06:47.960 +They see more of that than anything because it's across all the technologies. + +00:06:48.740 --> 00:06:49.680 +I think that's going to be a thing. + +00:06:49.760 --> 00:06:56.840 +And it's interesting how you sort of set the stage so that was already present for you and your product, right? + +00:06:57.120 --> 00:06:57.340 +Yeah. + +00:06:57.520 --> 00:07:02.720 +But even when we started building the Logfire platform, I remember saying, everyone was like, which ORM are we going to use? + +00:07:03.120 --> 00:07:06.100 +We're building a FastAPI, so there was some debate about how we do it. + +00:07:06.100 --> 00:07:07.600 +And I was like, let's just write SQL. + +00:07:07.960 --> 00:07:14.480 +And everyone, you know, it seemed like an odd thing to do because, sure, it's like six lines of SQL to do a simple, like, what would be a, like, get in Django. + +00:07:15.560 --> 00:07:20.360 +ORM. But I mean, I think even before LLMs, people were compelled enough because they were like, + +00:07:20.390 --> 00:07:24.920 +yeah, the autocomplete kind of LLM will do a lot of the work for me. And now I have complete + +00:07:25.120 --> 00:07:30.180 +control. Now I think where the majority of code is being written by AIs, having full control, + +00:07:30.480 --> 00:07:35.440 +full SQL is incredibly useful. And you can optimize it, right? You can only get the particular column + +00:07:35.620 --> 00:07:39.839 +that you want. You can be very careful about which indexes are being used. You can copy paste the + +00:07:39.860 --> 00:07:43.000 +SQL into whatever and work out the plan, + +00:07:43.760 --> 00:07:45.340 +that's much harder when you're using an ORM. + +00:07:45.400 --> 00:07:46.040 +So, yeah. + +00:07:46.280 --> 00:07:46.400 +Yeah. + +00:07:46.720 --> 00:07:49.980 +And you could just star star the dictionary + +00:07:50.120 --> 00:07:52.020 +that comes back right into a pedantic class + +00:07:52.180 --> 00:07:54.640 +and then you put that behind a function. + +00:07:54.760 --> 00:07:55.600 +You don't mess with it. + +00:07:55.740 --> 00:07:56.100 +It's safe. + +00:07:56.700 --> 00:07:56.960 +Exactly. + +00:07:57.540 --> 00:07:57.640 +Yeah. + +00:07:57.660 --> 00:07:59.500 +You kind of get the programmer benefits + +00:08:00.740 --> 00:08:02.580 +of programming against typed classes + +00:08:02.960 --> 00:08:06.180 +and the AI benefits of it can just talk like vanilla + +00:08:06.560 --> 00:08:07.520 +and the performance as well. + +00:08:08.240 --> 00:08:08.480 +All right. + +00:08:09.140 --> 00:08:13.020 +don't necessarily want to go too far down that rat hole we got a different one to go down + +00:08:13.860 --> 00:08:22.580 +let's talk about python interpreters so you built monty a specialized python interpreter written in + +00:08:22.840 --> 00:08:29.520 +rust and i i just want to just do a little historical journey to show like for people + +00:08:29.520 --> 00:08:36.119 +who don't know like this is not the first one of these i actually i'm happy to riff on this but i'll + +00:08:36.140 --> 00:08:43.080 +you take the lead i heard a conversation about two from two programmers an interchange exchange + +00:08:43.289 --> 00:08:49.160 +between those two talking about c python they're like what is c python is it like python that + +00:08:49.260 --> 00:08:55.520 +compiles to c or like you know so maybe just a little bit of a chat about what the heck is an + +00:08:55.680 --> 00:09:00.100 +interpreter yeah go ahead i remember being confused about that too and and you know siphon which i + +00:09:00.180 --> 00:09:05.079 +don't think we hear about so much anymore but that confused me as well i remember yeah so it's + +00:09:05.100 --> 00:09:11.740 +interesting that even from as you know far back as cpython's uh origination there was a acknowledgement + +00:09:11.860 --> 00:09:16.160 +that there might be other pythons and the python is a is a language not an implementation but yeah + +00:09:17.020 --> 00:09:25.080 +go ahead yeah so well we've got the python interpreter and we got python code we write + +00:09:25.280 --> 00:09:33.799 +often we write well python the language but when it executes it doesn't actually execute in python + +00:09:33.900 --> 00:09:39.640 +it might execute because C understands it and a C compiled thing runs or in your case rust + +00:09:40.540 --> 00:09:46.700 +understands the bytecode right so the interpreter parses our python into python bytecodes + +00:09:46.970 --> 00:09:51.020 +which you can get through to with the disk module you can disassemble it and look at the actual + +00:09:51.280 --> 00:09:55.960 +bytecodes you got back and then those are sent off to like a giant loop that interprets them + +00:09:56.620 --> 00:10:03.200 +hence the term interpreter so we've got c python we have the defunct iron python for dot net which + +00:10:03.180 --> 00:10:04.600 +which made it all the way to 3.4. + +00:10:04.860 --> 00:10:07.060 +We've got the defunct Jython, + +00:10:07.180 --> 00:10:08.520 +which made it all the way to 2.7. + +00:10:09.380 --> 00:10:12.380 +And we've got the much more exciting and modern Pyodide, + +00:10:12.920 --> 00:10:15.700 +which- - Pyodide is still CPython, so- + +00:10:16.500 --> 00:10:17.960 +- Yes, but compiled for WebAssembly, + +00:10:18.040 --> 00:10:19.320 +which I feel, I don't know, + +00:10:19.320 --> 00:10:21.160 +I feel like Rust and WebAssembly have this kinship. + +00:10:21.180 --> 00:10:22.700 +So it's like, I don't know, + +00:10:22.700 --> 00:10:23.700 +it feels closer to Rust than the others. + +00:10:23.700 --> 00:10:25.640 +- I agree, there's also Rust Python, + +00:10:25.820 --> 00:10:26.520 +which is in active development. + +00:10:26.520 --> 00:10:28.140 +I don't know what that's currently pointing at. + +00:10:29.980 --> 00:10:35.520 +There's also Grail, which is another Python interpreter. + +00:10:36.010 --> 00:10:38.460 +And the second biggest really is PyPy, + +00:10:39.580 --> 00:10:42.020 +probably the most best known one of all. + +00:10:42.640 --> 00:10:44.140 +So there's, I mean, there's a, + +00:10:45.030 --> 00:10:47.340 +without meaning to cause offense to those that are still active, + +00:10:47.660 --> 00:10:50.760 +there's also Unladen Swallow was another attempt. + +00:10:51.300 --> 00:10:51.860 +There's a whole, but look, + +00:10:52.370 --> 00:10:55.040 +without meaning to cause offense to any of those that are still alive, + +00:10:55.070 --> 00:10:58.580 +there was a kind of graveyard of other Python implementations. + +00:10:59.020 --> 00:11:06.440 +And so I went into this knowing that it's a space where lots of people have tried to build things, put in, bluntly, a great deal more effort than we have. + +00:11:08.679 --> 00:11:13.740 +And for the most part, I wouldn't say they failed, but they haven't got the same kind of adoption that CPython has. + +00:11:13.840 --> 00:11:19.100 +I mean, I think CPython is 99.9 of usage of Python. + +00:11:21.120 --> 00:11:33.080 +And my take is that the reason for that is you need almost complete, perfect consistency with CPython to use something else. + +00:11:33.410 --> 00:11:40.900 +Again, you need 99.59s of perfection, of identical behavior before you would go and switch in any real application. + +00:11:41.500 --> 00:11:45.080 +I remember trying to use PyPy. + +00:11:45.190 --> 00:11:46.440 +And even if I could get it to run, + +00:11:46.620 --> 00:11:48.340 +well, it turns out it's foreign function interfaces + +00:11:48.530 --> 00:11:53.760 +and not with like asyncpg was slower than CPython's. + +00:11:53.760 --> 00:11:55.160 +And so actually it didn't perform as well. + +00:11:55.250 --> 00:11:59.640 +And so the threshold to switch from CPython to something else + +00:11:59.690 --> 00:12:02.400 +or to choose something else was incredibly high. + +00:12:02.840 --> 00:12:07.000 +And so we are not trying to build another Python interpreter + +00:12:07.130 --> 00:12:09.520 +that you might credibly move your application across. + +00:12:10.100 --> 00:12:16.000 +We're using Python as a syntax for a very specific thing where LLMs write code. + +00:12:16.320 --> 00:12:23.200 +And the fact that we have a different goal is one of the reasons that we thought this was a credible project to take on. + +00:12:24.240 --> 00:12:34.240 +You know, the real challenge, I think, that I saw with all of those is there are so many different use cases. + +00:12:34.860 --> 00:12:39.560 +And it's both a big benefit of all the Python packages and stuff. + +00:12:39.740 --> 00:12:43.320 +But this package pulls in this compiled thing, + +00:12:43.390 --> 00:12:46.640 +and this other one pulls in another compiled thing, + +00:12:46.820 --> 00:12:50.060 +and it assumes that the gil works exactly in this way. + +00:12:50.460 --> 00:12:53.660 +And so there's all these implied behaviors + +00:12:53.810 --> 00:12:55.400 +that have to be carried across. + +00:12:56.190 --> 00:12:59.120 +And a lot of these, I think we're trying to say, + +00:12:59.380 --> 00:13:01.660 +let's put those to the side and see + +00:13:01.700 --> 00:13:04.800 +if we could build something neater that's more native to Java + +00:13:05.160 --> 00:13:08.500 +or.NET or whatever people were after with those different ones. + +00:13:10.000 --> 00:13:12.900 +But then the compatibility just hit him in the face, right? + +00:13:13.120 --> 00:13:15.740 +I haven't actually counted PyPI lately, + +00:13:17.700 --> 00:13:21.600 +but how many, we're almost just short of three-quarter million, + +00:13:22.120 --> 00:13:27.340 +two packages short of three-quarters of a million packages. + +00:13:28.260 --> 00:13:30.820 +I'm just going to say, yes, we're going to leave it open. + +00:13:30.980 --> 00:13:32.140 +We're absolutely leaving that open. + +00:13:33.559 --> 00:13:36.520 +But trying to be compatible with that many projects? + +00:13:36.580 --> 00:13:39.320 +We're actually 5,002 short of. + +00:13:39.600 --> 00:13:45.040 +oh yeah yeah okay sorry i just sorry to be a pedant but it comes with the oh yeah yeah you + +00:13:45.040 --> 00:13:51.500 +know you're right right seven four four not seven yeah well we're we're gonna there's gonna be some + +00:13:51.500 --> 00:13:56.560 +kind of milestone reach but it's not the one i was hoping for anyway the point is there's this there's + +00:13:56.640 --> 00:14:03.759 +so many edge cases and so many specializations yeah i think that's really where it hit them and + +00:14:04.460 --> 00:14:10.580 +you know maybe this is a good good segue to just you know what if not that then what are + +00:14:10.580 --> 00:14:17.780 +you actually building what is this monty so monty tries to solve uh this problem where we want to + +00:14:17.920 --> 00:14:21.840 +allow llm's are very very good at writing code we're talking about them writing sequel earlier + +00:14:21.910 --> 00:14:27.200 +they're very good at writing um python and and javascript i think honestly it wouldn't really + +00:14:27.390 --> 00:14:32.079 +matter to the implementation whether we were whether we were implementing python or java whether + +00:14:32.100 --> 00:14:34.800 +whether we, or JavaScript, it just turns out for a bunch of reasons. + +00:14:35.030 --> 00:14:38.540 +Python is easier and it's also like where we come from. + +00:14:41.139 --> 00:14:46.520 +The simplest use case of Monty is what people call programmatic tool calling or code mode, + +00:14:47.080 --> 00:14:51.480 +where instead of my LLM calling tools in a loop, + +00:14:52.730 --> 00:14:56.480 +sometimes using the return value from one tool straight into the next tool, + +00:14:57.260 --> 00:15:05.980 +the LLM can just go and write code and thereby be more reliable and much more performant and much + +00:15:06.100 --> 00:15:11.700 +lower cost. So we've seen examples of like, if you, for example, connect Pydantic AI with code + +00:15:11.820 --> 00:15:20.260 +mode enabled to GitHub's MCP and you say, go and find the five latest pull requests. And I forget + +00:15:20.330 --> 00:15:24.980 +what the question was, right? But the point was, we have to go jump through their API via MCP. + +00:15:28.239 --> 00:15:29.600 +and calculate some value, + +00:15:29.840 --> 00:15:32.400 +we've seen tasks go from kind of $2 down to 4 cents + +00:15:33.900 --> 00:15:35.280 +as a result of using code mode. + +00:15:35.420 --> 00:15:37.980 +Because one of the big reasons for that + +00:15:38.100 --> 00:15:39.960 +is that those MCP responses are vast. + +00:15:40.320 --> 00:15:43.000 +And so the MMM has to put loads of tokens into context + +00:15:43.580 --> 00:15:45.300 +to go and pull out, well, actually this is just like + +00:15:45.440 --> 00:15:47.520 +the ID of the thing I need to make the next request. + +00:15:48.200 --> 00:15:51.320 +- I just added an MPC server to talk Python + +00:15:51.600 --> 00:15:53.820 +a few weeks ago, so people could ask questions + +00:15:54.080 --> 00:15:54.580 +about it and stuff. + +00:15:55.120 --> 00:16:05.740 +And what really surprised me is the actual return type that the MCP servers recommend is markdown, not structured data. + +00:16:05.850 --> 00:16:10.320 +So you basically send a giant blob of markdown back as the response. + +00:16:10.510 --> 00:16:16.660 +And then, like you're saying, a bunch of tokens get consumed, just trying to understand the response rather than here's a JSON document. + +00:16:16.790 --> 00:16:18.560 +I know it's called this. Boom. Answer. + +00:16:18.740 --> 00:16:23.520 +So I think in the case of GitHub's one, they do return JSON, which is useful for us because we can then go parse that JSON. + +00:16:23.860 --> 00:16:27.900 +But also, if you don't need the whole of that response, + +00:16:28.190 --> 00:16:32.320 +you can search through it and extract a particular thing you need. + +00:16:32.860 --> 00:16:37.380 +So the conservative threshold for what Monty can do + +00:16:37.540 --> 00:16:42.340 +is allow us to implement this code mode use case. + +00:16:42.850 --> 00:16:44.780 +And I think it works for that for the most part now. + +00:16:45.080 --> 00:16:46.580 +We're working hard on some improvements. + +00:16:47.740 --> 00:16:50.839 +The biggest difference of it versus all of the other Python implementations + +00:16:51.420 --> 00:16:54.160 +is it is completely sandboxed. + +00:16:54.160 --> 00:16:58.140 +It is isolated from your machine. + +00:16:58.300 --> 00:17:01.040 +So you can't open a file + +00:17:01.180 --> 00:17:02.540 +or read an environment variable + +00:17:03.560 --> 00:17:05.079 +unless you very specifically say, + +00:17:05.579 --> 00:17:06.620 +here are the environment variables + +00:17:06.740 --> 00:17:07.939 +you're passing into this context. + +00:17:08.560 --> 00:17:12.920 +Or here are the pseudo files + +00:17:12.959 --> 00:17:14.520 +or indeed soon real files + +00:17:14.939 --> 00:17:16.780 +that I specifically want to expose to this runtime. + +00:17:17.520 --> 00:17:18.900 +That means that obviously reading a file + +00:17:18.900 --> 00:17:20.719 +is going to be way less performant than in CPython + +00:17:20.740 --> 00:17:23.839 +where we can go and make some sys call to read a file. + +00:17:24.280 --> 00:17:24.939 +We're not doing that. + +00:17:25.100 --> 00:17:27.980 +You're calling back from the Monty runtime + +00:17:28.800 --> 00:17:30.680 +to the host runtime, + +00:17:30.820 --> 00:17:33.480 +which might be Python or might be JavaScript or Rust, + +00:17:33.900 --> 00:17:36.180 +to say, read me this particular file, + +00:17:36.260 --> 00:17:37.340 +and then it can choose what to do. + +00:17:37.940 --> 00:17:39.900 +But that is obviously what you want in this scenario + +00:17:40.080 --> 00:17:42.080 +where the LLM is writing the code. + +00:17:42.480 --> 00:17:45.560 +So that is the regard in which we are completely different + +00:17:45.840 --> 00:17:49.760 +from all of the other Python implementations. + +00:17:50.020 --> 00:17:52.060 +And there's a few other projects doing similar things, + +00:17:52.780 --> 00:17:54.520 +but we're different in that regard + +00:17:54.680 --> 00:17:56.300 +from all of the established programming languages, + +00:17:56.540 --> 00:17:58.540 +which would all have ways to read files. + +00:18:00.440 --> 00:18:01.600 +- Very interesting take. + +00:18:02.240 --> 00:18:04.740 +You know, it might be worth just a quick mention to, + +00:18:06.060 --> 00:18:07.740 +there's plenty of people out there listening + +00:18:07.750 --> 00:18:10.960 +who have not done agentic tool using coding. + +00:18:11.830 --> 00:18:15.180 +So I think understanding just that the flow of that + +00:18:15.320 --> 00:18:18.620 +is kind of important to understanding the value of this, right? + +00:18:18.740 --> 00:18:20.120 +And you did definitely touch on it. + +00:18:20.220 --> 00:18:24.080 +But if you go and ask Claude Code to do something, or cursor, + +00:18:24.800 --> 00:18:29.500 +or whatever, it's constantly like, let me run this git command. + +00:18:29.550 --> 00:18:30.680 +Let me run this ls command. + +00:18:30.730 --> 00:18:31.820 +Let me run this find. + +00:18:32.260 --> 00:18:35.100 +And periodically, it'll just exec Python, + +00:18:35.770 --> 00:18:37.440 +like little strings of Python and stuff. + +00:18:38.000 --> 00:18:41.340 +So one of your core ideas is, what + +00:18:41.390 --> 00:18:43.960 +if we could give it a better Python + +00:18:45.020 --> 00:18:47.900 +that it's encouraged to use for this kind of behavior? + +00:18:49.260 --> 00:18:51.100 +Let me describe it in a slightly different way. + +00:18:51.440 --> 00:18:54.160 +Okay, so we have a continuum of how much control + +00:18:54.600 --> 00:18:56.280 +and how much flexibility LLMs have. + +00:18:56.380 --> 00:18:58.880 +At one end of the spectrum, we have pure tool calling, + +00:18:59.040 --> 00:19:00.220 +where they can basically return JSON + +00:19:00.800 --> 00:19:02.780 +with the name of a tool that you're going to call. + +00:19:04.419 --> 00:19:07.620 +And there are agent frameworks like PyLands.AI + +00:19:07.720 --> 00:19:09.020 +that allow you to hook that up to functions, + +00:19:09.240 --> 00:19:11.100 +but ultimately, you're just getting JSON back + +00:19:11.300 --> 00:19:12.680 +and you're deciding what to do with that, + +00:19:12.800 --> 00:19:15.160 +and you may call the LLM again with some return value. + +00:19:15.840 --> 00:19:17.360 +At the full other end of the spectrum, + +00:19:17.440 --> 00:19:18.840 +we have complete computer use. + +00:19:19.140 --> 00:19:20.860 +Some NLM has some vision model + +00:19:20.890 --> 00:19:23.440 +and it's moving my cursor around on screen + +00:19:23.840 --> 00:19:24.820 +to do everything I want. + +00:19:25.310 --> 00:19:26.020 +Type onto our keyboard. + +00:19:27.200 --> 00:19:28.620 +In the middle, we have a bunch of options. + +00:19:28.710 --> 00:19:31.560 +We have Monty, which is kind of on the, + +00:19:32.340 --> 00:19:33.940 +near the tool calling end of the spectrum. + +00:19:34.240 --> 00:19:37.840 +Then we have sandboxes like Daytona and E2B and Modal. + +00:19:38.380 --> 00:19:40.100 +And then we have the kind of Claude Code + +00:19:40.190 --> 00:19:42.800 +or Codex style of like complete control of your terminal. + +00:19:43.480 --> 00:19:44.300 +And along that spectrum, + +00:19:44.980 --> 00:19:46.719 +you go more and more power + +00:19:46.740 --> 00:19:50.280 +in terms of capacity of what the LLM might be able to do + +00:19:50.840 --> 00:19:52.220 +and more and more security concerns. + +00:19:52.440 --> 00:19:55.460 +And generally that comes with more and more of having an adult + +00:19:55.660 --> 00:19:57.820 +watching what it's going to go and do and controlling it + +00:19:58.240 --> 00:20:01.080 +and uncrashing it when it crashes, when it goes and does the wrong thing. + +00:20:01.920 --> 00:20:06.380 +And so for the most part today, when we're using something in the cloud + +00:20:06.620 --> 00:20:13.160 +that uses an LLM, it's doing the tool calling end of the spectrum. + +00:20:13.640 --> 00:20:18.440 +That's what the kind of LangChain, Langgraph, Pydantic AI, Crew AI, all those guys are doing. + +00:20:21.440 --> 00:20:27.960 +The NLM is doing very similar things when Claude Code basically decides to go and run LS or run RM-RF. + +00:20:28.680 --> 00:20:35.980 +It's calling the tool like bash command, which the Claude application running on your machine chooses to go and execute. + +00:20:36.520 --> 00:20:41.880 +The point is, for the most part, when we're building applications that are going to go and run in the cloud, + +00:20:42.120 --> 00:20:46.780 +We don't have a software developer who understands what's going on, sitting, watching every command. + +00:20:47.390 --> 00:20:50.860 +And so we need to be much more constrained in what we're going to allow the LLM to do. + +00:20:51.880 --> 00:20:55.760 +But we want to have a little bit more expressiveness than we do with pure tool calling. + +00:20:56.480 --> 00:21:06.180 +And at the moment, there is basically nothing in the spectrum between tool calling and go and run a sandboxing service and have access to a full sandbox. + +00:21:06.330 --> 00:21:08.480 +And that's powerful. You can do a bunch of things with it. + +00:21:08.740 --> 00:21:12.200 +but often we don't need that stuff and that's where monty's that's the kind of sweet spot + +00:21:14.480 --> 00:21:15.960 +okay there's interesting + +00:21:18.960 --> 00:21:24.740 +incentives or something that align with this undertaking as well right for example if you + +00:21:24.900 --> 00:21:31.260 +don't give it a networking stack it can't do bad things on the network yeah because it just doesn't + +00:21:31.340 --> 00:21:36.779 +exist right so it helps you it inspires to create like a more minimal version of the standard library + +00:21:36.800 --> 00:21:43.640 +and so on. Yeah. And you can imagine like we, we will soon have a, some version of HTTP requests + +00:21:43.740 --> 00:21:48.700 +that you can make, but you will be required to go and enable that explicitly. And even better + +00:21:48.880 --> 00:21:53.240 +because you're calling through the host, you're going to have a perfect point where you can go and + +00:21:53.400 --> 00:21:58.440 +read the URL and go, no, you can't make a request to local host and go and like start snooping on + +00:21:58.520 --> 00:22:02.880 +what's going on here. You have to be making a request to an external URL or whatever else it + +00:22:02.900 --> 00:22:07.400 +might be, or even I'm going to go and use some third party service to proxy all HTTP requests. + +00:22:08.120 --> 00:22:13.700 +So it is never an untrusted HTTP request inside my network. But the point is, this is the single + +00:22:13.920 --> 00:22:19.040 +biggest difference of Monty is every single place where you can, where the code can interact with + +00:22:19.140 --> 00:22:26.720 +the real world, it must call an external function. So call back through the host. And then the other + +00:22:26.980 --> 00:22:32.860 +regard in which it is, I think, somewhat innovative is we are not using traditional callbacks for + +00:22:32.880 --> 00:22:39.480 +that. So we're not giving the runtime a list of pointers to functions it can call on the host. + +00:22:40.060 --> 00:22:46.440 +Instead, the Monty runtime is effectively suspending and returning control to the host + +00:22:46.920 --> 00:22:49.720 +whenever you're doing a tool call. So you're basically getting a response, which is like, + +00:22:50.100 --> 00:22:56.380 +call the function read file with the arguments file name on whatever else it might be. And that + +00:22:56.620 --> 00:23:01.580 +allows a few things, but in particular, it allows us if that tool we're going to go and run, + +00:23:02.320 --> 00:23:04.600 +Well, that function we're going to go and run is going to take two days to run. + +00:23:05.340 --> 00:23:10.620 +We can serialize the Monty runtime, go put that in a database and shut down that process + +00:23:10.740 --> 00:23:12.340 +and wait for the tool to come back. + +00:23:12.860 --> 00:23:17.520 +And that's something that CPython doesn't offer, understandably, but we are able to build + +00:23:17.680 --> 00:23:19.060 +because we built Monty from scratch. + +00:23:19.600 --> 00:23:23.440 +You can serialize the entire interpreter state, go put it into a database and retrieve it + +00:23:23.560 --> 00:23:24.540 +later when you want to resume. + +00:23:25.100 --> 00:23:25.920 +That's pretty wild. + +00:23:27.180 --> 00:23:29.440 +So it's got this durability aspect, right? + +00:23:29.960 --> 00:23:30.120 +Yeah. + +00:23:31.580 --> 00:23:38.840 +Which I think is in these scenarios where often the code execution part of this is going to take milliseconds, + +00:23:39.610 --> 00:23:50.020 +but our tools might take minutes or hours or whatever else, both for durability and to build an application that's both more durable and easier to maintain. + +00:23:50.820 --> 00:23:58.280 +You don't have to have that interpreter state hanging around in memory as you would with CPython. + +00:24:00.140 --> 00:24:29.020 +Yeah. And all the other things like timeout and just other weird oddities, right? Like I was working on something on my laptop just yesterday and my wife's like, you ready to go? I'm like, hold on. I got to wait. I got to wait for this chat to complete before it's been going for five minutes. It's almost done. Just hold on. Then I can close my laptop and roll, you know, because it would have, who knows what it would have done to it, right? + +00:24:29.540 --> 00:24:35.680 +Yeah. Yeah. And talking of timeouts, the other thing that we're able to do in Monty is we're able to look. + +00:24:35.800 --> 00:24:39.160 +It's not perfect yet because it's early, but we basically allow you to set resource limits. + +00:24:39.460 --> 00:24:44.500 +So total execution time and memory limit in particular and recursion depth. + +00:24:45.160 --> 00:24:49.520 +And therefore you can run this Monty thing in some small image in the cloud. + +00:24:49.550 --> 00:24:55.660 +And you can say it's got 10 megabytes and it, you know, once it's hardened, you know, it's early. + +00:24:55.660 --> 00:24:58.640 +We have that support now, but I'm not saying there are no ways around it. + +00:24:58.960 --> 00:25:01.180 +It can't go and kill your machine out of memory. + +00:25:02.840 --> 00:25:05.020 +It can't oom your container. + +00:25:05.220 --> 00:25:09.080 +You're just going to get back a resources error saying too much memory consumed. + +00:25:10.520 --> 00:25:11.180 +Yeah, very powerful. + +00:25:12.100 --> 00:25:15.860 +So I see on the GitHub page here a couple of things. + +00:25:15.930 --> 00:25:19.080 +First of all, it supports Python 3, 10, 11, 12, 13, 14. + +00:25:19.640 --> 00:25:22.880 +Presumably 15 will take the place of 10 in a year or something. + +00:25:23.400 --> 00:25:26.659 +So that is the support for the... + +00:25:26.680 --> 00:25:30.080 +So we have, so the Monty runtime is written entirely in Rust. + +00:25:30.120 --> 00:25:35.620 +It has no dependency on CPython or Py03 or anything else. + +00:25:35.800 --> 00:25:37.040 +It is a pure Rust library. + +00:25:37.440 --> 00:25:37.960 +We're very lucky. + +00:25:37.980 --> 00:25:48.980 +We have the AST parser from Ruff from the Astral team that allows us to go from Python code to some basically structured object. + +00:25:48.980 --> 00:25:51.160 +So we don't have to go and do the parsing the Python code ourselves. + +00:25:51.640 --> 00:25:54.000 +Right, because Ruff is already written in Rust. + +00:25:54.660 --> 00:26:00.340 +I feel like the Astral team is kind of a peer of yours for sure. + +00:26:00.480 --> 00:26:02.660 +You guys must look at each other, what you all are doing. + +00:26:03.280 --> 00:26:03.600 +Yeah, yeah. + +00:26:03.760 --> 00:26:04.840 +And we use that a lot. + +00:26:04.900 --> 00:26:06.240 +And also we have ty built in. + +00:26:06.320 --> 00:26:11.540 +So the ty type checker from Astral is again written in Rust. + +00:26:11.720 --> 00:26:14.800 +And so it is compiled into Monty when you use it. + +00:26:14.820 --> 00:26:18.460 +And so before you run your code, you can go and run type checking at the same time. + +00:26:18.580 --> 00:26:26.380 +And again, that feedback is incredibly useful for LLMs to get them to write reasonably reliable workflows. + +00:26:28.480 --> 00:26:36.540 +But to come back to your question, so we have Monty itself, which is just Rust, pure Rust, no other C dependencies, just in Rust. + +00:26:36.740 --> 00:26:43.620 +And then we have, you can use that as a Rust library directly in your Rust application if you so wish, and there are people already doing that. + +00:26:44.000 --> 00:26:52.180 +But we then have libraries for Python and for JavaScript, which use, in the case of Python, PyO3, which is amazing. + +00:26:52.470 --> 00:26:57.540 +In the case of JavaScript, a thing called NAPI, or maybe you're supposed to pronounce it NAPI. + +00:26:57.710 --> 00:26:58.100 +I don't know. + +00:27:00.460 --> 00:27:04.400 +Which basically means we can then go and have JavaScript and Python packages where you can call Monty. + +00:27:04.740 --> 00:27:10.960 +And so slightly confusingly, that Python 3.10 through 3.14 is referring to the Python package that you're installing. + +00:27:11.500 --> 00:27:15.360 +the actual Monty is targeting Python 3.14 syntax only. + +00:27:15.820 --> 00:27:16.380 +I see. + +00:27:16.740 --> 00:27:20.420 +But those are the different language features + +00:27:20.590 --> 00:27:22.760 +that you support basically for parsing, right? + +00:27:22.980 --> 00:27:23.460 +Something like that. + +00:27:24.240 --> 00:27:24.480 +Yes. + +00:27:24.590 --> 00:27:25.080 +No, no, no. + +00:27:25.090 --> 00:27:26.660 +So that's just like we only support. + +00:27:26.830 --> 00:27:29.120 +So Monty itself will run as if it was 3.14 + +00:27:29.250 --> 00:27:30.500 +or some subset of it. + +00:27:30.500 --> 00:27:31.940 +We don't support all the syntax yet, + +00:27:31.950 --> 00:27:34.660 +but like 3.14 type stuff. + +00:27:34.730 --> 00:27:36.480 +But yeah, when you're installing it, + +00:27:36.540 --> 00:27:39.080 +when you're uv add Pynantic Monty, + +00:27:39.220 --> 00:27:41.340 +you can do that in 3.10 through 3.14. + +00:27:41.620 --> 00:27:43.900 +And obviously we maintain a bunch of Rust stuff. + +00:27:44.100 --> 00:27:45.520 +We've worked hard to have binaries + +00:27:45.620 --> 00:27:46.840 +for basically every environment, + +00:27:47.360 --> 00:27:49.980 +Python, Linux, macOS, Windows, + +00:27:50.460 --> 00:27:52.260 +a bunch of different architectures. + +00:27:52.800 --> 00:27:55.160 +And we have PGO builds, which no one else has. + +00:27:55.180 --> 00:27:56.400 +So that should improve performance again. + +00:27:58.320 --> 00:27:58.800 +Yeah, yeah. + +00:28:02.200 --> 00:28:02.600 +PGO? + +00:28:03.180 --> 00:28:03.440 +PTO? + +00:28:04.539 --> 00:28:05.340 +PGO is... + +00:28:05.560 --> 00:28:06.100 +Yeah, tell people. + +00:28:07.220 --> 00:28:10.580 +Yeah, so we did this first in Pydantic itself, + +00:28:10.760 --> 00:28:12.080 +which obviously the core is written in Rust. + +00:28:12.520 --> 00:28:14.820 +And it was, in fact, David Hewitt on our team, + +00:28:15.000 --> 00:28:16.200 +who's the PyO3 maintainer, + +00:28:16.200 --> 00:28:18.380 +who identified this great technique. + +00:28:18.460 --> 00:28:19.900 +So basically, it's part of Rust. + +00:28:20.680 --> 00:28:22.320 +You basically compile the library, + +00:28:24.300 --> 00:28:27.280 +and then you run as many different bits of code against it as you can, + +00:28:27.320 --> 00:28:29.020 +in our case, all of the unit tests. + +00:28:29.700 --> 00:28:32.840 +And then you basically recompile it with pointers + +00:28:33.020 --> 00:28:37.000 +as to which paths in the code, which branches are most common. + +00:28:37.240 --> 00:28:39.720 +and you can get up to about 50% performance improvement. + +00:28:40.140 --> 00:28:41.640 +The thing is, if you're building your own library, + +00:28:41.800 --> 00:28:42.340 +that's a real pain. + +00:28:42.400 --> 00:28:44.600 +If you're building your own application, that's a pain. + +00:28:44.760 --> 00:28:47.140 +But if you just uv add Pydantic Monty, + +00:28:47.160 --> 00:28:48.180 +you get that stuff for free. + +00:28:48.720 --> 00:28:49.560 +Yeah, super cool. + +00:28:49.940 --> 00:28:52.040 +Yeah, I'm reoriented in my acronyms now + +00:28:52.620 --> 00:28:55.000 +for profiler-guided optimizations, right? + +00:28:55.700 --> 00:28:56.660 +Yeah, so you basically, + +00:28:59.880 --> 00:29:01.660 +compilers, as Python people, + +00:29:01.720 --> 00:29:02.940 +we don't necessarily think about them a lot, + +00:29:03.100 --> 00:29:05.860 +but compilers have all sorts of optimizations. + +00:29:06.040 --> 00:29:13.920 +And I remember in the late 90s when I was working with things like GCC and stuff, you could actually break your program by asking for too many optimizations. + +00:29:14.280 --> 00:29:16.120 +You know, you could, it had these levels. + +00:29:16.280 --> 00:29:22.160 +And if you put it on the top level, there's a chance your program like literally might not run, which is a really bizarre thing for compilers to do. + +00:29:22.980 --> 00:29:31.580 +But they can, they make like decisions, like maybe we should inline this so we can avoid a stack jump and setting up the stack and all that. + +00:29:35.740 --> 00:29:41.540 +with the PGO, it actually looks at how the code runs and uses that as input for its optimization, + +00:29:41.880 --> 00:29:44.480 +which is a super cool idea. So it's awesome you're doing that. + +00:29:45.320 --> 00:29:48.320 +Yeah, and I honestly don't know what the difference is here. I think when I tried it, + +00:29:48.320 --> 00:29:52.180 +it was relatively minor, but in Pydantic, it makes for a big improvement. + +00:29:54.940 --> 00:29:58.940 +Yeah, going back a bit, I don't know if people remember, depending on where they were in their + +00:29:58.940 --> 00:30:08.420 +journey but from pidantic one to two you all got like 50x performance increases and yeah the + +00:30:08.940 --> 00:30:14.720 +pidantic of today is not the pidantic of 2017 right it sure is not it's sure it's not and that was + +00:30:15.430 --> 00:30:19.380 +you know that was an enormous piece of work the rewrite because we didn't have nlms i think it + +00:30:19.380 --> 00:30:22.020 +would have been a job that would have been a heck of a lot easier if we've been able to + +00:30:22.680 --> 00:30:28.900 +point opus 4.6 at pidantic and be like do this but in rust but hey we got it done and i love + +00:30:28.920 --> 00:30:36.160 +a lot along the way that's a challenge that we're gonna have to i don't know how you see it but i + +00:30:36.340 --> 00:30:41.500 +think as an industry and individually each of us it's going to struggle with like how much rust + +00:30:41.720 --> 00:30:46.940 +did you learn and how how much experience and ideas did you get spending that year evolving + +00:30:47.260 --> 00:30:52.980 +pydantic versus if you just got it knocked out like where's the trade-off like i don't i know + +00:30:53.040 --> 00:30:56.100 +there are those people who are like now it's impossible to enter as a software engineer i've + +00:30:56.040 --> 00:31:00.200 +I've spoken to some people, some really amazing product people who are like, I'm writing code + +00:31:00.400 --> 00:31:04.320 +suddenly because I have the right technical mindset. I just have never had the time to go + +00:31:04.320 --> 00:31:09.360 +and learn all this stuff. And now the LLM can do the like rote for me and I can do the innovative + +00:31:09.620 --> 00:31:14.080 +product stuff on top. So I get to build. So we have new people entering, but you're right. There + +00:31:14.280 --> 00:31:19.580 +are going to be big challenges because just as I don't have a clue about assembly and I'm not good + +00:31:19.580 --> 00:31:23.680 +at writing it, and that probably makes me a worse engineer than if I spent the first decade of my + +00:31:23.700 --> 00:31:25.840 +career hand writing out assembly. + +00:31:26.900 --> 00:31:28.500 +So as we add layers of abstraction, + +00:31:28.860 --> 00:31:33.140 +the layer of abstraction beneath becomes kind of in the shade + +00:31:33.460 --> 00:31:34.720 +to most of us. + +00:31:34.720 --> 00:31:35.800 +And we never look at it. + +00:31:36.480 --> 00:31:38.060 +Yeah, it's very interesting. + +00:31:38.800 --> 00:31:41.940 +I sort of think of this whole agentic coding thing + +00:31:42.240 --> 00:31:45.900 +as the change when design patterns became popular. + +00:31:46.910 --> 00:31:48.240 +Instead of talking about, here's how we're + +00:31:48.240 --> 00:31:49.440 +going to do the loop, or here's how we're + +00:31:49.440 --> 00:31:51.780 +going to construct the class, you just think, singleton, + +00:31:52.680 --> 00:31:56.940 +flyweight and like you're building with these bigger conceptual building blocks and now it's + +00:31:57.040 --> 00:32:01.960 +kind of like make a login page okay we've got the law now what now what else am i building like you + +00:32:02.040 --> 00:32:07.860 +can think almost in components rather than like very small pieces yeah kind of what pypi does but + +00:32:07.960 --> 00:32:15.020 +like at the next level up yeah yeah kind of yeah i do think there's still room for people to come + +00:32:15.060 --> 00:32:19.080 +into the industry i think it's super exciting you still just i think it's really going to come down + +00:32:19.000 --> 00:32:23.680 +down to like problem solving and breaking down things into the way you want them to work and + +00:32:23.800 --> 00:32:28.980 +that that's a programmer skill i also think what we haven't seen yet is the things that llms are bad + +00:32:29.060 --> 00:32:35.200 +at because one if an if i tried to do something with an llm and it doesn't work that is not proof + +00:32:35.210 --> 00:32:39.680 +that i cannot do it with another that's proof it didn't work that particular time whereas if i go + +00:32:39.680 --> 00:32:43.500 +and try and do something with that lemon it does work well hey that's proof it can be done and two + +00:32:44.000 --> 00:32:49.300 +no one wants to talk about this is the thing that failed right so anthropic announced we built a c + +00:32:49.520 --> 00:32:56.520 +compiler in two weeks by giving opus loads of access what they didn't say is we tried to build + +00:32:56.590 --> 00:33:00.400 +an ebay clone and it was a complete unmitigated failure cost us what would have been a hundred + +00:33:00.560 --> 00:33:05.720 +thousand dollars of inference i'm not saying that's happened you know no criticism yeah we don't hear + +00:33:05.770 --> 00:33:10.200 +about the failures both because they're less attractive to state and because they are not + +00:33:10.340 --> 00:33:15.380 +clear identifiers as it were in the way that like successes are and i think one of the things we + +00:33:15.380 --> 00:33:19.260 +will learn over the next few years is like here are the things lms are really really good at and + +00:33:19.270 --> 00:33:26.320 +here are the things that no one succeeded with them yet and that's probably meaningful yeah i'm i don't + +00:33:26.360 --> 00:33:30.060 +want to go too deep in this because i want to stay focused on money but i i'm also a believer of + +00:33:30.240 --> 00:33:36.380 +jevon's paradox i think that this is going to create more demand for software now that people + +00:33:36.400 --> 00:33:40.500 +see what is possible rather than just like, well, we're going to build exactly the same amount of + +00:33:40.600 --> 00:33:48.200 +software with fewer people. So I think there's a lot there. So codspeed. So you have that. That's + +00:33:48.290 --> 00:33:53.720 +this is a pretty interesting tool. I just recently learned about this. You have this as a badge on + +00:33:53.720 --> 00:33:59.800 +your GitHub. Tell us a quick bit about this. I'm good friends with Arthur, who was the founder. + +00:34:00.540 --> 00:34:05.260 +I'm a big fan of codspeed when you're building performance critical code. This is a nice view, + +00:34:05.280 --> 00:34:10.980 +but the real powerful thing is if you go and basically on a on a pull request you can see + +00:34:11.240 --> 00:34:17.500 +if you're getting performance regressions so okay um and even better so if you go to so these are + +00:34:17.500 --> 00:34:21.820 +the particular benchmarks we have so if yeah maybe you go to branches it's a or if you go to a pull + +00:34:21.919 --> 00:34:27.700 +request uh in in our github okay oh if i compare all these i've compared main against main that's + +00:34:27.700 --> 00:34:33.480 +not super interesting if you go back to our if you go back to to the yeah go to go to a pr that you + +00:34:33.460 --> 00:34:35.260 +guys have to PRs. + +00:34:35.760 --> 00:34:37.280 +If you go, for example, to that + +00:34:37.820 --> 00:34:39.120 +data class one, the third one down. + +00:34:39.639 --> 00:34:40.620 +Gotcha. All right. Let's check that out. + +00:34:42.940 --> 00:34:43.460 +You'll see we have + +00:34:43.460 --> 00:34:45.200 +a comment from CodSpeed saying one benchmark + +00:34:45.419 --> 00:34:46.100 +has got more performance. + +00:34:47.679 --> 00:34:49.300 +More importantly, if I had + +00:34:49.540 --> 00:34:50.520 +performance regression, + +00:34:52.279 --> 00:34:53.440 +now CodSpeed would be failing + +00:34:53.540 --> 00:34:55.100 +and I'd be like, I need to go fix that before + +00:34:55.679 --> 00:34:57.480 +I merge it. So we can't, as long + +00:34:57.500 --> 00:34:58.840 +as we have enough benchmarks, we can't have + +00:35:00.499 --> 00:35:01.560 +silent regressions in performance. + +00:35:02.020 --> 00:35:03.420 +And even more powerful, if I go + +00:35:03.440 --> 00:35:05.100 +click on that particular one, + +00:35:05.140 --> 00:35:06.360 +if you click on the pair tuples. + +00:35:09.620 --> 00:35:10.060 +- Hold on. + +00:35:12.180 --> 00:35:13.660 +Here perhaps, pair tuple, yeah got it. + +00:35:14.040 --> 00:35:14.160 +- Yeah. + +00:35:15.600 --> 00:35:17.800 +What you will see is we can now go and see + +00:35:19.460 --> 00:35:21.720 +flame graph exactly what's taken what time + +00:35:21.840 --> 00:35:23.480 +and where the performance changes have come from. + +00:35:23.820 --> 00:35:26.200 +This change is very minor, so it's not very interesting, + +00:35:26.380 --> 00:35:28.640 +but you can imagine if you accidentally do something slow + +00:35:28.740 --> 00:35:31.800 +in your code, this is Rust, but it'll work on Python as well. + +00:35:32.360 --> 00:35:33.940 +you would have this flame chart showing you + +00:35:34.060 --> 00:35:35.140 +where the performance has changed. + +00:35:36.440 --> 00:35:37.260 +- That's super neat. + +00:35:38.240 --> 00:35:39.300 +Yeah, if people who are listening, + +00:35:39.370 --> 00:35:42.080 +they just go to the Monty GitHub repo, + +00:35:42.380 --> 00:35:44.420 +go to any pull requests, pull it down. + +00:35:44.490 --> 00:35:47.520 +And there's just a comment from the cod speed bot. + +00:35:48.020 --> 00:35:51.220 +And it says, the improvement changed + +00:35:51.380 --> 00:35:54.720 +from 97.7 milliseconds to 88.1 milliseconds. + +00:35:54.900 --> 00:35:57.120 +That's a 10.95% increase in performance. + +00:35:57.360 --> 00:36:00.440 +So, hey, this thing doesn't hurt performance, right? + +00:36:00.440 --> 00:36:00.880 +By adding it. + +00:36:01.460 --> 00:36:03.720 +- Yeah, what's even cooler is under the hood they're using, + +00:36:05.960 --> 00:36:07.060 +oh, I'm having a blank on the name, + +00:36:07.100 --> 00:36:08.740 +but they're not even measuring time, + +00:36:08.940 --> 00:36:10.760 +they're measuring like CPU instructions. + +00:36:10.760 --> 00:36:12.500 +- CPU instructions, okay, yeah, yeah. + +00:36:13.800 --> 00:36:15.840 +- So it can run in a noisy environment, + +00:36:15.960 --> 00:36:17.420 +like you have actions and you can still get + +00:36:17.600 --> 00:36:21.060 +pretty good accuracy on detecting performance changes. + +00:36:22.000 --> 00:36:22.800 +- Yeah, okay. + +00:36:22.940 --> 00:36:23.860 +- Valgrind, there we are. + +00:36:23.980 --> 00:36:25.660 +Valgrind is the underlying tool that like + +00:36:26.060 --> 00:36:27.700 +at the compiler level is looking at number + +00:36:27.780 --> 00:36:28.540 +of CPU instructions. + +00:36:32.240 --> 00:36:33.440 +Let's see what this pulls up. + +00:36:33.780 --> 00:36:34.420 +Well, cool. + +00:36:35.760 --> 00:36:36.940 +I don't know what that's about, but. + +00:36:37.120 --> 00:36:40.860 +There's a polygonal, polygon. + +00:36:42.960 --> 00:36:43.360 +What? + +00:36:43.880 --> 00:36:45.300 +I don't know what this is, a cartoon, + +00:36:45.420 --> 00:36:46.420 +but there's also the app. + +00:36:47.920 --> 00:36:48.600 +The episode. + +00:36:48.840 --> 00:36:50.260 +That's its logo. Okay, I got it. + +00:36:50.800 --> 00:36:52.860 +At least it's like hero image or something. + +00:36:56.280 --> 00:36:56.640 +Yeah. + +00:36:57.020 --> 00:37:03.980 +Yeah, so maybe a good segue then into performance where the aim of Monty is not to build something faster than CPython. + +00:37:04.560 --> 00:37:08.400 +The aim, I suppose, is to build something that is not heinously slower. + +00:37:10.800 --> 00:37:15.880 +Performance seems to vary from about five times better to five times worse in most cases. + +00:37:15.960 --> 00:37:21.160 +I'm sure that there are edge cases we need to go and improve where it's worse than that, but that's what I seem to see. + +00:37:21.380 --> 00:37:24.500 +I mean, in my impression of the kind of LLM written code + +00:37:24.540 --> 00:37:26.700 +that we're mostly talking about, + +00:37:27.320 --> 00:37:28.360 +performance is not critical. + +00:37:29.360 --> 00:37:32.820 +Execution is going to be in a matter of single digit milliseconds, + +00:37:33.050 --> 00:37:33.840 +and that's not going to matter + +00:37:34.020 --> 00:37:35.420 +when your LLM requests are taking seconds. + +00:37:36.140 --> 00:37:37.940 +The thing where Monty really excels, + +00:37:38.080 --> 00:37:39.080 +if you scroll down a bit, + +00:37:39.350 --> 00:37:40.840 +and I can talk you through the table, + +00:37:42.960 --> 00:37:44.880 +it's like near the bottom of the readme. + +00:37:47.300 --> 00:37:48.500 +But yeah, there we are. + +00:37:48.640 --> 00:37:54.160 +So like the startup time here measured for Monty to go from basically code to a result. + +00:37:54.540 --> 00:38:00.200 +I think the code here is like one plus one is 0.06 milliseconds. + +00:38:00.980 --> 00:38:04.140 +So six microseconds. + +00:38:04.300 --> 00:38:10.400 +So and actually in the hot loop in benchmarks, we see one plus one going from code to result + +00:38:10.620 --> 00:38:12.980 +in Monty taking about 900 nanoseconds. + +00:38:13.100 --> 00:38:14.260 +So under a microsecond. + +00:38:16.220 --> 00:38:19.440 +Again, that's microsecond, not millisecond or second. + +00:38:20.580 --> 00:38:23.340 +When you compare that to like running something in Docker, + +00:38:23.460 --> 00:38:26.680 +which is taking in my example here, 195 milliseconds, + +00:38:29.400 --> 00:38:30.960 +Pyodide, Pyodide is an awesome project, + +00:38:31.260 --> 00:38:32.440 +big fan of the team, + +00:38:33.380 --> 00:38:34.580 +allowing you to run Python in the browser, + +00:38:34.660 --> 00:38:36.140 +but wasn't designed for this use case. + +00:38:36.860 --> 00:38:41.760 +Running, going from zero to like getting a result in Pyodide + +00:38:41.760 --> 00:38:43.460 +is 2.8 seconds. + +00:38:45.580 --> 00:38:50.220 +Starluck's a special case of another project, a bit like Monty, but a bit more limited. + +00:38:51.590 --> 00:38:54.060 +But sandboxing, I was talking earlier about that being one of the main options, + +00:38:54.170 --> 00:38:56.880 +like go run a basically spin up a new container somewhere. + +00:38:57.100 --> 00:38:58.440 +There's a bunch of services that will do that. + +00:38:58.840 --> 00:38:59.800 +They're very popular at the moment. + +00:39:00.400 --> 00:39:04.380 +From scratch to creating a new container and getting a result, here it's taking over a second. + +00:39:05.640 --> 00:39:11.039 +So where Monty really excels is where you have relatively small amount of Python code to call + +00:39:11.260 --> 00:39:15.920 +and the overhead of running it is basically in the realistic term zero. + +00:39:18.560 --> 00:39:23.700 +Yeah, so it's the cold start over and over and over again + +00:39:25.020 --> 00:39:26.760 +because these are all one-shot commands, + +00:39:26.960 --> 00:39:30.100 +like the LM asks for this thing and it shuts down when it gets the answer, right? + +00:39:30.480 --> 00:39:33.580 +Yeah, and I'm sure that if you asked the sandbox providers, + +00:39:33.660 --> 00:39:35.480 +they would be like, yeah, but it's not about cold start. + +00:39:36.020 --> 00:39:39.080 +It's about reusing an existing container and that is way faster. + +00:39:39.100 --> 00:39:42.560 +I agree that you know and then they're they're impressive pieces of technology + +00:39:43.500 --> 00:39:47.880 +but there are also lots of cases where I do want do I do want cold start I've spoken to the big + +00:39:48.240 --> 00:39:55.900 +LLM providers who are interested in Monty because if you go and ask uh ChatGPT uh like effectively + +00:39:56.180 --> 00:40:00.820 +some some arithmetic or like how many days between these two dates in the background they're running + +00:40:01.020 --> 00:40:04.940 +python code um do that calculation they're obviously very security conscious they can't + +00:40:04.880 --> 00:40:07.900 +Let's go run that Python code YOLO on whatever host. + +00:40:08.010 --> 00:40:11.600 +So they're actually using external sandboxing services often. + +00:40:11.850 --> 00:40:14.620 +And one, they're paying the second of overhead for that, + +00:40:14.760 --> 00:40:15.820 +where they do need a new container. + +00:40:16.420 --> 00:40:21.220 +But also they're paying the organizational complexity + +00:40:21.290 --> 00:40:22.640 +of another provider. + +00:40:23.220 --> 00:40:24.940 +They're paying the fee of running that. + +00:40:25.430 --> 00:40:27.140 +Whereas Monty would allow you to do that kind of thing + +00:40:27.210 --> 00:40:28.180 +right there in the process. + +00:40:30.020 --> 00:40:31.240 +That is something that's really interesting + +00:40:31.390 --> 00:40:33.780 +about how these LLMs are bad at math. + +00:40:34.220 --> 00:40:36.520 +just add up these numbers and it might not get it right. + +00:40:37.140 --> 00:40:39.180 +And so, like you said, they started, okay, + +00:40:40.440 --> 00:40:43.120 +I'm gonna write some bit of code + +00:40:43.570 --> 00:40:45.560 +that I know how to write really well and can verify. + +00:40:45.750 --> 00:40:47.920 +And then I'll just apply this dataset to it, right? + +00:40:48.020 --> 00:40:52.440 +Like you'll see it doing CSV types things with Python + +00:40:52.820 --> 00:40:53.600 +and all sorts of stuff. + +00:40:54.090 --> 00:40:55.900 +And so that's a really good place + +00:40:56.160 --> 00:40:58.200 +where that Monty could be the foundation of it, right? + +00:40:58.620 --> 00:40:59.080 +- Yeah, exactly. + +00:41:00.839 --> 00:41:02.780 +And the other nice thing about that is + +00:41:02.800 --> 00:41:04.720 +if you have the Python code and something does go wrong, + +00:41:05.380 --> 00:41:07.500 +you're not having to like kind of guess + +00:41:07.650 --> 00:41:09.620 +at what's going on inside the black box of the LLM. + +00:41:09.920 --> 00:41:10.940 +Well, I suppose you are at some level, + +00:41:11.100 --> 00:41:12.240 +but you have the code, + +00:41:12.270 --> 00:41:13.480 +which is kind of the intermediate step + +00:41:13.880 --> 00:41:14.900 +where you can go and verify, + +00:41:15.230 --> 00:41:16.240 +yep, that code makes sense. + +00:41:16.350 --> 00:41:17.820 +I mean, not saying everyone will do that, + +00:41:17.960 --> 00:41:19.320 +but as a developer debugging it, + +00:41:19.480 --> 00:41:22.020 +or as a data scientist trying to work out + +00:41:22.150 --> 00:41:24.180 +whether or not it is likely to have got the right result, + +00:41:24.230 --> 00:41:26.440 +I have the kind of intermediate representation + +00:41:27.020 --> 00:41:28.460 +of the logic that I can go and review. + +00:41:28.680 --> 00:41:30.380 +And so it's that much easier to debug. + +00:41:31.180 --> 00:41:32.040 +- Yeah, for sure. + +00:41:33.560 --> 00:41:35.360 +So let's talk about some of the columns. + +00:41:36.160 --> 00:41:37.740 +Partial language completeness. + +00:41:38.620 --> 00:41:45.520 +I'm not saying it needs to be completely complete, but what does it need? + +00:41:45.600 --> 00:41:49.720 +For example, do you need really dynamic metaclass programming for your tool use? + +00:41:50.100 --> 00:41:51.060 +Probably not, right? + +00:41:51.300 --> 00:41:51.960 +Right, probably not. + +00:41:52.660 --> 00:41:53.500 +What does it need? + +00:41:53.920 --> 00:41:57.480 +Yeah, so the things we miss right now, I'll start with the downside. + +00:41:57.500 --> 00:42:07.880 +side. The things we miss right now are classes, context managers, so with expressions, and match + +00:42:08.120 --> 00:42:12.620 +expressions, which are obviously relatively new. I think classes are by far the most complex of those. + +00:42:13.200 --> 00:42:17.760 +We will support them at some point. They're somewhat complex to get right. I have been + +00:42:17.920 --> 00:42:22.240 +amazed by how much LLMs just don't need classes to do most of the stuff they're doing. So you can + +00:42:22.340 --> 00:42:27.460 +pass a data class into Monty, and you will have some object where you can access attributes and + +00:42:27.980 --> 00:42:30.560 +as of later today methods on that data class. + +00:42:30.690 --> 00:42:33.300 +But what you can't do is define a class or a data class + +00:42:33.620 --> 00:42:34.980 +in the Monty code itself. + +00:42:35.820 --> 00:42:38.120 +I'm amazed at how often that's just not necessary. + +00:42:38.960 --> 00:42:41.780 +Context managers will mostly be nice + +00:42:42.010 --> 00:42:45.480 +because we can allow the LLM to write + +00:42:45.480 --> 00:42:46.700 +the kind of code it might want to. + +00:42:46.790 --> 00:42:48.600 +So let's say we allow the open, + +00:42:48.770 --> 00:42:49.880 +at the moment the open built-in + +00:42:49.890 --> 00:42:52.400 +is not provided at all for opening a file. + +00:42:52.820 --> 00:42:55.440 +We have basic support for pathlib + +00:42:55.900 --> 00:42:58.940 +via our way of allowing you + +00:42:59.200 --> 00:43:00.200 +like very controlled access + +00:43:00.280 --> 00:43:01.520 +to the outside world. + +00:43:01.940 --> 00:43:03.040 +But if we have at open, + +00:43:03.720 --> 00:43:04.920 +very often LLMs want to write + +00:43:05.120 --> 00:43:06.200 +with open, yada, yada. + +00:43:06.380 --> 00:43:07.500 +And we want to be able to support that. + +00:43:08.180 --> 00:43:09.900 +Match expressions are neat + +00:43:10.060 --> 00:43:10.760 +and I think will be more + +00:43:10.760 --> 00:43:11.540 +and more common in Python. + +00:43:12.080 --> 00:43:12.920 +I think we can, you know, + +00:43:13.320 --> 00:43:14.360 +full support will be hard, + +00:43:14.360 --> 00:43:16.160 +but getting most of it there is hard. + +00:43:16.640 --> 00:43:17.740 +What we will never, + +00:43:18.160 --> 00:43:20.300 +and then the other big part of partial + +00:43:20.620 --> 00:43:22.480 +is we don't have the full standard library. + +00:43:23.060 --> 00:43:23.740 +So we have a very, + +00:43:23.920 --> 00:43:25.220 +very limited standard library today. + +00:43:25.420 --> 00:43:30.540 +of some bits of typing, some bits of the sys module, + +00:43:31.660 --> 00:43:36.160 +os.envaron, there's a PR up from someone to add re, + +00:43:37.280 --> 00:43:40.920 +regexes, date time, and I think we'll add JSON. + +00:43:41.610 --> 00:43:43.300 +And so those will all be supported. + +00:43:43.500 --> 00:43:46.060 +And to be clear, they will all be implemented in Rust. + +00:43:46.640 --> 00:43:50.080 +So like JSON.loads will be Rust-level performance + +00:43:50.190 --> 00:43:50.960 +of loading that thing. + +00:43:50.960 --> 00:43:54.300 +I mean, we've been of overhead to creating the Monty object, + +00:43:54.340 --> 00:43:55.640 +but very, very fast. + +00:43:56.130 --> 00:43:56.460 +Right, right. + +00:43:59.060 --> 00:44:01.740 +But we're never going to go and support the whole standard library. + +00:44:01.840 --> 00:44:04.180 +It'll be on a case-by-case that LLMs actually need this thing + +00:44:05.320 --> 00:44:06.600 +that we can go and add them. + +00:44:06.630 --> 00:44:09.780 +I will say, and I know we're going to talk about this at some point, + +00:44:09.790 --> 00:44:13.720 +but it is amazing what this project is only made possible by LLMs. + +00:44:15.039 --> 00:44:17.640 +And not that we're ever aiming the full standard library, + +00:44:17.880 --> 00:44:21.440 +but adding support for certain modules in the standard library + +00:44:21.470 --> 00:44:23.420 +is a heck of a lot easier when you can, again, + +00:44:23.540 --> 00:44:25.460 +We have a perfect record of what it's supposed to do. + +00:44:25.540 --> 00:44:28.340 +So we can go and ask the LLM to build that. + +00:44:29.120 --> 00:44:33.040 +You've got the test for like CPython has a ton of tests. + +00:44:33.980 --> 00:44:38.880 +You can extract out the bits that apply to that maybe and just say, well, does it run here? + +00:44:39.140 --> 00:44:42.940 +I'll come on to like three reasons why I think this is possible with LLM. + +00:44:42.980 --> 00:44:48.940 +Let me just, the last point that's going to make is what we will never support is, or I think never support is third party libraries. + +00:44:49.440 --> 00:44:55.940 +So you'll never be able to pip install Pydantic or FastAPI or requests inside Monty. + +00:44:56.010 --> 00:45:03.300 +And because the reason for that is we would need to support the CPython ABI and basically support full CPython. + +00:45:04.200 --> 00:45:06.260 +And if you're going to do that, you're basically back to CPython. + +00:45:08.360 --> 00:45:11.920 +And so, sure, there are ways of sandboxing CPython, most of which are demonstrated here. + +00:45:12.080 --> 00:45:12.960 +That's not the aim of this project. + +00:45:13.420 --> 00:45:18.800 +However, what we can allow you to do is basically have a shim where you expose, let's say, + +00:45:20.160 --> 00:45:25.180 +HTTPX get and post methods and patch and whatever you need through to Monty. + +00:45:25.960 --> 00:45:33.220 +And we're currently working out whether or not we basically add those, provide those shims as part of the library. + +00:45:33.250 --> 00:45:34.620 +So you don't need to go and think about that. + +00:45:34.650 --> 00:45:41.980 +You can be like, yes, give it HTTP access or yes, give it access to DuckDB's SQL engine. + +00:45:42.720 --> 00:45:47.320 +or give it access to beautiful soup and that shim comes and you don't need to go and implement it + +00:45:49.000 --> 00:45:55.660 +so you can whitelist in like super critical libraries that people are like we if i had this + +00:45:55.820 --> 00:45:59.860 +i could really yes so one of the questions we have now that we need to probably go run evals on to + +00:46:00.000 --> 00:46:07.399 +find out is if we come up with a very pythonic type safe uh example of let's say an http library + +00:46:07.540 --> 00:46:12.860 +and we give those types to the LLM, does it do better or worse with that than just being told + +00:46:13.340 --> 00:46:18.080 +you can use requests? And I don't know the answer. There are genuine arguments in both cases. + +00:46:18.880 --> 00:46:22.680 +Some people seem to be very sure one or the other is right. I just don't know. And that's the kind + +00:46:22.720 --> 00:46:26.700 +of thing where we need to go and run evals and work out what an LLM will find easiest. + +00:46:28.380 --> 00:46:33.680 +But yeah, we can either kind of attempt to fake the existing libraries API, + +00:46:34.280 --> 00:46:39.940 +warts and all or we can go and in many cases just say oh we've got this new fetch library that has + +00:46:39.940 --> 00:46:45.380 +a fetch method and here's its uh signature and i suspect the llm will do do a pretty good job of it + +00:46:46.140 --> 00:46:53.560 +so what are the weird new it's not quite typo squatting but kind of typo squatting supply chain + +00:46:54.240 --> 00:46:59.719 +type of issues has at least in the earlier days of llms they when you would ask it to write code + +00:46:59.740 --> 00:47:04.440 +sometimes it would say we're going to import some library and that library didn't exist and then it + +00:47:04.680 --> 00:47:10.840 +imagined a bunch of code series that happened after it so people would go and find popular ones + +00:47:10.980 --> 00:47:18.900 +of those and then register malicious packages that the llms had hallucinated right um but i i guess + +00:47:18.930 --> 00:47:25.340 +you probably kind of you kind of got to do a similar analysis but not for evil where you say + +00:47:25.180 --> 00:47:33.800 +say, well, if I just ask Claude or Codex or whatever to do a thing, what does it try to do? + +00:47:34.010 --> 00:47:39.380 +If you see it always asking for a question, maybe it's just better that we lie to it and say, + +00:47:39.960 --> 00:47:44.360 +okay, whenever it says import request, we give it our special way to just get stuff off the internet. + +00:47:44.570 --> 00:47:48.280 +And it only really needs to get put in like a couple of, it doesn't need all of requests. It + +00:47:48.310 --> 00:47:51.660 +just needs a very basic behaviors. Is that the kind of stuff you're thinking? + +00:47:51.730 --> 00:47:55.100 +Yeah, exactly that. And that's one of the reasons we didn't start with Starlark. + +00:47:55.300 --> 00:48:03.540 +which is a uh i think originally a meta facebook project to have a like basically isolated python + +00:48:04.700 --> 00:48:11.020 +runtime was because starlark has a very disciplined and principled approach to what it supports and + +00:48:11.120 --> 00:48:15.240 +what it doesn't we have to be not principled we have to be like well if the llm wants to write + +00:48:15.340 --> 00:48:20.520 +this thing we're gonna go and implement the csv module but not the tunnel live module because + +00:48:20.680 --> 00:48:24.460 +that's just what they need to go and use and we're going to be like our principle is give the llm what + +00:48:24.460 --> 00:48:34.040 +wants not here's our rule um uh so yeah yes exactly and yeah i mean i think boris boris the um + +00:48:34.720 --> 00:48:39.040 +called code creator talked about this i saw him speaking he was saying like you know one of the + +00:48:39.060 --> 00:48:45.560 +reasons they gave the llm bash early on was like you can tell it to use the makedar um tool to + +00:48:45.680 --> 00:48:50.799 +make directories but half the time it'll just go and call makedar --p and make the directory + +00:48:50.820 --> 00:48:54.440 +that way and like are we going to fight it and always return an error being like you should do + +00:48:54.540 --> 00:48:57.820 +this other thing or we're just going to make that thing work and often you have to just make that + +00:48:57.820 --> 00:49:10.780 +thing work yeah exactly so is this go ahead yeah is this useful outside of this for ai story you + +00:49:10.780 --> 00:49:17.119 +know like if i'm creating something that has really high security i want to add some some + +00:49:17.140 --> 00:49:23.380 +mechanism for people to write scripting, but not full on programming language. Do you see it used + +00:49:23.460 --> 00:49:27.600 +in other places? Yeah, we've actually thought about this internally inside Logfire already. Like we + +00:49:27.760 --> 00:49:32.980 +want to be able to give people a way of basically entering config, config that can do things. There's + +00:49:33.040 --> 00:49:36.320 +no easy way of doing that right now, right? I said, sure, I can go and use as again, one of these + +00:49:36.440 --> 00:49:40.400 +sandboxing services to run that code, all the complexity of setting it up, we offer self hosted + +00:49:40.520 --> 00:49:45.680 +Logfire, so they're not going to work, etc, etc. Or once Monty is a bit more mature, we can just go + +00:49:45.620 --> 00:49:48.360 +go and use Monty to let them define the expression. + +00:49:49.210 --> 00:49:50.440 +It might be as simple as like, + +00:49:50.720 --> 00:49:52.340 +what field do we use from + +00:49:52.480 --> 00:49:54.100 +your profile to display as your name? + +00:49:55.059 --> 00:49:57.720 +We can let you, or an AI can write + +00:49:57.880 --> 00:49:59.460 +that one line of code that does that and then we can + +00:49:59.580 --> 00:50:02.260 +call it lots of times that it's feasible now to have + +00:50:02.380 --> 00:50:05.100 +the few lines of Python code to define this, + +00:50:05.859 --> 00:50:08.120 +that's generally been hard until now. + +00:50:09.519 --> 00:50:13.080 +But of course, the best tools are the ones where people + +00:50:13.340 --> 00:50:15.400 +use the tool for not what it was originally designed for. + +00:50:15.960 --> 00:50:18.600 +So someone invents the hammer and I think it's going to be used for nails. + +00:50:18.740 --> 00:50:21.640 +And then someone else realizes that you can like change the, + +00:50:21.960 --> 00:50:25.360 +like knockout like mistakes in your bumper of your car with a hammer. + +00:50:25.640 --> 00:50:27.940 +Right. And like, what's amazing about Pydantic, + +00:50:28.110 --> 00:50:31.820 +why I'm so proud of it is people gone and used it as a general purpose tool for + +00:50:31.820 --> 00:50:33.040 +a bunch of things I'd never thought of. + +00:50:33.140 --> 00:50:37.480 +So my like dream for Monty is that people come along with things to do with it + +00:50:37.480 --> 00:50:40.700 +that I had never heard of. And like RLM is a really good example of that. + +00:50:40.780 --> 00:50:44.900 +So recursive language models of this way in which you use almost always a + +00:50:44.920 --> 00:50:49.720 +Python REPL as a way of implementing effectively agentic loop. + +00:50:50.220 --> 00:50:54.480 +And there were some people who have an example of doing that and like getting + +00:50:54.580 --> 00:50:58.300 +better results in the RKGI two benchmarks by using RLM. + +00:50:58.680 --> 00:51:00.900 +I didn't even know about RLM is when I announced Monty, + +00:51:01.340 --> 00:51:05.040 +there are now at least four different libraries that are using Monty for RLM + +00:51:05.560 --> 00:51:05.780 +with, + +00:51:06.240 --> 00:51:09.140 +with DSPY because the SPY, + +00:51:09.260 --> 00:51:10.800 +because people are super excited about that space. + +00:51:10.980 --> 00:51:11.760 +So that's, + +00:51:12.140 --> 00:51:12.900 +that's agentic, + +00:51:13.040 --> 00:51:15.540 +but it's definitely something I haven't thought of when I announced it. + +00:51:15.900 --> 00:51:20.400 +Yeah, I was even thinking just like, I have a medical device, like a CT scanner. + +00:51:20.480 --> 00:51:24.540 +I don't want to let people script it, but we can't break it and like zap somebody. + +00:51:24.920 --> 00:51:25.500 +You know what I mean? + +00:51:25.620 --> 00:51:28.320 +It needs to be really very, very controlled. + +00:51:29.360 --> 00:51:31.300 +This could be a really interesting thing. + +00:51:32.000 --> 00:51:33.500 +So does it compile to WebAssembly? + +00:51:33.620 --> 00:51:34.980 +Can I in-browser it? + +00:51:35.480 --> 00:51:35.640 +Yep. + +00:51:35.940 --> 00:51:40.480 +And in fact, Simon Willison, the day it came out, or Simon Willison, + +00:51:40.820 --> 00:51:42.440 +Claude prompted by Simon Willison set one up. + +00:51:42.560 --> 00:51:51.100 +So I think if you go to Simon's blog somewhere, there's actually an example of Monty running somewhere in a browser that you can go and try. + +00:51:51.140 --> 00:51:52.300 +It's probably an earlier version. + +00:51:56.860 --> 00:51:57.160 +There you go. + +00:51:58.700 --> 00:51:59.340 +Yeah, somewhere here. + +00:51:59.340 --> 00:52:02.140 +I think he'll have a link to his version of it. + +00:52:03.560 --> 00:52:03.960 +Okay. + +00:52:05.020 --> 00:52:16.720 +So as he pointed out, you can do the really crazy thing, which is you can compile the Python package for, yes, this is his example, which is, I think, like WebAssembly running directly in the browser. + +00:52:16.780 --> 00:52:27.440 +But he did something even more crazy, which is he took the Python library, compiled that to Wasm, and then called that from inside Pyodide, which is like crazy worlds within worlds. + +00:52:28.400 --> 00:52:30.740 +Definitely not the original plan, but interesting. + +00:52:31.320 --> 00:52:31.720 +Yeah. + +00:52:32.240 --> 00:52:32.360 +Wow. + +00:52:33.660 --> 00:52:33.860 +Okay. + +00:52:33.940 --> 00:52:34.680 +So yes. + +00:52:35.760 --> 00:52:37.440 +And here's your example to do it, right? + +00:52:37.740 --> 00:52:37.820 +Yeah. + +00:52:38.940 --> 00:52:39.000 +Yeah. + +00:52:39.400 --> 00:52:45.580 +And I think the other thing we really need to add to this table in terms of latency and complexity is calling back to the host. + +00:52:46.020 --> 00:52:51.760 +So one of the reasons a number of people have reached out to me and excited about this is, sure, they're happy to have a sandboxing service. + +00:52:51.810 --> 00:52:53.800 +They don't even mind the second of start time. + +00:52:54.380 --> 00:53:04.100 +But if they want to, for example, build an agent that can go and basically run SQL against a bunch of CSV files, how do I get those CSV files into the sandbox? + +00:53:04.280 --> 00:53:10.540 +Well, that is painful and often slow because we have to make a full network round trip back to the host to get those files. + +00:53:11.160 --> 00:53:19.040 +The network latent, sorry, the overhead of calling that function on the host in Monty is single digit milliseconds or maybe even less. + +00:53:19.680 --> 00:53:26.560 +And so if you're making, if you're reading 50 different files from the local, from within + +00:53:26.590 --> 00:53:30.240 +the sandbox, but effectively they're registered locally, that's super easy and performance. + +00:53:31.820 --> 00:53:31.880 +Yeah. + +00:53:31.890 --> 00:53:33.520 +Because it's running right there in the same process. + +00:53:34.460 --> 00:53:34.580 +Yeah. + +00:53:36.620 --> 00:53:36.940 +Very neat. + +00:53:37.090 --> 00:53:38.580 +So a couple of questions. + +00:53:40.660 --> 00:53:44.460 +Kunita says, we have agents running on AWS strands. + +00:53:44.920 --> 00:53:46.060 +Here's the crazy thing about AWS. + +00:53:46.280 --> 00:53:48.080 +There's like so many services. + +00:53:48.170 --> 00:53:49.280 +I don't even know what strands is. + +00:53:49.500 --> 00:53:52.380 +I think Strands is their agent framework. + +00:53:53.050 --> 00:53:53.320 +Got it. + +00:53:54.260 --> 00:53:54.860 +Yeah, yeah. + +00:53:55.300 --> 00:53:58.480 +Will the use of Monty help us improve performance there? + +00:53:58.660 --> 00:54:00.080 +Could they use Monty there? + +00:54:00.080 --> 00:54:01.060 +Yes, it should be able to. + +00:54:02.480 --> 00:54:04.460 +Again, apologies if I don't know exactly what Strands is. + +00:54:04.470 --> 00:54:08.400 +If Strands is their agent framework, yes, in principle, + +00:54:08.620 --> 00:54:11.760 +Pynantic AI, our agent framework, will have support for Monty + +00:54:11.950 --> 00:54:15.920 +as a code execution environment later this week. + +00:54:16.100 --> 00:54:23.300 +And so you'll be able to basically instead of running, yes, that open source agents SDK. + +00:54:23.780 --> 00:54:31.240 +So I don't know whether AWS intend to add specific support for Monty, but I know our agent framework will support it later this week. + +00:54:31.780 --> 00:54:37.540 +My guess from what we've built in the past is others will pick up on it and also integrate it into their things. + +00:54:37.580 --> 00:54:44.800 +And of course, the nice thing is here, because the only real requirement is Rust, we already have the Python package and JavaScript package. + +00:54:45.000 --> 00:54:49.620 +But if you wanted to call it from any other language, basically, where you can call Rust, that should be possible. + +00:54:50.559 --> 00:54:51.420 +Nice. Very nice. + +00:54:52.859 --> 00:54:53.800 +And data science? + +00:54:54.960 --> 00:54:56.780 +You mentioned DuckDB already, sort of. + +00:54:57.660 --> 00:55:00.920 +Yeah, NumPy would be great to have, I think, full... + +00:55:01.580 --> 00:55:04.840 +I mean, I think this is where we need to be a bit careful about what we add. + +00:55:04.870 --> 00:55:09.880 +Like, sure, if there are particular bits of NumPy that are useful, can we go and add shims for that? + +00:55:09.910 --> 00:55:11.740 +Or can we even go and implement that in Rust? + +00:55:11.920 --> 00:55:20.020 +So you can do a like NumPy matrix transformation that happens effectively in Rust, but we need to work out what people want. + +00:55:20.240 --> 00:55:25.760 +And what we can't do, unfortunately, I'd love to be able to, but we can't do is just be like, yep, click this button. + +00:55:25.820 --> 00:55:29.320 +And now we have the full NumPy API available. + +00:55:29.520 --> 00:55:38.640 +That is the, you know, that's the big, I'm not going to say Achilles heel because I'm super optimistic about Monty, but that's the biggest challenge is that we don't just get to use all the libraries. + +00:55:38.900 --> 00:55:41.820 +Let me propose a slightly different path. + +00:55:44.900 --> 00:55:45.140 +Yep. + +00:55:45.640 --> 00:55:46.060 +Polars. + +00:55:46.800 --> 00:55:46.900 +Yep. + +00:55:47.980 --> 00:55:48.780 +Plus narwhals. + +00:55:49.680 --> 00:55:50.220 +What's narwhals? + +00:55:51.100 --> 00:56:01.100 +Narwhals is a facade across NumPy, polars, and a few other things that gives you, + +00:56:01.450 --> 00:56:03.920 +like you can program an either and it'll talk to one or the other. + +00:56:03.980 --> 00:56:09.620 +So basically you could use narwhals to talk NumPy, but it translates all the calls over to Polars. + +00:56:10.440 --> 00:56:10.580 +Yeah. + +00:56:10.940 --> 00:56:15.860 +I mean, given that, you know, there's a paradigm shift happening here. + +00:56:16.140 --> 00:56:21.260 +What we're not trying to do is let your existing Python code run in this runtime. + +00:56:21.800 --> 00:56:24.580 +We're trying to give it a context for LLMs to be able to write code. + +00:56:24.720 --> 00:56:26.160 +And so why not? + +00:56:26.280 --> 00:56:27.520 +I mean, Polars is written in Rust. + +00:56:28.240 --> 00:56:28.400 +Exactly. + +00:56:28.800 --> 00:56:29.360 +That's why I said that. + +00:56:29.420 --> 00:56:29.640 +Yeah, yeah. + +00:56:29.740 --> 00:56:32.940 +go and compile Polars into Monty. + +00:56:32.990 --> 00:56:37.840 +And now you have a full, very performant data frame library + +00:56:37.910 --> 00:56:40.700 +or analytical database effectively built into it. + +00:56:43.480 --> 00:56:47.120 +And we have the full Polars API available in Monty. + +00:56:47.240 --> 00:56:48.520 +That would be one option. + +00:56:50.540 --> 00:56:52.160 +Again, I'm going to be a bit restrictive + +00:56:52.380 --> 00:56:54.840 +and any color as long as it's black about what we add + +00:56:55.040 --> 00:56:57.180 +because I don't think we don't need, + +00:56:57.680 --> 00:57:02.480 +I don't care about your taste of whether you prefer polars to pandas or anything else. + +00:57:02.550 --> 00:57:04.480 +I care about what do the LLMs find easy to do. + +00:57:05.160 --> 00:57:12.320 +I think the biggest point of proof of that, Samuel, is that it doesn't do Pydantic yet. + +00:57:14.220 --> 00:57:19.080 +If it doesn't do Pydantic, like, okay, you're walking the walk. + +00:57:19.920 --> 00:57:20.060 +Yeah. + +00:57:20.740 --> 00:57:26.620 +And to be clear, I don't think, yeah, am I going to vibe code a whole new Pydantic in Monty? + +00:57:26.700 --> 00:57:28.100 +I don't know whether I'm keen for that yet. + +00:57:30.130 --> 00:57:30.300 +Yeah. + +00:57:31.040 --> 00:57:31.800 +Yes, indeed. + +00:57:32.790 --> 00:57:36.840 +So how do I go about making my AI, + +00:57:37.310 --> 00:57:45.920 +like let's say I'm doing clod code, opus 4.6, some project. + +00:57:46.580 --> 00:57:50.220 +I'm actually, I'm not a huge fan of the terminal clod code. + +00:57:50.310 --> 00:57:53.320 +I feel like it takes me too far away from the code. + +00:57:53.720 --> 00:58:00.600 +just, I prefer to kind of have it in, to kind of editor, like the extension for say cursor or + +00:58:01.040 --> 00:58:05.000 +VS Code where I can sort of like watch the code as it's going and sort of, no, no, no, you're going + +00:58:05.000 --> 00:58:08.840 +the wrong way. Anyway, it doesn't matter really which, how you run it. Suppose I'm running it + +00:58:10.059 --> 00:58:16.120 +somehow. How do I tell it about Monty? How does it know what Monty can and can't do? How do I make + +00:58:16.200 --> 00:58:22.119 +it use Monty? You know what I mean? You wait a few weeks for us to have skills for Monty and the + +00:58:22.140 --> 00:58:27.800 +rest of our stack and then you install those skills it's something we need to do and i think + +00:58:27.920 --> 00:58:33.160 +that's the number we will have proper documentation for monty as well and that will that will be an + +00:58:33.220 --> 00:58:38.640 +important part of it that's you know there's a lot to do here um lms can help with some of it but not + +00:58:38.800 --> 00:58:44.080 +not by any means do all of it um i mean at the moment read the read me and read the issues and + +00:58:44.200 --> 00:58:52.100 +i'm i am like impressed surprised scared by how much people are using monty already um wait + +00:58:52.140 --> 00:58:57.320 +already what are you doing you know what i saw i i saw your announcement of this on x actually is + +00:58:57.380 --> 00:59:04.380 +where i saw it and i believe it um it's been a while since i saw but it said something to the + +00:59:04.520 --> 00:59:10.400 +effect of like this is way too early but what the heck here we go posted the github link right + +00:59:10.620 --> 00:59:20.080 +something to that effect and that was what's that last week here we are with 5 000 stars yeah um + +00:59:20.100 --> 00:59:37.800 +Yeah, exactly. And it shows how many people are looking, are interested in this space. I mean, look, a lot of people would have started thinking, oh, there's going to be a new Python that's just faster because it's in Rust and it's going to do everything better in a way that like you might argue, you know, Ruff is like wholly better than what one before. + +00:59:38.620 --> 00:59:40.540 +That is not the aim for Monty. + +00:59:40.640 --> 00:59:43.000 +This is not going to supplant or replace in any way CPython. + +00:59:43.160 --> 00:59:44.480 +It's a completely separate thing. + +00:59:44.860 --> 00:59:46.520 +But I think there's also a lot of people who have started this + +00:59:46.720 --> 00:59:51.460 +because they're having a headache running stuff with existing options + +00:59:51.540 --> 00:59:54.220 +for sandboxing, and something like this is interesting. + +00:59:54.420 --> 00:59:58.340 +There's also another project that's worth calling out from Vassell + +00:59:58.500 --> 01:00:02.160 +called Just Bash, which is very similar conceptually. + +01:00:02.340 --> 01:00:06.740 +It's a bash environment written entirely in TypeScript by a team. + +01:00:06.760 --> 01:00:10.660 +But I met them when I was in San Francisco a few weeks ago. + +01:00:10.760 --> 01:00:18.380 +And the plan when I get around to finishing the JavaScript API is that they will in fact use Monty as the way of calling Python code. + +01:00:18.540 --> 01:00:23.660 +Because they have some way of calling Python code within this, which I think uses Pyodide at the moment. + +01:00:23.780 --> 01:00:29.380 +And it has some overheads and some challenges around security. + +01:00:32.440 --> 01:00:34.160 +But yeah, this is very similar in the sense of like, + +01:00:34.280 --> 01:00:37.420 +it's basically Vibe coding all of the terminal methods + +01:00:37.660 --> 01:00:41.200 +that you might want and using a bunch of existing unit tests + +01:00:41.440 --> 01:00:42.420 +to check that they're correct. + +01:00:45.599 --> 01:00:48.300 +Interesting that obviously Visele is a much, much bigger name + +01:00:48.330 --> 01:00:52.300 +than we are, and it hasn't got as much traction early on + +01:00:52.330 --> 01:00:54.100 +as at least in terms of GitHub stars, + +01:00:54.540 --> 01:00:55.960 +the worst of all vanity metrics. + +01:00:57.580 --> 01:00:59.640 +They've been out like two or three times as long as you have, + +01:00:59.680 --> 01:01:00.460 +and they've 1,000 stars. + +01:01:00.540 --> 01:01:05.560 +that is i mean that's not worthy honestly yeah and there's another project like this which has about + +01:01:06.100 --> 01:01:10.920 +20 stars which i was looking at earlier today which is this but in rust completely which already has + +01:01:11.140 --> 01:01:16.260 +support for monty which i can't remember the name of right now but maybe i should find it quickly and + +01:01:16.800 --> 01:01:20.560 +call it out because i feel like it deserves it given that it's a really cool project it has + +01:01:21.400 --> 01:01:29.080 +uh as i say about uh 30 stars uh let me very quickly excuse me for one minute it was one of + +01:01:28.940 --> 01:01:31.800 +One of the replies to my initial announcement. + +01:01:34.060 --> 01:01:38.020 +Sorry, I will not be very long. + +01:01:39.900 --> 01:01:41.400 +It's called BashKit. + +01:01:42.340 --> 01:01:42.960 +BashKit? + +01:01:43.620 --> 01:01:45.500 +I put the link here. + +01:01:49.140 --> 01:01:55.400 +This already actually has optional support for using Monty as the Python runtime. + +01:01:56.340 --> 01:01:56.980 +There we go. + +01:01:58.640 --> 01:02:02.180 +Well, if I was logged into GitHub on my streaming machine, I would have one more star, but I'll do it later. + +01:02:03.480 --> 01:02:04.240 +Fair enough. Fair enough. + +01:02:04.660 --> 01:02:08.600 +But I think what's interesting is all of these three projects, and I've heard of a few others, + +01:02:09.320 --> 01:02:14.700 +these are only possible, really, or they're only really challenges anyone would take on with the advantage of an AI. + +01:02:14.920 --> 01:02:16.800 +And so I was mentioning this earlier. + +01:02:16.800 --> 01:02:19.280 +I think there are three reasons why these things have... + +01:02:19.280 --> 01:02:23.420 +I'll talk about Monty in particular, why it is possible now when it wasn't before, + +01:02:23.720 --> 01:02:27.340 +and why it is something where the speed up from an LLM + +01:02:27.340 --> 01:02:29.720 +is even greater than in most coding tasks. + +01:02:33.120 --> 01:02:38.300 +One, the LLM knows in its soul, in its weights, + +01:02:38.720 --> 01:02:39.580 +the internal implementation, + +01:02:39.960 --> 01:02:42.680 +how to go about implementing a bytecode interpreter + +01:02:42.910 --> 01:02:43.920 +or how to implement it. + +01:02:44.460 --> 01:02:47.180 +If I ask most even experienced Python engineers + +01:02:47.300 --> 01:02:49.540 +or Rust engineers, how do I write a bytecode interpreter? + +01:02:50.060 --> 01:02:51.220 +They would scratch their head and be like, + +01:02:51.340 --> 01:02:52.460 +yeah, I sort of know about this. + +01:02:52.600 --> 01:02:53.300 +I'll put my hand up and say, + +01:02:53.300 --> 01:02:54.720 +I didn't know what a bytecode interpreter was + +01:02:55.210 --> 01:02:58.720 +or how they worked until I and Claude built one together. + +01:02:58.870 --> 01:03:00.440 +But like they know exactly how to do it + +01:03:00.440 --> 01:03:01.580 +'cause they've read 15 different, + +01:03:02.140 --> 01:03:03.560 +well-trodden implementations. + +01:03:03.780 --> 01:03:04.960 +- And it's got a great example. + +01:03:05.070 --> 01:03:06.100 +You can say not just any, + +01:03:06.280 --> 01:03:08.040 +here's the Python, CPython one, + +01:03:08.240 --> 01:03:10.500 +just help me do that, whatever that does. + +01:03:10.820 --> 01:03:12.140 +- And the second thing is they know + +01:03:12.540 --> 01:03:14.500 +what the public interface is again in their soul, + +01:03:15.000 --> 01:03:17.200 +as in they know what Python should be like. + +01:03:17.260 --> 01:03:19.380 +They know the signature of the filter function + +01:03:19.880 --> 01:03:21.600 +without you having to go and describe it. + +01:03:22.200 --> 01:03:24.760 +Thirdly, you have an amazing set of unit tests, + +01:03:24.910 --> 01:03:26.600 +which is basically just does it match CPython. + +01:03:26.760 --> 01:03:29.440 +So in our case, we basically vibe generate tests + +01:03:29.740 --> 01:03:31.280 +whenever we're adding a feature + +01:03:31.460 --> 01:03:35.380 +and then we run them with CPython and Monty + +01:03:35.390 --> 01:03:37.200 +and we confirm that they are identical output + +01:03:37.440 --> 01:03:38.100 +down to the byte. + +01:03:38.420 --> 01:03:41.360 +The exceptions have to be identical to the byte. + +01:03:43.940 --> 01:03:46.960 +But in the case of JustBash, + +01:03:49.700 --> 01:03:52.160 +they have the existing set of like + +01:03:52.180 --> 01:03:56.460 +tests somewhere for like any shell environment that they're able to leverage. And I think one + +01:03:56.470 --> 01:04:00.320 +thing we might do at some point is basically go steal a bunch of CPython tests and run them with + +01:04:00.420 --> 01:04:04.100 +both. I haven't got there yet, but that would be an interesting way ahead. And then the last thing is + +01:04:04.530 --> 01:04:08.920 +you don't have to bike shed or have any human debate about what should the function, what should + +01:04:08.920 --> 01:04:15.280 +the error message be when you try and add an int to a string. There's no debate about that. You're + +01:04:15.280 --> 01:04:20.400 +just doing whatever CPython does. And so there's a whole range of bike shedding debates that we just + +01:04:20.420 --> 01:04:26.100 +don't have to go and have because we're just like trying to target CPython. Now, of course, + +01:04:26.320 --> 01:04:29.260 +around the edge of that, there's a bunch of places where we do have to think about it. Like, how do + +01:04:29.260 --> 01:04:34.520 +we do these external function calling things? And that's, that is obviously, that is honestly much, + +01:04:34.640 --> 01:04:41.420 +much slower because we don't have this, like the LLM knows already the answer approach. But I think + +01:04:41.530 --> 01:04:47.640 +these are the kinds of tasks where LLMs are massively faster or one, one set of cases where + +01:04:47.660 --> 01:04:53.060 +LLMs are massively faster than without. So I was speaking to a big public company in New York who + +01:04:53.120 --> 01:05:01.560 +was saying that one of their team had Vibe-coded a Redis clone in Rust, put it into production after + +01:05:01.680 --> 01:05:07.520 +72 hours, and it was 30% faster than Redis. It probably worked fine, right? Yeah. And why is + +01:05:07.520 --> 01:05:10.860 +that possible? Well, the same things are all true. The unit test is super easy. It's just, + +01:05:11.110 --> 01:05:14.519 +is it the same as Redis? There's no debate about what the API is, et cetera, et cetera. And so + +01:05:15.000 --> 01:05:21.200 +there are these tasks which historically we would have thought were super hard so i think often we + +01:05:21.400 --> 01:05:24.940 +fall into the trap of thinking that what llms are good at is what humans are good at and what llms + +01:05:25.000 --> 01:05:29.180 +are bad at is what humans are bad at i think more and more we're seeing there are things that llms are + +01:05:30.280 --> 01:05:33.380 +much better at than we are and there are things that they are that they're less good at and we're + +01:05:33.440 --> 01:05:38.540 +still very early in learning what those things are but it is not good enough just to be just to use + +01:05:38.660 --> 01:05:44.500 +the like naive simplistic approach of like what humans are good at they're good at yeah + +01:05:44.520 --> 01:05:45.580 +The simplest example of that is like, + +01:05:46.140 --> 01:05:49.320 +ask an LLM to generate you a B-tree implementation in C. + +01:05:50.620 --> 01:05:53.480 +And that prompt alone, it will write you 500 lines of C + +01:05:53.500 --> 01:05:54.800 +that work as a B-tree implementation. + +01:05:55.140 --> 01:05:58.280 +- It takes you 20 minutes to study it, to be sure. + +01:05:58.320 --> 01:05:59.340 +And it's like, you're not, + +01:05:59.700 --> 01:06:00.900 +oh, I think it works this way, right? + +01:06:01.220 --> 01:06:01.460 +- Yeah. + +01:06:03.860 --> 01:06:06.900 +- I honestly think the little bits of weird math + +01:06:07.060 --> 01:06:09.180 +and the little hallucinations and stuff + +01:06:09.500 --> 01:06:12.540 +have shaken a lot of people's trust in these things. + +01:06:12.860 --> 01:06:16.220 +And it's just like, well, I mean, how easy is this to add five numbers? + +01:06:16.460 --> 01:06:16.900 +Come on. + +01:06:18.019 --> 01:06:19.860 +Obviously, these things are junk because they can't do that. + +01:06:19.930 --> 01:06:23.240 +And it's just like, well, maybe that's not the tool to use for that situation. + +01:06:23.390 --> 01:06:23.480 +Right. + +01:06:23.790 --> 01:06:23.920 +Yeah. + +01:06:24.080 --> 01:06:25.620 +But what you're using here is incredible. + +01:06:26.300 --> 01:06:26.380 +Yeah. + +01:06:26.500 --> 01:06:31.420 +But again, we have the guardrails of you must write unit tests all the time that match the + +01:06:31.470 --> 01:06:31.600 +two. + +01:06:31.680 --> 01:06:33.380 +I mean, well, or we have fuzzing going on. + +01:06:33.460 --> 01:06:34.800 +The fuzzing is another amazing technique. + +01:06:35.430 --> 01:06:41.120 +So we have a JSON parser called Jitter, which is about the fastest JSON parser in Rust that + +01:06:41.140 --> 01:06:45.040 +We also, it's built into Pydantic Core, + +01:06:45.100 --> 01:06:49.680 +but it's also actually independently a package in PyPI + +01:06:49.840 --> 01:06:50.660 +that's used an awful lot. + +01:06:50.720 --> 01:06:53.660 +You'll see it in the dependencies of OpenAI, for example. + +01:06:55.140 --> 01:06:57.420 +But Jitter was where I discovered about fuzzing, really. + +01:06:57.580 --> 01:07:01.500 +No, I found out about it through the Hypothesis project + +01:07:01.960 --> 01:07:04.920 +of my friend Zach Hatfield-Dodds in Python, + +01:07:05.020 --> 01:07:06.020 +but then fuzzing in Rust, + +01:07:06.480 --> 01:07:07.660 +because the performance is so much better, + +01:07:08.280 --> 01:07:09.400 +is really powerful. + +01:07:09.540 --> 01:07:12.540 +So basically it's generating random strings and using them as an input to something, + +01:07:13.260 --> 01:07:17.740 +but then it's using very clever stochastic techniques to work out where to try more things. + +01:07:18.420 --> 01:07:21.900 +And so you can basically fuzz Monty. + +01:07:21.900 --> 01:07:24.580 +You can just give it arbitrary strings for hour after hour, + +01:07:25.340 --> 01:07:27.680 +and periodically it'll find something where there's an error, + +01:07:27.840 --> 01:07:33.420 +where the memory usage is too high if you do the following sequence of multiplying integers together. + +01:07:34.460 --> 01:07:38.420 +I don't think it will find a true read-the-file system vulnerability, + +01:07:38.780 --> 01:07:40.780 +but it'll definitely find odd memory uses + +01:07:40.890 --> 01:07:43.720 +or it has found stack overflows and panics + +01:07:43.750 --> 01:07:44.340 +and things like that. + +01:07:45.040 --> 01:07:46.120 +- Yeah, very interesting. + +01:07:48.440 --> 01:07:50.540 +Well, I think people are excited about it. + +01:07:50.800 --> 01:07:53.220 +It's definitely got a lot of people talking, + +01:07:53.340 --> 01:07:57.120 +a lot of attention, a lot of comments in the live stream. + +01:07:58.470 --> 01:08:01.720 +So congrats and keep us posted on where it goes. + +01:08:02.800 --> 01:08:04.120 +- Will do. Thank you very much. + +01:08:04.540 --> 01:08:05.680 +Yeah, thanks so much for having me. + +01:08:06.280 --> 01:08:07.000 +- You bet. Bye. + +01:08:07.600 --> 01:08:07.880 +- Cheers. + diff --git a/youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt b/youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt deleted file mode 100644 index 85abfe2..0000000 --- a/youtube_transcripts/541-zensical-a-modern-static-site-generator-transcript-original.vtt +++ /dev/null @@ -1,2959 +0,0 @@ -WEBVTT - -00:00:00.880 --> 00:00:03.560 -Martin, welcome to Talk Python To Me. Great to have you here. - -00:00:04.280 --> 00:00:05.000 -Thanks for having me. - -00:00:06.080 --> 00:00:16.180 -I'm excited to talk about static sites and the next big platform for building them here in Python and beyond. - -00:00:16.660 --> 00:00:20.680 -So really excited to talk about Zensicle. Am I saying that right? - -00:00:21.300 --> 00:00:22.820 -Yeah, pretty much. Zensicle. - -00:00:23.480 --> 00:00:25.440 -Zensicle. Okay. Great. - -00:00:25.440 --> 00:00:32.140 -Yeah, I know MKDocs, the material for MKDocs has been really, really popular. - -00:00:33.040 --> 00:00:37.460 -And you all have made a big splash announcing this new project. - -00:00:38.020 --> 00:00:40.100 -So I'm really looking forward to diving into it. - -00:00:40.400 --> 00:00:44.420 -Before we do, though, let's just get a little bit of background on you. - -00:00:44.540 --> 00:00:45.020 -Who is Martin? - -00:00:45.660 --> 00:00:47.960 -Of course. So hi, my name is Martin Donut. - -00:00:48.760 --> 00:00:51.180 -Most people probably know me as Squidfunk. - -00:00:51.180 --> 00:00:56.300 -I've been an independent developer and consultant for the last 20 years now. - -00:00:56.960 --> 00:01:01.880 -And I mostly write in TypeScript, Python, and lately a lot of Rust. - -00:01:02.080 --> 00:01:04.380 -So I've become a huge fan of Rust, actually. - -00:01:05.280 --> 00:01:06.960 -I'm kind of a free spirit. - -00:01:07.400 --> 00:01:12.460 -So I love doing my own thing and building products from front to back, basically. - -00:01:12.780 --> 00:01:14.640 -So doing the front end as well as the back end. - -00:01:15.400 --> 00:01:18.940 -And for the past 15 years, I contributed a lot to open source. - -00:01:18.940 --> 00:01:24.020 -As I already mentioned, my most popular project so far is Material for MKDocs. - -00:01:25.000 --> 00:01:31.960 -And it's, well, millions of people basically look at sites that are built with it every day. - -00:01:32.580 --> 00:01:36.540 -Yeah, well, and Zanzico, my latest project, will hopefully go far beyond that. - -00:01:36.600 --> 00:01:37.800 -So we're working very hard on it. - -00:01:38.020 --> 00:01:39.180 -And this is why I'm here today. - -00:01:39.420 --> 00:01:41.020 -So excited to talk about it. - -00:01:42.160 --> 00:01:43.360 -Yeah, I am as well. - -00:01:43.920 --> 00:01:48.660 -And let's just start by admiring your website a little bit. - -00:01:48.660 --> 00:01:49.140 -Thanks. - -00:01:50.860 --> 00:01:54.500 -Brian and I spoke about this over on our Python Bytes podcast. - -00:01:55.800 --> 00:02:00.560 -And we kind of just got distracted just staring at the website. - -00:02:00.720 --> 00:02:05.000 -It's this beautiful flow of, I don't know, colors. - -00:02:05.120 --> 00:02:09.520 -It looks a little bit like a black hole worm, a white wormhole sort of experience. - -00:02:09.640 --> 00:02:09.940 -I don't know. - -00:02:10.000 --> 00:02:13.500 -What was the inspiration there at this cool design? - -00:02:14.220 --> 00:02:16.260 -Yeah, this is actually a strange attractor. - -00:02:16.420 --> 00:02:17.800 -So this is something from physics. - -00:02:18.420 --> 00:02:20.620 -I'm not very, very proficient in physics. - -00:02:20.620 --> 00:02:25.580 -But those strange attractors, I had a fascination for them for a very long time. - -00:02:26.360 --> 00:02:28.480 -And they follow very simple rules. - -00:02:28.640 --> 00:02:34.860 -So it's just three equations that define how their points move in three-dimensional space. - -00:02:36.140 --> 00:02:40.760 -And yeah, but still with those simple rules, a very complex shape can emerge. - -00:02:40.760 --> 00:02:47.000 -And this is for us actually symbolizes the process of evolving ideas through writing. - -00:02:47.220 --> 00:02:55.320 -So if you have slightly different conditions from the start, it's still orbiting around the same shape. - -00:02:55.320 --> 00:02:56.940 -But it might look a little bit different. - -00:02:56.940 --> 00:03:00.860 -And there's actually, I can share this now, there's actually a little Easter egg. - -00:03:00.940 --> 00:03:02.220 -Nobody has found it so far. - -00:03:02.440 --> 00:03:15.640 -So if you hover over the homepage on zensicle.org with the mouse in the left bottom corner, you can actually change the coefficients of the animation. - -00:03:16.080 --> 00:03:20.280 -And if you do this, you can click on them and then you can use your cursor. - -00:03:20.280 --> 00:03:21.920 -I'm changing beta. - -00:03:22.120 --> 00:03:24.420 -We're running beta 0.22 right now. - -00:03:24.420 --> 00:03:26.220 -Oh, it really does change it. - -00:03:26.300 --> 00:03:26.500 -Yeah. - -00:03:26.580 --> 00:03:27.220 -Oh, my goodness. - -00:03:27.820 --> 00:03:27.960 -Yeah. - -00:03:28.080 --> 00:03:32.320 -So it takes a little time. - -00:03:32.560 --> 00:03:38.860 -But if you change the coefficients in a specific way, it might be completely chaotic and become unstable. - -00:03:39.300 --> 00:03:42.460 -So this is what I really find fascinating about those strange attractors. - -00:03:43.180 --> 00:03:44.720 -And it's also the inspiration for the logo. - -00:03:45.480 --> 00:03:47.940 -So we're building on this image a lot. - -00:03:49.840 --> 00:03:50.280 -Okay. - -00:03:50.400 --> 00:03:51.660 -I thought it was just a cool design. - -00:03:51.660 --> 00:03:56.400 -I didn't realize it had all this meaning and actual math and physics behind it. - -00:03:56.480 --> 00:03:57.360 -That's super cool. - -00:03:57.420 --> 00:03:57.500 -Yeah. - -00:03:57.500 --> 00:04:02.680 -I love chaos theory and all of this, these fractal type of ideas here. - -00:04:02.960 --> 00:04:04.600 -And yeah, it's super neat. - -00:04:05.780 --> 00:04:06.180 -Okay. - -00:04:06.260 --> 00:04:08.380 -So what is zensicle? - -00:04:09.120 --> 00:04:10.060 -Why did you build it? - -00:04:10.100 --> 00:04:11.400 -Why not just more material? - -00:04:11.400 --> 00:04:14.980 -So there are a lot of questions in there, actually. - -00:04:15.320 --> 00:04:18.820 -Maybe let me just start by shortly speaking about what it is. - -00:04:19.140 --> 00:04:24.420 -So in very simple terms, it's a tool to build beautiful websites from a folder of text files. - -00:04:24.780 --> 00:04:28.420 -So you just write in Markdown and can generate a static site. - -00:04:29.020 --> 00:04:30.360 -You don't need a database for it. - -00:04:30.440 --> 00:04:34.980 -So to those that don't know what a static site is, you don't need a database or server. - -00:04:34.980 --> 00:04:40.860 -It's just static HTML, which means you just pip install zensicle and you're ready to go within a few minutes. - -00:04:41.560 --> 00:04:43.840 -And it's fully open source, MIT licensed. - -00:04:44.560 --> 00:04:48.520 -And to maybe explain a little bit more about static sites. - -00:04:48.680 --> 00:04:52.040 -So the big benefit of it, you can host it for free in many places. - -00:04:52.040 --> 00:04:54.400 -For instance, on GitHub Pages or Cloudflare. - -00:04:54.400 --> 00:04:58.880 -And they're secure and fast by default because there's only static file serving involved. - -00:04:59.480 --> 00:04:59.900 -And zensicle. - -00:05:00.060 --> 00:05:08.280 -So we try to make it pretty with a modern design, many built-in features and fun, according to the feedback of our users, which is kind of unusual for writing documentation. - -00:05:08.820 --> 00:05:09.880 -So, yeah. - -00:05:11.240 --> 00:05:11.720 -Yeah. - -00:05:11.900 --> 00:05:12.540 -Very cool. - -00:05:12.540 --> 00:05:26.280 -And if anyone's tried to manually create a static site, it quickly becomes a challenge if you're just writing. - -00:05:26.860 --> 00:05:29.200 -I say, hey, it's only five HTML pages. - -00:05:29.280 --> 00:05:30.700 -I can just write the HTML. - -00:05:30.920 --> 00:05:31.480 -You know what I mean? - -00:05:32.780 --> 00:05:37.740 -But, well, what if you want to have common navigation or you want to change the look and feel? - -00:05:38.620 --> 00:05:41.800 -Oh, well, now I've got to go edit that in five places, right? - -00:05:41.800 --> 00:05:53.320 -And so if even just beyond, basically beyond one page, having something that generates the static site is super valuable, right? - -00:05:53.360 --> 00:06:01.640 -Because it'll generate the wrapper navigation, the common CSS, the footer, all those kinds of things, right? - -00:06:02.720 --> 00:06:03.160 -Yes. - -00:06:03.240 --> 00:06:04.600 -So it depends on what you want to do. - -00:06:04.600 --> 00:06:11.360 -So, of course, if you have a small site, like a personal website or so, you can just write basic HTML if you're proficient in it. - -00:06:11.360 --> 00:06:18.540 -For instance, the users of Material, only 7% of them are front-end developers. - -00:06:19.980 --> 00:06:23.660 -We will dive a little bit into how Zensicle relates to Material later. - -00:06:23.660 --> 00:06:29.540 -And what Zensicle is being used for primarily is for documentation. - -00:06:29.920 --> 00:06:37.360 -So it builds on the Doccess code philosophy, which means that you treat your documentation exactly like your source code. - -00:06:37.440 --> 00:06:38.860 -So you primarily write documentation. - -00:06:39.080 --> 00:06:43.380 -You don't want to fight front-end development problems. - -00:06:43.720 --> 00:06:46.820 -You just want to keep the content, like get the content out. - -00:06:46.820 --> 00:06:59.200 -And with this Doccess code, what the cool thing about it is you can use the same tools and processes and workflows like you use for code, like versioning and PRs to make changes. - -00:06:59.820 --> 00:07:07.620 -And the adoption is growing really fast, actually, among companies in recent years as they're moving away from proprietary tools to open-source solutions. - -00:07:07.620 --> 00:07:15.360 -So Zensicle is for you or a static site generator in general is for you if you just want to get your writing out. - -00:07:15.780 --> 00:07:19.340 -And, of course, you can also customize it and make it pretty as you want. - -00:07:19.640 --> 00:07:24.240 -But you don't necessarily need to know HTML, CSS, and JavaScript. - -00:07:24.560 --> 00:07:26.520 -And that's quite practical. - -00:07:26.520 --> 00:07:32.100 -And you talked about writing, and you even have your metaphor with strange attractors. - -00:07:33.940 --> 00:07:42.980 -I personally find if I'm just in a clean space where it's really just about the ideas, I don't have to worry about the design. - -00:07:43.260 --> 00:07:47.620 -It makes it so much easier to just focus on the actual writing. - -00:07:47.860 --> 00:07:49.180 -You're in a markdown editor. - -00:07:49.700 --> 00:07:54.560 -My favorite is Type Hora, but you can use whatever variety that you want, right? - -00:07:54.560 --> 00:07:56.400 -And you're just there. - -00:07:56.700 --> 00:07:59.160 -You're not worried even hardly about the formatting of the markdown. - -00:07:59.260 --> 00:07:59.860 -You're just writing. - -00:08:00.160 --> 00:08:04.440 -And I find that very good creative space, I guess. - -00:08:06.080 --> 00:08:07.540 -Yeah, that's the beauty of markdown. - -00:08:07.900 --> 00:08:11.320 -So you can just write, as you mentioned. - -00:08:11.600 --> 00:08:16.140 -And how you, in the end, use it, you can still decide that afterwards. - -00:08:16.260 --> 00:08:19.020 -So if you want to build a website, if you want to create a PDF of it, - -00:08:19.560 --> 00:08:21.940 -if you just want to use it for internal note-taking or so. - -00:08:21.940 --> 00:08:26.760 -And this is the big benefit of markdown. - -00:08:26.860 --> 00:08:34.640 -It takes away a lot of the headache of having to remember a lot of markup in order to get your ideas out of the door. - -00:08:35.700 --> 00:08:38.520 -Can you actually put markup in it if you need to? - -00:08:39.000 --> 00:08:46.340 -For example, maybe you need a particular image, two of them side by side that are links, - -00:08:46.340 --> 00:08:48.780 -and you want them to open in a new tab if somebody clicks them. - -00:08:49.220 --> 00:08:53.620 -Can you set it into basically an unsafe mode and let it do embedded markup? - -00:08:54.620 --> 00:08:55.800 -Yeah, that's a great question. - -00:08:56.320 --> 00:08:57.380 -So, yes, it's possible. - -00:08:57.520 --> 00:09:00.040 -You can just use HTML within markdown. - -00:09:00.260 --> 00:09:04.460 -We currently depend on Python markdown, which we inherited from material for MKDocs. - -00:09:04.460 --> 00:09:10.480 -We are gradually moving towards common mark, which, so just as a context, - -00:09:10.680 --> 00:09:14.860 -Python markdown has some oddities when you use HTML within markdown. - -00:09:14.960 --> 00:09:19.600 -For instance, it won't replace relative URLs correctly. - -00:09:19.720 --> 00:09:21.200 -This is like an annoying thing. - -00:09:21.200 --> 00:09:28.520 -But once we move to common mark, we will also have predefined components that you can use - -00:09:28.520 --> 00:09:33.680 -because you can't express everything, like more complex things in plain markdown. - -00:09:33.860 --> 00:09:38.180 -So there are only things like you can make text bold, you can have lists, tables, et cetera. - -00:09:38.240 --> 00:09:45.600 -But if it's more complex, as you mentioned, aligning to images or having an image with a caption or so, - -00:09:45.960 --> 00:09:47.200 -you need basically HTML. - -00:09:47.200 --> 00:09:47.720 -HTML. - -00:09:48.280 --> 00:09:52.000 -And this is possible already, but we will make it much easier in the future. - -00:09:52.220 --> 00:09:54.020 -The frontend world already knows this. - -00:09:54.300 --> 00:09:55.200 -So they use MDX. - -00:09:55.260 --> 00:09:59.480 -They've been using MDX for quite a while, which is a dialect on top of markdown, - -00:09:59.860 --> 00:10:04.380 -which adds more liberty with components and so on. - -00:10:04.380 --> 00:10:07.140 -So you can create reusable components that you can use. - -00:10:08.400 --> 00:10:08.520 -Yeah. - -00:10:08.780 --> 00:10:09.980 -But, yeah. - -00:10:10.200 --> 00:10:11.540 -So it's possible. - -00:10:12.360 --> 00:10:15.500 -It's our users already also do it. - -00:10:15.500 --> 00:10:20.040 -We also have some examples on the documentation, and we will make it much more powerful in the future. - -00:10:20.980 --> 00:10:21.180 -Yeah. - -00:10:21.380 --> 00:10:21.860 -Very nice. - -00:10:22.240 --> 00:10:28.220 -I do think regular markdown is just a few missing things. - -00:10:28.320 --> 00:10:29.720 -I love the simplicity of it. - -00:10:30.200 --> 00:10:32.800 -And hat tips, John Gruber, for creating it. - -00:10:32.800 --> 00:10:40.360 -But it's just like, I just need to maybe put a class here or just do a little, if I could just control this a little bit more, - -00:10:40.580 --> 00:10:43.560 -then you could basically escape HTML. - -00:10:43.900 --> 00:10:49.260 -With obviously being careful to not just recreate HTML with square brackets instead of angle brackets, right? - -00:10:49.260 --> 00:10:52.420 -Yeah, there's been a lot of work on Python markdown. - -00:10:52.500 --> 00:10:57.860 -So in Python markdown, there are some extensions that allow you to add classes at least to block elements. - -00:10:58.280 --> 00:11:03.180 -So on markdown, you need to distinguish between inline and block elements. - -00:11:03.320 --> 00:11:04.080 -Oh, no, it also works. - -00:11:04.080 --> 00:11:04.240 -Sorry. - -00:11:04.300 --> 00:11:06.480 -It also works on inline elements like links and so on. - -00:11:06.980 --> 00:11:08.300 -But this is special syntax. - -00:11:08.300 --> 00:11:12.280 -So Python markdown is a dialect that is not standardized like common mark. - -00:11:12.420 --> 00:11:16.240 -In common mark, this is not easily possible to add specific classes. - -00:11:16.580 --> 00:11:21.200 -But with common mark, as I mentioned, you have MDX, which is a de facto standard. - -00:11:21.320 --> 00:11:23.080 -I don't know if they've standardized it already. - -00:11:23.840 --> 00:11:25.320 -That allows for much, much more. - -00:11:26.320 --> 00:11:26.800 -Nice. - -00:11:28.320 --> 00:11:31.400 -So what is Zensicle for? - -00:11:31.520 --> 00:11:34.980 -Is this a documentation generating tool? - -00:11:34.980 --> 00:11:39.860 -Is it a just open-ended static site generator? - -00:11:41.400 --> 00:11:47.340 -What is possible and what is your goal or your target with this project? - -00:11:49.500 --> 00:11:53.260 -Yeah, so as I mentioned right now, we're focusing on documentation. - -00:11:53.660 --> 00:11:56.200 -So because this is the thing we're coming from. - -00:11:56.700 --> 00:11:59.240 -But we're building Zensicle for much, much more. - -00:11:59.240 --> 00:12:05.980 -So our stretch goal is to have a fully-fledged knowledge management and documentation solution. - -00:12:06.840 --> 00:12:11.120 -There are already a lot of companies that use it internally for knowledge management. - -00:12:12.080 --> 00:12:16.480 -Basically, there's an alternative to a ZaaS-based solution like Confluence and Notion. - -00:12:16.900 --> 00:12:19.360 -We are aware that for this, we need WYSIWYG. - -00:12:19.480 --> 00:12:20.760 -So what you see is what you get. - -00:12:20.880 --> 00:12:23.240 -A visual editor that is also usable by non-technicals. - -00:12:23.240 --> 00:12:30.580 -And if you scroll, if you check out our roadmap and scroll down all the way, you will see it as a stretch goal. - -00:12:31.220 --> 00:12:34.580 -Which is basically something we're working towards. - -00:12:35.160 --> 00:12:41.040 -Because this would actually allow so much more people within organizations to use it. - -00:12:41.040 --> 00:12:52.880 -And in general, Zensicle, with Zensicle, we focus on three key areas that make us different from other static site generators. - -00:12:52.880 --> 00:12:55.440 -Which is, well, a modern design. - -00:12:55.540 --> 00:12:57.580 -So, of course, some also have a modern design. - -00:12:57.740 --> 00:13:04.740 -But within the Python ecosystem, some options might look a little bit dated. - -00:13:04.740 --> 00:13:08.880 -So we try to be a little bit more on the edge, actually. - -00:13:09.580 --> 00:13:12.280 -And it should be flexible and it should be fast. - -00:13:12.340 --> 00:13:13.340 -So those three things. - -00:13:13.420 --> 00:13:17.880 -Because the design, actually, is the thing that people notice first. - -00:13:18.520 --> 00:13:22.860 -So what we offer is a design that is customizable, brandable. - -00:13:23.040 --> 00:13:26.740 -You have tons of options with which you can change how navigation is laid out. - -00:13:28.200 --> 00:13:30.460 -You can also change colors, fonts, etc. - -00:13:30.460 --> 00:13:36.300 -And we have a lot of components that make it ready for technical writing. - -00:13:36.400 --> 00:13:38.520 -As you mentioned, you just want to start writing. - -00:13:38.860 --> 00:13:41.560 -So we have stuff like admonitions, tabs. - -00:13:42.160 --> 00:13:48.020 -And one very specific feature that we have is code annotations that we inherited from Material for MCADOX. - -00:13:48.080 --> 00:13:50.080 -Which is quite unique among static site generators. - -00:13:50.080 --> 00:13:57.460 -Which allows you to put a little bubble onto any line of code. - -00:13:57.960 --> 00:13:59.420 -You have to visit our documentation. - -00:13:59.420 --> 00:14:03.680 -This is our, you're currently browsing our, the other site. - -00:14:04.000 --> 00:14:04.660 -All right, all right. - -00:14:04.660 --> 00:14:05.000 -Hold on. - -00:14:05.140 --> 00:14:05.600 -I got it. - -00:14:05.640 --> 00:14:05.980 -Keep going. - -00:14:06.040 --> 00:14:06.780 -I'll get to stay. - -00:14:07.220 --> 00:14:07.680 -Right, right. - -00:14:07.720 --> 00:14:08.120 -No worries. - -00:14:08.640 --> 00:14:08.820 -Yeah. - -00:14:09.120 --> 00:14:11.140 -And there you have to search for code annotations. - -00:14:11.800 --> 00:14:19.160 -Yeah, so code annotations, which allow you to create a bubble in any line of code. - -00:14:19.360 --> 00:14:22.060 -And if you click that bubble, there opens a tooltip. - -00:14:22.140 --> 00:14:24.180 -And within this tooltip, you can use any rich content. - -00:14:24.180 --> 00:14:28.360 -So you can have lists, any formatted markdown tables, diagrams. - -00:14:29.620 --> 00:14:34.160 -Basically anything you can use anyway within markdown. - -00:14:34.680 --> 00:14:36.500 -And this is a very popular feature in Material. - -00:14:36.880 --> 00:14:38.840 -And so, of course, we brought it over. - -00:14:39.340 --> 00:14:41.080 -So users can still use it. - -00:14:41.080 --> 00:14:44.940 -So the second thing I talked about is it should be flexible. - -00:14:45.140 --> 00:14:47.600 -So what makes Zensicle different is we have a modular architecture. - -00:14:48.020 --> 00:14:50.060 -Or say we're working towards a modular architecture. - -00:14:50.240 --> 00:14:51.760 -We're still a little, we're still in alpha. - -00:14:51.760 --> 00:14:55.540 -So we're close to finishing the module system. - -00:14:56.620 --> 00:15:01.620 -And in Zensicle, it's modules all the way down, which means all core functionality is implemented as modules, - -00:15:01.620 --> 00:15:09.680 -which is different from other solutions where the plugin system sometimes is more or less an afterthought. - -00:15:09.860 --> 00:15:14.420 -So there's a plugin system added with specific hooks, extension points where you can hook into. - -00:15:14.420 --> 00:15:23.680 -And this might seem sufficient at first, but in the end, so for us, for instance, MKDocs in the end was a little bit limiting. - -00:15:24.260 --> 00:15:28.320 -And this allows you to basically swap, extend, replace all modules. - -00:15:28.480 --> 00:15:29.480 -You can use our modules. - -00:15:29.700 --> 00:15:31.660 -You can write your own, pull in third-party modules. - -00:15:31.980 --> 00:15:33.660 -And as I mentioned, Rust. - -00:15:33.860 --> 00:15:34.620 -So don't worry. - -00:15:34.820 --> 00:15:35.820 -You don't need to learn Rust. - -00:15:36.120 --> 00:15:42.100 -You will also be able to write modules in Python because we are super happy users of Pyro 3, which is absolutely amazing library. - -00:15:42.100 --> 00:15:48.280 -And Pyro 3 has really become a super important foundation of Python these days. - -00:15:48.400 --> 00:15:52.760 -It's almost like the C bindings for CPython. - -00:15:53.500 --> 00:15:53.740 -Exactly. - -00:15:54.200 --> 00:15:55.400 -So, yeah. - -00:15:55.560 --> 00:15:58.760 -So with Pyro 3, it allows us to have a Rust runtime. - -00:15:59.280 --> 00:16:07.000 -So all of the orchestration and how, in which order, so in which order things are run, threading, caching, parallelization, et cetera, - -00:16:07.000 --> 00:16:08.140 -all is happening in Rust. - -00:16:08.140 --> 00:16:13.980 -And we will provide Python binding so that you still can use Python to write modules. - -00:16:14.380 --> 00:16:16.000 -And they're still running fast. - -00:16:16.620 --> 00:16:16.740 -Yeah. - -00:16:16.920 --> 00:16:19.200 -Which brings me to the last point where we're different. - -00:16:19.460 --> 00:16:21.500 -We have a very heavy focus on performance. - -00:16:21.800 --> 00:16:29.520 -So our goal is to let you start with one page because, of course, all documentation sites or projects start small. - -00:16:29.820 --> 00:16:32.860 -And let you scale that to something like 100,000 pages. - -00:16:32.860 --> 00:16:36.160 -How we do it is through differential builds. - -00:16:36.360 --> 00:16:39.740 -We have created our own runtime, which is called ZRX. - -00:16:40.340 --> 00:16:43.720 -And differential builds mean that we are only rebuilding what changed. - -00:16:43.800 --> 00:16:49.740 -So, for instance, if you only change the page title, only that page and all instances where the page title is used are being rebuilt. - -00:16:49.940 --> 00:16:53.060 -And this means that changes are visible in milliseconds and not minutes. - -00:16:53.060 --> 00:16:53.460 -Yeah. - -00:16:54.260 --> 00:16:54.660 -Yeah. - -00:16:55.940 --> 00:16:56.860 -That's super cool. - -00:16:57.860 --> 00:17:01.540 -And so I'm presuming the build system itself is Rust-based, right? - -00:17:02.220 --> 00:17:02.720 -Yeah, exactly. - -00:17:02.840 --> 00:17:03.940 -It's 100% Rust, yeah. - -00:17:04.440 --> 00:17:05.000 -Yeah, yeah. - -00:17:06.220 --> 00:17:10.600 -Coming from a Python background, what was that experience like building that? - -00:17:10.600 --> 00:17:20.440 -Yeah, so that's kind of a tricky question because I'm not really coming from a long history of a Python. - -00:17:20.440 --> 00:17:22.700 -So I don't have a long Python background. - -00:17:23.500 --> 00:17:25.480 -I wrote many in TypeScript. - -00:17:26.240 --> 00:17:30.240 -And I only started 2021 writing Python. - -00:17:31.020 --> 00:17:36.300 -So this is actually the history, how materials started and how all of this unfolded. - -00:17:36.300 --> 00:17:40.300 -But I've written in several languages. - -00:17:41.240 --> 00:17:44.840 -So I also have written in C, Erlang, Ruby, Python, TypeScript. - -00:17:45.660 --> 00:17:47.180 -Rust was still extremely hard to learn. - -00:17:47.580 --> 00:17:51.040 -So I basically banged my head against the keyboard for a month. - -00:17:51.140 --> 00:17:54.500 -Wasn't making no progress at all because, yeah, you know, fighting with the borrow checker. - -00:17:54.920 --> 00:18:02.120 -So and once you get past that and then, of course, lifetimes and higher rank trade bounds and some other features, - -00:18:02.120 --> 00:18:11.340 -I'm now some kind of like 3,000 or 4,000 hours in, something like that, it gets really good. - -00:18:11.340 --> 00:18:20.860 -So I think Rust is seriously one of the best languages ever made because it allows you to express ideas extremely clearly with extreme clarity. - -00:18:21.860 --> 00:18:28.760 -And this is due to the very good type system, of course, and you get bare metal performance. - -00:18:28.760 --> 00:18:36.700 -And so I find it kind of insane having a language like Rust because it's so easy to write once you're used to it. - -00:18:36.700 --> 00:18:42.020 -You will be very productive and still have bare metal performance. - -00:18:42.440 --> 00:18:43.280 -It's completely insane. - -00:18:44.100 --> 00:18:44.780 -Yeah, that's wild. - -00:18:45.060 --> 00:18:50.820 -But it's got a little bit of a learning curve compared to like Python or TypeScript or something like that. - -00:18:51.840 --> 00:18:52.140 -Yeah. - -00:18:52.140 --> 00:18:57.160 -So I had, I think, 18 years of experience with many languages. - -00:18:58.140 --> 00:19:03.660 -As I mentioned, I also did a lot of C and I still found it very hard to learn. - -00:19:04.460 --> 00:19:04.660 -Yeah. - -00:19:04.660 --> 00:19:07.580 -But it's worth it. - -00:19:07.960 --> 00:19:08.820 -It's worth it. - -00:19:08.880 --> 00:19:14.860 -And my recommendation probably would be to learn it on something that you really care about, - -00:19:14.860 --> 00:19:23.720 -so that you want to build because otherwise you will probably lose the drive since you're running against those walls. - -00:19:24.300 --> 00:19:28.360 -Maybe for you or for somebody else, it's much easier to learn. - -00:19:28.660 --> 00:19:33.300 -So maybe it's just I'm a bad example that I needed so long. - -00:19:33.400 --> 00:19:33.660 -I don't know. - -00:19:33.860 --> 00:19:38.940 -But because after that month, it wasn't that I was completely up to speed. - -00:19:38.940 --> 00:19:45.920 -So it was just I was making very, very tiny progress, at least progress, because for a month I wasn't making progress at all. - -00:19:47.020 --> 00:19:47.580 -Yeah. - -00:19:47.680 --> 00:19:47.840 -Wow. - -00:19:49.780 --> 00:19:57.000 -The next show that I'm doing after this one, which actually is in real clock time, wall time, - -00:19:57.140 --> 00:20:02.240 -it's happening in like two hours or less from now is with Samuel Colvin from Pydantic. - -00:20:03.380 --> 00:20:03.500 -Yeah. - -00:20:03.580 --> 00:20:07.020 -Talking about Monty, a Python runtime. - -00:20:07.020 --> 00:20:12.640 -He and his team are rewriting in Rust, specifically targeting AI. - -00:20:12.900 --> 00:20:15.040 -So the Rust theme will continue. - -00:20:15.320 --> 00:20:21.400 -It's definitely a very – it caught me a little bit off guard, like how much people love it. - -00:20:21.480 --> 00:20:29.220 -But it's also – it makes perfect sense that we want this nice modern language for writing lower level things, - -00:20:29.580 --> 00:20:31.560 -even if it plugs into Python, right? - -00:20:31.560 --> 00:20:31.960 -Yeah. - -00:20:32.160 --> 00:20:32.560 -Yeah. - -00:20:32.640 --> 00:20:39.360 -So the fun thing is I also talked to Samuel a long time ago, and he was the one recommending to me to write it in Rust. - -00:20:40.100 --> 00:20:40.420 -Okay. - -00:20:40.980 --> 00:20:41.760 -So it's his fault. - -00:20:42.180 --> 00:20:45.900 -It's one of the reasons I – yeah, definitely I looked into it. - -00:20:46.920 --> 00:20:47.180 -Nice. - -00:20:47.240 --> 00:20:47.420 -Okay. - -00:20:47.560 --> 00:20:51.780 -And it made a lot of sense also during the time, the progress we're making and so on, - -00:20:51.780 --> 00:20:55.280 -and the walls we're hitting, that's to reconsider learning Rust. - -00:20:56.400 --> 00:20:57.200 -Best investment. - -00:20:57.880 --> 00:20:58.220 -Yeah. - -00:20:58.400 --> 00:20:58.800 -Amazing. - -00:20:59.160 --> 00:20:59.520 -Amazing. - -00:21:00.140 --> 00:21:04.700 -So I want to dig into your component structure and some of those things. - -00:21:04.800 --> 00:21:08.720 -But maybe before we do, let's talk about the origins a little bit. - -00:21:08.720 --> 00:21:14.340 -But so let's talk about how you went from material for MKDocs. - -00:21:15.740 --> 00:21:17.100 -Why even change? - -00:21:17.160 --> 00:21:18.800 -Why not just more material? - -00:21:20.660 --> 00:21:21.100 -Yeah. - -00:21:21.200 --> 00:21:25.140 -So this is a great question, and this is a little bit of a story. - -00:21:25.240 --> 00:21:26.980 -So there are several stories in there, actually. - -00:21:27.220 --> 00:21:27.360 -Yeah. - -00:21:27.360 --> 00:21:29.500 -So it's 10 years. - -00:21:29.640 --> 00:21:35.200 -I try to go make it as compact as possible while keeping the most important things. - -00:21:35.480 --> 00:21:39.740 -So to those who don't know, material for MKDocs is a very popular documentation framework. - -00:21:39.860 --> 00:21:41.420 -It's used by tens of thousands of projects. - -00:21:42.040 --> 00:21:45.080 -There are prominent users like AWS, Microsoft, OpenAI. - -00:21:46.320 --> 00:21:51.340 -Also, large open source projects use it, like, for instance, FastAPI, uv, Knative. - -00:21:51.620 --> 00:21:57.080 -And it's built on top of MKDocs, as the name says, which became one of the most popular aesthetic side generators. - -00:21:57.360 --> 00:21:59.720 -And it also eventually became my job. - -00:22:00.100 --> 00:22:02.220 -So I could make it my job. - -00:22:02.280 --> 00:22:05.740 -I could work in open source and earn a living somehow. - -00:22:06.040 --> 00:22:07.900 -I'm getting there how that worked. - -00:22:09.320 --> 00:22:12.700 -But at some point, we needed a new foundation. - -00:22:13.120 --> 00:22:17.180 -We've kind of outgrown MKDocs because it was not evolving at the pace that we needed. - -00:22:17.380 --> 00:22:18.920 -So we began exploring alternatives. - -00:22:19.560 --> 00:22:21.000 -And, yeah. - -00:22:21.100 --> 00:22:23.580 -So there's a lot of lessons learned in material. - -00:22:23.580 --> 00:22:27.320 -So let me shortly maybe talk about how it started. - -00:22:27.860 --> 00:22:32.180 -Because it started as a side project in 2015, like many things start. - -00:22:32.420 --> 00:22:38.740 -Because I wanted to release actually a C library, a zero-copy protocol buffers library I wrote called Protobluff. - -00:22:39.340 --> 00:22:42.480 -But then I realized that it needed more than a readme. - -00:22:42.660 --> 00:22:48.620 -So I looked at the existing aesthetic side generators, which were Hugo, Jekyll, Sphinx, MKDocs, something like that. - -00:22:49.240 --> 00:22:50.720 -And they all looked a little bit dated. - -00:22:51.340 --> 00:22:52.240 -I'm not a designer. - -00:22:52.440 --> 00:22:54.180 -But I wanted something more modern. - -00:22:54.320 --> 00:22:59.680 -And Google was pushing material design quite hard for app development at the time. - -00:23:00.060 --> 00:23:02.460 -And I've also seen it being used in the web. - -00:23:02.580 --> 00:23:04.340 -So I thought, well, maybe combine this. - -00:23:05.300 --> 00:23:06.980 -I quickly settled on MKDocs. - -00:23:07.040 --> 00:23:07.700 -It was easy to use. - -00:23:07.760 --> 00:23:08.360 -Simple templating. - -00:23:09.560 --> 00:23:10.920 -Enough for a side project, basically. - -00:23:11.120 --> 00:23:12.020 -So it was a side project. - -00:23:12.540 --> 00:23:13.440 -Did what most devs do. - -00:23:13.440 --> 00:23:14.360 -Check the license. - -00:23:14.360 --> 00:23:17.100 -But didn't do any further due diligence. - -00:23:17.560 --> 00:23:21.820 -So even put MKDocs in the name to show the connection, which is common for themes. - -00:23:22.080 --> 00:23:26.300 -And that actually turned out to be one of the biggest decisions I made in my career. - -00:23:26.520 --> 00:23:30.620 -Since I was basing my complete work on something I don't control. - -00:23:31.320 --> 00:23:35.520 -And it shaped the next 10 years of all of the work I was doing. - -00:23:35.600 --> 00:23:38.660 -And it's actually the reason why Zensical exists today. - -00:23:39.720 --> 00:23:39.980 -I see. - -00:23:39.980 --> 00:23:47.140 -So after I started developing it, I, like nine months later, released the first version. - -00:23:47.280 --> 00:23:48.040 -And send me good users. - -00:23:48.220 --> 00:23:49.280 -A lot of feature requests. - -00:23:50.140 --> 00:23:52.140 -And, you know, it was a side project. - -00:23:52.260 --> 00:23:54.380 -So I was doing client work at the time. - -00:23:54.600 --> 00:24:00.240 -As I mentioned, I've been like a consultant and developer, freelancer for 20 years. - -00:24:01.480 --> 00:24:04.080 -And I only had Sundays to work on it. - -00:24:04.520 --> 00:24:06.900 -So which at first was efficient. - -00:24:07.180 --> 00:24:09.700 -But the more popular it got, the more maintenance there came. - -00:24:09.700 --> 00:24:12.920 -So it kind of crept into my mornings and evenings. - -00:24:12.920 --> 00:24:19.300 -And I was doing triage, like answering questions and trying to fix bugs before I went to the client. - -00:24:19.300 --> 00:24:22.460 -And it was getting harder and harder to justify in front of my partner, actually. - -00:24:22.460 --> 00:24:25.100 -Because I was doing it in my spare time. - -00:24:25.660 --> 00:24:30.880 -And so I did what eventually all projects that started side projects. - -00:24:30.880 --> 00:24:36.220 -And where you don't have the full time to work on it, how they, yeah. - -00:24:36.300 --> 00:24:39.960 -So what basically happens is you start turning down feature requests. - -00:24:40.320 --> 00:24:42.600 -And many open source projects don't cross this line. - -00:24:42.720 --> 00:24:43.940 -And for me, it was a first. - -00:24:44.520 --> 00:24:46.440 -So, yeah. - -00:24:46.520 --> 00:24:50.940 -And also additionally, so I mentioned before that I started writing Python in 2021. - -00:24:51.320 --> 00:24:52.940 -At the time, I was focusing. - -00:24:53.800 --> 00:24:55.740 -So I only had Sundays to work on it. - -00:24:55.780 --> 00:24:56.480 -I didn't know Python. - -00:24:56.780 --> 00:25:00.080 -So I said that, okay, I will focus on the templating stuff. - -00:25:00.200 --> 00:25:02.640 -I will do the HTML, CSS, JavaScript, all of this, make it beautiful. - -00:25:03.000 --> 00:25:07.300 -And try to solve as much, as many problems as possible in the front end. - -00:25:07.740 --> 00:25:09.760 -But I won't start learning Python. - -00:25:09.840 --> 00:25:13.420 -Because it wasn't a language that I was using at that time. - -00:25:13.420 --> 00:25:16.320 -And I couldn't make up the time for it. - -00:25:16.420 --> 00:25:18.120 -So that's where I drew the line. - -00:25:19.780 --> 00:25:22.340 -It's probably going to be a fad, that Python thing anyway. - -00:25:23.620 --> 00:25:24.780 -I don't think so. - -00:25:25.820 --> 00:25:31.200 -Well, at the time, in 2015, it wasn't clear that it was going to be as popular as it was. - -00:25:31.640 --> 00:25:33.000 -As it is now, right? - -00:25:33.060 --> 00:25:36.060 -It's really, it started to become popular then. - -00:25:36.640 --> 00:25:38.700 -But it's really taken over the world. - -00:25:39.400 --> 00:25:39.840 -Absolutely. - -00:25:40.680 --> 00:25:41.400 -For a lot of reasons. - -00:25:41.400 --> 00:25:43.240 -Of course, yeah. - -00:25:44.040 --> 00:25:49.540 -I think one of the main reasons is because it's very popular in the ML community. - -00:25:49.720 --> 00:25:53.020 -And all of the LLM AI work that's happening and so on made it extremely popular. - -00:25:54.240 --> 00:26:00.660 -And I also think that Rust is doing a very good job on keeping it that way. - -00:26:00.960 --> 00:26:06.320 -Because finally, you have a very easy way to offload work to native code. - -00:26:06.320 --> 00:26:11.240 -Which is much easier than fiddling with C and C++ and void pointers and whatever. - -00:26:11.400 --> 00:26:14.760 -So as I mentioned, Pyro 3 is just an absolutely amazing library. - -00:26:14.880 --> 00:26:16.720 -It's so easy to write Rust code. - -00:26:17.520 --> 00:26:18.520 -Yeah, I think you're right. - -00:26:18.660 --> 00:26:22.640 -I think Rust has really provided an important escape hatch for it. - -00:26:22.680 --> 00:26:23.440 -I wrote it this way. - -00:26:23.500 --> 00:26:24.260 -It's not fast enough. - -00:26:24.440 --> 00:26:27.580 -Like, well, this part, we're going to make it as fast as it can be, basically. - -00:26:28.280 --> 00:26:28.460 -Yeah. - -00:26:29.460 --> 00:26:29.820 -Yeah. - -00:26:30.100 --> 00:26:30.940 -So... - -00:26:30.940 --> 00:26:32.600 -Sorry, I interrupted you. - -00:26:32.640 --> 00:26:32.900 -Keep going. - -00:26:32.980 --> 00:26:33.440 -No worries. - -00:26:33.540 --> 00:26:33.820 -No worries. - -00:26:34.080 --> 00:26:34.640 -Yeah, no, no. - -00:26:35.220 --> 00:26:38.960 -Yeah, so as I mentioned, I tried to keep it basically afloat for the first four years. - -00:26:40.080 --> 00:26:42.560 -And at the time, I didn't see the potential at all. - -00:26:42.680 --> 00:26:45.460 -It was just a theme, not a kind of product or so. - -00:26:45.860 --> 00:26:48.520 -But yet I felt responsible and kept on maintaining it. - -00:26:48.520 --> 00:26:52.220 -And my developer friends didn't understand why I was doing that. - -00:26:52.620 --> 00:26:53.020 -So... - -00:26:53.020 --> 00:26:57.660 -But for me, it was like, you know, it was kind of cool because I had a growing project. - -00:26:57.840 --> 00:26:58.680 -I had no immediate plans. - -00:26:58.800 --> 00:26:59.120 -I don't know. - -00:26:59.420 --> 00:27:02.540 -Let's see where I can take it. - -00:27:03.340 --> 00:27:04.900 -And yeah, so... - -00:27:04.900 --> 00:27:09.000 -And with this steady and slowly growth over years, then companies and organizations started using it. - -00:27:09.000 --> 00:27:17.200 -So they were basing their public-facing documentation on me, like the guy that maybe works on this project on a Sunday. - -00:27:18.160 --> 00:27:23.420 -And yet I felt responsible enough to trying to fix the bugs reported as quickly as possible. - -00:27:24.260 --> 00:27:24.400 -Yeah. - -00:27:25.000 --> 00:27:28.140 -And yeah, then in 2020 actually came the turning point. - -00:27:28.220 --> 00:27:31.460 -So when I was working on version five of it, I shared my progress publicly as I did before. - -00:27:31.520 --> 00:27:33.040 -And somebody mentioned a donate button. - -00:27:33.040 --> 00:27:42.600 -So I think the wording was something like, so that I can order pizza to survive the long Sunday coding sessions. - -00:27:44.040 --> 00:27:50.760 -But I heard from another developer who did this on his project, successful project for five years, a donate button. - -00:27:51.040 --> 00:27:52.260 -And he made $90. - -00:27:52.620 --> 00:27:56.360 -So I immediately said, that's not going to work. - -00:27:56.480 --> 00:27:59.760 -But I said, let's try an Amazon wish list. - -00:27:59.760 --> 00:28:08.400 -You know, I just put some stuff on there and maybe if somebody thinks my work is useful, then he can order me, like, make me a present, something, send me a present. - -00:28:09.240 --> 00:28:13.120 -So, yeah, and I basically received everything on that wish list. - -00:28:13.420 --> 00:28:14.420 -It was completely insane. - -00:28:14.540 --> 00:28:16.760 -So there were two consecutive days that felt like Christmas. - -00:28:17.000 --> 00:28:17.940 -I even put like... - -00:28:17.940 --> 00:28:20.760 -So I put some, you know, books and... - -00:28:21.360 --> 00:28:23.320 -But then also a single malt. - -00:28:23.520 --> 00:28:25.720 -I love Scottish single malt. - -00:28:26.720 --> 00:28:28.720 -It was a whiskey that cost $120. - -00:28:28.720 --> 00:28:30.940 -And I received that as well. - -00:28:31.500 --> 00:28:33.560 -So it was like, what's happening? - -00:28:34.380 --> 00:28:37.320 -And that led me to start thinking actually about demographics. - -00:28:38.020 --> 00:28:43.100 -So that I needed to better understand the audience of material for MKDocs. - -00:28:43.420 --> 00:28:44.580 -And I did a poll. - -00:28:44.880 --> 00:28:46.740 -And the results were absolutely eye-opening. - -00:28:46.980 --> 00:28:52.220 -I mentioned before, 7% only of users are front-end developers. - -00:28:52.480 --> 00:28:53.060 -Which means... - -00:28:53.060 --> 00:28:54.640 -And material is a front-end heavy project. - -00:28:54.640 --> 00:28:59.340 -So I kind of had an edge there in the Python space. - -00:28:59.920 --> 00:29:01.940 -Because, yeah, you know, it's based on Python. - -00:29:02.080 --> 00:29:08.320 -So front-end developers that write in JavaScript, they rather go for something like DocuSaurus or React-based or whatever. - -00:29:09.000 --> 00:29:11.100 -And technical writers were quite happy with the project. - -00:29:11.420 --> 00:29:13.640 -I didn't know even technical writers existed. - -00:29:13.640 --> 00:29:17.560 -So I had no clue that this job, that this is a job. - -00:29:17.860 --> 00:29:21.380 -Because I thought at the time, and it's in hindsight completely naive, of course. - -00:29:21.540 --> 00:29:24.820 -I thought that as a developer, you need to write the documentation, you know. - -00:29:25.300 --> 00:29:30.660 -So I learned about that and accidentally built a product for technical writers. - -00:29:31.240 --> 00:29:36.960 -And by the way, when I say product, I mean something that is not necessarily something you pay for. - -00:29:37.020 --> 00:29:38.600 -But something that doesn't feel engineered. - -00:29:38.600 --> 00:29:44.360 -So something that is like polished and designed and that you actually want to use. - -00:29:45.920 --> 00:29:46.340 -And, yeah. - -00:29:46.580 --> 00:29:51.280 -So I had a product that has like product market fit. - -00:29:51.600 --> 00:29:54.040 -But at the time, I didn't earn any money off it. - -00:29:54.440 --> 00:29:56.600 -So at the same time, I read about SponsorWare. - -00:29:57.340 --> 00:30:00.380 -And this, like, I'm not sure if you heard of it. - -00:30:00.420 --> 00:30:03.680 -But it's like a new model of monetization for open source. - -00:30:03.740 --> 00:30:04.900 -At the time, it was quite new. - -00:30:04.900 --> 00:30:07.560 -So that you can get paid for your work. - -00:30:07.740 --> 00:30:14.760 -So you can, so some developers, for instance, they sell course material or access to gated content or code or nothing at all. - -00:30:14.920 --> 00:30:22.380 -So if you have a popular project, you can just try to raise sponsorships from, and some companies are very generous when it comes to open source. - -00:30:23.000 --> 00:30:28.820 -And what we did with Material was we gave away early access to the latest features to the sponsors. - -00:30:28.820 --> 00:30:31.620 -And each feature was tied to a funding goal. - -00:30:31.720 --> 00:30:35.060 -And when that funding goal was met, it became free for everyone. - -00:30:35.380 --> 00:30:40.860 -So it was like kind of a funded feature development in multiple stages. - -00:30:41.400 --> 00:30:43.420 -And that's what I thought of it. - -00:30:44.320 --> 00:30:44.880 -Sorry? - -00:30:45.100 --> 00:30:45.180 -Yeah. - -00:30:45.760 --> 00:30:46.780 -That's super clever. - -00:30:47.000 --> 00:30:52.480 -I really love the idea of providing something for the sponsors. - -00:30:52.480 --> 00:30:58.600 -But still not turning it into, well, here's a paid version of our product and here's the open source version. - -00:30:58.820 --> 00:31:07.140 -But there's always this tension of how do you reward the people who support you without undermining the open source project? - -00:31:07.340 --> 00:31:08.460 -And that's a clever angle. - -00:31:08.460 --> 00:31:12.020 -Yeah, so that's extremely challenging. - -00:31:12.780 --> 00:31:15.960 -So as I'm telling this, so this is what I came up with. - -00:31:16.180 --> 00:31:18.720 -And I thought maybe it could work, something like that. - -00:31:18.760 --> 00:31:22.660 -And again, my developer friends, they said, well, never work. - -00:31:22.780 --> 00:31:24.340 -Nobody will pay for open source. - -00:31:24.440 --> 00:31:25.020 -You're insane. - -00:31:25.660 --> 00:31:27.200 -Spoiler alert, it did work. - -00:31:27.360 --> 00:31:31.380 -And in the end, we made 200K a year of it and could build a team and everything. - -00:31:31.520 --> 00:31:34.140 -So I know in Silicon Valley terms, this is probably minimum wage. - -00:31:34.140 --> 00:31:39.760 -But in Europe, it's quite an amount with which you can work very well. - -00:31:40.700 --> 00:31:44.900 -And yeah, so I started this program in 2020 and it grew steadily. - -00:31:45.340 --> 00:31:49.900 -And it finally allowed me to work on features outside of the Sunday. - -00:31:50.100 --> 00:31:55.000 -So invest more hours into it and finally learn Python in 2021. - -00:31:55.000 --> 00:32:05.060 -Wrote my first plugin and started hacking the MKDocs features that, well, that got turned down, that we upstreamed. - -00:32:05.140 --> 00:32:08.920 -But where the maintainer said, ah, it's maybe not a good fit or we don't have the time for it. - -00:32:09.520 --> 00:32:12.140 -And yeah, in total, I wrote 12 MKDocs plugins. - -00:32:12.900 --> 00:32:18.060 -So it started as a theme, but it turned into a popular, sorry, into a powerful docs framework in the end. - -00:32:18.060 --> 00:32:23.340 -And this worked quite well for several years until it didn't anymore. - -00:32:23.740 --> 00:32:27.720 -And that's the reason why Zensicle then came into being. - -00:32:28.640 --> 00:32:35.960 -So the way it didn't work is that, like, just where you want to take it started to diverge from MKDocs - -00:32:35.960 --> 00:32:40.720 -or you couldn't get your changes upstreamed or committed back? - -00:32:40.720 --> 00:32:48.060 -So the thing was that MKDocs was not evolving as we needed it. - -00:32:49.000 --> 00:32:52.600 -So historically, MKDocs had a sequence of single maintainers. - -00:32:53.000 --> 00:32:59.240 -And as far as I know, all of them worked on it in their spare time because they had regular jobs. - -00:32:59.940 --> 00:33:02.800 -And material was evolving quickly because, you know, we had funding. - -00:33:03.180 --> 00:33:05.700 -We could invest much more time in it. - -00:33:05.700 --> 00:33:11.000 -We could then, of course, then an open source project that is only maintained in the spare time. - -00:33:11.760 --> 00:33:12.960 -And so it was changing too slowly. - -00:33:13.100 --> 00:33:19.780 -So we started a lot of discussions on necessary API changes because for many users, material for MKDocs was MKDocs. - -00:33:20.040 --> 00:33:27.660 -So we were kind of like the storefront where most of the issues and, like, bug reports and feature requests came in - -00:33:27.660 --> 00:33:34.620 -because many people are using material for MKDocs and with this MKDocs, basically. - -00:33:35.700 --> 00:33:39.960 -And the main challenges that we faced were performance and plugin orchestration. - -00:33:40.060 --> 00:33:41.280 -I mentioned I wrote 12 plugins. - -00:33:41.940 --> 00:33:45.580 -And it's very hard to make them cooperate. - -00:33:46.320 --> 00:33:52.940 -And if you look at any popular MKDocs plugins issue tracker, you will find issues that go something like, - -00:33:53.340 --> 00:33:55.140 -well, this plugin is incompatible with this plugin. - -00:33:55.700 --> 00:34:00.220 -Well, if I change the order of the plugins in the configuration, this and this happens. - -00:34:00.220 --> 00:34:08.020 -So, and both of those problems were brought to us again and again by the users with which we talked. - -00:34:08.360 --> 00:34:10.540 -And so, you know, it was coming up a lot. - -00:34:11.260 --> 00:34:14.780 -Then suddenly after nine years, the original maintainer returned to MKDocs. - -00:34:14.900 --> 00:34:18.340 -And we were super optimistic because the project was, like, maintained again. - -00:34:18.420 --> 00:34:20.420 -He also started a sponsorship program. - -00:34:20.580 --> 00:34:23.640 -We upstreamed some of our funding immediately and supported his work. - -00:34:23.640 --> 00:34:28.240 -So, before MKDocs had no way to sponsor them. - -00:34:28.940 --> 00:34:33.640 -And the moment this went live, we immediately supported it. - -00:34:34.200 --> 00:34:36.740 -And some PRs were finally merged and issues were closed. - -00:34:37.000 --> 00:34:45.580 -But, yeah, then the works went silent and it started working in basically in the quiet. - -00:34:45.580 --> 00:34:48.600 -And three months later, we were invited to a video call. - -00:34:49.380 --> 00:34:56.880 -So, we as maintainers from, so I as a maintainer for material for MKDocs and some other key ecosystem maintainers. - -00:34:57.240 --> 00:35:05.680 -And we learned that MKDocs, that the plans for MKDocs 2.0 were completely different from what existed. - -00:35:05.680 --> 00:35:15.880 -So, what currently exists, MKDocs 1.x, which primarily means no plugin API and customization via templating alone. - -00:35:16.320 --> 00:35:22.460 -So, we already knew this is not enough because that's what we've done the first four years where, as I mentioned, I was only doing the templating. - -00:35:23.200 --> 00:35:26.140 -And some things you can't just do with templates. - -00:35:26.140 --> 00:35:36.340 -For instance, having a tag support where you need to pull in different tags from different pages and then render them on another page or so. - -00:35:36.400 --> 00:35:40.100 -So, you need synchronization efforts and you can't do this with templating. - -00:35:40.820 --> 00:35:42.260 -By the way, all of this information is public. - -00:35:42.540 --> 00:35:44.940 -So, you can read it on the MKDocs issue tracker. - -00:35:45.360 --> 00:35:47.860 -So, yeah, I'm not telling anything secret or so. - -00:35:48.540 --> 00:35:51.920 -Yeah, so it's a completely different direction than the one that we worked on. - -00:35:51.920 --> 00:35:56.940 -And we erased objections in the call and, but, yeah, still they were dismissed. - -00:35:57.440 --> 00:36:02.900 -So, MKDocs 2.0, as it looks right now, is incompatible with material for MKDocs. - -00:36:03.220 --> 00:36:08.600 -300 plugins in the ecosystem will become useless and tens of thousands of projects will be affected for us. - -00:36:08.760 --> 00:36:13.980 -So, we had absolutely no choice than to start building something. - -00:36:13.980 --> 00:36:22.380 -So, to start make something of this, because at the time, we had already 50,000 projects, 50,000 public projects, depending on us. - -00:36:22.900 --> 00:36:28.300 -We were talking to enterprise users and we knew that this number is much, much higher. - -00:36:28.440 --> 00:36:33.100 -So, for instance, one of our professional users, they already also sponsored material. - -00:36:33.660 --> 00:36:37.760 -They have two and a half thousand projects internally. - -00:36:37.760 --> 00:36:40.800 -So, only one company. - -00:36:41.160 --> 00:36:52.760 -And they have a dedicated team of individuals that maintain their customizations on top of material for MKDocs for all of the teams inside the company. - -00:36:52.860 --> 00:36:53.620 -It's a very big company. - -00:36:53.980 --> 00:36:58.320 -So, that's what you could infer from the... - -00:36:58.320 --> 00:37:01.180 -I could believe it. - -00:37:01.540 --> 00:37:03.040 -I couldn't believe it at all. - -00:37:03.180 --> 00:37:04.340 -So, absolutely insane. - -00:37:04.940 --> 00:37:07.320 -Yeah, so, as I mentioned, we had no choice. - -00:37:07.320 --> 00:37:15.320 -So, what we did was we immediately went back to the drawing board with the learnings from the almost 10 years that passed since I started Material. - -00:37:16.060 --> 00:37:20.440 -We built a lot of prototypes in TypeScript and Python, iterated on them. - -00:37:20.540 --> 00:37:28.880 -We did a lot of conceptual work and realized within weeks what could actually be done with a radically different architecture. - -00:37:28.880 --> 00:37:33.600 -Because writing 12 plugins, I know the ins and outs of MKDocs. - -00:37:33.600 --> 00:37:41.720 -So, I had to do a lot of hacks, for instance, to make the block plugin of Material work with the way navigation works in MKDocs. - -00:37:41.720 --> 00:37:48.840 -And the number one complaint, as I mentioned, was MKDocs is slow and it doesn't scale. - -00:37:49.000 --> 00:37:51.860 -So, like fixing a typo, you're doing a full rebuild. - -00:37:52.040 --> 00:37:53.280 -And this can take minutes. - -00:37:53.440 --> 00:37:57.200 -So, our design work centered exactly around this problem. - -00:37:58.080 --> 00:38:03.940 -And after a short while, so, we knew exactly what MKDocs should look like. - -00:38:03.940 --> 00:38:06.660 -And we didn't want to let our users down. - -00:38:06.940 --> 00:38:09.200 -And so, in essence, we had two options. - -00:38:09.880 --> 00:38:11.680 -We know what it should look like. - -00:38:11.860 --> 00:38:14.580 -We could fork it or we could start from scratch. - -00:38:15.300 --> 00:38:21.580 -And forking is not really possible because of the way how Python dependencies work. - -00:38:21.680 --> 00:38:24.660 -So, all of the plugins have a dependency on MKDocs. - -00:38:24.660 --> 00:38:28.860 -And this means that we would also need to fork all of the... - -00:38:28.860 --> 00:38:35.160 -So, without doing black magic with imports, which might not be the best idea. - -00:38:36.560 --> 00:38:41.880 -So, we would also need to fork all plugins or all plugins would need to switch to the fork. - -00:38:41.980 --> 00:38:45.500 -So, this would be like moving an entire city at once. - -00:38:45.800 --> 00:38:47.500 -And it's frankly impossible. - -00:38:47.500 --> 00:38:56.900 -So, and if we would fork it, we wouldn't be able to realize our learnings that we gained in the groundwork that we did. - -00:38:56.960 --> 00:38:58.280 -So, we had to start from scratch, actually. - -00:38:58.340 --> 00:38:58.460 -Right. - -00:38:58.540 --> 00:39:04.620 -Plus, you'd have to convince the entire community to at least create a parallel package. - -00:39:05.100 --> 00:39:12.500 -Because when you pip install that other plugin, it's going to say, hey, PyPI, I need MKDocs. - -00:39:13.520 --> 00:39:13.700 -Yeah. - -00:39:13.700 --> 00:39:18.380 -And now you'd need the forked version, you know, whatever that's going to be called, right? - -00:39:18.500 --> 00:39:22.040 -So, yeah, it would be a big battle, wouldn't it? - -00:39:22.600 --> 00:39:23.700 -Just technically with... - -00:39:24.580 --> 00:39:29.080 -Or you'd have to move the community, which is a very challenging thing to do. - -00:39:30.180 --> 00:39:30.580 -Yeah. - -00:39:30.740 --> 00:39:36.740 -And so, for us, the most sensible thing was to just, you know, we just start from scratch. - -00:39:37.080 --> 00:39:38.980 -We make it as compatible as possible. - -00:39:38.980 --> 00:39:45.620 -It became quite clear very quickly that we need to optimize for compatibility. - -00:39:45.960 --> 00:39:57.620 -Because if you create something that is not compatible and that forces users to migrate documentation manually and to do a lot of work to get over to something else, you won't get a lot of adoption. - -00:39:58.360 --> 00:40:02.160 -So, all you got to do is think about that 2,500 project team. - -00:40:02.440 --> 00:40:02.780 -Like, okay. - -00:40:03.420 --> 00:40:03.840 -Exactly. - -00:40:03.980 --> 00:40:05.600 -How do I keep them working with this, right? - -00:40:06.140 --> 00:40:06.460 -Yes. - -00:40:06.660 --> 00:40:06.860 -Yes. - -00:40:06.860 --> 00:40:06.940 -Yes. - -00:40:07.560 --> 00:40:07.780 -Yeah. - -00:40:07.840 --> 00:40:13.300 -So, what we then did is we had an idea how it should look. - -00:40:14.120 --> 00:40:17.200 -Then we started with Rust because it was recommended to us. - -00:40:17.300 --> 00:40:19.080 -So, it was very hard at first. - -00:40:19.820 --> 00:40:25.120 -And in total, it took us 16 months to build all of this. - -00:40:25.340 --> 00:40:26.840 -But it was not only writing code. - -00:40:26.960 --> 00:40:29.860 -It was also exactly knowing where we want to go. - -00:40:29.860 --> 00:40:32.620 -Because, you know, we're starting fresh. - -00:40:32.620 --> 00:40:40.420 -So, we better be sure that we are going into a direction where we actually want to go for the next 10 to 20 to 30 years. - -00:40:41.080 --> 00:40:41.480 -Depends. - -00:40:41.780 --> 00:40:41.720 -Yeah. - -00:40:41.720 --> 00:40:44.760 -We are really in for this for the long game. - -00:40:45.460 --> 00:40:50.060 -So, the 10 years that I've been doing this, I see that this is only the start. - -00:40:50.640 --> 00:40:52.700 -And we wrote a lot of things from scratch. - -00:40:52.700 --> 00:40:56.580 -So, the runtime, as I mentioned, it's like the heart of Zendicle. - -00:40:57.160 --> 00:41:00.160 -It already has something like 15,000 lines of code. - -00:41:00.720 --> 00:41:05.100 -A tiny HTTP middleware framework for file serving because we don't want to. - -00:41:05.160 --> 00:41:13.780 -So, we also want to make the file server extensible and don't want users to force them into async Rust and also don't have a dependency on Tokyo. - -00:41:13.780 --> 00:41:22.040 -And also like a monorepo management tool for Rust and JavaScript that we also open sourced, which I'm not sure if you've worked with monorepos. - -00:41:22.280 --> 00:41:26.840 -But in JavaScript, for instance, there's Lerner and it has 800 dependencies. - -00:41:27.100 --> 00:41:30.520 -So, when you install it, what you pull down is just insane. - -00:41:31.320 --> 00:41:39.340 -So, we worked a lot on the processes as well that we can make releases very easy and that we have a good way of working, basically. - -00:41:39.460 --> 00:41:42.380 -And we're very careful about our choice of dependencies. - -00:41:42.380 --> 00:41:46.600 -So, if it's not something that – let me put it another way. - -00:41:46.620 --> 00:41:55.480 -If it's something that you can write quite quickly, actually, and we rather own in order to make changes ourselves, we rather write it from scratch. - -00:41:56.360 --> 00:41:58.400 -I think that's a very healthy philosophy. - -00:41:59.240 --> 00:42:07.660 -And also, I think this agentic AI world that we're in these days, if you just need one or two functions and you used to think, - -00:42:07.660 --> 00:42:12.880 -well, maybe I'll lean on this, in your case, a crate or maybe a PyPI package or something. - -00:42:13.940 --> 00:42:18.660 -But if it's just one or two functions, maybe you really can just write it yourself without much effort. - -00:42:18.940 --> 00:42:21.960 -And it just – it saves you so much trouble, you know. - -00:42:21.960 --> 00:42:27.380 -So, I started using pip-audit for a lot of my projects. - -00:42:28.680 --> 00:42:40.500 -And I would say for my bigger projects, every two weeks, I get at least one CVE vulnerability notification for something I'm using. - -00:42:40.700 --> 00:42:41.960 -I'm like – - -00:42:41.960 --> 00:42:43.400 -But here's the thing. - -00:42:43.580 --> 00:42:50.160 -It's in a situation of that – probably a piece of code or functionality of that package that I don't even use or care about. - -00:42:50.580 --> 00:42:52.120 -So, it doesn't really apply to me. - -00:42:52.440 --> 00:42:56.500 -But then I've got all these, like, here's an issue – like, a latent issue that is in my code. - -00:42:56.880 --> 00:42:57.240 -Yeah. - -00:42:57.320 --> 00:42:59.120 -I'm going to have to figure out and deal with. - -00:42:59.360 --> 00:43:06.280 -But it's because I've taken in so much as part of this package where if I had just written the one or two functions, then it'd be fine. - -00:43:06.400 --> 00:43:06.920 -You know what I mean? - -00:43:06.920 --> 00:43:07.400 -Yeah. - -00:43:07.880 --> 00:43:08.400 -Absolutely. - -00:43:08.400 --> 00:43:16.440 -I think there's – I think things are swinging back a little bit from, like, let's just pull in everything because it's going to help us to, like, well, maybe not everything. - -00:43:17.120 --> 00:43:17.360 -Yeah. - -00:43:17.820 --> 00:43:26.400 -And also, you're not – you can't just change things easily and you depend on other APIs. - -00:43:26.400 --> 00:43:34.280 -So, for instance, one of the reasons why we choose to build a lot of things from scratch is that we want to control the public APIs. - -00:43:34.280 --> 00:43:43.100 -So, the worst thing for us would probably just be to export a third-party API that we're using as part of our public interface because it's Rust. - -00:43:43.320 --> 00:43:48.320 -So, it would mean that if this public API would change, the entire ecosystem would break. - -00:43:48.480 --> 00:43:58.820 -So, we're very careful what APIs we expose and rather wrap it in order to be safe so we can replace things. - -00:43:58.820 --> 00:43:59.000 -I see. - -00:43:59.340 --> 00:44:00.360 -Keep things replaceable. - -00:44:00.360 --> 00:44:13.160 -So, maybe you have the philosophy of it might be okay to use this crate, but we don't exchange its types as the public as part of our API or something along those lines. - -00:44:13.640 --> 00:44:14.640 -Yeah, we don't expose it. - -00:44:15.100 --> 00:44:24.620 -So, we – in some instance, the wrappers that I've wrote are identical to the types that we use from another crate. - -00:44:24.620 --> 00:44:31.040 -But by using our own types or just wrapping them, because in Rust, the nice benefit is you have zero-cost abstraction. - -00:44:31.260 --> 00:44:33.200 -So, all the code is monomorphized in line. - -00:44:33.280 --> 00:44:36.320 -So, you don't pay for wrapping code. - -00:44:36.820 --> 00:44:38.560 -That's the absolute crazy thing. - -00:44:38.920 --> 00:44:46.180 -So, you can finally create a really clean architecture without runtime penalties if you do it right. - -00:44:46.700 --> 00:44:47.300 -Oh, that's wild. - -00:44:47.400 --> 00:44:47.560 -Yeah. - -00:44:47.720 --> 00:44:47.940 -Yeah. - -00:44:47.940 --> 00:44:48.080 -Yeah. - -00:44:48.080 --> 00:44:48.580 -Very interesting. - -00:44:48.580 --> 00:44:57.000 -So, you can see I have this huge list of topics we're basically just barely cracked in the surface. - -00:44:57.440 --> 00:45:00.080 -But I'd like to go back to this components. - -00:45:02.740 --> 00:45:05.000 -Wrong search there. - -00:45:06.380 --> 00:45:08.080 -You have components. - -00:45:08.860 --> 00:45:09.900 -That was in the other part, wasn't it? - -00:45:09.900 --> 00:45:13.300 -Let's just talk down – talk through some of these things here. - -00:45:13.300 --> 00:45:18.040 -So, you've got, like, admonitions, buttons, code blocks. - -00:45:18.140 --> 00:45:21.620 -Like, let's talk through some of the building blocks, I guess, that you think are interesting here. - -00:45:23.040 --> 00:45:23.280 -Yeah. - -00:45:23.360 --> 00:45:31.720 -So, I think most of the – so, if you're not new to technical writing, most of the stuff shouldn't be quite new. - -00:45:31.820 --> 00:45:34.260 -So, like, admonitions, code blocks, stuff like that. - -00:45:34.260 --> 00:45:35.760 -You've probably seen our data tables. - -00:45:35.760 --> 00:45:41.360 -Diagrams are just mermaid diagrams, as they are – as you can use them on GitHub. - -00:45:42.400 --> 00:45:52.540 -One of the – so, like, the flagship features in Material, and now Zensical, as I mentioned, like, code annotations, which is a part of code blocks. - -00:45:53.860 --> 00:45:57.480 -Otherwise, we also have an icon, an emoji integration. - -00:45:57.480 --> 00:46:04.680 -So, you can use one of – I think we have something like over 10,000 icons now with a quite simple syntax. - -00:46:05.120 --> 00:46:06.180 -That's not standard markdown. - -00:46:06.320 --> 00:46:06.840 -That's the problem. - -00:46:06.940 --> 00:46:08.400 -So, that's, like, a Python markdown extension. - -00:46:09.240 --> 00:46:16.200 -And we're working on moving this over to common mark and finding a way to migrate this over. - -00:46:17.140 --> 00:46:27.280 -Because, you know, right now, it's – Zensical uses Python markdown for compatibility with Material for MKDocs, which means that for markdown rendering, we need to go through Python. - -00:46:28.140 --> 00:46:35.600 -And this is a temporary limitation that we have because I mentioned we are focusing really hard on compatibility. - -00:46:37.060 --> 00:46:44.740 -And all of those components will also, of course, be available within our common mark solution that we're working on that we will ship later this year. - -00:46:45.980 --> 00:46:46.380 -Yeah. - -00:46:46.980 --> 00:46:51.480 -But, yeah, right now, of course, you can use them as they're mentioned on our documentation. - -00:46:51.620 --> 00:46:55.260 -And we will, of course, provide automated tooling to get them over to common mark. - -00:46:55.260 --> 00:46:55.700 -Yeah. - -00:46:57.120 --> 00:46:57.560 -Yeah. - -00:46:57.680 --> 00:47:11.160 -I guess it's interesting that you've got to not just consider the API and the syntax and stuff, but maybe even the same parsing engine to have this strong compatibility, right? - -00:47:11.880 --> 00:47:12.040 -Yeah. - -00:47:12.040 --> 00:47:14.040 -We can even read MKDocs YML configuration. - -00:47:14.360 --> 00:47:17.260 -So, you can build an MKDocs project with Zensical as it stands. - -00:47:17.840 --> 00:47:23.040 -The thing that we currently don't support in its entirety is the plugins from the ecosystem. - -00:47:23.040 --> 00:47:26.440 -We already support some plugins. - -00:47:26.940 --> 00:47:28.640 -For instance, the MKDocs strings plugin. - -00:47:29.120 --> 00:47:36.220 -The author is also part of the Zensical team now with MKDocs strings being the second biggest project in the MKDocs space. - -00:47:36.220 --> 00:47:38.220 -So, we're very happy to have Tim on board. - -00:47:39.980 --> 00:47:41.540 -And several other plugins. - -00:47:41.800 --> 00:47:44.920 -But, as I mentioned, so Zensical uses modules. - -00:47:45.260 --> 00:47:54.280 -So, what we will do in the end is we will still always be able to read MKDocs configuration and map the plugin configurations to equivalent Zensical modules. - -00:47:54.280 --> 00:48:01.720 -So, the logic will be completely rewritten, but you will be able to migrate your project with a command. - -00:48:02.820 --> 00:48:04.160 -That's our goal. - -00:48:04.240 --> 00:48:11.680 -Because, you know, there has so much work been going on into projects built with Material and MKDocs. - -00:48:11.760 --> 00:48:16.200 -So, we need to make it easy for users and organizations to switch. - -00:48:16.640 --> 00:48:22.300 -And this is the main part we're working on in 2026. - -00:48:22.300 --> 00:48:25.360 -I think this is critical, right? - -00:48:26.020 --> 00:48:26.420 -Yeah. - -00:48:26.880 --> 00:48:35.920 -Your absolute best users, you know, like that big company, but many others, of course, they're not going to rewrite everything. - -00:48:36.000 --> 00:48:36.720 -Well, maybe they will. - -00:48:36.840 --> 00:48:38.700 -But many of them won't rewrite everything. - -00:48:38.840 --> 00:48:42.700 -They'll just use an old version and grin and bear it as long as they have to. - -00:48:42.800 --> 00:48:43.320 -You know what I mean? - -00:48:43.360 --> 00:48:47.480 -Like this idea of doing it from scratch. - -00:48:47.480 --> 00:48:54.800 -But if you provide a path for them that's very easy, then all of a sudden they get this way better experience, right? - -00:48:54.860 --> 00:48:59.060 -I can only imagine, you know, the build speed helping out the bigger projects the most. - -00:49:00.180 --> 00:49:00.380 -Yeah. - -00:49:00.620 --> 00:49:03.840 -And the compatibility part is one of the hardest engineering parts, actually. - -00:49:04.260 --> 00:49:09.560 -So, that you have to think about that, you know, because we don't want to paint ourselves into a corner. - -00:49:09.560 --> 00:49:22.640 -So, we need to think about where do we want to go, but how can we go there faster right now without making sacrifices in a way that we can't, in the end, replace things. - -00:49:22.880 --> 00:49:25.140 -And we have a pretty elaborate plan how to do all of this. - -00:49:25.880 --> 00:49:26.380 -And, yeah. - -00:49:26.780 --> 00:49:29.660 -So, we're working very hard on it to make it. - -00:49:30.040 --> 00:49:32.200 -So, right now, you can just use material, of course. - -00:49:32.340 --> 00:49:33.400 -You can keep using it. - -00:49:33.400 --> 00:49:38.480 -Or if your site already builds in Zensicle, you will have better speed and the modern design and the better search. - -00:49:38.680 --> 00:49:43.900 -So, the search has been completely rewritten from material to Zensicle. - -00:49:44.140 --> 00:49:46.060 -It's also, it's currently integrated. - -00:49:46.240 --> 00:49:47.560 -It's integrated with Zensicle. - -00:49:48.020 --> 00:49:51.960 -And we will open source it as a dedicated open source project. - -00:49:52.880 --> 00:49:54.020 -It's called Disco. - -00:49:54.620 --> 00:49:57.040 -So, you will also be able to use the search in other projects. - -00:49:57.040 --> 00:50:03.140 -And just as a number, to get a feel for it, it's 20 times faster than the search in material for MKDocs. - -00:50:03.400 --> 00:50:03.920 -Wow. - -00:50:04.020 --> 00:50:05.520 -So, it's a ground-up rewrite. - -00:50:06.140 --> 00:50:10.360 -And we actually started working on the search before we started working on Zensicle. - -00:50:10.940 --> 00:50:15.680 -Yeah, I noticed how nice the search was when I was playing with it. - -00:50:16.900 --> 00:50:17.600 -We're in. - -00:50:19.760 --> 00:50:23.800 -So, is Zensicle.org itself built in Zensicle? - -00:50:24.560 --> 00:50:25.100 -Yeah, of course. - -00:50:25.500 --> 00:50:28.880 -And it's actually built with an MKDocs YML because we're dog booting. - -00:50:28.880 --> 00:50:36.060 -So, you can also build it with MKDocs, with material for MKDocs. - -00:50:36.200 --> 00:50:37.560 -The project layout is exactly the same. - -00:50:38.260 --> 00:50:38.400 -Yeah. - -00:50:39.080 --> 00:50:47.840 -You know, I find that there's just a bunch of static sites that seem to have, I don't know what's going on with them, but their search is really bad. - -00:50:47.840 --> 00:50:57.040 -And, you know, either they've just integrated some kind of Google thing where it says site colon and they use your URL and then the search, which is a real bad experience. - -00:50:57.220 --> 00:51:02.440 -Or you go search and it sits there and it spins and it spins and then eventually it pulls up. - -00:51:02.440 --> 00:51:10.540 -So, it looks like you are pre-computing these types of things or something with your search engine or you've got some cool data structure to make that fast, right? - -00:51:11.520 --> 00:51:16.720 -Well, it's not one cool data structure that would be great because then everybody could just use it. - -00:51:17.040 --> 00:51:17.840 -But, no. - -00:51:18.280 --> 00:51:19.380 -A series of algorithms. - -00:51:19.700 --> 00:51:24.380 -Several months of work went into the search. - -00:51:24.800 --> 00:51:25.200 -Of course. - -00:51:25.200 --> 00:51:30.020 -So, it's a project of its own, as I mentioned. - -00:51:30.160 --> 00:51:31.640 -It's also completely modular. - -00:51:32.220 --> 00:51:47.240 -And the reason why most of the search engines that are out there, that are open source, so like the libraries that you can use, not services you have to pay for, that they don't provide results that are really relevant or are, is that they use BM25, - -00:51:47.240 --> 00:51:54.940 -which is like the standard bag of words ranking algorithm for information retrieval. - -00:51:55.200 --> 00:51:57.780 -And this doesn't nicely pair with autocomplete. - -00:51:57.920 --> 00:52:01.220 -So, what you get is you start typing and you get a lot of dancing results. - -00:52:02.220 --> 00:52:12.980 -And also, if you add further documents to your index, the balancing will be off because the relevance is computed based on the occurrence of a word in the entire corpus. - -00:52:13.220 --> 00:52:16.360 -So, you add a new document, those weights change again. - -00:52:16.360 --> 00:52:21.940 -So, the search that we have, we, of course, as a baseline also have a BM25 implementation. - -00:52:22.400 --> 00:52:28.200 -But the implementation you're seeing is a tie-breaking implementation, which provides much, much better accuracy. - -00:52:28.940 --> 00:52:30.060 -And you can configure it. - -00:52:30.060 --> 00:52:39.720 -So, tie-breaking means, okay, we first look into the title of the document and see if we have matches, then how many matches, then where they are. - -00:52:39.840 --> 00:52:45.040 -Then we look into the path and then in the body of the document and so on. - -00:52:45.240 --> 00:52:46.320 -All of this is configurable. - -00:52:46.320 --> 00:52:54.280 -And this is also why we believe that Disco alone will also be a very interesting project for other, for instance, static site generators to integrate. - -00:52:55.300 --> 00:52:57.360 -And you asked about, like, pre-computing. - -00:52:57.520 --> 00:53:01.960 -So, no, this is a search from the documents. - -00:53:02.060 --> 00:53:09.060 -We build a search index, which is a strapped-down version of the HTML that is rendered when you load the page. - -00:53:09.060 --> 00:53:12.040 -It's one JSON that we ship to the client. - -00:53:12.460 --> 00:53:15.480 -And for most pages, actually, this JSON is below one megabyte. - -00:53:15.580 --> 00:53:16.780 -You can GC it. - -00:53:17.000 --> 00:53:17.960 -So, compress it. - -00:53:18.460 --> 00:53:19.840 -Then it's something like 200K. - -00:53:20.040 --> 00:53:24.400 -And you have extremely fast search on the client with no cost. - -00:53:25.480 --> 00:53:33.520 -And so, we believe that for 90, 95, maybe 99% of documentation sites or sites in general, - -00:53:33.520 --> 00:53:41.640 -this client-side search is basically the way to go because it's fast and it doesn't require you to pay for anything. - -00:53:41.760 --> 00:53:48.480 -And there are several ZaaS-based services that can be extremely expensive when you do the math. - -00:53:49.380 --> 00:53:57.260 -So, yeah, you only need to use a server, basically, when the index becomes too big to ship to the client. - -00:53:57.860 --> 00:53:59.000 -And we're also working on that, by the way. - -00:53:59.600 --> 00:54:00.660 -Okay, that's really cool. - -00:54:00.660 --> 00:54:05.580 -You could shard the index or something like that, right, I suppose? - -00:54:05.800 --> 00:54:10.300 -Like, you could say, we're going to have 26 index bits. - -00:54:10.460 --> 00:54:14.920 -And only if the word starts with an A do you pull that piece down or something. - -00:54:15.240 --> 00:54:17.660 -But, yeah, a lot of cool aspects. - -00:54:18.480 --> 00:54:18.580 -Yeah. - -00:54:19.360 --> 00:54:20.540 -It's not that simple. - -00:54:21.060 --> 00:54:25.040 -But there are also some other interesting solutions. - -00:54:25.200 --> 00:54:26.940 -Like, PageFind is a pretty interesting library. - -00:54:27.040 --> 00:54:28.960 -It does a completely different approach. - -00:54:28.960 --> 00:54:36.340 -But it's not as snappy as the search that we ship to the client. - -00:54:36.840 --> 00:54:37.020 -Yeah. - -00:54:37.260 --> 00:54:40.040 -I use PageFind for my personal website, which is a static site. - -00:54:40.600 --> 00:54:40.820 -Yeah. - -00:54:41.340 --> 00:54:42.620 -It's also a great, great solution. - -00:54:42.840 --> 00:54:47.580 -But some things you won't be able to implement in PageFind properly. - -00:54:48.440 --> 00:54:48.840 -Sure. - -00:54:48.840 --> 00:54:52.180 -So, it's, you know, it's with software, it's trade-offs all the way. - -00:54:52.880 --> 00:54:56.340 -Well, I'm already thinking, like, I better pay attention to disco when it comes out. - -00:54:56.600 --> 00:54:59.500 -So, maybe adopt it for some stuff. - -00:55:00.500 --> 00:55:00.980 -Beautiful. - -00:55:01.180 --> 00:55:01.360 -Okay. - -00:55:01.360 --> 00:55:01.400 -Okay. - -00:55:01.800 --> 00:55:08.240 -We got a couple interesting questions sort of following up from the component side of things. - -00:55:09.280 --> 00:55:13.300 -Jamstack says, do you foresee community-led templates or themes for Zensicle? - -00:55:14.380 --> 00:55:19.640 -I know you have, like, two themes that I see something along those lines, a couple of themes that you can choose now. - -00:55:19.640 --> 00:55:22.580 -But what is the theme story, I guess? - -00:55:22.920 --> 00:55:24.200 -I want to ask you more broadly. - -00:55:25.940 --> 00:55:26.340 -Yeah. - -00:55:26.460 --> 00:55:27.120 -So, absolutely. - -00:55:27.480 --> 00:55:30.240 -So, right now, we have only this one theme. - -00:55:30.580 --> 00:55:38.520 -We have this variant setting where you can choose, like, the classic variant, which is when you move over from material for MKDocs. - -00:55:38.600 --> 00:55:39.800 -It looks exactly the same. - -00:55:39.800 --> 00:55:47.960 -This is also why we need it to keep the HTML as it is, also with the modern design that we provided, and the modern variant, which is the standard for Zensicle. - -00:55:47.960 --> 00:56:02.300 -Once we move to the component system, we will make it possible to, one, use components within Markdown, and, two, also create a template engine that is based on components. - -00:56:02.300 --> 00:56:13.900 -This will allow us much, much faster rendering, because, for instance, if you render the header for a site, it's a lot of HTML, because, you know, there's the search box in it and some other stuff. - -00:56:14.100 --> 00:56:15.260 -But only the title changes. - -00:56:15.400 --> 00:56:18.740 -So, we will also make the rendering differential as part of the build. - -00:56:18.840 --> 00:56:19.780 -That's the plan. - -00:56:20.240 --> 00:56:24.460 -And with this, we will also make it open to theme developers, of course. - -00:56:24.460 --> 00:56:34.100 -So, there will be the, like, packaging, for instance, compilation of ZaaS styles or TypeScript or so will be part of Zensicle. - -00:56:34.380 --> 00:56:40.200 -So, you don't need to pre-compile the theme like we need to do for, like, the last 10 years for material. - -00:56:41.120 --> 00:56:42.980 -So, it will have a proper asset pipeline. - -00:56:43.100 --> 00:56:45.500 -It will have a proper process to install themes. - -00:56:45.800 --> 00:56:46.600 -All of this is planned. - -00:56:46.600 --> 00:56:49.140 -But right now, we focus on feature parity. - -00:56:49.140 --> 00:56:54.380 -So, in order to make it possible for more users to migrate right now. - -00:56:55.180 --> 00:57:07.020 -That's really interesting that you would deliver the theme as, basically, its original source, not its rendered, you know, compiled or transpiled version, right? - -00:57:07.420 --> 00:57:11.400 -To keep it, I guess, a part of the Zensicle build step, right? - -00:57:11.400 --> 00:57:13.320 -Yes, exactly. - -00:57:13.460 --> 00:57:20.260 -Because we had a lot of requests for something like, hey, can we change the media queries a little bit? - -00:57:20.300 --> 00:57:24.720 -Because the sidebar disappears too early for my taste. - -00:57:25.300 --> 00:57:27.040 -And this is not... - -00:57:27.040 --> 00:57:33.020 -So, for this, you have to go through the compilation step again and, basically, fork the theme and recompile it. - -00:57:33.020 --> 00:57:37.020 -We want to make this configurable so that you can use... - -00:57:37.980 --> 00:57:44.360 -Yeah, so, you know, configure the theme and build it and it just works. - -00:57:44.700 --> 00:57:46.520 -So, this, like, you know, it just works. - -00:57:46.640 --> 00:57:48.460 -That's, like, the thing we're working towards. - -00:57:48.900 --> 00:57:50.100 -Make it as simple as possible. - -00:57:50.900 --> 00:57:51.040 -Yeah. - -00:57:51.540 --> 00:57:52.380 -Yeah, very cool. - -00:57:53.380 --> 00:57:54.020 -Let's maybe... - -00:57:54.560 --> 00:57:55.760 -I'm getting short on time here. - -00:57:55.780 --> 00:58:00.460 -Maybe wrap up our chat talking about two things. - -00:58:00.460 --> 00:58:01.680 -The future. - -00:58:02.900 --> 00:58:03.760 -Where are you going? - -00:58:03.860 --> 00:58:10.740 -You talked about compatibility being a big part of things going forward in 2026. - -00:58:10.960 --> 00:58:13.840 -But also sustainability, right? - -00:58:14.240 --> 00:58:23.860 -You had all these great supporters for material for MKDocs, which you must have just been absolutely thrilled to realize how successful that was, right? - -00:58:23.960 --> 00:58:28.820 -I mean, going from the wall, put up a wish list, and then, actually, people love this. - -00:58:28.820 --> 00:58:30.680 -I can put all my energy into it. - -00:58:30.760 --> 00:58:32.500 -I mean, I know how great of a feeling that is, right? - -00:58:33.360 --> 00:58:34.360 -That's completely insane. - -00:58:34.720 --> 00:58:35.260 -And I would... - -00:58:35.260 --> 00:58:35.280 -Yeah. - -00:58:35.460 --> 00:58:40.040 -When I started it, I would never believe that this would be my job at some point. - -00:58:40.660 --> 00:58:42.060 -Yeah, I feel the same way about the podcast. - -00:58:42.540 --> 00:58:43.460 -And it's just... - -00:58:43.460 --> 00:58:44.320 -I'm so grateful for it. - -00:58:44.340 --> 00:58:44.800 -It's amazing. - -00:58:45.580 --> 00:58:45.700 -Yeah. - -00:58:45.700 --> 00:58:46.020 -I can imagine. - -00:58:46.740 --> 00:58:47.000 -Yeah. - -00:58:47.000 --> 00:58:53.120 -But then with this transition to Zensicle, how does that change? - -00:58:53.220 --> 00:58:54.120 -Does that change anything? - -00:58:54.500 --> 00:58:55.660 -Or what's the story? - -00:58:56.500 --> 00:58:57.080 -Yeah, so... - -00:58:57.080 --> 00:58:58.640 -How do you bring that support over to Zensicle? - -00:58:58.640 --> 00:59:03.440 -Well, as we don't have a lot of time, I try to explain it as compact as possible. - -00:59:03.640 --> 00:59:09.440 -So, we are saying goodbye to this pay for features, pay for extra features. - -00:59:09.800 --> 00:59:15.400 -So, in material, you needed to be a sponsor in order to get the latest features earlier. - -00:59:15.660 --> 00:59:18.000 -What we will do is everything is open source from the start. - -00:59:18.000 --> 00:59:20.360 -So, for users, it's completely free. - -00:59:21.100 --> 00:59:26.880 -And we are shifting our model from the sponsorships to something we call Zensicle Spark. - -00:59:27.300 --> 00:59:32.240 -Because what we discovered, talking a lot to our professional users, is that the more we - -00:59:32.240 --> 00:59:36.740 -know about the problem space, and the better we understand the problem space, and the more - -00:59:36.740 --> 00:59:39.900 -we can collaborate with them, the more we can... - -00:59:39.900 --> 00:59:42.200 -The better degrees of freedom we can provide. - -00:59:42.200 --> 00:59:45.600 -So, we don't intend to just chip feature, feature, feature. - -00:59:45.600 --> 00:59:52.080 -But we intend to create degrees of freedom, so that you can adapt Zensicle to the processes - -00:59:52.080 --> 00:59:56.220 -within your organization, how they work, to the workflows, etc., which are all different, - -00:59:56.360 --> 00:59:58.480 -which is all very diverse, basically. - -00:59:59.040 --> 01:00:04.060 -So, Spark is a space where you, as a company, can basically get a seat. - -01:00:04.720 --> 01:00:08.720 -And together with us, Shape Zensicle is part of high-level discussions, where we explore - -01:00:08.720 --> 01:00:10.000 -the problem space. - -01:00:10.460 --> 01:00:11.300 -We create proposals. - -01:00:11.300 --> 01:00:13.940 -So, on the website, you will have clicked on the Spark section. - -01:00:13.940 --> 01:00:15.620 -There's this Zaps in progress. - -01:00:15.780 --> 01:00:18.660 -We call them Zaps, Zensicle Advancement Proposals. - -01:00:18.920 --> 01:00:19.620 -It's on the left side. - -01:00:20.320 --> 01:00:26.020 -We write very elaborate, detailed proposals on specific topics that we intend to work on. - -01:00:26.640 --> 01:00:32.780 -And then, with the feedback that we get, iterate on them and create an authoring... - -01:00:32.780 --> 01:00:37.860 -Like, the ideal authoring experience that caters to the most cases possible. - -01:00:37.860 --> 01:00:42.560 -Because we want to build Zensicle, as I mentioned, for the very long term. - -01:00:42.900 --> 01:00:46.740 -And not just a solution that is opinionated, but that is as unopinionated as possible. - -01:00:47.560 --> 01:00:53.380 -And the third thing that you get, besides the opportunity to discuss high-level discussions - -01:00:53.380 --> 01:00:57.840 -with us and create the proposals with us, is, of course, professional support. - -01:00:57.840 --> 01:01:01.220 -So, this is also, we've been asking, we've been asked for quite a lot by companies. - -01:01:02.080 --> 01:01:07.940 -So, in Spark, you, yeah, you can basically get our time. - -01:01:08.080 --> 01:01:11.240 -You can, we will, you can get direct access to the team. - -01:01:11.240 --> 01:01:17.920 -And also, we have, like, those open video calls where we share our progress and where you can get a window of support. - -01:01:18.060 --> 01:01:21.900 -And we talk about any problem that is keeping you up at night, basically. - -01:01:22.280 --> 01:01:27.240 -And stuff like migrations or how do you do this and this in Zensicle. - -01:01:27.740 --> 01:01:30.040 -And, yeah, it's been a blast. - -01:01:30.360 --> 01:01:34.700 -So, we're really happy that the organizations are enrolling into this new model. - -01:01:34.700 --> 01:01:41.880 -And we think it could also be a model that might translate quite well to other projects because you get a huge competitive advantage. - -01:01:42.040 --> 01:01:43.500 -You know exactly what to build. - -01:01:44.820 --> 01:01:47.800 -Yeah, you're on, you're talking to the actual users. - -01:01:48.160 --> 01:01:51.140 -They're saying, this is the thing that really is hard for us. - -01:01:51.220 --> 01:01:54.240 -Or you just get, maybe they don't say it, but you see it, right? - -01:01:55.160 --> 01:01:56.100 -Exactly, yes, yes. - -01:01:56.220 --> 01:01:58.320 -And talking to the users is the best thing you can do. - -01:01:58.320 --> 01:02:05.900 -So, what we learned from those, from the many times we talked to them is always something like, wow, we never would have come up with this. - -01:02:07.280 --> 01:02:08.280 -Yeah, incredible. - -01:02:09.060 --> 01:02:16.180 -Well, congratulations on the success for Material for MKDocs and then this new project. - -01:02:16.360 --> 01:02:18.140 -I'm very excited to see it coming along. - -01:02:18.380 --> 01:02:20.380 -And it looks like it's going to be great. - -01:02:21.420 --> 01:02:23.300 -Maybe a final call to action for people. - -01:02:23.680 --> 01:02:26.120 -Like, can they go ahead and start using Zensicle? - -01:02:26.120 --> 01:02:27.760 -If they're interested, what do they do? - -01:02:28.000 --> 01:02:28.340 -So on. - -01:02:30.760 --> 01:02:31.280 -Yeah, of course. - -01:02:31.660 --> 01:02:35.560 -So, you can, so, we mentioned Material for MKDocs a lot. - -01:02:35.640 --> 01:02:39.360 -And this is because we are coming from this direction. - -01:02:39.640 --> 01:02:44.420 -So, it means if you have a Material for MKDocs project, you should definitely try out Zensicle and see if you can build your project. - -01:02:44.540 --> 01:02:48.380 -But if you haven't used it, you can also just jumpstart a new project. - -01:02:48.640 --> 01:02:50.580 -It has a lot of built-in functionality already. - -01:02:50.580 --> 01:02:59.320 -You get, like, all of these components that we talked about, free search that you don't have to host, a very modern static site that is great on mobile. - -01:02:59.640 --> 01:03:01.000 -So, just give it a try. - -01:03:01.600 --> 01:03:02.980 -And we have a newsletter. - -01:03:03.340 --> 01:03:06.600 -So, where we, once a month, share the latest updates. - -01:03:07.000 --> 01:03:10.160 -And that might also be worth checking out. - -01:03:10.160 --> 01:03:15.980 -But, yeah, and otherwise, we'd be happy to see you, to get any feedback. - -01:03:16.540 --> 01:03:21.420 -By the way, we also have a public Discord, a community Discord, which is growing very well. - -01:03:21.720 --> 01:03:24.620 -So, if you have any problems or so, then you will get help there. - -01:03:25.280 --> 01:03:25.400 -Yeah. - -01:03:26.260 --> 01:03:34.620 -Would be great to see as many users as possible, of course, and shape the future of Zensicle together with all of you. - -01:03:34.620 --> 01:03:35.020 -Yeah. - -01:03:35.460 --> 01:03:35.660 -Yeah. - -01:03:36.600 --> 01:03:36.960 -Fantastic. - -01:03:37.580 --> 01:03:38.620 -Martin, thanks for coming on the show. - -01:03:39.400 --> 01:03:40.220 -Congrats on the project. - -01:03:41.300 --> 01:03:42.280 -Thanks for the invitation. - -01:03:42.600 --> 01:03:45.300 -And happy any time to come back. - -01:03:46.020 --> 01:03:46.320 -Yeah. - -01:03:46.420 --> 01:03:46.960 -Sounds good. - -01:03:46.960 --> 01:03:47.020 -Yeah. - -01:03:47.020 --> 01:03:47.000 -Sounds good. diff --git a/youtube_transcripts/542-zensical.vtt b/youtube_transcripts/542-zensical.vtt new file mode 100644 index 0000000..f16552c --- /dev/null +++ b/youtube_transcripts/542-zensical.vtt @@ -0,0 +1,3263 @@ +WEBVTT + +00:00:00.380 --> 00:00:02.760 +Martin, welcome to Talk Python To Me. + +00:00:02.840 --> 00:00:03.500 +Great to have you here. + +00:00:03.840 --> 00:00:04.940 +Thanks for having me. + +00:00:05.700 --> 00:00:08.860 +I'm excited to talk about static sites + +00:00:09.300 --> 00:00:14.060 +and the next big platform for building them here + +00:00:14.920 --> 00:00:16.000 +in Python and beyond. + +00:00:16.660 --> 00:00:19.500 +So really excited to talk about Zensical. + +00:00:19.900 --> 00:00:20.660 +Am I saying that right? + +00:00:20.920 --> 00:00:22.100 +Yeah, pretty much. + +00:00:22.480 --> 00:00:22.820 +Zensical. + +00:00:23.200 --> 00:00:24.280 +Zensical, OK. + +00:00:25.200 --> 00:00:25.340 +Great. + +00:00:26.020 --> 00:00:26.140 +Yeah. + +00:00:27.540 --> 00:00:32.080 +I know MKDocs, the material for MKDocs, has been really, really popular. + +00:00:32.980 --> 00:00:37.240 +And you all have made a big splash announcing this new project. + +00:00:38.060 --> 00:00:40.060 +So I'm really looking forward to diving into it. + +00:00:40.480 --> 00:00:44.400 +Before we do, though, let's just get a little bit of background on you. + +00:00:44.520 --> 00:00:44.980 +Who is Martin? + +00:00:46.000 --> 00:00:46.180 +Of course. + +00:00:46.440 --> 00:00:47.960 +So hi, my name is Martin Donath. + +00:00:49.000 --> 00:00:51.160 +Most people probably know me as Squidfunk. + +00:00:52.100 --> 00:00:56.200 +I've been an independent developer and consultant for the last 20 years now. + +00:00:57.100 --> 00:01:01.800 +And I mostly write in TypeScript, Python, and lately a lot of Rust. + +00:01:02.140 --> 00:01:04.300 +So I've become a huge fan of Rust, actually. + +00:01:05.660 --> 00:01:06.800 +I'm kind of a free spirit. + +00:01:07.420 --> 00:01:12.240 +So I love doing my own thing and building products from front to back, basically. + +00:01:12.700 --> 00:01:14.620 +So doing the front end as well as the back end. + +00:01:15.300 --> 00:01:18.800 +And for the past 15 years, I contributed a lot to open source. + +00:01:19.960 --> 00:01:23.940 +As already mentioned, my most popular project so far is Material for MKDocs. + +00:01:24.960 --> 00:01:30.620 +And it's, well, millions of people basically look at sites + +00:01:30.810 --> 00:01:31.900 +that are built with it every day. + +00:01:32.560 --> 00:01:34.340 +Yeah, well, and Zanziko, my latest project, + +00:01:34.700 --> 00:01:36.520 +will hopefully go far beyond that. + +00:01:36.620 --> 00:01:37.780 +So we're working very hard on it. + +00:01:38.000 --> 00:01:39.060 +And this is why I'm here today. + +00:01:39.480 --> 00:01:41.040 +So excited to talk about it. + +00:01:41.570 --> 00:01:43.280 +MARK MANDEL: Yeah, I am as well. + +00:01:43.900 --> 00:01:48.620 +And let's just start by admiring your website a little bit. + +00:01:49.290 --> 00:01:50.360 +BRIAN FITZPATRICK: Thanks. + +00:01:50.550 --> 00:01:53.859 +MARK MANDEL: Brian and I spoke about this over on our Python + +00:01:53.880 --> 00:02:02.500 +Bytes podcasts. And we kind of just got distracted just staring at the website. It's this beautiful + +00:02:02.810 --> 00:02:08.539 +flow of, I don't know, colors. It looks a little bit like a black hole worm, a white wormhole + +00:02:08.740 --> 00:02:13.240 +sort of experience. I don't know. What was the inspiration there at this cool design? + +00:02:14.240 --> 00:02:19.100 +Yeah, this is actually a strange attractor. So this is something from physics. I'm not very, + +00:02:19.160 --> 00:02:24.280 +very proficient in physics but those strange attractors uh have i have fascination for them + +00:02:24.440 --> 00:02:31.120 +for a very long time and they follow very simple rules so it's just three equations that define + +00:02:31.460 --> 00:02:38.560 +how their points move in in three-dimensional space and yeah but still with those simple rules + +00:02:39.240 --> 00:02:46.339 +very complex shape can emerge and this is for us actually symbolizes the process of evolving ideas + +00:02:46.380 --> 00:02:52.680 +through writing so if you have slightly different conditions from the start um it's still um + +00:02:53.680 --> 00:02:58.680 +orbiting around the same shape but it might look a little bit different and there's actually i can + +00:02:58.760 --> 00:03:03.780 +share this now there's actually a little easter egg nobody has found it so far so if you hover + +00:03:04.160 --> 00:03:12.260 +over the home page on zensicle.org with the mouse in the left bottom corner you can um you can + +00:03:12.120 --> 00:03:18.400 +actually change the coefficients of the animation and if you do this so you you can click on them + +00:03:18.600 --> 00:03:24.320 +and then you can use your cursor i'm changing i'm changing beta we're running beta 0.22 right now + +00:03:24.880 --> 00:03:31.860 +oh it really does change it yeah oh my goodness yeah so it's um it it takes a little it takes a + +00:03:31.860 --> 00:03:37.379 +little time but um if you change the coefficients in a specific way it might be completely chaotic + +00:03:37.480 --> 00:03:41.980 +chaotic and become unstable so this is what i really find fascinating about those strange + +00:03:42.120 --> 00:03:47.740 +attractors and it's also the inspiration for the logo um so we're building on this on this image a + +00:03:47.820 --> 00:03:54.260 +lot okay i thought it was just a cool design i didn't realize it had all this meaning and + +00:03:54.380 --> 00:04:00.300 +actual math and physics behind it that's super cool yeah i love chaos theory and all of + +00:04:00.280 --> 00:04:02.580 +of these fractal type of ideas here. + +00:04:02.730 --> 00:04:04.440 +And yeah, it's super neat. + +00:04:05.660 --> 00:04:08.500 +OK, so what is Zintiql? + +00:04:09.080 --> 00:04:10.040 +Why did you build it? + +00:04:10.040 --> 00:04:11.280 +Why not just more material? + +00:04:12.739 --> 00:04:14.900 +So there are a lot of questions in there, actually. + +00:04:15.300 --> 00:04:18.739 +Maybe let me just start by shortly speaking about what it is. + +00:04:18.920 --> 00:04:21.440 +So in very simple terms, it's a tool + +00:04:21.549 --> 00:04:24.320 +to build beautiful websites from a folder of text files. + +00:04:25.100 --> 00:04:28.340 +So you just write in Markdown and can generate a static site. + +00:04:29.260 --> 00:04:30.320 +You don't need a database for it. + +00:04:30.480 --> 00:04:33.420 +So to those who don't know what a static site is, + +00:04:33.480 --> 00:04:34.920 +you don't need a database or server. + +00:04:35.400 --> 00:04:38.740 +It's just static HTML, which means you just pip install + +00:04:38.860 --> 00:04:40.760 +ZensiCode and you're ready to go within a few minutes. + +00:04:41.560 --> 00:04:43.580 +And it's fully open source, MIT licensed. + +00:04:45.440 --> 00:04:48.500 +And to maybe explain a little bit more about static sites, + +00:04:48.660 --> 00:04:51.120 +so the big benefit of it, you can host it for free + +00:04:51.200 --> 00:04:54.400 +in many places, for instance, on GitHub Pages or Cloudflare, + +00:04:54.540 --> 00:04:56.759 +and they're secure and fast by default + +00:04:56.780 --> 00:04:58.780 +because there's only static file serving involved. + +00:04:59.360 --> 00:05:02.000 +And Zensical, so we try to make it pretty with a modern design, + +00:05:02.180 --> 00:05:04.780 +many built-in features, and fun, according + +00:05:04.780 --> 00:05:06.640 +to the feedback of our users, which + +00:05:06.640 --> 00:05:08.220 +is kind of unusual for writing documentation. + +00:05:08.560 --> 00:05:09.820 +So yeah. + +00:05:10.880 --> 00:05:12.480 +MARK MANDEL: Yeah, very cool. + +00:05:12.680 --> 00:05:20.200 +And if anyone's tried to manually create a static site, + +00:05:23.720 --> 00:05:26.180 +It quickly becomes a challenge if you're just writing. + +00:05:26.760 --> 00:05:29.140 +I say, hey, it's only five HTML pages. + +00:05:29.200 --> 00:05:31.440 +I can just write the HTML, you know what I mean? + +00:05:32.939 --> 00:05:35.260 +But well, what if you want to have common navigation? + +00:05:35.960 --> 00:05:37.660 +Or you want to change the look and feel? + +00:05:38.560 --> 00:05:41.680 +Oh, well, now I've got to go edit that in five places, right? + +00:05:41.800 --> 00:05:46.980 +And so if even just beyond, basically beyond one page, + +00:05:47.340 --> 00:05:51.160 +having something that generates the static site is-- + +00:05:51.960 --> 00:05:53.280 +it's super valuable, right? + +00:05:53.420 --> 00:05:56.080 +because it'll generate the wrapper navigation, + +00:05:56.700 --> 00:05:59.640 +the common CSS, the footer, + +00:06:00.040 --> 00:06:01.560 +all those kinds of things, right? + +00:06:02.960 --> 00:06:04.500 +Yeah, so it depends on what you want to do. + +00:06:04.840 --> 00:06:06.760 +So, of course, if you have a small site, + +00:06:07.440 --> 00:06:08.440 +like a personal website or so, + +00:06:08.440 --> 00:06:11.300 +you can just write basic HTML if you're proficient in it. + +00:06:11.900 --> 00:06:14.760 +For instance, the users of Material, + +00:06:16.000 --> 00:06:18.320 +only 7% of them are front-end developers. + +00:06:20.380 --> 00:06:22.819 +We will dive a little bit into how Zensical relates + +00:06:22.860 --> 00:06:31.720 +material later and what zensigl is being used for primarily is for documentation so it builds on the + +00:06:31.780 --> 00:06:37.040 +doccess code philosophy which means that you treat your documentation exactly like your source + +00:06:37.160 --> 00:06:42.700 +code so you primarily write documentation you just you don't want to fight front-end development + +00:06:43.080 --> 00:06:49.220 +problems you just want to keep the content like get the content out and with this docsis code what + +00:06:49.000 --> 00:06:54.300 +the cool thing about is you can use the same tools and processes and workflows like you use for code + +00:06:55.340 --> 00:07:02.500 +like versioning and prs to make changes and the adoption is growing really fast actually among + +00:07:02.840 --> 00:07:07.220 +companies in recent years and they move as they're moving away from proprietary tools to open source + +00:07:07.240 --> 00:07:13.400 +solutions so zenzigas for you or aesthetic aesthetic side generator in general is for you if you just + +00:07:13.520 --> 00:07:17.960 +want to get out your get your writing out and of course you can also customize it and make it make + +00:07:17.800 --> 00:07:24.200 +make it pretty as you want, but you don't necessarily need to know HTML, CSS, and JavaScript, + +00:07:24.440 --> 00:07:26.460 +and that's quite practical. + +00:07:27.360 --> 00:07:32.020 +And you talked about writing, and you even have your metaphor with strange attractors. + +00:07:34.220 --> 00:07:40.960 +I personally find if I'm just in a clean space where it's really just about the ideas, + +00:07:41.140 --> 00:07:42.880 +I don't have to worry about the design. + +00:07:43.580 --> 00:07:47.520 +It makes it so much easier to just focus on the actual writing. + +00:07:47.740 --> 00:07:49.100 +You're in a Markdown editor. + +00:07:49.680 --> 00:07:54.540 +My favorite is TypeHora, but you can use whatever variety that you want, right? + +00:07:55.320 --> 00:07:56.360 +And you're just there. + +00:07:56.880 --> 00:07:59.140 +You're not worried even hardly about the formatting of the Markdown. + +00:07:59.280 --> 00:07:59.800 +You're just writing. + +00:08:00.030 --> 00:08:04.400 +And I find that a good creative space, I guess. + +00:08:06.400 --> 00:08:07.480 +Yeah, that's the beauty of Markdown. + +00:08:07.900 --> 00:08:11.180 +So you can just write, as you mentioned. + +00:08:11.550 --> 00:08:16.060 +And how you, in the end, use it, you can still decide that afterwards. + +00:08:16.160 --> 00:08:21.900 +So if you want to build a website, if you want to create a PDF of it, if you just want to use it for internal note taking or so. + +00:08:22.900 --> 00:08:34.599 +And this is the big benefit of markdown as it takes away a lot of the headache of having to remember a lot of markup in order to get your ideas out of the door. + +00:08:35.599 --> 00:08:38.479 +Can you actually put markup in it if you need to? + +00:08:38.919 --> 00:08:48.760 +You know, for example, maybe you need a particular image, two of them side by side that are links and you want them to open in a new tab if somebody clicks them. + +00:08:49.480 --> 00:08:53.580 +Can you set it into basically an unsafe mode and let it do embedded markup? + +00:08:54.740 --> 00:08:55.720 +Yeah, that's a great question. + +00:08:56.400 --> 00:08:57.320 +So, yes, it's possible. + +00:08:57.460 --> 00:09:00.000 +You can just use HTML within Markdown. + +00:09:00.280 --> 00:09:04.420 +We currently depend on Python Markdown, which we inherited from material for MKDocs. + +00:09:05.180 --> 00:09:08.120 +we are gradually moving towards common mark, + +00:09:08.450 --> 00:09:10.400 +which, so just as a context, + +00:09:10.700 --> 00:09:12.360 +Python Markdown has some oddities + +00:09:12.610 --> 00:09:14.820 +when you use HTML within Markdown. + +00:09:14.850 --> 00:09:19.560 +For instance, it won't replace relative URLs correctly. + +00:09:19.590 --> 00:09:21.080 +This is like an annoying thing. + +00:09:22.160 --> 00:09:23.400 +But once we move to common mark, + +00:09:23.830 --> 00:09:27.520 +we will also have like predefined components + +00:09:27.740 --> 00:09:29.940 +that you can use because you can't express everything + +00:09:30.800 --> 00:09:33.640 +like more complex things in plain Markdown. + +00:09:33.860 --> 00:09:36.760 +So there are only things like you can make text bold, + +00:09:36.850 --> 00:09:38.160 +you can have lists, tables, et cetera. + +00:09:38.190 --> 00:09:41.540 +But if it's more complex, as you mentioned, + +00:09:41.940 --> 00:09:45.560 +aligning to images or having an image with a caption or so, + +00:09:46.000 --> 00:09:47.820 +you need basically HTML. + +00:09:48.280 --> 00:09:49.600 +And this is possible already, + +00:09:50.220 --> 00:09:51.900 +but we will make it much easier in the future. + +00:09:52.220 --> 00:09:53.960 +The front-end world already knows this. + +00:09:54.160 --> 00:09:55.120 +So they use MDX. + +00:09:55.280 --> 00:09:56.720 +They've been using MDX for quite a while, + +00:09:57.220 --> 00:09:59.459 +which is a dialect on top of Markdown + +00:09:59.760 --> 00:10:04.320 +with adds more liberty with components and so on. + +00:10:04.320 --> 00:10:07.080 +So you can create reusable components that you can use. + +00:10:08.280 --> 00:10:09.920 +Yeah, but yeah. + +00:10:10.280 --> 00:10:11.420 +So it's possible. + +00:10:13.660 --> 00:10:15.480 +Our users already also do it. + +00:10:15.720 --> 00:10:18.180 +We also have some examples on the documentation + +00:10:18.420 --> 00:10:19.980 +and we will make it much more powerful in the future. + +00:10:20.960 --> 00:10:21.820 +Yeah, very nice. + +00:10:22.240 --> 00:10:28.160 +I do think regular markdown is just a few missing things. + +00:10:28.340 --> 00:10:29.660 +I love the simplicity of it. + +00:10:29.990 --> 00:10:32.720 +And hat tips, John Gruber, for creating it. + +00:10:32.900 --> 00:10:37.080 +But it's just like, I just need to maybe put a class here + +00:10:37.110 --> 00:10:38.020 +or just do a little-- + +00:10:38.160 --> 00:10:40.280 +if I could just control this a little bit more, + +00:10:40.510 --> 00:10:43.540 +then you could basically escape HTML, + +00:10:44.279 --> 00:10:47.120 +with obviously being careful to not just recreate HTML + +00:10:47.310 --> 00:10:49.140 +with square brackets instead of angle brackets, right? + +00:10:50.440 --> 00:10:52.400 +Yeah, there's been a lot of work on Python Markdown. + +00:10:52.500 --> 00:10:54.120 +So in Python Markdown, there are some extensions that + +00:10:54.300 --> 00:10:57.560 +allow you to add classes at least to block elements. + +00:10:58.280 --> 00:11:02.880 +So markdown, you need to distinguish between inline and block + +00:11:03.000 --> 00:11:03.120 +elements. + +00:11:03.200 --> 00:11:04.240 +Oh, no, it also works-- sorry. + +00:11:04.260 --> 00:11:06.440 +It also works on inline elements, like links and so on. + +00:11:07.240 --> 00:11:08.180 +But this is special syntax. + +00:11:08.460 --> 00:11:11.260 +So Python Markdown is a dialect that is not standardized, + +00:11:11.620 --> 00:11:12.260 +like CommonMark. + +00:11:12.380 --> 00:11:14.380 +In CommonMark, this is not easily possible + +00:11:14.900 --> 00:11:16.140 +to add specific classes. + +00:11:16.420 --> 00:11:18.440 +But with CommonMark, as I mentioned, + +00:11:18.640 --> 00:11:21.160 +you have MDX, which is a de facto standard. + +00:11:21.260 --> 00:11:23.040 +I don't know if they've standardized it already. + +00:11:23.920 --> 00:11:25.320 +That allows for much, much more. + +00:11:26.279 --> 00:11:26.680 +Nice. + +00:11:28.340 --> 00:11:31.280 +So what is Zensical 4? + +00:11:31.440 --> 00:11:34.940 +Is this a documentation generating tool? + +00:11:35.220 --> 00:11:39.820 +Is it a just open-ended static site generator? + +00:11:42.480 --> 00:11:47.240 +What is possible and what is your goal or your target with this project? + +00:11:49.840 --> 00:11:53.020 +Yeah, so as I mentioned right now, we're focusing on documentation. + +00:11:53.440 --> 00:11:56.060 +So because this is the thing we're coming from. + +00:11:56.640 --> 00:11:59.220 +But we're building Zensical for much, much more. + +00:11:59.300 --> 00:12:05.840 +So our stretch goal is to have a fully-fledged knowledge management and documentation solution. + +00:12:07.160 --> 00:12:11.000 +There are already a lot of companies that use it internally for knowledge management. + +00:12:12.240 --> 00:12:16.440 +Basically, as an alternative to ZaaS-based solution like Confluence and Notion, + +00:12:17.080 --> 00:12:19.320 +we are aware that for this we need WYSIWYG. + +00:12:19.420 --> 00:12:20.740 +So what you see is what you get. + +00:12:20.820 --> 00:12:23.700 +A visual editor that is also usable by non-technicals. + +00:12:24.380 --> 00:12:34.260 +And if you scroll, if you check out our roadmap and scroll down all the way, you will see it as a stretch goal, which is basically something we're working towards. + +00:12:35.500 --> 00:12:41.000 +Because this would actually allow so much more people within organizations to use it. + +00:12:41.900 --> 00:12:51.440 +And in general, Zensical, with Zensical, we focus on three key areas that make us different + +00:12:51.620 --> 00:12:55.260 +from other static site generators, which is, well, a modern design. + +00:12:55.440 --> 00:13:03.480 +So, of course, some also have a modern design, but within the Python ecosystem, some options + +00:13:03.740 --> 00:13:08.420 +might look a little bit dated or a little bit, so we try to be a little bit more on the edge, + +00:13:08.700 --> 00:13:08.820 +actually. + +00:13:09.600 --> 00:13:12.240 +And it should be flexible and it should be fast. + +00:13:12.240 --> 00:13:13.260 +So those three things. + +00:13:13.500 --> 00:13:17.660 +Because the design actually is the thing that people notice first. + +00:13:18.760 --> 00:13:22.800 +So what we offer is a design that is customizable, brandable. + +00:13:22.960 --> 00:13:26.720 +You have like tons of options with which you can change how navigation is laid out. + +00:13:27.260 --> 00:13:30.320 +How you can also change like colors, fonts, etc. + +00:13:31.940 --> 00:13:36.240 +And we have a lot of components that make it ready for technical writing. + +00:13:36.380 --> 00:13:38.440 +As you mentioned, you just want to start writing. + +00:13:38.880 --> 00:13:41.500 +So we have stuff like admonitions, tabs, + +00:13:42.119 --> 00:13:45.840 +and one very specific feature that we have is code annotations + +00:13:46.160 --> 00:13:47.960 +that we inherited from Material for Enkid OX, + +00:13:48.080 --> 00:13:50.000 +which is quite unique among static site generators, + +00:13:50.300 --> 00:13:57.400 +which allows you to put a little bubble onto any line of code. + +00:13:58.320 --> 00:13:59.260 +You have to visit our documentation. + +00:13:59.720 --> 00:14:03.620 +This is our, you're currently browsing the other site. + +00:14:04.260 --> 00:14:04.940 +All right, all right, hold on. + +00:14:05.080 --> 00:14:05.420 +I got it. + +00:14:05.620 --> 00:14:06.120 +Keep going, Michael. + +00:14:06.320 --> 00:14:06.680 +I'll get to stay. + +00:14:07.240 --> 00:14:08.080 +Right, right, no worries. + +00:14:08.660 --> 00:14:11.060 +Yeah, and there you have to search for code annotations. + +00:14:12.040 --> 00:14:13.000 +Yeah, so code annotations, + +00:14:13.740 --> 00:14:19.120 +which allow you to create a bubble in any line of code. + +00:14:19.320 --> 00:14:22.040 +And if you click that bubble, there opens a tooltip. + +00:14:22.040 --> 00:14:24.100 +And within this tooltip, you can use any rich content. + +00:14:24.200 --> 00:14:28.140 +So you can have lists, any formative markdown tables, diagrams, + +00:14:29.820 --> 00:14:34.120 +basically anything you can use anyway within markdown. + +00:14:34.660 --> 00:14:36.440 +And this is a very popular feature in Material. + +00:14:36.920 --> 00:14:38.720 +And so, of course, we brought it over. + +00:14:39.520 --> 00:14:41.000 +So users can still use it. + +00:14:42.040 --> 00:14:44.820 +The second thing I talked about is it should be flexible. + +00:14:45.060 --> 00:14:47.540 +So what makes Zensical different is we have a modular architecture + +00:14:47.920 --> 00:14:50.020 +or say we're working towards a modular architecture. + +00:14:50.180 --> 00:14:51.700 +We're still in alpha. + +00:14:52.880 --> 00:14:55.420 +So we're close to finishing the module system. + +00:14:56.700 --> 00:14:58.380 +And in Zensical, it's modules all the way down, + +00:14:58.480 --> 00:15:01.500 +which means all core functionality is implemented as modules, + +00:15:01.940 --> 00:15:06.019 +which is different from other solutions where the plugin system + +00:15:07.080 --> 00:15:09.640 +sometimes is more or less an afterthought. + +00:15:09.790 --> 00:15:11.000 +So there's a plugin system added + +00:15:11.130 --> 00:15:13.120 +with specific hooks, extension points + +00:15:13.550 --> 00:15:14.300 +where you can hook into. + +00:15:15.400 --> 00:15:18.420 +And this might seem sufficient at first, + +00:15:18.620 --> 00:15:21.060 +but in the end, so for us, for instance, + +00:15:21.900 --> 00:15:23.600 +MKDocs in the end was a little bit limiting. + +00:15:24.220 --> 00:15:26.900 +And this allows you to basically swap, + +00:15:27.050 --> 00:15:28.120 +extend, replace all modules. + +00:15:28.370 --> 00:15:29.500 +You can use our modules. + +00:15:29.610 --> 00:15:30.600 +You can write your own, + +00:15:30.700 --> 00:15:31.600 +pull in third-party modules. + +00:15:31.960 --> 00:15:33.600 +And as I mentioned, Rust. + +00:15:33.940 --> 00:15:34.500 +So don't worry. + +00:15:34.790 --> 00:15:35.620 +You don't need to learn Rust. + +00:15:35.820 --> 00:15:37.880 +you will also be able to write modules in Python + +00:15:38.360 --> 00:15:40.060 +because we are super happy users of Pyro 3, + +00:15:40.440 --> 00:15:42.020 +which is absolutely amazing library. + +00:15:43.640 --> 00:15:46.980 +And Pyro 3 has really become a super important foundation + +00:15:47.160 --> 00:15:48.220 +of Python these days. + +00:15:48.380 --> 00:15:52.740 +It's almost like the C bindings for CPython. + +00:15:53.460 --> 00:15:53.700 +Exactly. + +00:15:54.200 --> 00:15:56.420 +So, yeah, so with Pyro 3, + +00:15:56.560 --> 00:15:58.680 +it allows us to have a Rust runtime. + +00:15:59.160 --> 00:16:03.700 +So all of the orchestration and how, + +00:16:03.820 --> 00:16:05.600 +in which order things are run, + +00:16:05.700 --> 00:16:06.940 +threading, caching, parallelization, etc. + +00:16:07.110 --> 00:16:08.060 +All is happening in Rust + +00:16:08.320 --> 00:16:11.000 +and we will provide Python binding + +00:16:11.070 --> 00:16:13.820 +so that you still can use Python to write modules + +00:16:14.340 --> 00:16:15.940 +and they're still running fast. + +00:16:16.580 --> 00:16:18.240 +Yeah, which brings me to the last point + +00:16:18.620 --> 00:16:19.140 +where we're different. + +00:16:19.580 --> 00:16:21.460 +We have a very heavy focus on performance. + +00:16:21.780 --> 00:16:24.920 +So our goal is to let you start with one page + +00:16:25.040 --> 00:16:27.500 +because of course all documentation sites + +00:16:27.800 --> 00:16:29.300 +or projects start small + +00:16:30.140 --> 00:16:32.780 +and let you scale that to something like 100,000 pages. + +00:16:34.200 --> 00:16:36.120 +How we do it is through differential builds. + +00:16:36.520 --> 00:16:39.660 +We have created our own runtime, which is called ZRX. + +00:16:40.360 --> 00:16:43.680 +And differential builds mean that we are only rebuilding what changed. + +00:16:43.700 --> 00:16:49.700 +So, for instance, if you only change the page title, only that page and all instances where the page title is used are being rebuilt. + +00:16:49.960 --> 00:16:52.960 +And this means that changes are visible in milliseconds and not minutes. + +00:16:54.120 --> 00:16:54.460 +Yeah. + +00:16:56.160 --> 00:16:56.780 +That's super cool. + +00:16:58.180 --> 00:17:01.500 +And so I'm presuming the build system itself is Rust-based, right? + +00:17:02.140 --> 00:17:02.740 +Yeah, exactly. + +00:17:02.880 --> 00:17:09.680 +100 rust yeah yeah yeah and coming from a python background like how what was that experience like + +00:17:09.959 --> 00:17:17.860 +building that yeah so um that's kind of a tricky question because i'm i'm i'm not really + +00:17:18.120 --> 00:17:24.240 +coming from a long history of a puzzle so i don't have a long python background uh i wrote many + +00:17:24.500 --> 00:17:32.019 +brought many in typescript um and i only started 2021 writing python um so this is actually like + +00:17:32.080 --> 00:17:39.680 +the history how materials started and and how all of this unfolded um but um i've written in several + +00:17:40.000 --> 00:17:46.200 +several languages so i also have written like in c erlang ruby python typescript um rust was still + +00:17:46.460 --> 00:17:51.440 +extremely hard to learn so i basically banged my head against the keyboard for a month wasn't making + +00:17:51.620 --> 00:17:56.139 +no progress at all because uh yeah you know fighting with the borrower checker so and once + +00:17:56.160 --> 00:18:09.520 +Once you get past that, and then, of course, lifetimes and higher rank trade bounds and some other features, I'm now some kind of like 3,000 or 4,000 hours in, something like that. + +00:18:10.540 --> 00:18:11.260 +It gets really good. + +00:18:11.640 --> 00:18:20.820 +So I think Rust is seriously one of the best languages ever made because it allows you to express ideas extremely clearly with extreme clarity. + +00:18:21.800 --> 00:18:26.800 +And this is due to the very good type system, of course. + +00:18:27.260 --> 00:18:28.680 +And you get bare metal performance. + +00:18:28.980 --> 00:18:32.940 +So I find it kind of insane having a language like Rust + +00:18:33.220 --> 00:18:36.640 +because it's so easy to write once you're used to it. + +00:18:37.060 --> 00:18:41.840 +You will be very productive and still have bare metal performance. + +00:18:42.360 --> 00:18:43.100 +It's completely insane. + +00:18:43.840 --> 00:18:44.740 +Yeah, that's wild. + +00:18:45.280 --> 00:18:46.800 +But it's got a little bit of a learning curve + +00:18:47.660 --> 00:18:50.800 +compared to like Python or TypeScript or something like that. + +00:18:51.600 --> 00:18:59.440 +yeah so i had i think 18 years of experience with many languages um as i mentioned i already did i + +00:18:59.740 --> 00:19:07.000 +also did a lot of c um and i still found it very hard to learn yeah but but it's uh if you so but + +00:19:07.040 --> 00:19:13.240 +it's worth it it's worth it and my recommendation probably would be um to learn it on something + +00:19:13.480 --> 00:19:19.419 +that you really care about so that you want to build because otherwise you will probably lose the + +00:19:20.840 --> 00:19:22.260 +drive since you're running + +00:19:22.680 --> 00:19:23.660 +against those walls. + +00:19:24.480 --> 00:19:26.120 +Maybe for you or + +00:19:26.320 --> 00:19:28.260 +for somebody else, it's much easier to learn. + +00:19:28.500 --> 00:19:29.900 +So maybe it's just + +00:19:29.980 --> 00:19:31.560 +I'm a bad example that I + +00:19:32.360 --> 00:19:33.620 +needed so long. I don't know. + +00:19:34.840 --> 00:19:36.340 +Because after that month, + +00:19:36.800 --> 00:19:38.180 +it wasn't that I was completely + +00:19:38.340 --> 00:19:39.880 +up to speed. So it was just + +00:19:39.920 --> 00:19:42.220 +I was making very, very tiny progress. + +00:19:42.660 --> 00:19:44.500 +At least progress. Because for a month + +00:19:44.500 --> 00:19:45.840 +I wasn't making progress at all. + +00:19:47.300 --> 00:19:47.800 +Yeah, wow. + +00:19:47.860 --> 00:19:52.200 +So the next show that I'm doing after this one, + +00:19:52.460 --> 00:19:56.920 +which actually is in real clock time, wall time, + +00:19:57.440 --> 00:19:59.940 +it's happening in like two hours or less from now, + +00:20:00.120 --> 00:20:04.680 +is with Samuel Colvin from Pydantic talking about Monty, + +00:20:05.600 --> 00:20:06.940 +a Python runtime. + +00:20:07.620 --> 00:20:10.020 +He and his team are rewriting in Rust, + +00:20:11.300 --> 00:20:12.580 +specifically targeting AI. + +00:20:12.880 --> 00:20:14.980 +So the Rust theme will continue. + +00:20:15.340 --> 00:20:21.340 +It's definitely a very, it caught me a little bit off guard, like how much people love it. + +00:20:21.460 --> 00:20:31.500 +But it's also, you know, it makes perfect sense that we want this nice modern language for writing lower level things, even if it plugs into Python, right? + +00:20:32.440 --> 00:20:32.540 +Yeah. + +00:20:32.570 --> 00:20:39.300 +So the fun thing is I also talked to Samuel a long time ago and he was the one recommending to me to write it in Rust. + +00:20:39.900 --> 00:20:43.100 +So it's one of the reasons. + +00:20:43.240 --> 00:20:45.840 +Yeah, definitely I looked into it. + +00:20:46.840 --> 00:20:50.500 +And it made a lot of sense also during the time, + +00:20:50.540 --> 00:20:51.760 +the progress we're making and so on + +00:20:51.760 --> 00:20:55.200 +and the walls we're hitting that to reconsider learning Rust. + +00:20:56.360 --> 00:20:57.180 +Best investment. + +00:20:57.900 --> 00:20:58.700 +Yeah, amazing. + +00:20:59.300 --> 00:20:59.460 +Amazing. + +00:21:00.400 --> 00:21:03.700 +So I want to dig into your component structure + +00:21:03.780 --> 00:21:04.620 +and some of those things. + +00:21:04.840 --> 00:21:08.680 +But maybe before we do, let's talk about the origins a little bit. + +00:21:10.680 --> 00:21:14.280 +So let's talk about how you went from material for MKDocs. + +00:21:15.880 --> 00:21:17.060 +Why even change? + +00:21:17.070 --> 00:21:18.760 +Why not just more material? + +00:21:20.960 --> 00:21:23.140 +Yeah, so this is a great question. + +00:21:23.290 --> 00:21:25.100 +And this is a little bit of a story. + +00:21:25.130 --> 00:21:26.840 +So there are several stories in there, actually. + +00:21:28.500 --> 00:21:29.440 +So it's 10 years. + +00:21:29.590 --> 00:21:35.140 +I try to go make it as compact as possible while keeping the most important things. + +00:21:35.650 --> 00:21:39.640 +So to those who don't know, material for MKDocs is a very popular documentation framework. + +00:21:39.880 --> 00:21:41.220 +It's used by tens of thousands of projects. + +00:21:42.060 --> 00:21:45.040 +There are prominent users like AWS, Microsoft, OpenAI. + +00:21:47.059 --> 00:21:48.860 +Also, large open source projects use it, + +00:21:48.980 --> 00:21:51.300 +like, for instance, FastAPI, uv, Knative. + +00:21:51.840 --> 00:21:53.900 +And it's built on top of MKDocs, as the name says, + +00:21:54.220 --> 00:21:56.980 +which became one of the most popular static site generators. + +00:21:57.660 --> 00:21:59.660 +And it also eventually became my job. + +00:22:00.420 --> 00:22:02.140 +So I could make it my job. + +00:22:02.180 --> 00:22:05.580 +I could work in open source and earn a living somehow. + +00:22:06.380 --> 00:22:07.880 +I'm getting there how that worked. + +00:22:09.420 --> 00:22:12.600 +But at some point, we needed a new foundation. + +00:22:13.400 --> 00:22:17.100 +We've kind of outgrown MKDocs because it was not evolving at the pace that we needed. + +00:22:17.460 --> 00:22:18.840 +So we began exploring alternatives. + +00:22:19.960 --> 00:22:23.580 +And yeah, so there's a lot of lessons learned in materials. + +00:22:23.660 --> 00:22:27.240 +So let me shortly maybe talk about how it started. + +00:22:27.960 --> 00:22:32.160 +Because it started as a side project in 2015, like many things start, + +00:22:32.640 --> 00:22:35.480 +because I wanted to release actually a C library, + +00:22:36.160 --> 00:22:38.720 +a zero copy protocol buffers library I wrote called Protobluff. + +00:22:39.680 --> 00:22:44.240 +but then I realized that it needed more than a readme so I looked at the existing aesthetic + +00:22:44.400 --> 00:22:49.960 +site generators which were Hugo, Jekyll, Sphinx, MKDocs, something like that and they all looked + +00:22:50.050 --> 00:22:54.940 +a little bit dated. I'm not a designer but I wanted something more modern and Google was pushing + +00:22:55.140 --> 00:23:02.100 +material design quite hard for app development at the time so and I've also seen it being used in + +00:23:02.100 --> 00:23:07.620 +the web so I thought well maybe combine this. I quickly settled on MKDocs, was easy to use, + +00:23:07.640 --> 00:23:12.880 +simple templating enough for a side project basically so it was a side project did what + +00:23:12.980 --> 00:23:18.600 +most devs do it check the license and but didn't do any further due diligence so even put mkdocs in + +00:23:18.660 --> 00:23:24.540 +the name to show the connection so which is common for themes and that actually turned out to be one + +00:23:24.600 --> 00:23:29.280 +of the biggest decisions i made in my career since i was basing my complete work on somebody + +00:23:29.680 --> 00:23:35.880 +something i don't control and it shaped the next 10 years of all of the work i was doing and is + +00:23:35.900 --> 00:23:38.480 +actually the reason why Zensical exists today. + +00:23:39.580 --> 00:23:39.960 +I see. + +00:23:40.660 --> 00:23:43.420 +So after I started developing it, + +00:23:44.180 --> 00:23:46.100 +I, like nine months later, + +00:23:46.400 --> 00:23:47.100 +released the first version + +00:23:47.240 --> 00:23:47.920 +and send a good users, + +00:23:48.240 --> 00:23:49.200 +a lot of feature requests. + +00:23:49.980 --> 00:23:52.100 +And, you know, it was a side project. + +00:23:52.140 --> 00:23:54.260 +So I was doing client work at the time. + +00:23:54.720 --> 00:23:55.100 +As I mentioned, + +00:23:55.200 --> 00:23:57.820 +I've been like a consultant and developer, + +00:23:59.000 --> 00:24:00.140 +freelancer for 20 years. + +00:24:01.560 --> 00:24:04.020 +And I only had Sundays to work on it. + +00:24:04.340 --> 00:24:09.600 +So, which at first was efficient, but the more popular it got, the more maintenance that came. + +00:24:09.880 --> 00:24:12.720 +So, it kind of crept into my mornings and evenings. + +00:24:13.090 --> 00:24:19.220 +And I was doing triage, like answering questions and trying to fix bugs before I went to the client. + +00:24:19.560 --> 00:24:24.980 +And it was getting harder and harder to justify in front of my partner, actually, because I was doing it in my spare time. + +00:24:25.600 --> 00:24:29.780 +And so I did what eventually all projects + +00:24:29.830 --> 00:24:30.740 +that start as side projects + +00:24:31.020 --> 00:24:34.300 +and where you don't have the full time to work on it, + +00:24:34.880 --> 00:24:36.140 +how they, yeah. + +00:24:36.270 --> 00:24:37.760 +So what basically happens is + +00:24:38.350 --> 00:24:39.760 +you start turning down feature requests + +00:24:40.240 --> 00:24:42.520 +and many open source projects don't cross this line. + +00:24:42.650 --> 00:24:43.880 +And for me, it was a first. + +00:24:45.299 --> 00:24:47.480 +So yeah, and also additionally, + +00:24:47.800 --> 00:24:50.800 +so I mentioned before that I started writing Python in 2021. + +00:24:51.230 --> 00:24:52.760 +At the time I was focusing, + +00:24:53.780 --> 00:24:55.720 +So I only had Sundays to work on it. + +00:24:55.720 --> 00:24:56.420 +I didn't know Python. + +00:24:56.820 --> 00:24:59.920 +So I said that, okay, I will focus on the templating stuff. + +00:25:00.180 --> 00:25:02.040 +I will do the HTML, CSS, JavaScript, all of this, + +00:25:02.180 --> 00:25:04.080 +make it beautiful and try to solve as much, + +00:25:05.120 --> 00:25:07.160 +as many problems as possible in the front end. + +00:25:07.980 --> 00:25:09.740 +But I won't start learning Python + +00:25:09.980 --> 00:25:13.380 +because it wasn't a language that I was using at that time + +00:25:13.620 --> 00:25:16.260 +and I couldn't make up the time for it. + +00:25:16.420 --> 00:25:18.080 +So that's where I drew the line. + +00:25:19.760 --> 00:25:22.220 +It's probably going to be a fad, that Python thing anyway. + +00:25:23.580 --> 00:25:25.359 +I don't think so, but... + +00:25:25.960 --> 00:25:32.900 +Well, at the time, in 2015, it wasn't clear that it was going to be as popular as it is now, right? + +00:25:32.930 --> 00:25:41.280 +It started to become popular then, but it's really taken over the world for a lot of reasons. + +00:25:42.160 --> 00:25:49.359 +Of course, I think one of the main reasons is because it's very popular in the ML community + +00:25:49.380 --> 00:25:51.740 +and all of the LLM AI work that's happening + +00:25:51.860 --> 00:25:52.980 +and so on made it extremely popular. + +00:25:53.160 --> 00:25:59.300 +So, and I also think that Rust is doing a very good job + +00:25:59.540 --> 00:26:00.580 +on keeping it that way, + +00:26:01.180 --> 00:26:03.340 +because finally you have a very easy way + +00:26:03.340 --> 00:26:06.280 +to offload work to native code, + +00:26:06.540 --> 00:26:09.840 +which is much easier than fiddling with C and C++ + +00:26:10.100 --> 00:26:11.120 +and void pointers and whatever. + +00:26:11.480 --> 00:26:14.700 +So as I mentioned, Pyro3 is just an absolutely amazing library. + +00:26:14.820 --> 00:26:16.660 +It's so easy to write Rust code. + +00:26:17.240 --> 00:26:18.480 +- Yeah, I think you're right. + +00:26:18.620 --> 00:26:22.400 +I think Rust has really provided an important escape hatch for, + +00:26:22.650 --> 00:26:24.240 +I wrote it this way, it's not fast enough. + +00:26:24.350 --> 00:26:27.420 +Like, well, this part, we're going to make it as fast as it can be, basically. + +00:26:28.100 --> 00:26:28.420 +Yeah. + +00:26:29.600 --> 00:26:29.760 +Yeah. + +00:26:30.120 --> 00:26:30.300 +So. + +00:26:31.680 --> 00:26:31.820 +All right. + +00:26:31.860 --> 00:26:32.580 +Sorry, I interrupted you. + +00:26:32.590 --> 00:26:32.920 +Keep going. + +00:26:32.980 --> 00:26:33.360 +Oh, no worries. + +00:26:33.470 --> 00:26:33.740 +No worries. + +00:26:34.160 --> 00:26:34.260 +Yeah. + +00:26:34.310 --> 00:26:34.540 +No, no. + +00:26:35.260 --> 00:26:35.360 +Yeah. + +00:26:35.460 --> 00:26:38.760 +So as I mentioned, I tried to keep it basically afloat for the first four years. + +00:26:40.220 --> 00:26:42.520 +And at the time, I didn't see the potential at all. + +00:26:42.700 --> 00:26:45.400 +It was just a theme, not a kind of product or so. + +00:26:45.880 --> 00:26:48.480 +But yet I felt responsible and kept on maintaining it. + +00:26:48.600 --> 00:26:52.120 +and my developer friends didn't understand why I was doing that. + +00:26:53.660 --> 00:26:57.560 +But for me, it was kind of cool because I had a growing project. + +00:26:57.760 --> 00:26:58.640 +I had no immediate plans. + +00:26:58.700 --> 00:26:59.060 +I don't know. + +00:26:59.680 --> 00:27:02.520 +Let's see where I can take it. + +00:27:03.160 --> 00:27:06.940 +And with this steady and slowly growth over years, + +00:27:07.140 --> 00:27:08.920 +then companies and organizations started using it. + +00:27:08.940 --> 00:27:13.400 +So they were basing their public-facing documentation on me, + +00:27:13.720 --> 00:27:17.160 +like the guy that maybe works on this project on a Sunday. + +00:27:18.040 --> 00:27:23.320 +And yet I felt responsible enough to trying to fix the bugs reported as quickly as possible. + +00:27:24.300 --> 00:27:24.380 +Yeah. + +00:27:25.080 --> 00:27:28.120 +And yeah, then in 2020 actually came the turning point. + +00:27:28.120 --> 00:27:31.320 +So when I was working on version five of it, I shared my progress publicly as I did before. + +00:27:31.400 --> 00:27:32.900 +And somebody mentioned a donate button. + +00:27:34.400 --> 00:27:42.340 +So I think the wording was something like so that I can order pizza to survive the long Sunday coding sessions. + +00:27:44.060 --> 00:27:47.600 +But I heard from another developer who did this on his project, + +00:27:47.940 --> 00:27:52.400 +successful project for five years, a donate button, and he made $90. + +00:27:52.660 --> 00:27:56.280 +So I immediately said that's not going to work. + +00:27:56.460 --> 00:27:59.580 +But I said, let's try an Amazon wish list. + +00:28:00.140 --> 00:28:01.640 +You know, I just put some stuff on there. + +00:28:01.960 --> 00:28:04.820 +And maybe if somebody thinks my work is useful, + +00:28:05.040 --> 00:28:08.340 +then he can order me like, make me a present, something, send me a present. + +00:28:09.740 --> 00:28:12.980 +So yeah, and I basically received everything on that wish list. + +00:28:13.420 --> 00:28:17.180 +it was completely insane so there were two consecutive days that felt like christmas i even + +00:28:17.340 --> 00:28:24.780 +put like so um i put some you know books and but then also uh single malt i i love scottish + +00:28:25.020 --> 00:28:31.960 +scottish single malt uh it was a whiskey that cost 120 and i received that that as well so it + +00:28:31.960 --> 00:28:37.139 +was like what's happening um and that led me to start thinking actually about demographics + +00:28:38.420 --> 00:28:40.760 +so that I needed to better understand + +00:28:40.990 --> 00:28:43.060 +the audience of material for MKDocs. + +00:28:43.660 --> 00:28:44.540 +And I did a poll, + +00:28:44.750 --> 00:28:46.740 +and the results were absolutely eye-opening. + +00:28:46.750 --> 00:28:47.560 +I mentioned before, + +00:28:48.150 --> 00:28:52.060 +7% only of users are front-end developers, + +00:28:52.290 --> 00:28:52.800 +which means, + +00:28:52.980 --> 00:28:54.540 +and material is a front-end heavy project. + +00:28:55.620 --> 00:28:57.520 +So I kind of had an edge there + +00:28:57.590 --> 00:28:59.020 +in the Python space + +00:29:00.200 --> 00:29:00.660 +because, yeah, + +00:29:00.840 --> 00:29:01.880 +it's based on Python, + +00:29:02.060 --> 00:29:02.760 +so front-end developers + +00:29:03.060 --> 00:29:04.260 +that write in JavaScript, + +00:29:04.940 --> 00:29:06.499 +they rather go for something + +00:29:06.520 --> 00:29:08.280 +like DocuSaurus or React-based or whatever. + +00:29:08.960 --> 00:29:11.020 +And technical writers were quite happy with the project. + +00:29:11.360 --> 00:29:13.600 +I didn't know even technical writers existed. + +00:29:13.810 --> 00:29:16.240 +So I had no clue that this job, + +00:29:16.460 --> 00:29:17.420 +that this is a job + +00:29:17.820 --> 00:29:19.040 +because I thought at the time, + +00:29:19.050 --> 00:29:21.340 +and it's in hindsight completely naive, of course, + +00:29:21.500 --> 00:29:23.340 +I thought that as a developer, + +00:29:23.350 --> 00:29:24.800 +you need to write the documentation, you know? + +00:29:25.320 --> 00:29:27.240 +And so I learned about that + +00:29:27.710 --> 00:29:30.520 +and accidentally built a product for technical writers. + +00:29:31.040 --> 00:29:32.960 +And by the way, when I say product, + +00:29:33.110 --> 00:29:35.879 +I mean something that is not necessarily + +00:29:35.900 --> 00:29:39.880 +something you pay for but something that doesn't feel engineered so something that is like polished + +00:29:40.630 --> 00:29:49.880 +and designed and that you actually want to want to use um and yeah so uh i had a product a product + +00:29:50.020 --> 00:29:55.400 +that has like product market fit and but at the time i didn't earn any money off it so at the same + +00:29:55.400 --> 00:30:01.779 +time i read about sponsorware and this like a i'm not sure if you heard of it but it's like a new + +00:30:01.800 --> 00:30:03.540 +model of monetization for open source. + +00:30:03.670 --> 00:30:04.820 +At the time, it was quite new + +00:30:05.820 --> 00:30:07.500 +so that you can get paid for your work. + +00:30:07.680 --> 00:30:09.560 +So some developers, for instance, + +00:30:09.650 --> 00:30:11.460 +they sell course material or + +00:30:11.990 --> 00:30:13.560 +access to gated content or code + +00:30:13.720 --> 00:30:15.580 +or nothing at all. So if you have a + +00:30:15.740 --> 00:30:17.560 +popular project, you can just try to + +00:30:18.200 --> 00:30:19.280 +raise sponsorships from + +00:30:19.540 --> 00:30:21.580 +some companies are very generous when it + +00:30:21.680 --> 00:30:23.560 +comes to open source. And what we did + +00:30:23.660 --> 00:30:25.199 +with Material was we gave away + +00:30:25.800 --> 00:30:27.720 +early access to the latest features + +00:30:28.260 --> 00:30:29.680 +to the sponsors. And each + +00:30:29.980 --> 00:30:31.760 +feature was tied to a funding goal + +00:30:31.780 --> 00:30:38.060 +that funding goal was met it became free for everyone so it was like kind of a funded uh + +00:30:38.440 --> 00:30:46.220 +feature development in multiple stages and uh that's what i thought of it sorry yeah that's + +00:30:46.280 --> 00:30:54.500 +super clever i really love the idea of uh providing something for the sponsors but still not turning + +00:30:54.680 --> 00:31:00.320 +into well here's a paid version of our product and here's the open source version but there's always + +00:31:00.340 --> 00:31:06.280 +this tension of how do you how do you reward the people who support you without undermining the + +00:31:06.330 --> 00:31:12.820 +open source project and that's a clever angle yeah so that's extremely challenging um so + +00:31:13.400 --> 00:31:17.940 +as i'm as i'm telling this so this is what i came up with when and i thought maybe it could work + +00:31:18.140 --> 00:31:23.720 +something like that and again my developer friends they said will never work nobody will pay for + +00:31:23.830 --> 00:31:29.559 +open source you're insane spoiler alert it did work and in the end we made 200k a year of it + +00:31:29.720 --> 00:31:31.320 +and could build a team and everything. + +00:31:31.430 --> 00:31:32.640 +So I know in Silicon Valley terms, + +00:31:32.730 --> 00:31:34.080 +this is probably minimum wage, + +00:31:34.320 --> 00:31:38.060 +but in Europe, it's quite an amount + +00:31:38.070 --> 00:31:39.600 +with which you can work very well. + +00:31:40.480 --> 00:31:42.880 +And yeah, so I started this program in 2020 + +00:31:43.150 --> 00:31:44.860 +and it grew steadily + +00:31:45.640 --> 00:31:47.820 +and it finally allowed me to work on features + +00:31:49.150 --> 00:31:49.880 +outside of the Sunday. + +00:31:50.030 --> 00:31:51.740 +So invest more hours into it + +00:31:52.420 --> 00:31:54.820 +and finally learn Python in 2021, + +00:31:55.110 --> 00:31:56.000 +wrote my first plugin + +00:31:56.020 --> 00:32:03.100 +and started hacking the mkdocs features that well that got turned down that that we + +00:32:04.560 --> 00:32:08.660 +upstreamed but where the maintainer said ah it's maybe not a good fit or we don't have the time for + +00:32:08.780 --> 00:32:14.940 +it and yeah in total i wrote 12 mkdocs plugins so it started as a theme but it turned into a popular + +00:32:15.240 --> 00:32:20.540 +sorry into a powerful docs framework in the end and this worked quite well for several years + +00:32:21.400 --> 00:32:23.160 +until it didn't anymore. + +00:32:23.680 --> 00:32:27.520 +And that's the reason why Zensical then came into being. + +00:32:28.980 --> 00:32:30.680 +So the way it didn't work is that, + +00:32:31.340 --> 00:32:33.880 +like, just where you want to take it + +00:32:33.940 --> 00:32:35.940 +started to diverge from MKDocs + +00:32:36.120 --> 00:32:39.540 +or you couldn't get your changes upstreamed + +00:32:39.680 --> 00:32:40.620 +or committed back? + +00:32:42.860 --> 00:32:46.240 +So the thing was that MKDocs was not evolving + +00:32:46.980 --> 00:32:47.900 +as we needed it. + +00:32:48.960 --> 00:32:52.540 +So historically, MKDocs had a sequence of single maintainers. + +00:32:52.960 --> 00:32:59.100 +And as far as I know, all of them worked on it in their spare time because they had regular jobs. + +00:32:59.820 --> 00:33:02.780 +And Material was evolving quickly because we had funding. + +00:33:04.400 --> 00:33:05.640 +We could invest much more time in it. + +00:33:05.640 --> 00:33:10.900 +We could then, of course, then an open source project that is only maintained in spare time. + +00:33:11.580 --> 00:33:12.920 +And so it was changing too slowly. + +00:33:13.080 --> 00:33:19.700 +So we started a lot of discussions on necessary API changes because for many users, material for MKDocs was MKDocs. + +00:33:20.100 --> 00:33:27.580 +So we were kind of like the storefront where most of the issues and like bug reports and feature requests came in. + +00:33:28.140 --> 00:33:34.600 +Because many people are using material for MKDocs and with this MKDocs basically. + +00:33:35.860 --> 00:33:39.920 +And the main challenges that we faced were performance and plugin orchestration. + +00:33:40.020 --> 00:33:45.540 +I mentioned I wrote 12 plugins and it's very hard to make them cooperate. + +00:33:46.100 --> 00:33:55.060 +And if you look at any popular MKDocs plugins issue tracker, you will find issues that go something like, well, this plugin is incompatible with this plugin. + +00:33:55.780 --> 00:34:00.060 +Well, if I change the order of the plugins in the configuration, this and this happens. + +00:34:00.760 --> 00:34:07.960 +So and both of those problems were brought to us again and again by the users with which we talked. + +00:34:08.120 --> 00:34:13.280 +and so you know it was coming coming up a lot then suddenly after nine years the original + +00:34:13.600 --> 00:34:17.960 +maintainer returned to mkdocs and we were super optimistic because the project was like maintained + +00:34:18.159 --> 00:34:22.080 +again he also started a sponsorship program we upstreamed some of our funding immediately + +00:34:22.700 --> 00:34:30.679 +and supported his work so before mkdocs have had no way to sponsor them and the moment this uh + +00:34:31.300 --> 00:34:36.980 +this went live we immediately supported it and some prs were finally merged and issues were closed but + +00:34:37.659 --> 00:34:45.500 +yeah then um the works went silent and you started working in in basically uh in the quiet + +00:34:45.840 --> 00:34:52.679 +and three months later we were invited to a video call um so um we as maintainers from so i as a + +00:34:52.760 --> 00:34:58.100 +maintainer for material for mkdocs and um some other key ecosystem maintainers and we learned + +00:34:58.100 --> 00:35:06.400 +that mkdocs um that the plans for mkdocs 2.0 were completely different uh from what existed at the + +00:35:06.280 --> 00:35:15.680 +So what currently exists, mkdocs 1.x, which primarily means no plugin API and customization via templating alone. + +00:35:16.340 --> 00:35:22.380 +So we already knew this is not enough because that's what we've done the first four years where, as I mentioned, I was only doing the templating. + +00:35:23.140 --> 00:35:26.080 +And some things you can't just do with templates. + +00:35:26.660 --> 00:35:36.300 +For instance, having a tag support where you need to pull in different tags from different pages and then render them on another page or so. + +00:35:36.320 --> 00:35:40.020 +So you need synchronization efforts and you can't do this with templating. + +00:35:40.620 --> 00:35:42.160 +By the way, all of this information is public. + +00:35:42.520 --> 00:35:44.840 +So you can read it on the MKDocs issue tracker. + +00:35:45.200 --> 00:35:47.820 +So, yeah, I'm not telling anything secret or so. + +00:35:48.520 --> 00:35:48.620 +Yeah. + +00:35:48.860 --> 00:35:51.860 +So it's a completely different direction than the one that we worked on. + +00:35:52.060 --> 00:35:53.760 +And we raised objections in the call. + +00:35:54.300 --> 00:35:56.740 +But yeah, still they were dismissed. + +00:35:57.240 --> 00:36:00.080 +So MKDocs 2.0, as it looks right now, + +00:36:00.680 --> 00:36:02.840 +is incompatible with material for MKDocs. + +00:36:03.240 --> 00:36:05.760 +300 plugins in the ecosystem will become useless + +00:36:06.240 --> 00:36:07.880 +and tens of thousands of projects will be affected. + +00:36:08.010 --> 00:36:10.840 +And for us, so we had absolutely no choice + +00:36:11.720 --> 00:36:13.840 +than to start building something. + +00:36:13.990 --> 00:36:16.160 +So to start make something of this, + +00:36:16.260 --> 00:36:19.560 +because at the time we had already 50,000 projects, + +00:36:20.360 --> 00:36:22.340 +50,000 public projects, depending on us. + +00:36:22.980 --> 00:36:28.220 +We're talking to enterprise users, and we knew that this number is much, much higher. + +00:36:28.300 --> 00:36:33.040 +So, for instance, one of our professional users, they already also sponsored material. + +00:36:33.860 --> 00:36:37.640 +They have two and a half thousand projects internally. + +00:36:38.480 --> 00:36:40.760 +So, only one company. + +00:36:41.000 --> 00:36:52.740 +And they have a dedicated team of individuals that maintain their customizations on top of material for MKDocs for all of the teams inside the company. + +00:36:52.880 --> 00:36:53.560 +It's a very big company. + +00:36:54.200 --> 00:37:01.180 +So that's what you could infer from the I could believe it. + +00:37:01.540 --> 00:37:03.000 +I couldn't believe it at all. + +00:37:03.160 --> 00:37:04.180 +So absolutely insane. + +00:37:04.940 --> 00:37:05.000 +Yeah. + +00:37:05.180 --> 00:37:07.260 +So as I mentioned, we had no choice. + +00:37:07.440 --> 00:37:15.280 +So what we did was we immediately went back to the drawing board with the learnings from the almost 10 years that passed since I started Material. + +00:37:16.320 --> 00:37:20.400 +We built a lot of prototypes in TypeScript and Python, iterated on them. + +00:37:20.420 --> 00:37:21.840 +We did a lot of conceptual work. + +00:37:23.319 --> 00:37:25.440 +and realized within weeks + +00:37:25.860 --> 00:37:27.260 +what could actually be done + +00:37:27.340 --> 00:37:28.860 +with a radically different architecture. + +00:37:28.960 --> 00:37:30.480 +Because writing 12 plugins, + +00:37:30.800 --> 00:37:33.540 +I know the ins and outs of MKDocs. + +00:37:33.700 --> 00:37:34.420 +I know the... + +00:37:34.700 --> 00:37:37.120 +So I had to do a lot of hacks, for instance, + +00:37:37.160 --> 00:37:39.320 +to make the block plugin of material work + +00:37:39.640 --> 00:37:41.740 +with the way navigation works in MKDocs. + +00:37:42.760 --> 00:37:45.540 +And the number one complaint, as I mentioned, + +00:37:45.920 --> 00:37:48.740 +was MKDocs is slow and it doesn't scale. + +00:37:49.040 --> 00:37:51.800 +So like fixing a typo, you're doing a full rebuild. + +00:37:52.080 --> 00:37:53.140 +and this can take minutes. + +00:37:53.390 --> 00:37:57.100 +So our design work centered exactly around this problem. + +00:37:58.680 --> 00:38:00.860 +And after a short while, + +00:38:01.140 --> 00:38:03.840 +so we knew exactly what MKDocs should look like + +00:38:04.880 --> 00:38:06.580 +and we didn't want to let our users down. + +00:38:06.730 --> 00:38:09.100 +And so in essence, we had two options. + +00:38:10.140 --> 00:38:11.680 +We know what it should look like. + +00:38:12.050 --> 00:38:14.380 +We could fork it or we could start from scratch. + +00:38:15.640 --> 00:38:18.140 +And forking is not really possible + +00:38:19.100 --> 00:38:21.560 +because of the way how Python dependencies work. + +00:38:21.720 --> 00:38:24.640 +So all of the plugins have a dependency on MKDocs. + +00:38:25.480 --> 00:38:28.740 +And this means that we would also need to fork all of... + +00:38:28.800 --> 00:38:31.680 +So without doing black magic with imports, + +00:38:32.320 --> 00:38:35.100 +which might not be the best idea. + +00:38:36.940 --> 00:38:38.860 +So we would also need to fork all plugins + +00:38:39.400 --> 00:38:41.800 +or all plugins would need to switch to the fork. + +00:38:41.900 --> 00:38:45.400 +So this would be like moving an entire city at once. + +00:38:46.140 --> 00:38:47.340 +And it's frankly impossible. + +00:38:48.260 --> 00:38:51.420 +So, and if we would fork it, + +00:38:51.560 --> 00:38:53.960 +we wouldn't be able to realize our learnings + +00:38:54.220 --> 00:38:56.840 +that we gained in the groundwork that we did. + +00:38:56.840 --> 00:38:58.220 +So we had to start from scratch, actually. + +00:38:58.250 --> 00:39:01.500 +Right, plus you'd have to convince the entire community + +00:39:01.690 --> 00:39:04.380 +to at least create a parallel package. + +00:39:05.340 --> 00:39:08.420 +Because when you pip install that other plugin, + +00:39:09.440 --> 00:39:13.000 +it's going to say, hey, PyPI, I need mkdocs. + +00:39:13.560 --> 00:39:15.720 +And now you'd need the forked version, + +00:39:16.560 --> 00:39:18.240 +whatever that's going to be called, right? + +00:39:19.040 --> 00:39:21.960 +So, yeah, it would be a big battle, wouldn't it? + +00:39:22.740 --> 00:39:23.740 +Just technically with... + +00:39:24.780 --> 00:39:26.640 +Or you'd have to move the community, + +00:39:26.730 --> 00:39:29.020 +which is a very challenging thing to do. + +00:39:30.380 --> 00:39:33.320 +Yeah, and so for us, + +00:39:33.440 --> 00:39:35.400 +the most sensible thing was to just, you know, + +00:39:35.400 --> 00:39:36.520 +we just start from scratch. + +00:39:37.280 --> 00:39:38.860 +We make it as compatible as possible. + +00:39:39.860 --> 00:39:43.520 +It became quite clear very quickly + +00:39:43.720 --> 00:39:45.620 +that we need to optimize for compatibility + +00:39:45.640 --> 00:39:48.600 +because if you create something that is not compatible + +00:39:48.740 --> 00:39:52.100 +and that forces users to migrate documentation manually + +00:39:52.500 --> 00:39:55.760 +and to do a lot of work to get over to something else, + +00:39:56.400 --> 00:39:57.540 +you won't get a lot of adoption. + +00:39:58.240 --> 00:40:02.040 +Yeah, all you got to do is think about that 2,500 project team. + +00:40:02.400 --> 00:40:05.560 +Like, okay, how do I keep them working with this, right? + +00:40:06.140 --> 00:40:06.720 +Yes, yes. + +00:40:07.580 --> 00:40:13.220 +Yeah, so what we then did is we had an idea how it should look. + +00:40:14.340 --> 00:40:17.120 +Then we started with Rust because it was recommended to us. + +00:40:17.340 --> 00:40:18.920 +So it was very hard at first. + +00:40:20.640 --> 00:40:25.000 +And in total, it took us 16 months to build all of this. + +00:40:25.520 --> 00:40:26.820 +But it was not only writing code. + +00:40:26.900 --> 00:40:29.720 +It was also exactly knowing where we want to go. + +00:40:31.080 --> 00:40:32.480 +Because, you know, we're starting fresh. + +00:40:33.340 --> 00:40:36.160 +So we better be sure that we are going into a direction + +00:40:37.020 --> 00:40:40.320 +where we actually want to go for the next 10 to 20 to 30 years. + +00:40:41.000 --> 00:40:41.440 +Depends. + +00:40:42.100 --> 00:40:44.620 +We are really in for this for the long game. + +00:40:45.680 --> 00:40:47.400 +So the 10 years that I've been doing this, + +00:40:48.000 --> 00:40:50.000 +I see that this is only the start. + +00:40:50.660 --> 00:40:52.460 +And we wrote a lot of things from scratch. + +00:40:52.920 --> 00:40:55.040 +So the runtime, as I mentioned, + +00:40:55.160 --> 00:40:56.520 +that's like the heart of Zendicle. + +00:40:57.620 --> 00:41:00.140 +It already has something like 15,000 lines of code. + +00:41:01.040 --> 00:41:03.800 +A tiny HTTP middleware framework for file serving + +00:41:03.970 --> 00:41:05.060 +because we don't want to, + +00:41:05.180 --> 00:41:07.220 +so we also want to make the file server extensible + +00:41:07.540 --> 00:41:11.220 +and don't want users to force them into async rust + +00:41:11.240 --> 00:41:13.720 +and also don't have a dependency on Tokyo. + +00:41:15.020 --> 00:41:17.360 +And also like a monorepo management tool + +00:41:17.390 --> 00:41:19.460 +for Rust and JavaScript that we also open sourced, + +00:41:19.620 --> 00:41:22.000 +which I'm not sure if you've worked with monorepos, + +00:41:22.300 --> 00:41:24.120 +but in JavaScript, for instance, + +00:41:24.680 --> 00:41:26.780 +there's Lerner and it has 800 dependencies. + +00:41:26.970 --> 00:41:27.760 +So when you install it, + +00:41:28.720 --> 00:41:30.440 +what you pull down is just insane. + +00:41:31.520 --> 00:41:33.860 +So we worked a lot on the processes as well + +00:41:34.140 --> 00:41:36.080 +that we can make releases very easy + +00:41:36.420 --> 00:41:39.200 +and that we have a good way of working basically. + +00:41:39.380 --> 00:41:42.320 +And we're very careful about our choice of dependencies. + +00:41:42.700 --> 00:41:44.580 +So if it's not something that, + +00:41:44.730 --> 00:41:46.560 +so let me put it another way. + +00:41:46.560 --> 00:41:49.160 +If it's something that you can write quite quickly, actually, + +00:41:49.440 --> 00:41:52.920 +and we rather own in order to make changes ourselves, + +00:41:53.330 --> 00:41:55.340 +we rather write it from scratch. + +00:41:56.220 --> 00:41:56.320 +Yeah. + +00:41:56.320 --> 00:41:58.400 +I think that's a very healthy philosophy. + +00:41:59.180 --> 00:42:03.720 +And also I think this agentic AI world that we're in these days, + +00:42:04.330 --> 00:42:07.580 +if you just need one or two functions and you used to think, + +00:42:07.740 --> 00:42:09.240 +well, maybe I'll lean on this. + +00:42:09.320 --> 00:42:12.780 +in your case a crate or maybe a PyPI package or something. + +00:42:14.150 --> 00:42:15.320 +But if it's just one or two functions, + +00:42:15.600 --> 00:42:18.500 +maybe you really can just write it yourself without much effort. + +00:42:18.800 --> 00:42:21.540 +And it just saves you so much trouble. + +00:42:22.760 --> 00:42:27.280 +I started using pip-audit for a lot of my projects. + +00:42:29.000 --> 00:42:31.720 +And I would say for my bigger projects, + +00:42:31.840 --> 00:42:38.980 +every two weeks I get at least one CVE vulnerability notification + +00:42:39.480 --> 00:42:40.380 +for something I'm using. + +00:42:40.660 --> 00:42:43.280 +I'm like, but here's the thing. + +00:42:43.560 --> 00:42:45.800 +It's in a situation of that, + +00:42:46.420 --> 00:42:48.340 +probably a piece of code or functionality of that package + +00:42:48.600 --> 00:42:50.080 +that I don't even use or care about. + +00:42:50.750 --> 00:42:52.000 +So it doesn't really apply to me, + +00:42:52.410 --> 00:42:53.620 +but then I've got all these like, + +00:42:53.780 --> 00:42:56.420 +here's a latent issue that is in my code + +00:42:57.270 --> 00:42:59.060 +that I'm going to have to figure out and deal with, + +00:42:59.380 --> 00:43:02.380 +But it's because I've taken in so much as part of this package + +00:43:02.720 --> 00:43:04.360 +where if I had just written the one or two functions, + +00:43:04.740 --> 00:43:06.200 +then it'd be fine. + +00:43:06.440 --> 00:43:06.900 +You know what I mean? + +00:43:07.980 --> 00:43:08.340 +Absolutely. + +00:43:09.540 --> 00:43:11.500 +I think things are swinging back a little bit from like, + +00:43:11.660 --> 00:43:14.880 +let's just pull in everything because it's going to help us to like, + +00:43:15.040 --> 00:43:16.340 +well, maybe not everything. + +00:43:17.160 --> 00:43:17.340 +Yeah. + +00:43:17.950 --> 00:43:23.640 +And also you can't just change things easily + +00:43:23.860 --> 00:43:26.300 +and you depend on other APIs. + +00:43:26.580 --> 00:43:34.180 +So, for instance, one of the reasons why we choose to build a lot of things from scratch is that we want to control the public API. + +00:43:34.540 --> 00:43:42.920 +So, the worst thing for us would probably just be to export a third-party API that we're using as part of our public interface because it's Rust. + +00:43:43.240 --> 00:43:48.280 +So, it would mean that if this public API would change, the entire ecosystem would break. + +00:43:48.440 --> 00:43:53.320 +So we're very careful what APIs we expose + +00:43:54.240 --> 00:43:57.560 +and rather wrap it in order to be safe + +00:43:57.590 --> 00:44:00.340 +so we can replace things, keep things replaceable. + +00:44:02.600 --> 00:44:03.940 +So maybe you have the philosophy of + +00:44:04.000 --> 00:44:05.860 +it might be okay to use this crate, + +00:44:06.300 --> 00:44:09.660 +but we don't exchange its types as the public, + +00:44:10.400 --> 00:44:13.080 +as part of our API or something along those lines. + +00:44:13.530 --> 00:44:14.580 +Yeah, we don't expose it. + +00:44:14.920 --> 00:44:24.600 +So we, in some instance, the wrappers that I've wrote are identical to the types that we use from another crate. + +00:44:24.740 --> 00:44:30.940 +But by using our own types or just wrapping them, because in Rust, the nice benefit is you have zero cost abstraction. + +00:44:31.280 --> 00:44:33.200 +So all the code is monomorphized in line. + +00:44:33.200 --> 00:44:36.280 +So you don't pay for wrapping code. + +00:44:36.880 --> 00:44:38.520 +That's the absolute crazy thing. + +00:44:38.580 --> 00:44:43.240 +So you can finally create a really clean architecture + +00:44:43.580 --> 00:44:46.160 +without runtime penalties if you do it right. + +00:44:46.460 --> 00:44:47.240 +Oh, that's wild. + +00:44:47.300 --> 00:44:48.040 +Yeah, yeah, yeah. + +00:44:48.040 --> 00:44:48.540 +Very interesting. + +00:44:50.180 --> 00:44:52.060 +So you can see I have this huge list of topics. + +00:44:52.380 --> 00:44:56.920 +We're basically just barely cracking the surface. + +00:44:57.320 --> 00:44:59.880 +But I'd like to go back to this component. + +00:45:03.619 --> 00:45:04.940 +Wrong search there. + +00:45:06.240 --> 00:45:07.840 +you have a component + +00:45:08.790 --> 00:45:09.660 +that was in the other part + +00:45:09.800 --> 00:45:12.000 +let's just talk down and talk through some of these + +00:45:12.640 --> 00:45:14.000 +these things here so you've got like + +00:45:15.760 --> 00:45:16.200 +admonitions + +00:45:16.830 --> 00:45:17.940 +buttons, code blocks + +00:45:18.140 --> 00:45:20.180 +let's talk through some of the building blocks I guess that + +00:45:20.320 --> 00:45:21.560 +you think are interesting here + +00:45:22.800 --> 00:45:24.040 +yeah so I think + +00:45:24.260 --> 00:45:24.780 +most of the + +00:45:25.400 --> 00:45:28.220 +if you're not new to technical writing + +00:45:28.960 --> 00:45:30.100 +most of the stuff shouldn't be + +00:45:31.320 --> 00:45:32.100 +quite new so like + +00:45:32.260 --> 00:45:34.240 +admonitions, code blocks, stuff like that + +00:45:34.300 --> 00:45:40.160 +you've probably seen or data tables uh diagrams are just mermaid diagrams uh as they are as they + +00:45:40.160 --> 00:45:46.460 +can you as you can use them on github uh one of the so like the flagship features in material + +00:45:47.120 --> 00:45:53.100 +or and now zensical um as i mentioned like code annotations which is a part of code blocks um + +00:45:54.040 --> 00:46:00.460 +otherwise we also have a an icon and emoji integration so um you can use one of i think + +00:46:00.340 --> 00:46:06.140 +we have something like over 10,000 icons now with a quite simple syntax. That's not standard markdown. + +00:46:06.380 --> 00:46:11.440 +That's the problem. So that's like a Python markdown extension. And we're working on moving this over + +00:46:11.620 --> 00:46:20.020 +to common mark and finding a way to migrate this over. Because you know, right now it's Zensika + +00:46:20.480 --> 00:46:25.360 +uses Python markdown for compatibility with materials for mkdocs, which means that for markdown + +00:46:25.380 --> 00:46:27.220 +on rendering, we need to go through Python. + +00:46:28.180 --> 00:46:30.220 +And this is a temporary limitation that we have, + +00:46:30.920 --> 00:46:34.900 +because I mentioned we are focusing really hard + +00:46:34.940 --> 00:46:35.500 +on compatibility. + +00:46:37.240 --> 00:46:39.360 +And all of those components will also, of course, + +00:46:39.380 --> 00:46:41.520 +be available within our CommonMark solution + +00:46:41.680 --> 00:46:44.680 +that we're working on that we will ship later this year. + +00:46:46.220 --> 00:46:46.380 +Yeah. + +00:46:47.440 --> 00:46:49.400 +But right now, of course, you can use them + +00:46:49.760 --> 00:46:51.440 +as they're mentioned on our documentation. + +00:46:51.860 --> 00:46:53.660 +And we will, of course, provide automated tooling + +00:46:53.880 --> 00:46:55.240 +to get them over to CommonMark. + +00:46:56.859 --> 00:47:04.980 +yeah i guess it's interesting that you've got to not just consider the api and and the syntax + +00:47:05.170 --> 00:47:12.100 +and stuff but maybe even the same parsing engine to have this strong compatibility right yeah we + +00:47:12.100 --> 00:47:17.120 +can even read mkdocs yml configuration so you can build an mkdocs project with zendicle as it stands + +00:47:17.800 --> 00:47:23.640 +the thing that we currently don't support in its entirety is the plugins from the ecosystem we + +00:47:24.680 --> 00:47:26.300 +We already support some plugins. + +00:47:26.870 --> 00:47:28.580 +For instance, the MKDocS strings plugin. + +00:47:29.010 --> 00:47:31.740 +The author is also part of the Zensical team now + +00:47:31.980 --> 00:47:34.380 +with MKDocS strings being the second biggest project + +00:47:34.550 --> 00:47:36.160 +in the MKDocs space. + +00:47:36.250 --> 00:47:38.140 +So we're very happy to have Tim on board. + +00:47:40.060 --> 00:47:41.420 +And several other plugins. + +00:47:41.780 --> 00:47:44.920 +But as I mentioned, so Zensical uses modules. + +00:47:45.170 --> 00:47:46.780 +So what we will do in the end is + +00:47:47.100 --> 00:47:50.440 +we will still always be able to read MKDocs configuration + +00:47:50.780 --> 00:47:52.359 +and map the plugin configurations + +00:47:52.380 --> 00:47:54.120 +to equivalent Zensical modules. + +00:47:55.140 --> 00:47:58.180 +So the logic will be completely rewritten, + +00:47:58.430 --> 00:48:00.520 +but you will be able to migrate your project + +00:48:00.840 --> 00:48:01.720 +with a command. + +00:48:02.880 --> 00:48:04.140 +That's our goal. + +00:48:04.140 --> 00:48:08.400 +Because there has so much work been going on + +00:48:08.490 --> 00:48:11.660 +into projects built with Material and MKDocs. + +00:48:11.760 --> 00:48:14.700 +So we need to make it easy for users + +00:48:14.960 --> 00:48:16.140 +and organizations to switch. + +00:48:16.500 --> 00:48:22.120 +And this is the main part we're working on in 2026. + +00:48:22.600 --> 00:48:30.760 +I think this is it's critical right if you yeah your absolute best users you know like that that + +00:48:30.800 --> 00:48:36.700 +big company but many others of course they're not going to rewrite everything well maybe they will + +00:48:36.820 --> 00:48:41.820 +but many of them won't rewrite everything they'll just use an old version and grin and bear it as + +00:48:41.820 --> 00:48:48.760 +long as they have to you know what I mean like this idea of doing it from scratch but if + +00:48:48.700 --> 00:48:54.220 +you provide a path for them that's very easy then all of a sudden they get this way better experience + +00:48:54.640 --> 00:48:58.940 +right i can only imagine you know the build speed helping out the bigger projects the most + +00:48:59.960 --> 00:49:03.760 +yeah and the compatibility part is one of the hardest engineering parts actually + +00:49:04.240 --> 00:49:07.980 +so that you you have to think about that you know because we don't want to fade ourselves + +00:49:08.420 --> 00:49:14.440 +paint ourselves into a corner so we need to think about where do we want to go but how can we + +00:49:14.460 --> 00:49:22.560 +go there faster right now without making sacrifices in a way that we can't in the end replace things + +00:49:22.820 --> 00:49:28.400 +and we have a pretty elaborate plan how to do all of this and yeah so we're working very hard on it + +00:49:28.720 --> 00:49:33.320 +to make it so right now you can just use use material of course you can you can keep using it + +00:49:33.820 --> 00:49:38.020 +or if your site already built in zensical you will have better speed and the modern design and the + +00:49:37.920 --> 00:49:44.160 +better search so the search has been completely rewritten from material to um to zenzical it's + +00:49:44.290 --> 00:49:50.520 +also it's currently integrated it's integrated with zenzical and we will open source it as a + +00:49:50.640 --> 00:49:56.360 +dedicated open source project um it's it's called disco so you will also be able to use the search + +00:49:56.360 --> 00:50:01.559 +in other projects and for um just as a number to to get a feel for it it's 20 times faster than + +00:50:01.580 --> 00:50:03.060 +the search in material for MKDocs. + +00:50:03.360 --> 00:50:03.820 +Wow. + +00:50:03.880 --> 00:50:05.580 +So it's a ground-up rewrite. + +00:50:06.420 --> 00:50:08.700 +And we actually started working on the search + +00:50:08.820 --> 00:50:10.320 +before we started working on Zensical. + +00:50:10.980 --> 00:50:13.580 +Yeah, I noticed how nice the search was + +00:50:14.400 --> 00:50:15.400 +when I was playing with it. + +00:50:17.500 --> 00:50:18.620 +We're in-- + +00:50:19.599 --> 00:50:23.760 +So is zensicle.org itself built in Zensical? + +00:50:24.460 --> 00:50:25.080 +Yeah, of course. + +00:50:25.880 --> 00:50:27.820 +And it's actually built with an MKDocs YML + +00:50:27.940 --> 00:50:28.840 +because we're dogfooding. + +00:50:28.940 --> 00:50:32.840 +So you can also build it with a mkdocs, + +00:50:34.340 --> 00:50:36.060 +with material for mkdocs. + +00:50:36.140 --> 00:50:37.460 +The project layout is exactly the same. + +00:50:38.120 --> 00:50:38.380 +- Yeah. + +00:50:39.020 --> 00:50:43.520 +You know, I find that there's just a bunch of static sites + +00:50:43.860 --> 00:50:46.080 +that seem to have, I don't know what's going on with them, + +00:50:46.120 --> 00:50:47.780 +but their search is really bad. + +00:50:48.540 --> 00:50:50.560 +You know, either they've just integrated + +00:50:50.760 --> 00:50:53.600 +some kind of Google thing where it says site colon + +00:50:53.760 --> 00:50:55.520 +and they use your URL and then the search, + +00:50:55.700 --> 00:50:56.860 +which is a real bad experience, + +00:50:57.540 --> 00:51:02.360 +Or you go search and it sits there and it spins and it spins and then eventually it pulls up. + +00:51:03.380 --> 00:51:07.900 +So it looks like you are pre-computing these types of things or something with your search engine + +00:51:07.970 --> 00:51:10.500 +or you've got some cool data structure to make that fast, right? + +00:51:11.480 --> 00:51:14.060 +Well, it's not one cool data structure. + +00:51:14.170 --> 00:51:16.620 +That would be great because then everybody could just use it. + +00:51:16.860 --> 00:51:17.740 +But no. + +00:51:17.980 --> 00:51:19.160 +A series of algorithms. + +00:51:19.680 --> 00:51:24.320 +Several months of work went into the search. + +00:51:24.780 --> 00:51:25.180 +Of course. + +00:51:26.840 --> 00:51:29.940 +So it's a project of its own, as I mentioned. + +00:51:30.160 --> 00:51:31.600 +It's also completely modular. + +00:51:32.110 --> 00:51:37.120 +And the reason why most of the search engines that are out there, that are open source, + +00:51:37.360 --> 00:51:41.260 +so like the libraries that you can use, not services you have to pay for, + +00:51:41.880 --> 00:51:45.280 +that they don't provide results that are really relevant or us, + +00:51:45.800 --> 00:51:52.300 +is that they use BM25, which is like the standard bag of words ranking algorithm + +00:51:52.680 --> 00:51:54.880 +for information retrieval. + +00:51:55.320 --> 00:52:00.000 +and this doesn't nicely pair with autocomplete so what you get is you start typing and you get a + +00:52:00.080 --> 00:52:05.940 +lot of dancing results because um and and also if you if you add uh further documents to your index + +00:52:06.400 --> 00:52:10.800 +the the balancing will be off because it's it's um the relevance is computed based on the occurrence + +00:52:11.020 --> 00:52:16.260 +of a word in the entire corpus so you add a new document those and those weights change again + +00:52:16.870 --> 00:52:21.939 +the search that we have um we of course as a baseline also have a bm25 implementation + +00:52:22.480 --> 00:52:26.540 +but the implementation you're seeing is a tie-breaking implementation which provides + +00:52:26.880 --> 00:52:32.680 +much much better accuracy and you can configure it so tie-breaking means okay we first look into + +00:52:32.940 --> 00:52:39.280 +the title of the document then and and see if we have matches then how many matches then where + +00:52:39.290 --> 00:52:44.820 +they are then we look into the navigate in the path and then in the body of the document and so + +00:52:44.920 --> 00:52:50.000 +on all of this is configurable and this is also why we believe that this go alone will also be a + +00:52:49.940 --> 00:52:51.980 +very interesting projects for other, + +00:52:52.440 --> 00:52:54.180 +for instance, static site generators to integrate. + +00:52:55.500 --> 00:52:57.340 +And you asked about like pre-computing. + +00:52:57.520 --> 00:53:01.820 +So no, this is a search from the documents. + +00:53:01.960 --> 00:53:02.860 +We put a search index, + +00:53:03.280 --> 00:53:05.040 +which is a strapped down version of the HTML + +00:53:05.640 --> 00:53:09.020 +that is rendered when you load the page. + +00:53:10.440 --> 00:53:11.920 +It's one JSON that we shipped to the client. + +00:53:12.340 --> 00:53:15.480 +And for most pages, actually this JSON is below one megabyte. + +00:53:15.480 --> 00:53:17.920 +You can GC it, so compress it. + +00:53:18.680 --> 00:53:24.460 +then it's something like 200K and you have extremely fast search on the client with no cost. + +00:53:25.260 --> 00:53:33.400 +And so we believe that for 90, 95, maybe 99% of documentation sites or sites in general, + +00:53:33.550 --> 00:53:41.560 +this client site search is basically the way to go because it's fast and it doesn't require you to pay for anything. + +00:53:41.590 --> 00:53:48.460 +And there are several ZaaS-based services that can be extremely expensive when you do the math. + +00:53:49.280 --> 00:53:50.900 +so yeah + +00:53:51.580 --> 00:53:53.220 +you only need to use a server + +00:53:53.400 --> 00:53:54.140 +basically when you + +00:53:55.040 --> 00:53:57.140 +when the index becomes too big to ship to the client + +00:53:57.720 --> 00:53:58.960 +and we're also working on that by the way + +00:53:59.320 --> 00:54:00.620 +okay that's really cool + +00:54:02.060 --> 00:54:03.320 +you could shard + +00:54:03.380 --> 00:54:04.980 +the index or something like that right + +00:54:04.980 --> 00:54:06.520 +I suppose like you could say + +00:54:07.280 --> 00:54:08.820 +we're going to have 26 + +00:54:09.660 --> 00:54:11.080 +index bits and only if + +00:54:11.080 --> 00:54:12.540 +the word starts with an A + +00:54:13.000 --> 00:54:14.840 +do you pull that piece down or something + +00:54:15.120 --> 00:54:15.480 +but yeah + +00:54:16.760 --> 00:54:17.719 +a lot of cool aspects + +00:54:17.740 --> 00:54:24.340 +yeah it's not not that simple uh but uh there are also some other in some other interesting + +00:54:24.800 --> 00:54:28.760 +solutions like page find is a pretty interesting library it does a completely different approach + +00:54:29.560 --> 00:54:38.000 +but um it's not as snappy as um the search that we ship to the to the client yeah i use page find + +00:54:38.000 --> 00:54:43.819 +for my personal website which is a static site yeah it's also a great great solution but um you + +00:54:43.840 --> 00:54:47.480 +some things you won't be able to implement in PageFind properly. + +00:54:48.260 --> 00:54:52.100 +Because, so it's, you know, it's with software, it's trade-offs all the way. + +00:54:52.860 --> 00:54:56.240 +Well, I'm already thinking, like, I better pay attention to Disco when it comes out. + +00:54:56.460 --> 00:54:59.440 +So maybe adopt it for some stuff. + +00:55:00.700 --> 00:55:00.960 +Beautiful. + +00:55:01.150 --> 00:55:01.260 +Okay. + +00:55:02.120 --> 00:55:08.100 +We got a couple interesting questions sort of following up from the component side of things. + +00:55:09.260 --> 00:55:13.260 +Jamstack says, do you foresee community-led templates or themes for Zensical? + +00:55:14.420 --> 00:55:18.460 +I know you have like two themes that I see something along those lines a couple of themes + +00:55:18.860 --> 00:55:24.160 +that you can choose now but what's the what is the theme story I guess I want to ask you more broadly + +00:55:25.800 --> 00:55:32.600 +yeah so absolutely so right now we have only this one theme we have this variant setting where you + +00:55:32.760 --> 00:55:38.519 +can choose like the classic variant which is if you when you move over from material for mkdocs + +00:55:38.560 --> 00:55:43.960 +looks exactly the same. This is also why we needed to keep the HTML as it is also with the modern + +00:55:44.100 --> 00:55:51.480 +design that we provided and the modern variant which is the standard for Zensical. Once we move + +00:55:51.580 --> 00:55:58.220 +to the component system we will make it possible to one use components within markdown and two also + +00:55:59.220 --> 00:56:04.279 +create a template engine that is based on components. This will allow us much much faster + +00:56:04.300 --> 00:56:11.020 +rendering because for instance if you render the header uh on a for a site it's a lot of html because + +00:56:11.140 --> 00:56:15.880 +you know there's the search box in it and some other stuff but only the title changes so we will + +00:56:15.960 --> 00:56:21.320 +also make the rendering differential as part of the build that's the plan and with this uh + +00:56:21.440 --> 00:56:27.240 +we will also make it open to um to theme developers of course so there will be um the like packaging + +00:56:27.600 --> 00:56:34.260 +um for instance compilation of of zas um styles or typescript or so will be part of zensical + +00:56:34.300 --> 00:56:40.040 +So you don't need to pre-compile the theme like we needed to do for the last 10 years for Material. + +00:56:41.360 --> 00:56:42.860 +So it will have a proper asset pipeline. + +00:56:42.990 --> 00:56:45.320 +It will have a proper process to install themes. + +00:56:45.860 --> 00:56:46.540 +All of this is planned. + +00:56:46.760 --> 00:56:54.300 +But right now we focus on feature parity in order to make it possible for more users to migrate right now. + +00:56:55.180 --> 00:57:02.200 +That's really interesting that you would deliver the theme as basically its original source, not its rendered. + +00:57:04.420 --> 00:57:10.600 +you know compiled or transpiled version right um to keep it i guess it's a part of this integral + +00:57:10.760 --> 00:57:18.040 +build step right yes exactly because um it's we had a lot of requests for something like hey can + +00:57:18.040 --> 00:57:24.280 +we change the media queries a little bit because the sidebar disappears too early for my um for my + +00:57:24.480 --> 00:57:30.260 +taste and um this is not so for this you have to go through the compilation step again and basically + +00:57:30.940 --> 00:57:32.980 +fork the theme and recompile it. + +00:57:33.640 --> 00:57:34.920 +We want to make this configurable + +00:57:35.120 --> 00:57:38.140 +so that you can use, yeah, + +00:57:38.400 --> 00:57:40.680 +so, you know, configure the theme + +00:57:41.420 --> 00:57:44.320 +and build it and it just works. + +00:57:44.650 --> 00:57:46.420 +So this like, you know, it just works. + +00:57:46.700 --> 00:57:48.220 +That's like the thing we're working towards. + +00:57:48.900 --> 00:57:50.020 +Make it as simple as possible. + +00:57:50.580 --> 00:57:52.280 +Yeah. Yeah, very cool. + +00:57:53.580 --> 00:57:55.700 +Let's maybe, I'm getting short on time here, + +00:57:55.820 --> 00:57:57.540 +maybe wrap up our chat + +00:57:58.180 --> 00:58:04.720 +talking about two things, the future, where you go and you talked about + +00:58:06.320 --> 00:58:11.640 +compatibility being a big part of things going forward in 2026, but also + +00:58:12.180 --> 00:58:17.960 +sustainability, right? You had all these great supporters for material for MK + +00:58:18.180 --> 00:58:22.420 +Docs, which you must have just been absolutely thrilled to realize how + +00:58:22.820 --> 00:58:26.820 +successful that was, right? I mean, going from the wall, put up a wish list + +00:58:26.840 --> 00:58:28.780 +and then actually people love this. + +00:58:28.810 --> 00:58:30.620 +I can put all my energy into it. + +00:58:30.700 --> 00:58:32.460 +I mean, I know how great of a feeling that is, right? + +00:58:33.500 --> 00:58:34.260 +- That's completely insane. + +00:58:34.580 --> 00:58:37.420 +And when I started it, I would never believe + +00:58:37.520 --> 00:58:40.000 +that this would be my job at some point. + +00:58:40.660 --> 00:58:42.040 +- Yeah, I feel the same way about the podcast + +00:58:42.780 --> 00:58:44.300 +and it's just, I'm so grateful for it. + +00:58:44.340 --> 00:58:44.700 +It's amazing. + +00:58:44.870 --> 00:58:46.000 +- Yeah, I can imagine. + +00:58:47.979 --> 00:58:50.260 +- Yeah, but then with this transition to Zensical, + +00:58:52.160 --> 00:58:52.980 +how does that change? + +00:58:53.180 --> 00:58:55.540 +Does that change anything or what's the story? + +00:58:56.500 --> 00:58:58.600 +How do you bring that support over to Zensical? + +00:58:59.300 --> 00:59:00.440 +As we don't have a lot of time, + +00:59:00.490 --> 00:59:03.380 +I try to explain it as compact as possible. + +00:59:04.290 --> 00:59:08.420 +So we are saying goodbye to this pay for features, + +00:59:08.610 --> 00:59:09.400 +pay for extra features. + +00:59:09.600 --> 00:59:13.180 +So in Material, you needed to be a sponsor + +00:59:13.250 --> 00:59:15.340 +in order to get the latest features earlier. + +00:59:15.640 --> 00:59:17.920 +What we will do is everything is open source from the start. + +00:59:18.160 --> 00:59:20.260 +So for users, it's completely free. + +00:59:21.060 --> 00:59:24.740 +And we are shifting our model from the sponsorships + +00:59:24.760 --> 00:59:26.800 +to something we call Zensical Spark, + +00:59:27.440 --> 00:59:29.980 +because what we discovered talking a lot + +00:59:29.980 --> 00:59:32.480 +to our professional users is that the more we know + +00:59:32.760 --> 00:59:35.380 +about the problem space and the better we understand + +00:59:35.400 --> 00:59:37.940 +the problem space and the more we can collaborate with them, + +00:59:38.500 --> 00:59:41.480 +the more we can, the better degrees of freedom + +00:59:41.540 --> 00:59:42.040 +we can provide. + +00:59:42.420 --> 00:59:45.520 +So we don't intend to just chip feature, feature, feature, + +00:59:46.020 --> 00:59:48.660 +but we intend to create degrees of freedom + +00:59:48.940 --> 00:59:52.040 +so that you can adapt Zensical to the processes + +00:59:52.320 --> 00:59:54.020 +within your organization, how they work, + +00:59:54.160 --> 01:00:00.260 +the workflows etc which are all different which is all very diverse basically so spark is a space + +01:00:00.420 --> 01:00:06.480 +where you as a as a company can basically get a seat um and together with us shape zensical as + +01:00:06.480 --> 01:00:11.700 +part of high level discussions where we explore problem the problem space we create proposals so + +01:00:11.840 --> 01:00:15.940 +on the website you have clicked on the spark section um there's this zaps in progress we call + +01:00:16.040 --> 01:00:22.779 +them zaps zensical advancement proposals it's on the left side um we write um very elaborate + +01:00:22.800 --> 01:00:28.620 +detailed proposals on specific topics that we intend to work on and then um with the feedback + +01:00:28.800 --> 01:00:34.560 +that we get iterate on them and um create a authoring like the ideal authoring experience + +01:00:34.960 --> 01:00:41.140 +that um caters to the most cases possible um because we want to build zensical and as i + +01:00:41.200 --> 01:00:45.420 +mentioned like for the very long term and not as just a solution that is opinionated but that is + +01:00:45.460 --> 01:00:50.860 +as unopinionated as possible and the third thing that you get besides like those uh the + +01:00:50.840 --> 01:00:56.940 +opportunity to discuss like high-level discussions with us and create the proposals with us is of + +01:00:56.960 --> 01:01:00.720 +course professional support so this is also we've been asking we've been asked for quite a lot by + +01:01:01.460 --> 01:01:08.760 +companies um so in spark you um yeah you can basically um get um our time you can we will + +01:01:08.940 --> 01:01:15.040 +you can get direct access to the team and uh also we have like those open video calls uh where we + +01:01:15.220 --> 01:01:20.040 +share our progress and where you can get a window of support and we talk about any problem that is + +01:01:20.220 --> 01:01:26.800 +keeping you up at night basically and stuff like migrations or how do you do this and this in + +01:01:26.940 --> 01:01:32.380 +Zensical and yeah it's it's been a blast so it's we're really happy that the organizations are + +01:01:32.500 --> 01:01:38.980 +enrolling into this new model and we think it could also be a model that might translate quite + +01:01:39.030 --> 01:01:43.260 +well to other projects because you get a huge competitive advantage you know exactly what to + +01:01:43.220 --> 01:01:43.380 +builds. + +01:01:44.720 --> 01:01:47.520 +Yeah, you're talking to the actual users. + +01:01:48.460 --> 01:01:51.060 +They're saying, this is the thing that really is hard for us. + +01:01:51.400 --> 01:01:54.180 +Or you just get, maybe they don't say it, but you see it, right? + +01:01:54.960 --> 01:01:55.820 +Exactly, yes. + +01:01:56.280 --> 01:01:58.280 +And talking to the users is the best thing you can do. + +01:01:58.400 --> 01:02:02.880 +So what we learned from those, from the many times we talked to them, is always something + +01:02:03.020 --> 01:02:05.780 +like, wow, we never would have come up with this. + +01:02:07.400 --> 01:02:08.200 +Yeah, incredible. + +01:02:11.320 --> 01:02:14.520 +So congratulations on the success for Material for MKDocs + +01:02:14.640 --> 01:02:16.100 +and then this new project. + +01:02:16.360 --> 01:02:17.880 +I'm very excited to see it coming along. + +01:02:18.260 --> 01:02:20.340 +And it looks like it's going to be great. + +01:02:21.580 --> 01:02:23.180 +Maybe a final call to action for people. + +01:02:23.580 --> 01:02:26.080 +Like, can they go ahead and start using Zensical? + +01:02:26.410 --> 01:02:27.700 +If they're interested, what do they do? + +01:02:27.920 --> 01:02:28.320 +So on. + +01:02:30.800 --> 01:02:31.200 +Yeah, of course. + +01:02:31.440 --> 01:02:35.520 +So we mentioned Material for MKDocs a lot. + +01:02:35.530 --> 01:02:39.360 +And this is because we are coming from this direction. + +01:02:39.600 --> 01:02:44.360 +So it means if you have a material for MKDocs project, you should definitely try out Zensical and see if you can build your project. + +01:02:44.580 --> 01:02:48.280 +But if you haven't used it, you can also just jumpstart a new project. + +01:02:48.640 --> 01:02:50.460 +It has a lot of built-in functionality already. + +01:02:51.020 --> 01:02:59.220 +You get like all of these components that we talked about, free search that you don't have to host, a very modern static site that is great on mobile. + +01:02:59.680 --> 01:03:00.960 +So just give it a try. + +01:03:01.760 --> 01:03:06.340 +And we have a newsletter where we once a month share the latest updates. + +01:03:07.100 --> 01:03:10.020 +And that might also be worth checking out. + +01:03:10.740 --> 01:03:10.820 +Yeah. + +01:03:11.010 --> 01:03:15.820 +And otherwise, we'd be happy to see you, to get any feedback. + +01:03:16.460 --> 01:03:21.360 +By the way, we also have a public Discord, a community Discord, which is growing very well. + +01:03:21.690 --> 01:03:24.560 +So if you have any problems or so, then you will get help there. + +01:03:25.300 --> 01:03:25.380 +Yeah. + +01:03:26.440 --> 01:03:34.560 +Would be great to see as many users as possible, of course, and shape the future of Zensical together with all of you. + +01:03:35.460 --> 01:03:35.620 +Yeah. + +01:03:36.540 --> 01:03:36.920 +Fantastic. + +01:03:37.560 --> 01:03:38.560 +Martin, thanks for coming on the show + +01:03:39.460 --> 01:03:40.080 +congrats on the project + +01:03:41.180 --> 01:03:42.760 +thanks for the invitation and + +01:03:43.600 --> 01:03:45.240 +happy anytime to come back + +01:03:45.800 --> 01:03:46.940 +yeah, sounds good + +01:03:47.020 --> 01:03:47.460 +Yeah. + diff --git a/youtube_transcripts/543-langchain-deep-agents.vtt b/youtube_transcripts/543-langchain-deep-agents.vtt new file mode 100644 index 0000000..fef1939 --- /dev/null +++ b/youtube_transcripts/543-langchain-deep-agents.vtt @@ -0,0 +1,2942 @@ +WEBVTT + +00:00:01.520 --> 00:00:04.760 +Sydney, welcome back to Talk Python To Me. Awesome to have you here. + +00:00:05.390 --> 00:00:07.080 +Thanks. Yeah, super excited to be back. + +00:00:07.980 --> 00:00:15.240 +I am super excited to have you here. We're going to be talking about almost the topic du jour, + +00:00:15.830 --> 00:00:22.280 +the AI, but not in the way that people might think. Not using AI to build code, although what we're + +00:00:22.360 --> 00:00:29.400 +talking about could be used for that, and so on. But actually, how do you build your own AI tools? + +00:00:29.580 --> 00:00:35.860 +How do you build your own cloud code equivalent if you wanted to kind of have a lot more control over that? + +00:00:36.800 --> 00:00:38.380 +So I'm really excited about it. + +00:00:38.380 --> 00:00:42.460 +I think it's pretty eye-opening and great tools. + +00:00:42.620 --> 00:00:46.280 +Last time we were on, I think we talked about LangGraph. + +00:00:46.620 --> 00:00:47.060 +Was that right? + +00:00:47.600 --> 00:00:48.940 +Yeah, yeah, I think so. + +00:00:50.180 --> 00:00:56.320 +And now, carrying on with more Lang things from LangChain, we're going to talk about deep agents. + +00:00:56.640 --> 00:00:58.920 +So super cool topic. + +00:00:59.630 --> 00:01:05.660 +I think people who feel like this is mysterious or, you know, you, you sign up for some frontier + +00:01:05.950 --> 00:01:07.060 +model and it does all the magic. + +00:01:07.310 --> 00:01:12.600 +Well, we're going to dig into how you might, you know, how that magic works and how you + +00:01:12.670 --> 00:01:15.880 +might build your own as well with some really cool tools here. + +00:01:16.280 --> 00:01:20.100 +Now it has been a little while since you've been on, I think you've been on three times, + +00:01:20.240 --> 00:01:23.000 +which is amazing, but here's number four. + +00:01:24.000 --> 00:01:28.660 +There's a ton of new people listening to the show or coming into the Python space in general. + +00:01:28.740 --> 00:01:36.880 +I mean, it's amazing to me that 50% of the people doing Python are new to it professionally the last two years. + +00:01:38.620 --> 00:01:39.300 +I guess it makes sense. + +00:01:39.500 --> 00:01:43.640 +Anyway, quick introduction about who you are for everyone who doesn't know already. + +00:01:44.320 --> 00:01:44.900 +Yeah, sure thing. + +00:01:46.440 --> 00:01:51.880 +Well, very excited to get to share all of our new Deep Agents stuff with folks. + +00:01:52.180 --> 00:01:52.700 +My name is Sydney. + +00:01:53.080 --> 00:01:57.860 +I currently work at LinkChain, which might sound familiar. + +00:01:57.920 --> 00:02:02.240 +It started as an open source package helping folks use AI. + +00:02:02.460 --> 00:02:10.300 +Basically, as soon as LLM started to blow up, LinkChain emerged as a toolkit for building with LLMs in Python. + +00:02:11.680 --> 00:02:14.620 +And then it since has evolved into a company. + +00:02:14.900 --> 00:02:20.520 +So we offer observability and evals products for agents. + +00:02:20.860 --> 00:02:32.060 +We basically are building a platform for folks to build agents, but we still are kind of built on our open source core, which is that link chain project and now deep agents. + +00:02:32.280 --> 00:02:34.720 +And then I've also spoken with you about LangGraph. + +00:02:35.640 --> 00:02:39.140 +So we'll kind of talk about how all of those open source projects are related today. + +00:02:40.460 --> 00:02:48.060 +And then I guess I'll also note I've chatted with you before about other open source projects like Pydantic and Pydantic AI is where I worked previously. + +00:02:48.320 --> 00:02:51.780 +So very excited to kind of be in the open source AI space. + +00:02:52.360 --> 00:02:52.600 +Yeah. + +00:02:53.750 --> 00:02:57.160 +It's been quite the roller coaster, I think, + +00:02:57.240 --> 00:02:58.680 +you've probably been on the last couple of years. + +00:02:59.200 --> 00:03:02.260 +We talked about the young coders' blueprint for success, right? + +00:03:02.310 --> 00:03:03.420 +As you were graduating college. + +00:03:03.890 --> 00:03:08.660 +And now you spent a good stint with Pydantic.dev, which is awesome. + +00:03:08.870 --> 00:03:13.540 +And that's a big center of the open source Python world. + +00:03:13.630 --> 00:03:14.860 +And so is LangChain. + +00:03:15.080 --> 00:03:18.180 +So very exciting. + +00:03:18.420 --> 00:03:24.020 +I'm sure. Yeah, yeah. I think if I could redo the Young Coders Blueprint success now, + +00:03:24.060 --> 00:03:26.760 +it would probably look pretty different than it did when we chatted. + +00:03:27.360 --> 00:03:29.960 +I was wondering about that as well. Maybe we'll get to that later. + +00:03:30.540 --> 00:03:40.200 +Yeah. Maybe we will. So let's start by setting the stage with, let's start here. So I want to + +00:03:40.240 --> 00:03:47.720 +talk about this idea of deep agents, obviously the name of the product or the library that we're + +00:03:47.680 --> 00:03:54.040 +going to be talking about, but more high level for the moment, you know, as opposed to shallow agents. + +00:03:54.600 --> 00:04:01.840 +So give us a contrast, I guess, if you will, between what is a shallow agent, as you all refer to it, + +00:04:02.440 --> 00:04:08.860 +and then why the term deep agents? Yeah, great question. So I think a shallow agent is sort of + +00:04:09.340 --> 00:04:16.539 +what the agents of maybe a year or two looked ago, a year or two ago looked like. So agents are + +00:04:16.560 --> 00:04:23.860 +basically a model calling tools in a loop in response to some prompt. And so a shallow agent + +00:04:23.970 --> 00:04:30.400 +maybe does like a couple of tool calls to help an end user achieve a goal. So maybe you need help + +00:04:30.600 --> 00:04:37.380 +with a flight booking and your agent has powers to, you know, call flight and hotel booking tools. + +00:04:38.080 --> 00:04:42.660 +So that's like a relatively simple task. It's pretty easy to like judge whether or not that + +00:04:42.600 --> 00:04:51.240 +was successful. But deep agents have access to much more context and are able to perform + +00:04:52.700 --> 00:04:58.940 +much more complex tasks with kind of longer horizons. And so we're generally seeing a trend + +00:04:59.220 --> 00:05:04.380 +towards, you know, folks always pushing the boundaries of like how complex of tasks can + +00:05:04.680 --> 00:05:10.300 +agents solve? And then also like, you know, how long can they run for in a sustainable way? + +00:05:11.720 --> 00:05:14.420 +I think deep agents, I think they're where it's at. + +00:05:14.580 --> 00:05:22.280 +You know, one of the, I feel like there's this sort of split in what people feel like is possible with AI. + +00:05:23.600 --> 00:05:25.940 +And a lot of it comes down to this, I believe. + +00:05:27.800 --> 00:05:34.960 +I go to ChatGPT or Gemini or somewhere, you know, like ChatGPT.com. + +00:05:34.960 --> 00:05:38.900 +And I type into the text box, create me a function to do this. + +00:05:39.040 --> 00:05:42.140 +or I want you to solve this problem. + +00:05:42.760 --> 00:05:46.120 +And all it has to work with is the text that you've typed into the text box. + +00:05:46.680 --> 00:05:46.780 +Yep. + +00:05:47.040 --> 00:05:47.120 +Right? + +00:05:47.610 --> 00:05:49.680 +And it's got very little to go on. + +00:05:49.760 --> 00:05:52.280 +I mean, depending on how much you give it as a prompt, I guess. + +00:05:52.460 --> 00:05:55.380 +But generally, it has very little to go on. + +00:05:55.810 --> 00:05:57.560 +And you get pretty good answers. + +00:05:57.810 --> 00:06:01.100 +I mean, to be honest, ChatGPT and things are like utter magic. + +00:06:01.440 --> 00:06:07.160 +But relative to DeepAgence, they don't necessarily come up with the best answers. + +00:06:07.560 --> 00:06:13.760 +And really, I think the essence of it is that they can't check and revalidate, right? + +00:06:14.020 --> 00:06:21.240 +As opposed to something like Claude Code or Codex, where it has an idea, it reads about + +00:06:21.240 --> 00:06:23.660 +the code, and it's, okay, well, let me try to write that. + +00:06:23.770 --> 00:06:29.220 +Now, let me apply some tools to see how that worked, right? + +00:06:29.300 --> 00:06:32.700 +Let me run ruff against it and see if that path, oh, the rough, does it work? + +00:06:32.700 --> 00:06:33.720 +It says there's wrong code. + +00:06:34.030 --> 00:06:35.340 +Well, let me go back and do it again. + +00:06:35.780 --> 00:06:36.920 +Let me run the unit test. + +00:06:37.200 --> 00:06:37.980 +oh, look, they did pass. + +00:06:38.080 --> 00:06:39.520 +Okay, I think I'm on the right track, right? + +00:06:39.620 --> 00:06:43.020 +This back and forth and this kind of tool use and iteration, + +00:06:44.040 --> 00:06:46.360 +that is more indicative of a deep agent, would you say? + +00:06:47.040 --> 00:06:47.520 +Yeah, definitely. + +00:06:47.880 --> 00:06:52.740 +I think a deep agent has kind of much more agency + +00:06:53.100 --> 00:06:56.580 +than a shallow agent, if we're calling it that. + +00:06:56.580 --> 00:06:56.780 +Yeah, yeah, yeah. + +00:06:58.120 --> 00:07:01.960 +And yeah, the more capabilities and power you give your agent, + +00:07:02.120 --> 00:07:04.200 +the more useful it has the potential to be. + +00:07:04.920 --> 00:07:07.160 +And so what we're doing in building deep agents + +00:07:07.180 --> 00:07:14.260 +kind of trying to build the most effective harness, trying to equip this, you know, agent builder with + +00:07:15.680 --> 00:07:23.100 +the best set of tools and instructions so that, yeah, it can do really challenging things. And + +00:07:23.180 --> 00:07:27.580 +you're kind of talking through some of the like coding agent applications that I think a lot of + +00:07:27.580 --> 00:07:32.900 +us are seeing kind of revolutionize our day to day workflows. Absolutely. And I'm using coding + +00:07:32.920 --> 00:07:36.960 +agents because I feel like that probably most significant and most strongly connects with the + +00:07:37.180 --> 00:07:42.740 +audience, but it doesn't have to be coding, right? It could be, be anything. But before we get into + +00:07:43.040 --> 00:07:48.200 +what that might be, I just want to circle back and say, I really think that there's, I don't know if + +00:07:48.200 --> 00:07:54.460 +you have a better way you individually or you as a LinkedIn representative, a better way to represent + +00:07:54.680 --> 00:08:01.120 +this because when people talk about, oh, AI makes this mistake or AI hallucinates or this or that + +00:08:01.140 --> 00:08:06.260 +or whatever, right? People use the same words, but they're not necessarily talking about the same + +00:08:06.480 --> 00:08:11.920 +thing. And then they debate whether their version of the thing that they don't really make clear + +00:08:12.680 --> 00:08:16.420 +is better or worse than some other thing that's not actually the same thing, right? It's kind of + +00:08:16.520 --> 00:08:24.040 +people are talking a bit past each other. Do you see a good way of this conversation be more + +00:08:24.280 --> 00:08:28.920 +specific, evolving, or is that where we are for a while? Yeah, that's a great question. + +00:08:30.200 --> 00:08:37.940 +Basically, just the is what you mean, the fact that people are very concerned about like AI not being grounded in truth or like hallucinating. + +00:08:38.400 --> 00:08:42.640 +Yeah. Yeah. So, for example, let's say somebody says, oh, this stuff is terrible. + +00:08:43.180 --> 00:08:46.680 +It made up all this stuff and it gave me really shout and it was actually wrong about a fact. + +00:08:47.040 --> 00:08:57.540 +And what they meant is they use the free non logged in version of ChatGPT with the lowest model, like instant answer versus another person who used, let's say, + +00:08:59.079 --> 00:09:06.900 +deep research, the top pro model, and a 500 word prompt with a couple of files to back. + +00:09:07.060 --> 00:09:11.020 +You know, like those people say, well, I use it and it's wrong and it's bad. + +00:09:11.140 --> 00:09:12.780 +And I did it and look how amazing it is. + +00:09:13.040 --> 00:09:14.540 +And they think they're talking about the same thing. + +00:09:14.540 --> 00:09:16.600 +And those are even putting agents aside. + +00:09:18.220 --> 00:09:20.200 +Those are really different things, right? + +00:09:20.240 --> 00:09:25.740 +We're debating whether we're sort of comparing those as if they're the same experience. + +00:09:26.400 --> 00:09:26.880 +And then judgment. + +00:09:27.520 --> 00:09:27.760 +Yeah. + +00:09:28.000 --> 00:09:33.060 +I think that's a great question. So, you know, I think there's always the baseline thing that like, + +00:09:33.920 --> 00:09:39.720 +you know, you should be skeptical and ask questions of your results that you get from AI tooling. + +00:09:40.160 --> 00:09:45.860 +That being said, the rate at which AI tooling is improving is pretty hard to believe. And so I think, + +00:09:46.360 --> 00:09:51.620 +you know, even thinking about things like citations and yeah, deep research abilities, + +00:09:53.400 --> 00:09:59.100 +agents and AI tools are getting pretty good at grounding things in truth and like in current + +00:09:59.340 --> 00:10:06.600 +truth, not just like data that they were trained on. Right. And so I generally have high confidence + +00:10:07.020 --> 00:10:12.720 +in the AI tools that I use with that like asterisk of like, okay, but like I do ask + +00:10:13.220 --> 00:10:16.120 +follow-up questions and like get them to check their work sometimes. + +00:10:16.480 --> 00:10:21.739 +Yeah. Yeah. Yeah. I think there's a big, there's a wide varied skill gap here and + +00:10:21.780 --> 00:10:23.200 +and tool chain gap and so on. + +00:10:23.260 --> 00:10:24.700 +It's super interesting. + +00:10:25.360 --> 00:10:29.600 +So as a way to sort of set the stage for deep agents, + +00:10:31.420 --> 00:10:34.580 +would you say Claude Code is a pretty good representative of this idea? + +00:10:34.960 --> 00:10:36.820 +Maybe describe why if you think so. + +00:10:37.500 --> 00:10:38.920 +Yeah, I think so. + +00:10:39.620 --> 00:10:41.560 +When I think about a deep agent, + +00:10:41.840 --> 00:10:47.000 +I think about something that has access to an abundance of context. + +00:10:47.260 --> 00:10:50.040 +And so for Claude Code, that's like your file system. + +00:10:50.800 --> 00:10:56.400 +I think about something that's autonomous and kind of can organize complex tasks. + +00:10:56.810 --> 00:11:04.660 +And so that's like, you know, spitting up sub agents and keeping a to do list handy to be able to organize all the things going on. + +00:11:05.700 --> 00:11:13.040 +And then I also think about, you know, an agent being really kind of optimized for the user that it's working with. + +00:11:13.410 --> 00:11:17.480 +And so that really ties into like memory and updating memory. + +00:11:18.020 --> 00:11:20.280 +And I think Claude Code does all of those things. + +00:11:20.380 --> 00:11:24.340 +So I think it's a very coding-specific deep agent. + +00:11:24.920 --> 00:11:25.400 +Right. + +00:11:26.160 --> 00:11:26.520 +Very cool. + +00:11:29.120 --> 00:11:35.520 +So the blog post that announced deep agents at LinkedIn referenced this X post, which is, + +00:11:36.260 --> 00:11:37.320 +I think it's pretty interesting. + +00:11:37.540 --> 00:11:39.500 +It's certainly something that resonates with me. + +00:11:39.500 --> 00:11:45.000 +And it says, this person, Alex Albert, says, I'm making a list of all the non-coding things + +00:11:45.160 --> 00:11:46.780 +people are doing with Claude code. + +00:11:47.480 --> 00:11:48.880 +What are you using Claude code for? + +00:11:48.940 --> 00:11:52.640 +I got in parentheses, like silent, that's not coding. + +00:11:54.310 --> 00:11:59.240 +So I think that really highlights how powerful this stuff is. + +00:11:59.460 --> 00:12:01.640 +People who are not even coders are like, you know what? + +00:12:01.680 --> 00:12:03.420 +I'm willing to open up the terminal. + +00:12:03.510 --> 00:12:06.940 +I figured out where that is, and I made it not white on my Mac. + +00:12:07.500 --> 00:12:13.020 +And now I'm able to do way, way more by basically giving it access + +00:12:14.380 --> 00:12:15.840 +to the file system and other things. + +00:12:16.020 --> 00:12:23.240 +And then, you know, Claude themselves came out with Cowork, which is basically Claude code for non-coders. + +00:12:23.680 --> 00:12:24.900 +You know, something like that, right? + +00:12:24.960 --> 00:12:35.680 +If you install the desktop app, you can give it access to a part of your file system, and it can use much of the things that Claude code would do, right? + +00:12:36.700 --> 00:12:36.860 +Yep. + +00:12:38.220 --> 00:12:40.380 +Yeah, definitely like a big motivator here for us. + +00:12:42.020 --> 00:12:48.860 +I think we saw how revolutionary Claude Code was just within almost weeks of release. + +00:12:49.060 --> 00:12:54.260 +And so I think the idea is like, well, certainly this revolution is coming to other areas. + +00:12:54.760 --> 00:12:56.860 +And so how can we kind of generalize that? + +00:12:57.700 --> 00:12:57.860 +Yeah. + +00:12:57.960 --> 00:13:02.380 +So I'm going to just kind of scroll through here a little bit and see what people put down. + +00:13:02.780 --> 00:13:06.540 +But I think, yeah, 319 replies. + +00:13:06.900 --> 00:13:08.740 +So I guess people are doing stuff with it. + +00:13:09.820 --> 00:13:15.520 +So somebody says notes plus research plus knowledge base plus obsidian. + +00:13:16.519 --> 00:13:19.720 +And I think that's pretty interesting. + +00:13:19.880 --> 00:13:25.000 +I've heard about somebody building, I don't know if people have read the book, A Second Brain. + +00:13:26.680 --> 00:13:32.040 +But the idea that you drop stuff into like an inbox and then eventually you categorize it. + +00:13:32.040 --> 00:13:33.940 +And it means you don't have to remember so much. + +00:13:34.020 --> 00:13:39.480 +Somebody building, basically using cloud code automation to build that kind of stuff. + +00:13:40.400 --> 00:13:41.640 +Somebody says writing a book. + +00:13:42.260 --> 00:13:47.180 +That's, I hope that means it's helping them write the book, not actually Claude is writing + +00:13:47.180 --> 00:13:49.700 +the book, but I don't know. + +00:13:50.880 --> 00:13:51.460 +I don't know how you feel. + +00:13:51.460 --> 00:13:57.060 +I feel a little creeped out if it's just like, here's a whole bunch of text created purely + +00:13:57.220 --> 00:13:57.600 +by AI. + +00:13:57.920 --> 00:13:59.840 +You know, I gave it a vague idea. + +00:14:00.480 --> 00:14:00.960 +Now read it. + +00:14:01.480 --> 00:14:20.220 +Yeah, I definitely feel a little bit more kind of ethically conflicted about like work that I would like to consume that's like original versus like, I don't really have ethical qualms with code not being like original thoughts from someone, but definitely like writing or art. I think the lines start to get posse. + +00:14:20.600 --> 00:14:22.040 +Yeah, I really dislike it. + +00:14:22.040 --> 00:14:24.480 +And it's so bad on YouTube now. + +00:14:25.340 --> 00:14:28.040 +You go to YouTube and you see videos and you're like, + +00:14:28.240 --> 00:14:35.280 +oh, this is just pictures with some AI generated thing + +00:14:35.420 --> 00:14:36.880 +and then text-to-speech thing. + +00:14:37.020 --> 00:14:39.960 +And I don't know, it feels not good. + +00:14:40.010 --> 00:14:42.180 +So hopefully this is not, this book is like, + +00:14:42.220 --> 00:14:44.120 +it's helping me write the book, not writing the book. + +00:14:44.620 --> 00:14:46.600 +And then, yeah, what else we got? + +00:14:48.960 --> 00:14:54.240 +Helped me learn H-Ledger, including working with banks and all sorts of stuff. + +00:14:55.660 --> 00:14:58.180 +Person says, yeah, another person says a second brain. + +00:14:59.280 --> 00:15:01.420 +Browser use, calendar and scheduling. + +00:15:02.620 --> 00:15:05.840 +Medical diagnosis for my oncologist wife. + +00:15:06.560 --> 00:15:06.660 +Okay. + +00:15:07.600 --> 00:15:10.840 +That almost sounds like coding, but there's a lot of ideas here. + +00:15:10.840 --> 00:15:15.500 +And I've certainly personally, I actually have in my editor, + +00:15:15.780 --> 00:15:18.760 +I have a project called Claude as Chat, + +00:15:20.260 --> 00:15:22.340 +where I just want to talk about a bunch of documents + +00:15:22.700 --> 00:15:24.560 +and have it be more thorough + +00:15:24.880 --> 00:15:26.400 +and maybe create other documents + +00:15:26.510 --> 00:15:28.120 +and then reference those back and so on. + +00:15:28.820 --> 00:15:31.620 +So instead of opening up some kind of chat thing, + +00:15:31.860 --> 00:15:33.560 +I'll open up my code editor + +00:15:33.940 --> 00:15:37.700 +and fire up Claude code or something and go after it. + +00:15:37.900 --> 00:15:40.260 +So yeah, are you doing anything like this? + +00:15:41.500 --> 00:15:42.180 +That's a good question. + +00:15:42.700 --> 00:15:54.200 +I have been using Deep Agents, kind of our more general purpose equivalent, to help me with some like life admin things or even just like work admin things. + +00:15:54.540 --> 00:16:00.420 +So working in open source, we get a lot of, you know, incoming PRs and issues, etc. + +00:16:01.660 --> 00:16:07.860 +So we're working on using Deep Agents to kind of help us like triage and categorize there. + +00:16:10.480 --> 00:16:10.860 +What else? + +00:16:12.260 --> 00:16:19.940 +I have been experimenting with a deep agent that helps learn from my past social media posts and + +00:16:19.940 --> 00:16:28.020 +their performance and then help me write new ones based on like docs that I provide etc. Admittedly + +00:16:28.360 --> 00:16:32.940 +again I think that like crosses the fuzzy line with writing and I've kind of found that like + +00:16:33.110 --> 00:16:38.899 +I actually prefer to just like write quick tweets and LinkedIn posts you know originally and then + +00:16:39.580 --> 00:16:45.560 +maybe have like Claude help me edit if I'm like really struggling with a line. But I do think + +00:16:45.720 --> 00:16:49.480 +that's an interesting use case because it like definitely has gotten better at kind of learning + +00:16:49.620 --> 00:16:58.300 +my style. But at some, yeah. Yeah. Very interesting. So it's not just Claude code. You'll also point + +00:16:58.440 --> 00:17:04.360 +out that, you know, as I mentioned as well, OpenAI is deep research, which is incredible, + +00:17:05.839 --> 00:17:06.780 +As well as Manus. + +00:17:06.829 --> 00:17:08.640 +I just recently learned about Manus, + +00:17:08.670 --> 00:17:11.060 +but I feel like this is a little bit similar. + +00:17:11.420 --> 00:17:13.500 +It's a little more agentic, + +00:17:14.319 --> 00:17:17.319 +but it still feels like just a ChatGPT chat experience. + +00:17:19.800 --> 00:17:19.939 +Interesting. + +00:17:20.089 --> 00:17:20.439 +I don't know. + +00:17:20.540 --> 00:17:21.400 +I've used Manus any. + +00:17:21.640 --> 00:17:22.500 +I don't know anything about it. + +00:17:23.820 --> 00:17:25.060 +I haven't used it a ton, + +00:17:25.300 --> 00:17:29.000 +but we've definitely taken some inspiration from their features. + +00:17:29.720 --> 00:17:30.020 +I'm sure. + +00:17:30.220 --> 00:17:30.280 +Cool. + +00:17:30.940 --> 00:17:33.820 +All I know now is that Manus is part of Meta, apparently. + +00:17:34.540 --> 00:17:34.660 +Yeah. + +00:17:34.750 --> 00:17:35.040 +I guess. + +00:17:35.180 --> 00:17:35.480 +Congratulations. + +00:17:35.620 --> 00:17:40.740 +Manus people? Yeah. That's cool. Yeah, there have been a lot of crazy acquisitions recently. + +00:17:41.640 --> 00:17:51.080 +Yeah, absolutely. All right. So that brings us to maybe what is the essence, the characteristics of + +00:17:52.630 --> 00:17:54.840 +deep agents, right? There's these different examples, but + +00:17:56.360 --> 00:18:02.299 +how is that different than just an LLM or you ask it a question, right? You guys have laid out + +00:18:02.720 --> 00:18:04.940 +with a nice little picture kind of what that means to you. + +00:18:05.420 --> 00:18:12.240 +Yeah. Yeah. So when we think about deep agents, we think about it as an agent harness. And so + +00:18:12.900 --> 00:18:20.000 +it's a tool for building agents that comes with these built in things that kind of build up the + +00:18:20.100 --> 00:18:27.060 +harness so that the agents are highly effective at those complex long running tasks. And so I'll + +00:18:27.060 --> 00:18:29.120 +talk a little bit more about kind of what's built in here. + +00:18:30.640 --> 00:18:34.960 +Before we do, I think we got to do some, we got to do some nomenclature, some definitions here. + +00:18:35.320 --> 00:18:35.740 +Yeah, yeah. + +00:18:35.900 --> 00:18:40.580 +So you, you said that an agent harness. + +00:18:41.480 --> 00:18:42.980 +So what is an agent harness? + +00:18:44.240 --> 00:18:44.940 +Yeah, that's great. + +00:18:44.940 --> 00:18:47.660 +This is invisible to many people, but it's part of the magic, right? + +00:18:48.540 --> 00:18:48.720 +Yeah. + +00:18:49.830 --> 00:19:00.380 +So this is an agent harness is kind of add-ons around that core like model and tool calling loop that help to make an agent more effective. + +00:19:00.460 --> 00:19:02.220 +with more complex tasks. + +00:19:03.240 --> 00:19:04.660 +So you kind of have your like basic agent + +00:19:04.900 --> 00:19:08.820 +that's just like you give a model prompt and some tools + +00:19:09.140 --> 00:19:10.900 +and it like runs in a loop + +00:19:10.970 --> 00:19:12.380 +and then produces a final result. + +00:19:13.760 --> 00:19:17.460 +Whereas a harness adds in extra support + +00:19:17.570 --> 00:19:18.840 +to make the agent more effective. + +00:19:19.580 --> 00:19:20.360 +I see. + +00:19:21.120 --> 00:19:23.640 +So a little bit like when people would say, + +00:19:25.500 --> 00:19:33.180 +you are a marketing wizard who was created seven successful whatever and then you ask it something + +00:19:33.280 --> 00:19:40.540 +like that's the that's a real baby version of of maybe what a harness maybe a little flavor of what + +00:19:40.540 --> 00:19:46.800 +a harness is right it's here's all of the things that you're doing here's your skills here's what + +00:19:46.800 --> 00:19:52.499 +i want you to focus on here's maybe your tool chain that you can use right to you can call these things + +00:19:52.520 --> 00:19:54.860 +to do more, to learn more, something like that? + +00:19:55.620 --> 00:19:56.320 +Yeah, yeah. + +00:19:56.500 --> 00:20:00.920 +So the harness helps to provide the model + +00:20:01.200 --> 00:20:03.780 +with extra context and capability + +00:20:04.000 --> 00:20:06.160 +so that it can perform better. + +00:20:07.000 --> 00:20:07.960 +And we can... + +00:20:35.419 --> 00:20:38.220 +Thank you. + +00:20:54.180 --> 00:20:55.020 +I think we're getting... + +00:20:55.940 --> 00:20:56.080 +Thanks. + +00:20:57.460 --> 00:20:58.900 +Thanks, folks, for letting me know that I'm still here. + +00:21:00.240 --> 00:21:02.500 +Sydney is here, but I think she's having some trouble. + +00:21:02.610 --> 00:21:03.300 +So hang tight. + +00:21:03.940 --> 00:21:04.840 +Sydney, you're back. + +00:21:05.340 --> 00:21:07.320 +I wasn't sure if I was gone or you were gone, + +00:21:07.480 --> 00:21:09.220 +but it's all good. + +00:21:12.220 --> 00:21:13.480 +So we were talking about a harness + +00:21:13.740 --> 00:21:17.140 +and then the internet happened. + +00:21:17.600 --> 00:21:18.000 +Yes. + +00:21:19.020 --> 00:21:22.040 +So I think it's a little bit easier to understand + +00:21:22.320 --> 00:21:23.500 +kind of what a harness is + +00:21:23.670 --> 00:21:26.400 +if we talk about some of the components + +00:21:26.780 --> 00:21:27.900 +of an agent harness. + +00:21:28.280 --> 00:21:28.820 +Got it. Yeah. Okay. + +00:21:28.870 --> 00:21:30.420 +So what are the characteristics maybe? + +00:21:31.040 --> 00:21:32.220 +Yeah. So for the first, + +00:21:32.780 --> 00:21:34.879 +the first thing we think about with our agent harness + +00:21:34.900 --> 00:21:38.540 +is giving the agent access to a planning tool. + +00:21:38.840 --> 00:21:40.140 +So for Claude Code users, + +00:21:40.720 --> 00:21:43.320 +you're very intimately familiar with the to-do list + +00:21:43.580 --> 00:21:46.760 +that Cloud generates and then kind of checks off + +00:21:46.780 --> 00:21:49.500 +as it makes its way through various tasks. + +00:21:50.440 --> 00:21:52.540 +And this just helps your agent to like stay organized + +00:21:53.360 --> 00:21:55.640 +and kind of ensure that it gets through + +00:21:55.820 --> 00:21:59.200 +all of the various steps in a complex problem. + +00:22:00.340 --> 00:22:02.799 +And even just giving the agent this planning tool + +00:22:02.820 --> 00:22:07.960 +can help it like have a better trajectory for those harder problems. + +00:22:10.980 --> 00:22:14.980 +It's really wild how much the planning helps, but it really does. + +00:22:15.600 --> 00:22:18.900 +You know, Claude Code and Cursor and others even have planning mode. + +00:22:21.499 --> 00:22:28.640 +And I think probably this harness shifts a little bit when you switch it into planning mode. + +00:22:29.060 --> 00:22:32.680 +it probably gets a different set of instructions that you don't even see. + +00:22:33.760 --> 00:22:35.680 +You're in planning mode and here's how you're going to act. + +00:22:35.710 --> 00:22:41.580 +And you're going to now interview the user to really try to understand what it is they want and so on, right? + +00:22:41.800 --> 00:22:43.460 +Like they don't tell you that. + +00:22:43.540 --> 00:22:44.820 +It's just a drop down planning mode. + +00:22:45.160 --> 00:22:47.700 +But it probably means something like that, right? + +00:22:48.060 --> 00:22:49.120 +Yeah, yeah, exactly. + +00:22:49.900 --> 00:22:55.739 +And I think we've even seen the power of planning kind of reflected at the model level + +00:22:55.760 --> 00:23:01.560 +where like models, you know, there was a big boom in kind of like reasoning or thinking models + +00:23:02.370 --> 00:23:09.500 +about a year ago. And just the idea that if a model thinks through or like reasons about tasks + +00:23:09.700 --> 00:23:16.340 +more before producing a final result, then it's likely to do better. Yeah, absolutely. So yeah, + +00:23:16.630 --> 00:23:24.579 +the planning tools, kind of part one. Another thing that's big for our harness is access to + +00:23:24.900 --> 00:23:32.980 +a file system. So, you know, models have limited context windows, which is just like the amount of + +00:23:33.200 --> 00:23:39.560 +tokens, text and other things that you can send to the model. And so being able to use a file system + +00:23:39.780 --> 00:23:47.480 +and kind of selectively search or read or write files is a really effective tool for kind of + +00:23:47.760 --> 00:23:52.260 +context management that's more organized than just like sending everything all at once to the model. + +00:23:52.360 --> 00:23:59.920 +Yeah, and I think that might be a little bit why my Claudus chat fake programming project actually is useful, right? + +00:24:00.120 --> 00:24:02.560 +Because it has a file system, right? + +00:24:02.660 --> 00:24:07.660 +And here's a couple files you start with, and then when you need to, you create more files, and then you reference them. + +00:24:10.020 --> 00:24:16.920 +Some of those files, I don't know, I think 20,000, 30,000 words, which is a lot of context. + +00:24:17.000 --> 00:24:18.180 +It's most of the context. + +00:24:18.270 --> 00:24:21.160 +Just try to keep that in memory if it had to do it that way. + +00:24:22.300 --> 00:24:29.760 +And so letting the AI unload its mind, it's like asking you to read a textbook and remember + +00:24:30.020 --> 00:24:31.740 +everything instead of ever going back to it, right? + +00:24:32.320 --> 00:24:32.460 +Yeah. + +00:24:32.680 --> 00:24:34.100 +One shot, read a textbook, now go. + +00:24:35.340 --> 00:24:36.140 +Yeah, exactly. + +00:24:36.940 --> 00:24:41.080 +I think we're kind of starting to see this pattern emerge where it's like, well, effective + +00:24:41.920 --> 00:24:43.760 +agents are just like effective people, right? + +00:24:43.920 --> 00:24:49.979 +They think carefully and plan, and then they keep their notes and thoughts organized and + +00:24:50.000 --> 00:24:55.820 +you know, make things accessible when they need them, but don't like, you know, it makes much + +00:24:55.830 --> 00:25:03.740 +more sense to like read a textbook chapter than just like read the textbook. Yeah, yeah, absolutely. + +00:25:03.880 --> 00:25:10.640 +It's like using a highlighter almost. Okay, so planning tool, file system, and then sub agents. + +00:25:10.830 --> 00:25:17.739 +This one is less obvious to me. Tell me about the sub agents. Yeah, so in my mind, sub agents are + +00:25:18.120 --> 00:25:29.360 +largely for helping your deep agent accomplish tasks more efficiently. So if you ask your + +00:25:29.660 --> 00:25:36.840 +deep agent to go do a bunch of research on some given thing, it probably wants to pursue a couple + +00:25:37.020 --> 00:25:41.480 +different paths for that research, right? You want it to be really thorough. And it's more effective + +00:25:41.520 --> 00:25:48.600 +if you spin up sub-agents to do that in parallel than if you just had your main agent do like all + +00:25:48.700 --> 00:25:57.300 +of the research in sequence. And then we also, I'll give a coding example too, like if you wanted your + +00:25:57.820 --> 00:26:03.260 +agent to edit a bunch of files in like a similar way, it'd probably be better for it to go edit + +00:26:03.480 --> 00:26:08.140 +like 10 files at the same time than to do the first file, then when it finishes go to the second. + +00:26:09.180 --> 00:26:12.140 +And so the name in the game here really is like parallelization. + +00:26:12.620 --> 00:26:17.060 +And then the final like buzzword that I'll drop here is context isolation, + +00:26:18.620 --> 00:26:22.340 +which is that like if you have kind of a like small subtask, + +00:26:23.170 --> 00:26:27.920 +an agent is likely to perform better if you like just give it the context it needs rather than like + +00:26:28.340 --> 00:26:33.320 +all of this other history and things like that. And so that's really what motivates subagents. + +00:26:33.540 --> 00:26:37.100 +Awesome. Yeah, I think the parallelism is pretty straightforward. People probably think, + +00:26:37.120 --> 00:26:38.940 +"Oh, sub-agent, maybe you can fan out, + +00:26:39.160 --> 00:26:40.920 +like we're gonna go read this article + +00:26:41.200 --> 00:26:43.120 +and we're going to read the document you gave us. + +00:26:43.240 --> 00:26:45.820 +Then we'll, right, if you did those two in parallel, + +00:26:45.960 --> 00:26:46.340 +that's great." + +00:26:46.340 --> 00:26:48.440 +But I think it's the context management + +00:26:48.640 --> 00:26:51.080 +is also super important, right? + +00:26:51.200 --> 00:26:53.400 +The little sub-agent that might read the Wikipedia article + +00:26:55.600 --> 00:26:56.720 +doesn't consume all the, + +00:26:56.760 --> 00:26:58.040 +it doesn't have to know all the other stuff. + +00:26:58.160 --> 00:27:00.920 +All it has to do is say, given this article, + +00:27:02.100 --> 00:27:03.580 +get this piece of information out of it. + +00:27:03.700 --> 00:27:05.840 +And then it kind of resets almost back + +00:27:05.860 --> 00:27:07.220 +to just a sentence or two, right? + +00:27:07.300 --> 00:27:09.440 +So it's a good way to do that context isolation, + +00:27:09.540 --> 00:27:09.980 +like you say. + +00:27:10.540 --> 00:27:11.740 +- Yep, yep, definitely. + +00:27:13.920 --> 00:27:18.860 +And then the fourth one we have listed here is system prompt. + +00:27:19.480 --> 00:27:23.560 +Perhaps what I'll elaborate on here is the fact that + +00:27:23.820 --> 00:27:27.620 +we do give it a system prompt that instructs it + +00:27:27.740 --> 00:27:30.180 +on how to use the file system and the planning tool + +00:27:30.420 --> 00:27:32.900 +and the fact that it can invoke sub-agents. + +00:27:33.880 --> 00:27:38.540 +but we also load memory into the system prompt. + +00:27:39.280 --> 00:27:43.460 +And so that's something that can like persist across conversations and things + +00:27:43.620 --> 00:27:47.640 +like that. And so the idea here just being like prompts, power agents, + +00:27:48.060 --> 00:27:49.180 +and we want to, you know, + +00:27:49.300 --> 00:27:53.340 +really optimize the kind of under the hood prompt that's powering this + +00:27:53.400 --> 00:27:53.560 +harness. + +00:27:54.520 --> 00:27:58.240 +Yeah. Two thoughts on that really quick. I think that's great. + +00:27:59.520 --> 00:28:00.880 +There was, I'm sure you've seen this, + +00:28:01.280 --> 00:28:08.980 +But there was an article that said something to the effect of 13 Markdown files just took a billion dollars off the stock market. + +00:28:09.860 --> 00:28:12.800 +Or no, it was some huge amount, maybe 200 billion. + +00:28:12.920 --> 00:28:14.340 +It was some huge number. + +00:28:15.220 --> 00:28:26.820 +And effectively, that was when Claude released the legal agent as a Markdown file or a couple other specialized knowledge worker things. + +00:28:27.120 --> 00:28:38.600 +And people just realize, wow, it can actually solve all these problems that we used to employ people for, which is really kind of, it's a whole nother kind of tough debate. + +00:28:38.880 --> 00:28:42.240 +But that just shows how powerful prompts are, right? + +00:28:42.290 --> 00:28:47.500 +If like, oh, we just gave it a different addition to its prompt. + +00:28:48.050 --> 00:28:51.620 +And now it's Wall Street is freaked out because of it, right? + +00:28:51.760 --> 00:28:52.140 +That's crazy. + +00:28:52.360 --> 00:28:53.580 +Do you see that article? + +00:28:54.280 --> 00:28:54.600 +Yeah. + +00:28:55.520 --> 00:28:55.860 +Yeah. + +00:28:56.440 --> 00:28:58.980 +Yeah, it's definitely wild. + +00:28:59.200 --> 00:29:03.820 +I mean, I think we've known for a long time that like prompt engineering and, you know, + +00:29:04.000 --> 00:29:07.680 +really carefully tailoring your prompt to your use case is super powerful. + +00:29:09.020 --> 00:29:13.700 +But I think people are like really starting to realize how much that might affect like + +00:29:13.820 --> 00:29:14.560 +various industries. + +00:29:15.460 --> 00:29:16.420 +Yeah, 100%. + +00:29:17.000 --> 00:29:22.460 +So I'll link to actually the Claude code system prompt. + +00:29:23.420 --> 00:29:24.580 +I don't know how many words this is. + +00:29:24.760 --> 00:29:25.520 +It's a lot. + +00:29:27.740 --> 00:29:31.700 +it's a lot of words. Let me see if I can answer that question. But I'll link to the + +00:29:31.820 --> 00:29:42.260 +Claude Code system prompt. And people can check it out. It's 16,000 words, which is a third of a + +00:29:42.390 --> 00:29:51.639 +novel. And I think that's noteworthy because if you ask a question of your AI, some of them show + +00:29:51.660 --> 00:29:57.880 +the context that's being used up, that counts towards it, right? That's kind of, that precedes + +00:29:58.000 --> 00:30:04.740 +your one sentence question. You know, if you're like, fix failing test, you know, 16,000 words + +00:30:04.770 --> 00:30:12.680 +that precedes fix failing test. It's crazy, right? Yeah, definitely. One of the key things that we + +00:30:12.830 --> 00:30:17.800 +also add in deep agents under the hood is like prompt caching. So you might think like, oh man, + +00:30:17.900 --> 00:30:27.720 +And like my cost is really going to rack up if this is being sent every time under the hood, but we can kind of cache those like shared prompts across invocations. + +00:30:28.840 --> 00:30:32.660 +So that's very helpful for very verbose system prompts. + +00:30:33.360 --> 00:30:38.900 +Good. You want to send that every time. I mean, this is the Claude Code one, but you probably have a non-trivial one as well, right? + +00:30:39.420 --> 00:30:39.480 +Yeah. + +00:30:40.260 --> 00:30:40.320 +Yeah. + +00:30:41.020 --> 00:30:44.000 +But this definitely speaks to the fact that like obviously prompts are important. + +00:30:44.080 --> 00:30:48.740 +And we're very dependent on cloud code for productivity. + +00:30:49.200 --> 00:30:53.040 +And the detailed system prompt is a big part of why it's so effective. + +00:30:53.700 --> 00:30:54.460 +Yeah, absolutely. + +00:30:54.840 --> 00:30:57.120 +And why people are like, well, I feel like it's changed. + +00:30:57.320 --> 00:30:58.960 +It's now less friendly or whatever. + +00:30:59.180 --> 00:31:02.500 +Maybe that means just the model might not have even changed. + +00:31:02.500 --> 00:31:04.260 +It could just be the system prompt has changed. + +00:31:04.460 --> 00:31:06.940 +And now it's doing something slightly different. + +00:31:07.520 --> 00:31:07.680 +Yeah. + +00:31:08.100 --> 00:31:08.220 +Yeah. + +00:31:08.520 --> 00:31:08.580 +Yeah. + +00:31:09.500 --> 00:31:09.700 +All right. + +00:31:09.780 --> 00:31:12.560 +Let's talk about Deep Agents, + +00:31:13.160 --> 00:31:18.980 +link chain style, not just in general, the concept of it. + +00:31:19.919 --> 00:31:25.600 +And you have a GitHub repo over here just called Deep Agents. + +00:31:27.280 --> 00:31:29.520 +And I thought it might be interesting just to talk + +00:31:29.740 --> 00:31:32.300 +through what we got over here. + +00:31:32.460 --> 00:31:34.140 +So some of it we've already talked about, + +00:31:35.380 --> 00:31:38.680 +like the planning, the file system, the subagents. + +00:31:39.420 --> 00:31:47.100 +But there's also more things like tools, middleware, the whole programming model. + +00:31:48.880 --> 00:31:49.540 +Where do we want to start? + +00:31:49.540 --> 00:31:52.700 +I guess before we start, let me, how old is this project? + +00:31:53.380 --> 00:31:54.180 +Not super old, right? + +00:31:54.700 --> 00:31:55.860 +Not super old. + +00:31:56.380 --> 00:31:57.140 +I'll go here. + +00:31:57.160 --> 00:31:58.400 +I'll hit the history on the readme. + +00:31:58.520 --> 00:31:59.560 +That's usually the best way. + +00:31:59.920 --> 00:32:00.100 +Yeah. + +00:32:01.480 --> 00:32:02.280 +So August. + +00:32:02.500 --> 00:32:02.600 +Okay. + +00:32:02.700 --> 00:32:03.680 +So it's been around since August. + +00:32:05.679 --> 00:32:07.780 +That doesn't tell you when it's been public, right? + +00:32:07.840 --> 00:32:08.420 +That's the thing. + +00:32:09.060 --> 00:32:09.180 +Yeah. + +00:32:09.380 --> 00:32:09.900 +When was it released? + +00:32:10.360 --> 00:32:14.200 +I think it was made public very soon after. + +00:32:14.540 --> 00:32:16.280 +I think it might have started public, honestly. + +00:32:17.200 --> 00:32:20.220 +We're very like the first company, which is great. + +00:32:21.320 --> 00:32:23.040 +But yeah, so it started just this summer. + +00:32:23.960 --> 00:32:24.120 +Okay. + +00:32:24.480 --> 00:32:24.540 +Yeah. + +00:32:24.640 --> 00:32:27.020 +So it's already got 10,000 stars. + +00:32:27.240 --> 00:32:29.060 +It's pretty popular here. + +00:32:30.900 --> 00:32:31.100 +All right. + +00:32:31.120 --> 00:32:31.820 +So let's see. + +00:32:32.040 --> 00:32:39.340 +I guess maybe let's talk about the programming model because I think that'll help make it + +00:32:39.340 --> 00:32:42.160 +concrete for people? Like what is, what is the value of this? You know, + +00:32:42.280 --> 00:32:44.780 +maybe just talk us through like this quick start. + +00:32:45.760 --> 00:32:48.760 +Yeah. So as we mentioned kind of at the beginning, + +00:32:50.300 --> 00:32:51.400 +deep agents are, + +00:32:52.080 --> 00:32:56.520 +and the agents you can build with the deep agents package are very general. + +00:32:56.880 --> 00:33:01.420 +So cloud code is an example of a like coding agent, + +00:33:02.160 --> 00:33:06.360 +but you might want to build deep agents with all sorts of specializations. + +00:33:06.840 --> 00:33:10.680 +And so our new open source library helps you do that. + +00:33:10.810 --> 00:33:12.500 +And so you can see here, + +00:33:12.630 --> 00:33:15.780 +we have basically a three line code snippet. + +00:33:16.630 --> 00:33:19.740 +You import create deep agent from the deep agents package, + +00:33:20.510 --> 00:33:22.020 +you call create deep agent, + +00:33:22.190 --> 00:33:25.200 +and you can add your own model, tools, + +00:33:25.940 --> 00:33:28.120 +prompt additions, kind of other configuration. + +00:33:29.620 --> 00:33:32.360 +And then you like have an agent + +00:33:32.380 --> 00:33:35.600 +that's ready to use and even deploy. + +00:33:37.390 --> 00:33:39.640 +So very basically easy way to get started + +00:33:39.920 --> 00:33:41.280 +with building effective agents. + +00:33:42.260 --> 00:33:42.580 +- Awesome. + +00:33:43.120 --> 00:33:46.100 +So you might just say agent.invoke + +00:33:46.300 --> 00:33:49.320 +and you say research lane graph and write a summary. + +00:33:49.920 --> 00:33:50.760 +- Yep, yep. + +00:33:53.140 --> 00:33:56.860 +- So then what, how does it know what model to use? + +00:33:57.130 --> 00:34:00.580 +How does it, you know, how does it go about that? + +00:34:00.940 --> 00:34:05.260 +Can it use tools and to-dos, you know, planning like we've discussed? + +00:34:05.900 --> 00:34:06.420 +Yeah. + +00:34:06.600 --> 00:34:20.620 +So when you use the create deep agent function, under the hood, we add tools for planning and also for file system access and things like that. + +00:34:20.620 --> 00:34:25.600 +We'll have a user or a developer specify, like, what file systems they'd like to use. + +00:34:27.220 --> 00:34:32.520 +And then you can bring your own tools in addition to the ones that we provide under the hood. + +00:34:32.649 --> 00:34:41.379 +So maybe going back to my travel agent example, you could bring like, or actually I'll use like a personal assistant example. + +00:34:42.600 --> 00:34:51.040 +If you want to have a calendar API tool and a Gmail API tool, you could bring those along as well. + +00:34:51.389 --> 00:34:53.280 +So kind of more use case specific tools. + +00:34:53.620 --> 00:34:54.139 +I see. + +00:34:54.700 --> 00:34:57.580 +Maybe I'm working with Obsidian or some other Markdown thing, + +00:34:58.120 --> 00:34:59.260 +and you could point it and say, + +00:34:59.860 --> 00:35:04.400 +you're allowed to access any of my Markdown files for this project + +00:35:04.500 --> 00:35:05.460 +or just in general, right? + +00:35:06.040 --> 00:35:08.180 +That could be a tool, and you could teach it to do that. + +00:35:08.560 --> 00:35:08.800 +Yep. + +00:35:10.260 --> 00:35:14.820 +So I noticed below that you can do things like specify a little more detail. + +00:35:15.040 --> 00:35:18.380 +For example, you can say it can use a certain model. + +00:35:19.120 --> 00:35:22.160 +In this case, let's see how long that'll last. + +00:35:22.280 --> 00:35:23.520 +You could use GPT-40. + +00:35:23.860 --> 00:35:25.260 +Aren't they taking that away again? + +00:35:25.560 --> 00:35:26.040 +They took it away. + +00:35:26.300 --> 00:35:27.660 +People are freaked out on them. + +00:35:28.140 --> 00:35:28.880 +They put it back. + +00:35:28.980 --> 00:35:31.000 +But I think it's also not long for this world. + +00:35:31.120 --> 00:35:31.520 +But whatever. + +00:35:31.560 --> 00:35:32.860 +You pick some model. + +00:35:33.740 --> 00:35:37.160 +And then, as you pointed out, this tools is my custom tool. + +00:35:37.420 --> 00:35:42.600 +It's not super obvious from this code snippet, but my custom tool is just a Python function, right? + +00:35:43.440 --> 00:35:43.640 +Yes. + +00:35:44.140 --> 00:35:44.200 +Yep. + +00:35:44.460 --> 00:35:44.760 +That's correct. + +00:35:45.080 --> 00:35:47.340 +It's pretty easy to define tools. + +00:35:48.500 --> 00:35:51.240 +It can be just, yeah, a very simple Python function. + +00:35:51.920 --> 00:35:58.440 +can use some API of your choice, like maybe the calendar API, for example. + +00:35:59.500 --> 00:35:59.760 +I see. + +00:36:00.000 --> 00:36:04.780 +So you could write pretty much any type of function. + +00:36:04.780 --> 00:36:08.120 +It just has to take in text and spit out text or something to that effect? + +00:36:08.640 --> 00:36:13.580 +Yeah, we actually support like multimodal content for tools as well. + +00:36:13.840 --> 00:36:15.660 +So it can produce images. + +00:36:16.120 --> 00:36:24.400 +could produce files of other types and can take, you know, a varying, it can take any types of + +00:36:24.600 --> 00:36:33.440 +arguments. So the model is populating those arguments. But, yep. Right. Okay. How does it, + +00:36:34.480 --> 00:36:40.160 +this might be getting too deep in the weeds for a quick start, but how does it know what to pass + +00:36:40.280 --> 00:36:44.960 +to your Python function? And how does it know what to do with the return value? + +00:36:45.700 --> 00:36:51.000 +Yeah, that's a great question. So it all comes back to the prompt. And this is kind of a like + +00:36:51.260 --> 00:36:58.780 +wonderful marriage between developer docs and LLMs. So when you define a function, let's say, + +00:36:59.140 --> 00:37:04.820 +like, I'll use a simple example, a weather tool, a get weather tool, you can imagine the arguments + +00:37:05.500 --> 00:37:11.800 +might be something like, I'll say like city and state, or something like that. + +00:37:13.480 --> 00:37:20.940 +And then you might expect kind of structured weather data back, like, you know, current temperature, current conditions, etc. + +00:37:22.300 --> 00:37:26.700 +And when you define that function in your Python code, you can write a doc string. + +00:37:27.090 --> 00:37:31.360 +And it says this tool is used for getting the weather in a given city and state. + +00:37:31.950 --> 00:37:33.920 +And then you can document your arcs. + +00:37:34.010 --> 00:37:39.400 +And so city, you would say the city to get the weather for and then state, you know, this is all pretty self-explanatory. + +00:37:41.460 --> 00:37:48.180 +And then that information is parsed under the hood and actually passed to the model as part of its prompt. + +00:37:48.780 --> 00:37:49.340 +Oh, that's cool. + +00:37:49.680 --> 00:38:03.960 +And so what that looks like is, you know, we would parse the fact that we would parse the signature of the tool and the documentation and tell the model effectively like, hey, you have a get weather tool. + +00:38:04.780 --> 00:38:10.000 +If you, you know, you should call it when you want to get the weather for a given city and state. + +00:38:10.150 --> 00:38:11.380 +We pull that out of the doc string. + +00:38:12.140 --> 00:38:15.860 +And then we also say when you call it, make sure to pass these args in. + +00:38:16.420 --> 00:38:17.100 +Right, right, right. + +00:38:17.740 --> 00:38:22.660 +So a lot of times I see this happening and people are like, well, let's try to specify a JSON schema. + +00:38:22.900 --> 00:38:29.640 +And your job is to generate data that looks like this and then maybe even validate and say, no, you did it wrong. + +00:38:29.720 --> 00:38:30.220 +Try it again. + +00:38:33.560 --> 00:38:40.940 +But this is really interesting using the native Python syntax and help, you know, doc strings, right? + +00:38:41.160 --> 00:38:41.500 +That's wild. + +00:38:41.980 --> 00:38:42.820 +Yeah, it's really nice. + +00:38:43.060 --> 00:38:48.100 +I think it lets developers kind of focus on just writing the code that makes sense for their use case. + +00:38:48.240 --> 00:38:55.140 +And then, yeah, under the hood, like we convert these schemas to LLM usable things. + +00:38:55.280 --> 00:39:05.500 +And this is a nice like intersection of my previous work and current work, which is like a lot of the, you know, function parsing uses tools like Pydantic to define schemas for models. + +00:39:05.740 --> 00:39:07.160 +So that's a cool overlap. + +00:39:07.760 --> 00:39:11.720 +Yeah, I know that that's an interesting aspect of what Pydantic is used for a lot. + +00:39:11.760 --> 00:39:13.640 +So as you were describing this, I was wondering, hmm, + +00:39:14.390 --> 00:39:15.740 +are you using Pydantic for this, perhaps? + +00:39:16.340 --> 00:39:16.800 +Yep, yep. + +00:39:17.659 --> 00:39:18.380 +Okay, amazing. + +00:39:20.300 --> 00:39:25.200 +I think this also blends well with clod code typewritten. + +00:39:25.940 --> 00:39:29.280 +I guess just clod, you know, clod opus on it, whatever. + +00:39:29.820 --> 00:39:34.980 +The models, they're very keen to write doc strings, right? + +00:39:35.150 --> 00:39:39.700 +They just, even if you don't ask it to, a lot of times it's doc string, doc string, doc string. + +00:39:39.920 --> 00:39:41.460 +So I guess that would be really helpful, right? + +00:39:41.700 --> 00:39:54.600 +Yeah, definitely. I think it's nice that some of the things that we as developers weren't necessarily the best about in terms of code cleanliness or quality, we can now get some help enforcing as well. + +00:39:55.900 --> 00:40:00.440 +It's too much work, but it's not too much work for you, AI, because you don't get tired, so you do it. + +00:40:00.840 --> 00:40:01.360 +Yeah, exactly. + +00:40:02.980 --> 00:40:10.120 +Yeah. What about type hints? Does that play into anything that you consider? + +00:40:10.360 --> 00:40:15.400 +If I say it's an int versus a string, does it communicate, oh, you'd have to pass an integer here? + +00:40:16.240 --> 00:40:17.120 +Yeah, yeah, we do. + +00:40:17.270 --> 00:40:26.780 +So we generate the JSON schema both from the documentation for the parameters as well as the types associated with them. + +00:40:26.790 --> 00:40:29.020 +So that helps the model kind of align to. + +00:40:29.920 --> 00:40:30.100 +Yeah. + +00:40:30.920 --> 00:40:31.060 +Yeah. + +00:40:32.280 --> 00:40:32.900 +That's very cool. + +00:40:35.020 --> 00:40:38.460 +I see also that it says an MCP is supported. + +00:40:39.160 --> 00:40:39.500 +Yes. + +00:40:39.820 --> 00:40:40.880 +What does that mean? + +00:40:42.199 --> 00:40:45.560 +So MCP stands for model context protocol. + +00:40:47.059 --> 00:40:51.780 +And as of now, MCP, the protocol has specs for a lot of things. + +00:40:51.870 --> 00:40:57.880 +But the thing that it's most popular for is kind of having a specification for what tools should look like. + +00:40:58.960 --> 00:41:05.340 +And so MCP clients can use tools provided on MCP servers. + +00:41:06.080 --> 00:41:14.160 +And this means basically that you can use tools provided elsewhere, so not just in your own code that have a rendered interface. + +00:41:14.520 --> 00:41:21.240 +So that means that you can plug in an MCP server basically as a custom tool to this? + +00:41:22.080 --> 00:41:27.460 +Not that it itself does MCP server stuff, but it can consume MCP servers, is that correct? + +00:41:27.960 --> 00:41:28.360 +Yes. + +00:41:28.480 --> 00:41:42.260 +So you can fetch tools from MCP servers to use in your agents, which is really helpful if you want to use tools defined by others or maybe defined by others, you know, on a team adjacent to yours, things like that. + +00:41:42.700 --> 00:41:43.700 +Yeah, cool. + +00:41:43.820 --> 00:41:46.580 +I think the world sleeps a little bit on MCP servers. + +00:41:46.740 --> 00:41:57.020 +I think we could do a lot of neat stuff if more AI support, you know, Claude code, cursor, Claude, just no adjectives. + +00:41:58.220 --> 00:42:00.320 +those all support MCP servers. + +00:42:00.460 --> 00:42:02.760 +But for example, ChatGPT doesn't, right? + +00:42:02.900 --> 00:42:02.980 +- Yeah. + +00:42:03.180 --> 00:42:06.360 +- And that's probably the biggest one people use. + +00:42:06.360 --> 00:42:10.440 +But if you could say, I know it's got connect my calendar + +00:42:10.780 --> 00:42:13.660 +or three other things of all the possible data sources + +00:42:13.820 --> 00:42:15.820 +in the world, but you could have a lot more things + +00:42:15.980 --> 00:42:18.160 +if there was a little bit more support for this stuff, + +00:42:18.320 --> 00:42:19.420 +but it's cool that you all support it. + +00:42:19.900 --> 00:42:20.600 +- Yeah, definitely. + +00:42:21.360 --> 00:42:23.860 +I think it just helps a lot with like cross team collaboration + +00:42:24.200 --> 00:42:27.020 +and then also just like general community collaboration, + +00:42:27.340 --> 00:42:27.380 +Right? + +00:42:27.580 --> 00:42:29.460 +Like if there's some great idea for a tool, + +00:42:29.700 --> 00:42:32.060 +someone's probably implemented it somewhere. + +00:42:32.380 --> 00:42:34.360 +And it's nice to have that standardized interface. + +00:42:34.640 --> 00:42:34.720 +Yeah. + +00:42:35.140 --> 00:42:35.280 +Yeah. + +00:42:35.420 --> 00:42:37.620 +And the other thing I think is just the timeliness + +00:42:37.860 --> 00:42:38.980 +and the accuracy of the data. + +00:42:39.100 --> 00:42:41.140 +Because when you call it MCP server, + +00:42:41.260 --> 00:42:42.720 +you're basically just calling an API. + +00:42:43.320 --> 00:42:44.540 +And it can give you back the data. + +00:42:45.840 --> 00:42:49.220 +Whereas, you know, you ask-- if there was a weather MCP server, + +00:42:49.260 --> 00:42:51.840 +for example, instead of saying, what's the weather? + +00:42:51.940 --> 00:42:55.180 +It's like, well, my training data goes back to January 2025. + +00:42:55.780 --> 00:42:57.940 +So then the weather, like that is unhelpful to me. + +00:42:57.940 --> 00:42:59.760 +I want to know what the weather is now, right? + +00:43:00.280 --> 00:43:03.700 +It could ask exactly what it is. + +00:43:03.730 --> 00:43:06.920 +So I created an MCP server for Talk Python for people who don't know. + +00:43:08.319 --> 00:43:10.420 +And you can plug it in a cloud and other things. + +00:43:10.470 --> 00:43:12.800 +You can say, what's the latest episode or what are the last five episodes? + +00:43:13.999 --> 00:43:19.640 +And if I published an episode 10 seconds ago, it'll show up if you ask the AI, right? + +00:43:19.780 --> 00:43:21.900 +I think that's one of the big benefits. + +00:43:22.130 --> 00:43:24.220 +That plus access to data that's like private. + +00:43:24.320 --> 00:43:32.280 +you know. Yeah, yeah, it's very helpful. And I think another thing to like highlight on this + +00:43:32.480 --> 00:43:40.260 +page is we support like using any model, which is really nice. So you don't have kind of this like + +00:43:40.340 --> 00:43:45.560 +vendor lock in, like the same flexibility that you get from, you know, being able to use tools + +00:43:45.640 --> 00:43:50.920 +from any provider. It's nice to be able to switch models based on your use case. + +00:43:51.740 --> 00:43:55.940 +Yeah, that's super cool. So for example, if you use Claude Code, you get to pick anything, + +00:43:56.130 --> 00:43:59.380 +as long as it's an anthropic model, you can pick that one, right? + +00:43:59.610 --> 00:44:00.160 +Right, right. + +00:44:00.340 --> 00:44:06.620 +Whereas this, you could pick anything. Could I pick, so I'm running on my Mac Mini Pro, + +00:44:07.360 --> 00:44:12.040 +that's a little bit better at those things, I'm running the OpenAI, + +00:44:14.320 --> 00:44:17.420 +OpenWeights model locally, like the 20 billion parameter one, + +00:44:18.460 --> 00:44:24.900 +And I got it set up so I can do basically treat it as an open AI API endpoint. + +00:44:25.080 --> 00:44:25.200 +Yeah. + +00:44:25.520 --> 00:44:31.060 +Could I plug that in here and then talk to my Mac mini instead of talking to a cloud frontier model? + +00:44:31.640 --> 00:44:32.480 +Yeah, yeah, you could. + +00:44:33.240 --> 00:44:40.340 +And so this has kind of been motivated by our open source philosophy and foundation. + +00:44:40.760 --> 00:44:43.860 +But you can use any model. + +00:44:44.540 --> 00:44:51.260 +We have tons of integrations in LangChain for all sorts of providers, including open source model adapters. + +00:44:52.200 --> 00:44:52.680 +Okay. + +00:44:53.200 --> 00:44:54.660 +And then it's also cool. + +00:44:55.500 --> 00:44:58.500 +Your subagents can use different models than your main agent. + +00:44:58.680 --> 00:45:02.780 +So you might want subagents to use a cheaper and faster model inherently, right? + +00:45:02.900 --> 00:45:04.840 +Because they should be handling kind of smaller tasks. + +00:45:05.140 --> 00:45:05.500 +Interesting. + +00:45:05.760 --> 00:45:06.160 +Yeah, yeah. + +00:45:07.120 --> 00:45:09.020 +I think that is a pattern that people use. + +00:45:09.180 --> 00:45:12.840 +they sort of plan with the higher model, + +00:45:13.020 --> 00:45:15.800 +maybe plan with Opus, but then you execute with Sonnet + +00:45:15.940 --> 00:45:18.120 +or something like that if you're in the Cloud world. + +00:45:20.380 --> 00:45:21.940 +That, I think that can be really powerful + +00:45:22.460 --> 00:45:23.940 +'cause once you get everything set straight + +00:45:23.970 --> 00:45:26.120 +and you got the to-dos broken down and the sub-agents, + +00:45:26.240 --> 00:45:29.080 +you're right, it's a much smaller job to address the pieces. + +00:45:29.810 --> 00:45:30.480 +- Yep, yep. + +00:45:32.160 --> 00:45:35.440 +- Also, you got a CLI, I see. + +00:45:35.900 --> 00:45:37.080 +- Yeah, very exciting. + +00:45:37.320 --> 00:45:41.200 +So the DeepAgent CLI is kind of our coding agent. + +00:45:41.710 --> 00:45:44.500 +You can think of it as analogous to Claude Code. + +00:45:45.760 --> 00:45:50.820 +We use it internally at LinkChain as opposed to Claude Code + +00:45:51.760 --> 00:45:53.200 +and enjoy some of those features. + +00:45:53.520 --> 00:45:56.860 +Like we support streaming, which is really nice. + +00:45:56.930 --> 00:46:00.440 +So you can actually see like, you know, word by word outputs, + +00:46:01.440 --> 00:46:03.440 +which is kind of nice from like an end user perspective. + +00:46:04.740 --> 00:46:08.940 +And then the model switching and built-in memory and things like that. + +00:46:09.900 --> 00:46:13.640 +So basically CLI built on top of the DeepAgent's open source harness. + +00:46:14.800 --> 00:46:14.940 +Nice. + +00:46:15.870 --> 00:46:17.760 +Let me go back real quick to this different model. + +00:46:17.770 --> 00:46:20.980 +So here it says openAI colon GPT40. + +00:46:22.540 --> 00:46:25.680 +Does it basically just know how to talk to openAI + +00:46:26.000 --> 00:46:28.840 +and then you've got to set an environment variable or something + +00:46:29.030 --> 00:46:30.860 +to specify your API key? + +00:46:31.360 --> 00:46:35.860 +Or how does it get connected behind the scenes? + +00:46:36.220 --> 00:46:37.960 +Yeah, that's exactly right. + +00:46:38.000 --> 00:46:41.000 +So you can set your API key and environment variables. + +00:46:41.500 --> 00:46:44.140 +You could also pass it in explicitly here if you wanted to do that. + +00:46:44.220 --> 00:46:47.520 +I think that's probably not the best practice. + +00:46:47.740 --> 00:46:53.240 +But Deep Agents is built on top of LinkChain, + +00:46:53.440 --> 00:47:01.180 +which is kind of our tool for standardizing using different models. + +00:47:02.000 --> 00:47:07.900 +And so we have standard content blocks that represent different types of messages, and + +00:47:08.040 --> 00:47:10.320 +that's standardized across providers and models. + +00:47:11.760 --> 00:47:15.260 +And so we use LangChain under the hood to talk with all these different providers and + +00:47:15.320 --> 00:47:18.160 +then provide you, the end user, with kind of a unified experience. + +00:47:18.740 --> 00:47:18.920 +Nice. + +00:47:19.280 --> 00:47:23.580 +Yeah, so people who know LangChain or LangGraph, a lot of this is layered on... + +00:47:23.700 --> 00:47:25.820 +This is kind of on top of all that, right? + +00:47:26.260 --> 00:47:27.160 +Yeah, exactly. + +00:47:27.740 --> 00:47:30.700 +So it's actually built on both LangChain and LangGraph. + +00:47:31.280 --> 00:47:35.620 +So we think of LangGraph as like our agent runtime. + +00:47:36.280 --> 00:47:39.780 +This is like, you know, to get really like technical with it, + +00:47:40.690 --> 00:47:45.120 +the graph under the hood that's powering those like model and tool call iterations + +00:47:46.230 --> 00:47:51.020 +and streaming and LangGraph also powers like if you actually want to deploy your agent, + +00:47:51.140 --> 00:47:57.880 +it's kind of the framework that enables that with like durability and, and all of those like, + +00:47:57.990 --> 00:48:04.360 +you know, production grade features. Then laying chain itself is what we call an agent framework. + +00:48:05.120 --> 00:48:09.540 +That's different from an agent harness. Like it doesn't have all these things built in under the + +00:48:09.680 --> 00:48:16.419 +hood. But it just has those like agent building blocks. And then deep agents is the agent harness + +00:48:16.440 --> 00:48:23.200 +where we plug in all of that other logic got it okay very cool yeah it says the create deep agent + +00:48:23.360 --> 00:48:30.740 +returns a compiled lane graph graph so there you go right yep yep um one other thing i forgot to + +00:48:30.850 --> 00:48:36.700 +mention just we'll bring it up here is uh one of the most important parts of our harness is + +00:48:37.880 --> 00:48:42.739 +summarization um so if you have a really long conversation with like cloud code you might see + +00:48:42.760 --> 00:48:48.600 +it say like compacting and then it'll kind of spin for a minute or two um that's because you've + +00:48:48.640 --> 00:48:55.000 +actually hit or you're close to hitting the context limit uh the context window limit for that model + +00:48:55.040 --> 00:49:00.920 +you're using yeah um and so we see with these like long running long horizon tasks that effective + +00:49:01.180 --> 00:49:07.180 +summarization and compaction is super important and so we basically guarantee with deep agents that + +00:49:07.320 --> 00:49:12.020 +you're never gonna like hit your context overflow error because under the hood we'll kind of keep + +00:49:11.980 --> 00:49:14.020 +track of things and summarize as we go. + +00:49:14.440 --> 00:49:32.860 +I love it. Okay. Yeah, that kind of brings us maybe to the these lifecycle events and middleware, I think a little bit. So this is an interesting idea, because you have all these different capabilities that are, I guess I saw them middleware. + +00:49:34.940 --> 00:49:40.060 +I don't know where there's some, there's a list somewhere, I'm sure, but you can plug + +00:49:40.220 --> 00:49:47.900 +into what happens before code is sent to the model or what happens for each step and things + +00:49:48.000 --> 00:49:48.200 +like that. + +00:49:48.360 --> 00:49:48.420 +Right. + +00:49:48.560 --> 00:49:50.440 +Maybe tell me more about this. + +00:49:51.100 --> 00:49:51.700 +Yeah, sure. + +00:49:52.100 --> 00:49:59.720 +I'll send, I sent you the link for, I think, we have a lot of, a lot of middleware content + +00:49:59.920 --> 00:50:00.340 +on our docs. + +00:50:01.940 --> 00:50:02.460 +There we go. + +00:50:03.480 --> 00:50:03.560 +Yeah. + +00:50:03.780 --> 00:50:03.960 +Perfect. + +00:50:04.320 --> 00:50:11.780 +So middleware is kind of this innovation that we shipped with LangChain 1.0 in October. + +00:50:13.920 --> 00:50:20.980 +And it's kind of the like intermediate step between it's what powers or enables the harness. + +00:50:21.380 --> 00:50:24.860 +And so we have that core model and tool calling loop. + +00:50:25.240 --> 00:50:33.040 +But you can imagine you might want to kind of hook into behavior around both of the model and tool kind of nodes. + +00:50:34.360 --> 00:50:36.940 +And I'll give some examples of what that might look like in context. + +00:50:37.320 --> 00:50:44.500 +So before your model runs, you might want to check if you need to summarize and do that before the model call. + +00:50:46.140 --> 00:51:00.020 +After the model runs and before you call a tool, you might want to check if that tool requires a human in the loop to approve before that kind of sensitive tool call runs. + +00:51:00.200 --> 00:51:08.960 +The classic example there is like if the model calls the send email tool, you might want to like approve that email before it's, you know, sent to your boss, for example. + +00:51:09.470 --> 00:51:10.300 +Do stock trade. + +00:51:10.870 --> 00:51:29.040 +Yeah, yeah, exactly. And then there's some less flashy, but, you know, still important things like just robustness and like fallbacks, model fallbacks or tool retries or things like that that you can support via middleware. + +00:51:30.100 --> 00:51:41.240 +Okay. Yeah, you can build your own as well for sure. And there's also some that are pre-built, right? Like you said some human in the loop summarization. + +00:51:41.880 --> 00:51:42.100 +Yep. + +00:51:44.120 --> 00:51:51.600 +Personal information detection, that's pretty interesting. The to-do list or the retry. + +00:51:52.100 --> 00:52:12.040 +Yeah. Yeah. So we kind of tried to like standardize and just observe common patterns that we saw for folks building agents and expose some common middlewares. But you can kind of build your own as well. And then Deep Agents uses middleware to power all of the things we're doing for our agents. + +00:52:13.200 --> 00:52:20.680 +Yeah, and I saw one of the presentations you all did that, you know, Claude Code and so on. + +00:52:20.680 --> 00:52:24.860 +They have these types of custom tools and middleware as well. + +00:52:24.960 --> 00:52:31.160 +So people probably are familiar with experiencing them, just not quite realizing exactly how, right? + +00:52:31.760 --> 00:52:33.260 +Yeah, yeah, I think that's true. + +00:52:33.360 --> 00:52:37.080 +And like middleware is generally just kind of a common software pattern, right? + +00:52:37.180 --> 00:52:43.060 +Like you want to hook into lifecycle events and perform logic that's, you know, appropriate for your application. + +00:52:43.840 --> 00:52:44.880 +Yeah, 100%. + +00:52:46.300 --> 00:52:47.900 +All right, let's, we're getting short on time. + +00:52:48.060 --> 00:52:51.180 +Let's talk examples before we run out of time. + +00:52:51.190 --> 00:52:57.180 +And I think this will lead us into a couple more interesting elements that we maybe haven't necessarily talked about. + +00:52:57.230 --> 00:53:02.860 +So I'll link to this examples subfolder here on the GitHub repo. + +00:53:03.200 --> 00:53:07.080 +So we've got a deep research one, which I think is really cool. + +00:53:08.200 --> 00:53:12.640 +The content builder for writing the text to SQL agent, Ralph mode. + +00:53:13.080 --> 00:53:14.760 +I've yet to experience Ralph mode. + +00:53:14.820 --> 00:53:18.200 +I haven't done anything with that, but that's just the, I don't care. + +00:53:18.280 --> 00:53:18.840 +Just keep trying. + +00:53:19.180 --> 00:53:20.020 +If you fail, just keep trying. + +00:53:20.360 --> 00:53:20.460 +Right. + +00:53:20.680 --> 00:53:21.180 +Sort of mode. + +00:53:21.580 --> 00:53:21.680 +Yeah. + +00:53:21.760 --> 00:53:24.880 +Ralph from Simpsons, South Park. + +00:53:25.060 --> 00:53:25.480 +I don't know one of them. + +00:53:26.000 --> 00:53:26.160 +Yep. + +00:53:26.800 --> 00:53:26.900 +Yeah. + +00:53:28.320 --> 00:53:31.680 +Anyway, maybe we could talk about the deep research one first, + +00:53:31.840 --> 00:53:34.260 +because it's got a cool UI component. + +00:53:34.430 --> 00:53:41.180 +So you can run this in both as a Jupyter Notebook + +00:53:41.230 --> 00:53:45.180 +and play with it or a Langraph dev UI. + +00:53:46.320 --> 00:53:48.460 +And then you also have some other UIs as well, right? + +00:53:48.460 --> 00:53:48.940 +I can't remember. + +00:53:50.380 --> 00:53:53.060 +I feel like there's a third UI that you all support + +00:53:53.130 --> 00:53:54.120 +for this kind of stuff. + +00:53:54.150 --> 00:53:55.380 +So maybe let's talk through this one + +00:53:55.480 --> 00:53:56.960 +and then tell me about it. + +00:53:57.619 --> 00:53:58.860 +Yeah, definitely. + +00:53:59.140 --> 00:54:05.940 +So the idea with deep research is that you are going to be doing a pretty long running task. + +00:54:06.060 --> 00:54:08.840 +Like you want your model to be really thorough. + +00:54:09.060 --> 00:54:16.700 +And then one of the most important tools for deep research is web search because you want to get like current and relevant information. + +00:54:17.480 --> 00:54:19.880 +So we use Tably for web search. + +00:54:21.600 --> 00:54:25.700 +And then I can talk a little bit about our like UI as well. + +00:54:26.040 --> 00:54:26.240 +Yeah. + +00:54:26.940 --> 00:54:34.780 +So we, I guess I'll, yeah, I can chat a little bit about our UI, but generally it's hard to build agents, right? + +00:54:35.010 --> 00:54:37.600 +Like we talk about prompt optimizations and things. + +00:54:39.500 --> 00:54:44.340 +And LangChain, the company, provides a lot of tools to make it easier to build agents. + +00:54:44.470 --> 00:54:50.180 +And so one of them is this kind of agent viewer where you can see each of the steps in your agent. + +00:54:50.440 --> 00:54:56.900 +In this case, we see like the summarization middleware step and then the model and tool steps. + +00:54:58.480 --> 00:55:05.240 +And then, yeah, so that kind of makes it easier to like step through and understand the behavior of your model. + +00:55:06.340 --> 00:55:11.100 +Right. Ultimately, as we talked about, it's a bit of a lane graph thing anyway. + +00:55:11.330 --> 00:55:14.460 +So it shows you how that all comes together, right? + +00:55:15.020 --> 00:55:39.840 +Yes. Yep. And you can see on the right, we're looking at kind of the trace of things. And so we see like the to do middleware being called and other tool calls, etc. So we try to really make those like agent behavior primitives first class. So you can really narrow into like, what is the model doing once I invoke it? + +00:55:40.420 --> 00:55:44.140 +Yeah. So that's Lang graph dev for this project. + +00:55:44.360 --> 00:55:53.760 +You can also do the notebook, and there's a really nice, actually nice visualizations in there for, you know, what is the prompt that is the result and so on, right? + +00:55:53.940 --> 00:55:58.260 +I think it comes out pretty, maybe I can open up the notebook and it's got the results cached. + +00:55:58.400 --> 00:56:05.560 +You know, sometimes that's, that's both a benefit and a drawback in notebooks, but right now it would be a benefit of some of the, yeah, for example, + +00:56:06.760 --> 00:56:11.440 +has like a really nice display of what the prompt was with formatting and so on, right? + +00:56:11.920 --> 00:56:12.080 +Yep. + +00:56:12.480 --> 00:56:12.600 +Yep. + +00:56:16.400 --> 00:56:18.480 +And then there's a third one, if I remember correctly, + +00:56:18.700 --> 00:56:25.800 +that's kind of like a web UI for like ChatGPT or Claude, just no adjectives. + +00:56:26.720 --> 00:56:26.900 +Yeah. + +00:56:27.600 --> 00:56:32.000 +So we had kind of a deep research UI that we built out as a POC around like, + +00:56:32.240 --> 00:56:35.160 +we just want to make this easy for folks to view. + +00:56:36.260 --> 00:56:45.440 +I will note we have recently rolled out a product called Agent Builder, which is like a no-code agent builder powered by deep agents. + +00:56:46.460 --> 00:56:49.400 +And it is, you know, somewhat inspired from this UI. + +00:56:49.490 --> 00:56:58.320 +It basically is like a chat interface for an agent that gives you insights into the tool calls that are happening and things like that. + +00:56:58.660 --> 00:57:03.540 +That's kind of our like modern version of how you would probably go about seeing this in a UI. + +00:57:04.300 --> 00:57:04.760 +Sure. Okay. + +00:57:07.920 --> 00:57:17.140 +what else i guess a couple of other examples here what's this text to sql story yeah so um + +00:57:17.880 --> 00:57:25.900 +the idea here is that if an agent has some information about um data structure um for your + +00:57:26.160 --> 00:57:33.640 +you know for your database etc it is much easier as a like person to learn about that data if you + +00:57:33.380 --> 00:57:40.620 +can kind of ask just like regular questions and the agent can convert those questions into SQL + +00:57:40.860 --> 00:57:46.720 +queries based on the structure of the data and then, you know, run them and answer. So this is like, + +00:57:46.910 --> 00:57:53.720 +I think a really powerful agentic pattern to have when you just think about like data analysis and + +00:57:54.820 --> 00:57:59.020 +like business logic and. + +00:57:59.840 --> 00:58:00.220 +Yeah. + +00:58:00.480 --> 00:58:04.640 +If you could somehow parse out the database schema and tables + +00:58:05.960 --> 00:58:08.220 +and then use that as part of your system prompt, + +00:58:08.330 --> 00:58:11.500 +you know, when the user asks you to do a thing, + +00:58:11.960 --> 00:58:14.460 +it has to match one of these elements + +00:58:14.570 --> 00:58:15.580 +in the mid-converted SQL. + +00:58:15.780 --> 00:58:16.560 +That's pretty neat. + +00:58:17.040 --> 00:58:17.240 +Yeah. + +00:58:17.680 --> 00:58:17.740 +Yeah. + +00:58:18.920 --> 00:58:20.500 +Very cool that, you know, + +00:58:20.890 --> 00:58:23.079 +data analysis in general is kind of accelerated + +00:58:23.660 --> 00:58:25.240 +by agent support. + +00:58:26.060 --> 00:58:26.460 +Yeah, absolutely. + +00:58:27.040 --> 00:58:28.820 +You know, one thing that like really, + +00:58:30.580 --> 00:58:33.740 +this reminds me of is like five years ago, + +00:58:33.880 --> 00:58:35.400 +it was like very normal to kind of + +00:58:36.360 --> 00:58:37.720 +be bashing your head against the wall, + +00:58:37.860 --> 00:58:39.420 +trying to figure out like how to transform + +00:58:39.680 --> 00:58:42.780 +your Panda's data frame to be in some shape + +00:58:42.980 --> 00:58:44.940 +so that you could make some graph, right? + +00:58:45.040 --> 00:58:46.440 +And like, that's a lot easier now + +00:58:47.340 --> 00:58:47.920 +with this sort of thing. + +00:58:48.080 --> 00:58:48.980 +Like once you have the data, + +00:58:49.560 --> 00:58:54.820 +your AI tool can help you kind of shape and mold things as necessary for an + +00:58:54.860 --> 00:58:57.640 +app. Yeah, absolutely. I want this in a pie chart, + +00:58:58.270 --> 00:59:00.060 +broken down by this. Okay. Yeah. + +00:59:00.340 --> 00:59:03.760 +And then it's like in your file system already, which is pretty cool. Yeah. + +00:59:03.880 --> 00:59:08.200 +That's awesome. All right. A couple more things really quick. + +00:59:08.330 --> 00:59:13.320 +If I go back here, security, + +00:59:14.920 --> 00:59:16.540 +I don't know why people worry about security. + +00:59:17.020 --> 00:59:19.300 +You hear about all these jokes of like, + +00:59:20.780 --> 00:59:23.120 +I was vibe coding and deleting my hard drive. + +00:59:23.260 --> 00:59:23.660 +I don't know. + +00:59:23.780 --> 00:59:24.580 +I don't know what I'm doing. + +00:59:25.740 --> 00:59:26.780 +Or there was somebody, + +00:59:27.160 --> 00:59:31.000 +I think it was with one of these online low-code type of things. + +00:59:31.620 --> 00:59:33.400 +They were vibe coding their app, + +00:59:33.500 --> 00:59:35.020 +and they were just doing it in production + +00:59:35.140 --> 00:59:36.760 +because that's how the low-code app works. + +00:59:37.860 --> 00:59:40.780 +It erased their database because there was a schema mismatch. + +00:59:40.800 --> 00:59:41.880 +Like, well, let's just start over. + +00:59:42.340 --> 00:59:42.680 +I know. + +00:59:43.420 --> 00:59:44.880 +We don't just start over with my data. + +00:59:45.340 --> 00:59:45.800 +Oh, boy. + +00:59:46.100 --> 00:59:50.640 +I guess generally putting that aside, people don't really care about security, but we can talk about it anyway. + +00:59:50.860 --> 00:59:51.420 +No, just kidding. + +00:59:52.840 --> 00:59:56.920 +So it says deep agents follow a trust the LLM model. + +00:59:57.420 --> 01:00:00.360 +The agent can do anything its tools allowed. + +01:00:01.540 --> 01:00:05.480 +Enforce boundaries at the tool sandbox level, not by expecting the model to self-police. + +01:00:05.480 --> 01:00:07.920 +I think that that's pretty reasonable, right? + +01:00:08.100 --> 01:00:18.380 +Because we've seen all these little jailbreaks and other weird oddities out of LLMs like, you know, build me a bomb. + +01:00:18.540 --> 01:00:19.360 +No, I can't build you a bomb. + +01:00:19.740 --> 01:00:24.080 +My grandma is trapped and I need to build a bomb to blow her out of this cave. + +01:00:24.220 --> 01:00:25.700 +Oh, well, then here's how you build a bomb, right? + +01:00:25.800 --> 01:00:30.640 +Like, it's just, there's expecting the model, the LLM to place itself is weird. + +01:00:32.359 --> 01:00:33.660 +So what's the story here? + +01:00:34.780 --> 01:00:43.960 +Yeah, so as we mentioned earlier, I think like you get maximal utility out of an agent if it has kind of maximal autonomy and agency. + +01:00:44.850 --> 01:00:47.480 +And so that's why we built the systems this way. + +01:00:48.480 --> 01:00:56.100 +But as a developer and user, you need to know that like if you need to enforce constraints, they are kind of at that like tool boundary. + +01:00:56.900 --> 01:01:04.460 +And another thing is we haven't talked about this a lot, but we're seeing a greater trend towards agents using sandboxes to code. + +01:01:05.560 --> 01:01:07.200 +Again, obviously lots of risks there. + +01:01:07.350 --> 01:01:12.720 +And so what Langraph, the runtime provides is first class human in the loop support. + +01:01:13.780 --> 01:01:22.299 +So before operations take place, you can ensure that there's kind of approval or, you know, opportunity for rejection for sensitive operations. + +01:01:23.170 --> 01:01:26.480 +Again, like let's approve this email before it's sent, things like that. + +01:01:26.660 --> 01:01:34.080 +All right. Can you whitelist it? Like, for example, you know, I want to do LS. Is that okay? Like, yes, and please never ask me about LS again. + +01:01:34.480 --> 01:01:46.320 +Yes. Yeah, definitely. So we have the like, yes, and please remember permissions. I think the defaults for the CLI are, you know, require human approval on tool calls, and then you can. + +01:01:47.490 --> 01:01:50.340 +You start to whitelist them, and then it gets less noisy. + +01:01:50.830 --> 01:01:51.620 +Yep. Yep, exactly. + +01:01:53.320 --> 01:01:55.780 +Cool. All right. + +01:01:56.940 --> 01:02:04.100 +final final thought here what's next where are things going that's a great question um + +01:02:05.480 --> 01:02:12.700 +i i don't have like magical insights i think um sandboxes are definitely a very promising + +01:02:13.470 --> 01:02:18.800 +um and kind of growing pattern just as you mentioned you know initially being able to like + +01:02:19.440 --> 01:02:26.420 +run code and execute code is super valuable um in in terms of like productivity if you know that + +01:02:26.420 --> 01:02:31.360 +the code handed to you is like tested and functional that's really valuable um and then + +01:02:31.440 --> 01:02:38.640 +i'll also say that like we think about agents who write code as coding agents but actually i think + +01:02:38.740 --> 01:02:46.579 +that like coding is just a productivity accelerator like you can use code to perform data analysis or + +01:02:46.600 --> 01:02:51.480 +to do so many other things that need to be automated. + +01:02:51.890 --> 01:02:53.200 +So I think we're going to start to see + +01:02:53.700 --> 01:02:54.920 +more general purpose agents + +01:02:55.420 --> 01:02:57.560 +who just write code to help them with things. + +01:02:59.360 --> 01:03:00.000 +So, yeah. + +01:03:01.000 --> 01:03:04.560 +Yeah, I'm trying to dream up what I might build with this. + +01:03:08.060 --> 01:03:08.800 +There was this joke, + +01:03:09.560 --> 01:03:12.220 +this joke that I did on the Python Bytes podcast. + +01:03:12.520 --> 01:03:13.900 +It comes to mind now. + +01:03:14.400 --> 01:03:17.120 +there's this character putting their hands up. + +01:03:17.180 --> 01:03:18.620 +It's a silent side project. + +01:03:19.320 --> 01:03:20.920 +The new side project is talking. + +01:03:21.540 --> 01:03:22.960 +I feel like that's how my life is. + +01:03:23.120 --> 01:03:24.380 +There's just so exciting. + +01:03:24.420 --> 01:03:25.860 +You can build so many things with AI, + +01:03:26.820 --> 01:03:29.120 +have AI build the thing or with this, + +01:03:29.240 --> 01:03:32.400 +like imbue it with really powerful agentic capabilities. + +01:03:32.580 --> 01:03:34.940 +And there's just, I think the real challenge + +01:03:35.200 --> 01:03:38.300 +is finding time and focusing and finishing anything at all. + +01:03:38.300 --> 01:03:40.760 +'Cause it's just so exciting to try ideas out. + +01:03:41.420 --> 01:03:42.760 +- Yeah, yeah, definitely. + +01:03:43.220 --> 01:03:47.260 +It is nice that it's easier to get further with ideas + +01:03:47.390 --> 01:03:49.480 +in a very short period of time due to these tools. + +01:03:49.540 --> 01:03:49.620 +- Yeah. + +01:03:50.380 --> 01:03:50.480 +- Yeah. + +01:03:51.070 --> 01:03:53.680 +- I think it's great 'cause you can test out an idea + +01:03:53.730 --> 01:03:55.880 +and go, "Ah, that's not that great." + +01:03:55.880 --> 01:03:57.960 +Actually, this is a really good idea + +01:03:57.990 --> 01:03:59.280 +that I'm gonna keep going, you know? + +01:03:59.500 --> 01:04:00.900 +So that's really cool. + +01:04:01.360 --> 01:04:04.200 +Anything new or anything planned with deep agents + +01:04:04.460 --> 01:04:05.740 +that's coming that we don't know about + +01:04:05.790 --> 01:04:08.000 +or it's not obvious or haven't talked about yet? + +01:04:09.960 --> 01:04:13.180 +- I think we'll probably release + +01:04:13.200 --> 01:04:19.280 +roadmap soon. I mean, we're, we're really sprinting towards, I think a 1.0 eventually. + +01:04:20.800 --> 01:04:24.540 +We just kind of want to solidify those like core primitives. Like I mentioned, we have + +01:04:25.720 --> 01:04:32.120 +file systems that you can use like remote file systems as well, like, you know, an S3 backend + +01:04:32.510 --> 01:04:41.480 +or a database backed backend. And so, yeah, just excited to kind of keep sprinting on + +01:04:42.320 --> 01:04:45.640 +what the latest and greatest trends are in agent harnesses. + +01:04:46.400 --> 01:04:48.740 +One resource that I would point to, + +01:04:48.780 --> 01:04:49.920 +let me see if I can find the link, + +01:04:50.320 --> 01:04:57.900 +is more and more we're seeing with agent development + +01:04:58.240 --> 01:05:00.820 +that you're not really able to do it well + +01:05:01.040 --> 01:05:02.820 +if you can't look under the hood + +01:05:02.940 --> 01:05:05.580 +and see what your agent's doing. + +01:05:06.820 --> 01:05:09.920 +And so we just released this blog post + +01:05:09.920 --> 01:05:12.160 +on harness engineering. + +01:05:12.340 --> 01:05:14.780 +So basically like how we go about + +01:05:15.160 --> 01:05:16.840 +improving our harness systematically. + +01:05:17.960 --> 01:05:20.520 +And it was very dependent on looking at our traces + +01:05:21.100 --> 01:05:25.080 +of agent behavior and like even using LLMs + +01:05:25.100 --> 01:05:26.380 +to analyze those traces. + +01:05:28.180 --> 01:05:30.380 +And yeah, like the tail is in the trace. + +01:05:30.620 --> 01:05:33.600 +So I guess lesson here just being, + +01:05:34.500 --> 01:05:38.599 +it's really cool to use traces to self-improve + +01:05:39.280 --> 01:05:41.680 +our own, you know, harness and things like that. + +01:05:42.360 --> 01:05:46.200 +Yeah, that's wild to actually see the steps. + +01:05:46.560 --> 01:05:53.520 +And I guess you could probably even look at like failures and retries and how does the context vary? + +01:05:54.800 --> 01:05:55.740 +Yeah, yeah, exactly. + +01:05:56.700 --> 01:05:56.880 +Interesting. + +01:05:57.310 --> 01:05:57.420 +Okay. + +01:05:59.460 --> 01:06:01.040 +Well, very cool. + +01:06:01.920 --> 01:06:02.300 +Thank you, Sydney. + +01:06:03.540 --> 01:06:04.520 +Maybe final call to action. + +01:06:04.700 --> 01:06:06.260 +People want to get started with deep agents. + +01:06:06.390 --> 01:06:06.860 +What do they do? + +01:06:07.900 --> 01:06:10.260 +You do UVPIP install deep agents. + +01:06:12.200 --> 01:06:15.180 +Yeah, but super easy to get started in just a couple lines of code. + +01:06:16.580 --> 01:06:18.660 +And we're an open source team, + +01:06:18.840 --> 01:06:23.560 +so always happy to answer questions or accepting contributions, et cetera. + +01:06:23.800 --> 01:06:24.280 +Awesome. + +01:06:24.410 --> 01:06:26.180 +Do you have a Discord channel or something like that? + +01:06:27.420 --> 01:06:27.740 +I don't. + +01:06:27.740 --> 01:06:28.960 +You got a community group? + +01:06:29.640 --> 01:06:31.580 +We do have a forum that we. + +01:06:31.600 --> 01:06:32.160 +Okay, there you go. + +01:06:32.380 --> 01:06:34.960 +Yeah, I would direct people to the forum generally. + +01:06:35.840 --> 01:06:35.980 +Sweet. + +01:06:36.600 --> 01:06:36.840 +All right. + +01:06:38.400 --> 01:06:39.080 +very interesting + +01:06:40.460 --> 01:06:41.360 +what a wild time + +01:06:42.000 --> 01:06:43.920 +what a weird and interesting time we live in + +01:06:44.080 --> 01:06:44.820 +but very cool + +01:06:45.260 --> 01:06:47.760 +yeah great to chat with you about all things + +01:06:47.920 --> 01:06:49.060 +DB agents thanks for having me + +01:06:49.440 --> 01:06:51.820 +you bet keep up the good work talk to you next time + diff --git a/youtube_transcripts/544-wheel-next.vtt b/youtube_transcripts/544-wheel-next.vtt new file mode 100644 index 0000000..84d500f --- /dev/null +++ b/youtube_transcripts/544-wheel-next.vtt @@ -0,0 +1,3647 @@ +WEBVTT + +00:00:00.719 --> 00:00:02.520 +Jonathan, Ralph, and Charlie, + +00:00:03.340 --> 00:00:06.500 +welcome or welcome back depending on which one of you are here in this. + +00:00:07.520 --> 00:00:08.280 +Welcome to the show you all. + +00:00:08.360 --> 00:00:09.660 +It's awesome to have you on Talk Python To Me. + +00:00:10.500 --> 00:00:11.320 +Thanks for having us. + +00:00:11.440 --> 00:00:12.020 +Thanks for having us. + +00:00:12.020 --> 00:00:12.920 +Thanks for having us Michael. + +00:00:13.799 --> 00:00:18.480 +We're going to dive in deep to Python packaging and + +00:00:19.120 --> 00:00:23.180 +really look at how the needs of Python packaging have evolved. + +00:00:25.059 --> 00:00:28.400 +What you all as well as a bunch of other people, + +00:00:28.500 --> 00:00:31.540 +I see very long contributor lists on these peps. + +00:00:32.439 --> 00:00:34.880 +So a lot of people involved in this project. + +00:00:37.500 --> 00:00:37.860 +Really great. + +00:00:39.100 --> 00:00:41.060 +So let's get into it. + +00:00:41.160 --> 00:00:45.060 +Before we do, let's just do a quick round of intros for you all. + +00:00:45.420 --> 00:00:48.200 +I guess go around clockwise. + +00:00:48.600 --> 00:00:49.280 +Jonathan, you go first. + +00:00:51.080 --> 00:00:55.880 +So I work at NVIDIA for, I think, the better of eight years + +00:00:56.160 --> 00:00:56.540 +right now. + +00:00:57.320 --> 00:01:00.800 +I did all kinds of different roles, but very recently, + +00:01:01.120 --> 00:01:03.260 +I mean, over the last two something years, + +00:01:03.840 --> 00:01:07.940 +I moved into improving our CUDA and Python offering, + +00:01:08.240 --> 00:01:13.020 +trying to find better ways to expose GPU programming, + +00:01:13.340 --> 00:01:14.740 +essentially, at the Python layer. + +00:01:15.260 --> 00:01:19.200 +And I think for a little bit over a year, + +00:01:19.320 --> 00:01:20.960 +I've been working with Ralph and Charlie + +00:01:22.080 --> 00:01:24.620 +over multiple proposals to improve Python packaging, + +00:01:25.700 --> 00:01:30.940 +initiative call we are next and i think we'll talk a little bit more about this so excited to be on + +00:01:30.940 --> 00:01:36.600 +the show today yeah excited to have you you have really seen the roller coaster at nvidia i'm sure + +00:01:37.500 --> 00:01:43.020 +right it was like when you start yeah it was like gaming and probably some data science and then + +00:01:43.720 --> 00:01:47.900 +all all the changes and now just center of the universe so i'm sure it is exciting + +00:01:48.660 --> 00:01:52.360 +you know the funny thing is i i wanted to join nvidia since 15 years + +00:01:53.320 --> 00:01:56.560 +and I need a PhD to actually be able to join NVIDIA. + +00:01:56.820 --> 00:01:57.840 +That is so awesome. + +00:01:58.590 --> 00:01:59.020 +I love it. + +00:01:59.160 --> 00:02:01.120 +I was amazed by the CUDA technology + +00:02:01.270 --> 00:02:02.220 +when I was in high school + +00:02:02.370 --> 00:02:02.900 +and I was like, + +00:02:03.060 --> 00:02:05.820 +ah, this is so incredible, the concept. + +00:02:06.600 --> 00:02:07.880 +And I wanted to join, + +00:02:08.100 --> 00:02:10.100 +so I'm happy I was able to make this happen. + +00:02:11.680 --> 00:02:11.900 +Yeah. + +00:02:12.680 --> 00:02:15.260 +You know, CUDA is going to be an important part + +00:02:15.260 --> 00:02:15.940 +of this discussion. + +00:02:16.260 --> 00:02:16.940 +Not the only part, + +00:02:16.990 --> 00:02:18.900 +but it certainly is one of the forcing functions + +00:02:19.240 --> 00:02:21.140 +for the things happening here. + +00:02:21.920 --> 00:02:24.660 +Give people the background on CUDA. + +00:02:25.380 --> 00:02:25.800 +What is it? + +00:02:26.050 --> 00:02:26.600 +How does it work? + +00:02:26.780 --> 00:02:27.420 +Why is it so amazing? + +00:02:29.180 --> 00:02:32.880 +Well, CUDA is essentially a programming language + +00:02:32.990 --> 00:02:38.120 +that allows you to program on GPUs, specifically NVIDIA GPUs, + +00:02:38.400 --> 00:02:40.620 +and has a different programming model + +00:02:40.940 --> 00:02:43.500 +than what you would usually do in CE. + +00:02:44.579 --> 00:02:48.140 +Because GPUs are fundamentally very different than CPUs, + +00:02:48.600 --> 00:02:50.580 +you have to program them with a different mindset. + +00:02:51.120 --> 00:02:54.920 +Like, for example, the biggest important thing when you start with GPU + +00:02:54.940 --> 00:02:59.420 +is to not think about a single thread executing the instruction, + +00:02:59.680 --> 00:03:05.260 +but like, how can you massively parallel a task on like thousands of threads at the single? + +00:03:06.200 --> 00:03:18.820 +And it takes a different perspective and mode of thinking to how can you imagine doing a task on so many threads at the same time. + +00:03:19.260 --> 00:03:23.980 +We're not used to it as classic computer scientists. + +00:03:24.660 --> 00:03:26.680 +If anything, multi-threading is something + +00:03:26.790 --> 00:03:30.940 +that we tend to shy away from because there's a lot of caveats. + +00:03:32.030 --> 00:03:34.400 +But well, GPU programming is all about how can you + +00:03:34.580 --> 00:03:36.040 +have as many threads as possible. + +00:03:37.060 --> 00:03:37.740 +Yeah, yeah. + +00:03:38.260 --> 00:03:44.120 +It comes from graphics and videos where this pixel is computed + +00:03:44.380 --> 00:03:45.500 +independently of that pixel. + +00:03:45.760 --> 00:03:49.440 +And we've got 5K resolution. + +00:03:49.800 --> 00:03:51.060 +So let's just break that up, right? + +00:03:52.280 --> 00:03:53.960 +Yeah, it's exactly the idea. + +00:03:54.060 --> 00:04:00.460 +And now we have this reasonably new model that's called Tile Programming that abstracts it even more, + +00:04:01.960 --> 00:04:05.920 +which essentially instead of thinking about threads and blocks and grids, + +00:04:06.400 --> 00:04:12.260 +you think in terms of Tile, so kind of a mini representation that you could have in mind. + +00:04:12.960 --> 00:04:15.480 +And that thing can scale and adapt differently + +00:04:15.830 --> 00:04:16.640 +on different hardware. + +00:04:16.920 --> 00:04:17.500 +So pretty cool. + +00:04:18.030 --> 00:04:19.780 +But yeah. + +00:04:19.780 --> 00:04:20.700 +Yeah, that is amazing. + +00:04:21.340 --> 00:04:24.200 +People think that their CPU has a lot of cores. + +00:04:25.560 --> 00:04:28.060 +It's got nothing on the graphics cards. + +00:04:29.480 --> 00:04:30.340 +Well, yeah. + +00:04:30.800 --> 00:04:33.220 +It's a different type of hardware. + +00:04:34.040 --> 00:04:35.220 +Reach is a different objective. + +00:04:37.160 --> 00:04:38.220 +Yeah, well, very cool. + +00:04:38.400 --> 00:04:38.660 +Very cool. + +00:04:38.800 --> 00:04:42.240 +And what a journey if you did all that work to get there. + +00:04:42.720 --> 00:04:43.360 +Absolutely love it. + +00:04:43.820 --> 00:04:44.460 +Ralph, welcome. + +00:04:44.760 --> 00:04:44.840 +Hello. + +00:04:46.000 --> 00:04:46.180 +Yeah. + +00:04:46.580 --> 00:04:46.920 +Thanks, Michael. + +00:04:47.280 --> 00:04:47.820 +Great to be here. + +00:04:49.720 --> 00:04:52.200 +So about me, I am a physicist by training. + +00:04:53.080 --> 00:04:55.660 +I did a PhD in atomic and quantum physics. + +00:04:57.100 --> 00:04:59.340 +Worked in the semiconductor industry in a while. + +00:05:00.980 --> 00:05:03.700 +And I rolled into scientific computing due to that. + +00:05:03.700 --> 00:05:11.360 +I started using Python in 2004 and used the mailing list at that point because there was, I mean, NumPy didn't exist yet. + +00:05:11.600 --> 00:05:13.180 +There was no documentation for anything. + +00:05:13.310 --> 00:05:14.560 +So you had to join a mailing list. + +00:05:15.140 --> 00:05:16.880 +That's how I rolled into open source early on. + +00:05:17.660 --> 00:05:22.100 +I became the release manager of NumPy and SciPy in 2010. + +00:05:22.390 --> 00:05:24.960 +And yeah, I've been kind of doing that ever since + +00:05:26.320 --> 00:05:28.600 +as a volunteer for 10 years and then I got really too much. + +00:05:28.880 --> 00:05:29.620 +So I made it my job. + +00:05:30.720 --> 00:05:33.500 +I joined QuantSight, which is a small consulting company + +00:05:33.900 --> 00:05:36.680 +primarily around like data science, + +00:05:37.020 --> 00:05:38.420 +applied AI, scientific computing. + +00:05:39.420 --> 00:05:42.720 +And yeah, I'm now one of the two co-CEOs of QuantSight. + +00:05:43.240 --> 00:05:43.740 +- Awesome. + +00:05:44.000 --> 00:05:47.240 +- I'm trying to, basically, we just converted last year + +00:05:47.350 --> 00:05:48.680 +to a public benefit corporation, + +00:05:48.970 --> 00:05:51.600 +which is very much aligned with what, you know, + +00:05:51.660 --> 00:05:52.880 +most of our team wants to do. + +00:05:53.340 --> 00:05:55.000 +Most of them are open source maintainers. + +00:05:55.430 --> 00:05:59.320 +And yeah, we basically do consulting to allow ourselves + +00:05:59.560 --> 00:06:02.040 +to make impactful open source contributions. + +00:06:03.720 --> 00:06:08.020 +- QuantSight is doing a ton in the data science space, + +00:06:08.500 --> 00:06:10.100 +scientific computing space for sure. + +00:06:10.340 --> 00:06:13.620 +I've had multiple rounds of QuantSight folks on the show + +00:06:13.940 --> 00:06:15.680 +and things like that and very neat. + +00:06:17.819 --> 00:06:20.000 +- Yeah, it's a lot of fun and rewarding. + +00:06:20.460 --> 00:06:21.900 +So yeah, glad to be here. + +00:06:22.960 --> 00:06:27.720 +- Yeah, it's an interesting transition going from a science + +00:06:27.980 --> 00:06:30.680 +or something along those lines into programming, right? + +00:06:30.800 --> 00:06:36.140 +I got into it through working in my math research and so on. + +00:06:36.260 --> 00:06:38.040 +And actually this is just more fun. + +00:06:38.480 --> 00:06:39.400 +I'm just going to do programming. + +00:06:39.940 --> 00:06:42.460 +It's not exactly physics, but it's pretty similar. + +00:06:43.440 --> 00:06:43.880 +Yeah. + +00:06:44.230 --> 00:06:49.080 +I mean, I've always liked both, but I did experimental physics, + +00:06:49.410 --> 00:06:51.820 +and there it's like, you know, you have much less control + +00:06:52.730 --> 00:06:54.100 +over what you end up producing. + +00:06:54.720 --> 00:06:57.760 +You know, building and using lasers in the lab, if one broke, + +00:06:58.020 --> 00:07:00.920 +maybe I had to send it off for repairs and wait a month, right? + +00:07:01.080 --> 00:07:02.200 +What do you do in the meantime? + +00:07:02.370 --> 00:07:02.780 +You program. + +00:07:03.740 --> 00:07:06.720 +So, you know, that's one of the nicer things about it. + +00:07:07.060 --> 00:07:08.940 +And yeah, gradually started with like, + +00:07:09.420 --> 00:07:12.060 +even before Python, there was some MATLAB + +00:07:12.180 --> 00:07:13.760 +and then you roll into open source + +00:07:13.860 --> 00:07:16.220 +and then mostly just Python a bit of C + +00:07:16.320 --> 00:07:18.020 +and you kind of like go down from there + +00:07:18.080 --> 00:07:20.140 +and then you encounter packaging + +00:07:20.440 --> 00:07:23.840 +and it's one of those things that like only 5% of people like + +00:07:23.900 --> 00:07:25.980 +and the rest see it as a chore, but yeah, + +00:07:26.520 --> 00:07:28.440 +when you like it, you just have to do more and more. + +00:07:28.720 --> 00:07:31.140 +- Yeah, you're with your people now, I think on this call, + +00:07:31.260 --> 00:07:31.680 +that's for sure. + +00:07:33.100 --> 00:07:36.640 +Hey, Charlie, do we even need to give you an introduction? + +00:07:36.940 --> 00:07:44.240 +just say uv and then go on or no, please, please do. I'm just kidding. But the reason I say that + +00:07:44.340 --> 00:07:51.120 +is uv has taken the world by storm really. And congratulations. And yeah, tell people about + +00:07:51.140 --> 00:07:56.980 +thank you. Yeah, of course. So my name is Charlie. I'm the founder and CEO of Astral. + +00:07:57.980 --> 00:08:04.400 +I've been working on the company for, let's see, started the company in October, 2022. That's the + +00:08:04.300 --> 00:08:11.040 +easier way to do it. So I've been working on this for a few years. We mostly build open source. So + +00:08:11.400 --> 00:08:15.940 +we've worked on a couple of different tools that have become quite popular in Python. So we build + +00:08:16.020 --> 00:08:21.420 +rough, which is our linter and formatter, ty, which is our type checker. And then most relevant for + +00:08:21.440 --> 00:08:28.440 +this episode would be uv, which is our Python package and project manager. So yeah, we spend + +00:08:28.440 --> 00:08:34.260 +all our time thinking about how to build tools that make it easier and to work with Python and + +00:08:34.280 --> 00:08:36.200 +how to make Python programming more productive. + +00:08:37.300 --> 00:08:38.240 +A lot of that's about speed. + +00:08:39.010 --> 00:08:40.560 +We try to build things that are really fast, + +00:08:40.630 --> 00:08:42.140 +but it's also about user experience + +00:08:42.469 --> 00:08:44.159 +and trying to sort of like take complexity + +00:08:44.920 --> 00:08:46.340 +out of the critical path for users. + +00:08:47.540 --> 00:08:49.780 +So, you know, for example, + +00:08:50.280 --> 00:08:51.340 +we've definitely spent a lot of time + +00:08:51.560 --> 00:08:52.740 +thinking about how we can make it easier + +00:08:52.830 --> 00:08:54.200 +for people to install PyTorch, + +00:08:54.960 --> 00:08:57.060 +which is, you know, one of the examples + +00:08:57.320 --> 00:08:58.600 +that will come up, I'm sure, you know, + +00:08:58.600 --> 00:08:59.340 +over the course of the show + +00:08:59.520 --> 00:09:00.660 +and one of the motivating examples + +00:09:00.920 --> 00:09:02.180 +for the peps we've been working on. + +00:09:02.420 --> 00:09:05.000 +So yeah, that's why I'm here. + +00:09:05.120 --> 00:09:07.420 +We've been collaborating with Jonathan, Ralph, + +00:09:07.510 --> 00:09:09.180 +and honestly, like a bunch of other people too. + +00:09:09.280 --> 00:09:09.960 +It's been a really big effort + +00:09:10.090 --> 00:09:11.060 +and I'm sure we'll get into that. + +00:09:11.240 --> 00:09:13.720 +But it's been cool to have this long running + +00:09:13.790 --> 00:09:15.900 +and very like wide ranging collaboration + +00:09:16.190 --> 00:09:18.000 +around trying to push Python packaging forward. + +00:09:19.160 --> 00:09:19.280 +Yeah. + +00:09:21.120 --> 00:09:23.740 +Well, like I said, congrats on all the stuff with Astral. + +00:09:24.340 --> 00:09:27.140 +And we're going to talk a little bit about pyx, I think. + +00:09:27.280 --> 00:09:28.100 +Maybe see if there's any, + +00:09:29.100 --> 00:09:31.100 +just to check in at the end of the show + +00:09:31.120 --> 00:09:33.620 +after we talk about some of these things, I think, if you're up for it. + +00:09:34.200 --> 00:09:34.660 +Yeah, sounds good. + +00:09:35.700 --> 00:09:36.120 +Yeah, awesome. + +00:09:37.340 --> 00:09:37.700 +All right. + +00:09:38.180 --> 00:09:39.840 +Well, let me switch around. + +00:09:42.920 --> 00:09:46.420 +So one of the challenges, I mean, let's just start with what is the challenge? + +00:09:46.800 --> 00:09:52.160 +You all have described this as the lowest common denominator packaging problem + +00:09:52.680 --> 00:09:54.840 +that we've got to deal with. + +00:09:55.860 --> 00:10:04.260 +the idea or the problem is different cpus have specialized instructions different graphics cards + +00:10:04.360 --> 00:10:10.360 +all these different compute and platforms and so on might have specific instructions but + +00:10:11.540 --> 00:10:15.960 +and they're optimized right like do this as vectors operations instead of on registers or + +00:10:15.960 --> 00:10:23.240 +whatever and then but maybe some other thing that it might run on doesn't support that right i don't + +00:10:23.080 --> 00:10:28.500 +WebAssembly whatever yeah yeah and so then how how do you actually end up shipping something + +00:10:29.090 --> 00:10:34.460 +to python people that takes advantages of the specializations that are there + +00:10:35.100 --> 00:10:39.100 +when they're there but without breaking the other ones right that's kind of the core problem is that + +00:10:39.100 --> 00:10:48.020 +right yeah i think one one little step back before if you if you think about it like a wheel when + +00:10:48.040 --> 00:10:50.540 +when you take the Python package that we have everywhere, + +00:10:51.240 --> 00:10:54.140 +there is a few parts of the file names + +00:10:54.280 --> 00:10:57.700 +that essentially allows you to know what's been built for. + +00:10:58.280 --> 00:11:01.220 +So inside this you have, if it's a pure Python package, + +00:11:01.440 --> 00:11:04.100 +it's simple, you might have a minimum Python version, + +00:11:04.360 --> 00:11:07.040 +but in most cases it's pretty generic, + +00:11:07.220 --> 00:11:08.320 +so that's not an issue. + +00:11:08.920 --> 00:11:11.700 +When you start having compiled code inside the package, + +00:11:11.900 --> 00:11:12.720 +that's a different story, + +00:11:13.020 --> 00:11:14.120 +because now we're talking about + +00:11:14.440 --> 00:11:16.000 +what kind of OS it was built for. + +00:11:16.240 --> 00:11:19.900 +So Windows, macOS, Linux, different flavors. + +00:11:21.500 --> 00:11:23.940 +We're talking about the type of CPU that it was built for. + +00:11:24.200 --> 00:11:28.760 +So X86, ARM, PowerPC, potentially RISC-V, + +00:11:29.030 --> 00:11:31.260 +all these things, mobile. + +00:11:32.740 --> 00:11:34.280 +And then finally the Python API. + +00:11:34.540 --> 00:11:35.120 +And in most cases, + +00:11:35.210 --> 00:11:37.780 +it means the minimum Python API that you need. + +00:11:38.440 --> 00:11:40.580 +So, and for people, + +00:11:41.070 --> 00:11:43.480 +an API is essentially the same as an API, + +00:11:43.610 --> 00:11:44.560 +but for a binary. + +00:11:45.060 --> 00:11:49.660 +So it's important when things are stable at the ABI level + +00:11:49.900 --> 00:11:51.840 +because it allows you to be future compatible. + +00:11:52.600 --> 00:11:53.900 +What does ABI mean? + +00:11:54.720 --> 00:11:55.360 +Jonathan, help us out. + +00:11:55.360 --> 00:11:57.000 +What does ABI mean for those of us who don't know? + +00:11:57.060 --> 00:12:00.480 +Application binary interface, so it's the same as API, + +00:12:00.820 --> 00:12:02.880 +but instead of specifically for binaries. + +00:12:05.020 --> 00:12:09.020 +And the problem that we collectively kind of try to get + +00:12:09.200 --> 00:12:12.880 +to is that, well, today the compute space + +00:12:12.900 --> 00:12:14.620 +and the scientific computing space, + +00:12:15.050 --> 00:12:19.540 +which if we take the latest JetBrains Python developer survey + +00:12:20.240 --> 00:12:23.560 +is at least 40 to 50% of the Python developers + +00:12:23.730 --> 00:12:26.620 +are essentially doing data science or similar. + +00:12:27.760 --> 00:12:30.460 +So it's a massive percentage of the community + +00:12:31.500 --> 00:12:35.460 +is doing in some form scientific computing + +00:12:35.890 --> 00:12:39.260 +to whatever extent you may want to think about it. + +00:12:40.020 --> 00:12:42.800 +And while the problem is when we do these things, + +00:12:42.870 --> 00:12:45.400 +we try to do them fast because who + +00:12:45.520 --> 00:12:48.960 +likes to wait on the return of some Pandas operation + +00:12:49.160 --> 00:12:51.000 +on NIMPI or PyTorch operation. + +00:12:51.860 --> 00:12:55.600 +But to go fast, you need to use all the tricks in the books + +00:12:56.260 --> 00:12:58.240 +that you can get to essentially-- + +00:12:58.240 --> 00:13:01.920 +you have to optimize the binary for a specific CPU, + +00:13:02.620 --> 00:13:05.960 +for a specific GPU, or for a specific library + +00:13:06.220 --> 00:13:07.840 +that you want to use, like BLAS. + +00:13:08.200 --> 00:13:09.500 +BLAS is a general concept. + +00:13:09.780 --> 00:13:13.460 +So which blast implementation do you use or MPI? + +00:13:15.460 --> 00:13:17.900 +And the problem is, well, we don't have the tags + +00:13:18.060 --> 00:13:21.380 +or markers to allow us to essentially flag + +00:13:21.520 --> 00:13:25.260 +this specific binary to be compatible with X, Y, and Z. + +00:13:26.880 --> 00:13:31.620 +- Right, so the wheel might say, this is for 3.14, + +00:13:32.440 --> 00:13:36.180 +it is for ARM CPUs and so on, + +00:13:36.480 --> 00:13:37.900 +but it's not gonna say, + +00:13:38.740 --> 00:13:44.100 +And it supports this vectorization optimization on Intel chips. + +00:13:44.440 --> 00:13:45.820 +I just said ARM, didn't I? + +00:13:46.280 --> 00:13:47.080 +Let's go with Apple Silicon. + +00:13:47.080 --> 00:13:54.120 +A very good example of ARM is that the default most people build with is actually a Raspberry Pi. + +00:13:55.460 --> 00:13:56.100 +ARM level. + +00:13:56.320 --> 00:13:56.880 +Yeah, yeah, yeah. + +00:13:57.700 --> 00:14:08.020 +And you can imagine that when you build for any type of desktop CPU, ARM, you have a little bit more complex CPUs. + +00:14:07.980 --> 00:14:10.420 +CPUs and a little bit more advanced chips. + +00:14:11.040 --> 00:14:13.720 +And it's a lot of performance that you leave on the table + +00:14:13.840 --> 00:14:16.780 +by not optimizing for a specific platform. + +00:14:16.960 --> 00:14:18.580 +So obviously, in some cases, it doesn't really matter. + +00:14:19.260 --> 00:14:21.000 +But in other cases, it does really matter. + +00:14:21.680 --> 00:14:23.580 +And we want to be able to do that. + +00:14:24.400 --> 00:14:26.180 +I'll give a very concrete example. + +00:14:26.320 --> 00:14:31.140 +Like, Intel x86-64 is kind of the most common, I think, + +00:14:31.680 --> 00:14:33.560 +CPU that most people will have at home, right? + +00:14:34.260 --> 00:14:42.420 +If you build a wheel for that, you can only use CPU features, performance CPU features that go back to about 2009. + +00:14:43.470 --> 00:14:59.000 +Any new hardware features that were introduced after 2009, things like, you know, SSE4, AVX2, you know, later versions of that, you just cannot use because the installers don't know that you put that in the wheel. + +00:14:59.620 --> 00:15:04.460 +And hence, they will also install it on computers that don't have those instructions. + +00:15:04.580 --> 00:15:07.140 +Right. And then you just get like very ugly crashes. + +00:15:07.940 --> 00:15:13.620 +Hence, what we all do is we ship wheels, binaries that are only compatible with 2009. + +00:15:13.900 --> 00:15:22.200 +And the difference between the 2009 hardware features and, you know, the 2019 or 2023 one + +00:15:22.760 --> 00:15:26.520 +could be a factor of 10x, 20x in performance, depending on what you're doing. + +00:15:27.780 --> 00:15:29.060 +10x to 20x? + +00:15:30.400 --> 00:15:30.780 +Oh, yeah. + +00:15:31.060 --> 00:15:36.040 +For, you know, especially when you work with scientific data and SIMD instructions. + +00:15:36.520 --> 00:15:39.120 +Yeah, you can you can get massive performance increases. + +00:15:39.360 --> 00:15:42.400 +If you heard of vectorization, this is a huge deal. + +00:15:43.680 --> 00:15:47.940 +Yeah, I mean, I guess the way I think about it from our perspective of building like + +00:15:48.280 --> 00:15:51.760 +because because these problems, like one of the things that's very hard about solving them, + +00:15:51.960 --> 00:15:56.699 +and it has required like us to be so collaborative across the industry + +00:15:56.760 --> 00:16:02.040 +is that it touches basically every piece of the Python packaging stack. + +00:16:02.980 --> 00:16:09.080 +It impacts how you build things, how the registries work, + +00:16:09.800 --> 00:16:13.020 +what they support, how installers choose what to install. + +00:16:13.830 --> 00:16:17.420 +And so for us, it's like the superpower of Python, + +00:16:17.770 --> 00:16:23.580 +I think in some ways, sorry, I think the superpower of Python in some ways + +00:16:23.540 --> 00:16:30.320 +is like you can build and distribute all this software that's built for, you know, that uses + +00:16:30.440 --> 00:16:34.140 +native code. Like you can take native code and you can distribute it out to users and they can run + +00:16:34.160 --> 00:16:40.260 +it just like it's any other piece of Python code. And in the spec, we have these things like, okay, + +00:16:40.440 --> 00:16:47.359 +you can build a wheel that targets Windows or Linux or macOS, and it can target like x86 or + +00:16:47.380 --> 00:16:52.740 +arm or whatever else. And those are all captured in the spec. And so for us, like building uv, + +00:16:53.400 --> 00:16:58.780 +we know how to detect those things, how to figure out like which wheel to install based on what the + +00:16:58.880 --> 00:17:03.020 +user's machine is running. But there's all this other stuff that's not captured by any of those + +00:17:03.200 --> 00:17:08.280 +standards, like the instruction set, or even like the supported CUDA version, like all these things + +00:17:08.319 --> 00:17:12.439 +are not captured in that wheel file. And the installers don't know how to detect them. They + +00:17:12.520 --> 00:17:16.639 +don't know how to figure out like, okay, which PyTorch build should I use based on the CUDA + +00:17:16.660 --> 00:17:17.720 +a version on the user's machine. + +00:17:18.000 --> 00:17:19.060 +All that stuff is lost. + +00:17:19.130 --> 00:17:21.000 +And that's the gap that we're trying to bridge. + +00:17:22.339 --> 00:17:25.459 +ANDREAS SULLIVAN: And part of the philosophy is also-- + +00:17:25.459 --> 00:17:28.960 +so right now, Python packaging exposed what + +00:17:28.960 --> 00:17:30.040 +is called platform tags. + +00:17:30.390 --> 00:17:33.300 +So essentially, a sort of mini tag that + +00:17:33.740 --> 00:17:35.700 +comes with a specific definition that installers + +00:17:35.820 --> 00:17:36.540 +know how to resolve. + +00:17:37.160 --> 00:17:38.600 +And what we're trying to evolve is + +00:17:38.650 --> 00:17:43.280 +to end up creating 200 more today, and 200 more in two years, + +00:17:43.550 --> 00:17:46.160 +and 200 more again in four years. + +00:17:46.320 --> 00:17:51.600 +So we try to come up with a generic system that will allow you to essentially include + +00:17:51.620 --> 00:17:59.920 +the arbitrary definition that then resolvers and package managers can then understand by + +00:18:00.080 --> 00:18:07.100 +some sort of mechanism and resolve, but not create a sort of blessed list of things that + +00:18:07.100 --> 00:18:09.560 +you constantly have to update because it's a lot of maintenance. + +00:18:10.200 --> 00:18:12.320 +Yeah, that's how we got into the situation now, right? + +00:18:12.500 --> 00:18:14.380 +because there's one for the version, + +00:18:14.600 --> 00:18:17.060 +there's one for the architecture of the CPU, + +00:18:17.230 --> 00:18:19.760 +but then there's not a spot for the other stuff. + +00:18:19.960 --> 00:18:24.700 +So the overall idea is to say almost just a metadata section + +00:18:25.460 --> 00:18:28.760 +in there and things can read it or ignore it as they see fit. + +00:18:31.780 --> 00:18:33.300 +- Conceptually, yeah, a little bit. + +00:18:34.260 --> 00:18:34.600 +Yeah, yeah. + +00:18:34.850 --> 00:18:37.080 +I mean, it's like, I guess the question is like, okay, + +00:18:37.540 --> 00:18:39.840 +if we have this like huge space of things + +00:18:40.080 --> 00:18:42.460 +that we might possibly want to detect + +00:18:42.480 --> 00:18:48.100 +condition installs on. Like, okay, anytime someone publishes a wheel for Python, they should now tell + +00:18:48.280 --> 00:18:55.320 +us what CUDA version is it built for? If any, what CPU instruction sets does it support? Blah, blah, + +00:18:55.440 --> 00:18:59.140 +blah. Where would we put all that stuff? Becomes the question. It's like, what, are we just going + +00:18:59.140 --> 00:19:03.600 +to keep expanding the platform tag and everything else? And that's the problem that we're trying to + +00:19:03.660 --> 00:19:09.139 +solve in a generic way. Yeah, you can end up with a file name that's 4,000 characters wide or + +00:19:09.120 --> 00:19:09.380 +or something. + +00:19:10.180 --> 00:19:12.260 +They can already get pretty long, by the way. + +00:19:12.350 --> 00:19:13.420 +But yeah, that's... + +00:19:14.180 --> 00:19:18.960 +If we have any changes, we have to work around that in uv sometimes, file name length limits. + +00:19:19.170 --> 00:19:24.220 +But yeah, it's actually a very famous package that used, I think, 200 first digits of Pi + +00:19:24.320 --> 00:19:25.280 +as the version number. + +00:19:26.020 --> 00:19:27.180 +Oh, my gosh. + +00:19:28.980 --> 00:19:30.140 +It's a pretty good joke. + +00:19:30.420 --> 00:19:31.240 +I didn't know about it. + +00:19:31.320 --> 00:19:35.320 +There's somebody on discuss.hython.org that posted the link, and I was like, but that's + +00:19:35.500 --> 00:19:35.740 +hilarious. + +00:19:36.400 --> 00:19:37.800 +That is wild. + +00:19:38.480 --> 00:19:41.060 +So before we dive into what you all are proposing, + +00:19:41.580 --> 00:19:43.980 +let's maybe talk about how just a couple of packages + +00:19:45.080 --> 00:19:47.300 +or libraries solve this problem now + +00:19:47.460 --> 00:19:48.640 +in maybe different directions. + +00:19:49.500 --> 00:19:52.480 +So Ralph, what about NumPy? + +00:19:52.800 --> 00:19:56.520 +I mean, you talked about vectorization and stuff. + +00:19:57.340 --> 00:19:57.660 +Yeah. + +00:19:58.080 --> 00:19:59.860 +That's so in line with NumPy, right? + +00:19:59.880 --> 00:20:00.620 +Is NumPy like-- + +00:20:00.620 --> 00:20:00.720 +Yes. + +00:20:00.900 --> 00:20:02.640 +And pandas, that's the way. + +00:20:03.600 --> 00:20:04.080 +Yes. + +00:20:04.320 --> 00:20:04.700 +NumPy, yes. + +00:20:04.900 --> 00:20:05.280 +Pandas, no. + +00:20:06.700 --> 00:20:16.440 +So NumPy does contain SMD instructions and you know because it's incredibly useful for + +00:20:16.700 --> 00:20:21.680 +performance. NumPy has all large arrays and basic instructions on them that have direct hardware + +00:20:22.200 --> 00:20:30.080 +implementations typically. But the way it's done is incredibly complex because you need to end up + +00:20:30.100 --> 00:20:34.080 +with a wheel that works on every type of CPU, right? + +00:20:34.220 --> 00:20:38.600 +We didn't, you know, I'll stay with x86, but the same happens on the other platforms, right? + +00:20:38.660 --> 00:20:45.060 +You know, it needs to run on a 2010 CPU and it needs to run better on a 2024 CPU. + +00:20:46.360 --> 00:20:53.140 +So what we do in NumPy is we have a system that basically allows you to + +00:20:54.520 --> 00:20:59.420 +either parameterize a source file that, you know, and then rebuild it multiple times, you know, + +00:21:00.020 --> 00:21:03.240 +four different particular CPU architectures. + +00:21:03.320 --> 00:21:08.220 +So like a Haswell family and then a Skylake family and so on. + +00:21:09.419 --> 00:21:12.020 +And then we basically merge that together + +00:21:12.140 --> 00:21:14.260 +in a single Python extension module. + +00:21:15.240 --> 00:21:19.840 +And then at runtime, we have our own code to detect the CPU + +00:21:20.600 --> 00:21:23.880 +and basically then some like dispatch shim layer + +00:21:24.220 --> 00:21:27.380 +that kind of fishes out the right family + +00:21:27.720 --> 00:21:28.800 +from the extension module. + +00:21:29.920 --> 00:21:31.980 +So yeah, you put up the diagram there. + +00:21:32.180 --> 00:21:33.580 +It's pretty complicated. + +00:21:33.980 --> 00:21:39.280 +And I'd say there I've been collaborating with some of the, you know, + +00:21:39.730 --> 00:21:41.420 +world experts on this. + +00:21:41.850 --> 00:21:45.080 +We had like an in the end, + +00:21:45.690 --> 00:21:49.160 +this was only successful because we built a generic architecture + +00:21:49.940 --> 00:21:54.940 +that other experts per, you know, CPU architecture could come and contribute to. + +00:21:55.070 --> 00:21:59.400 +So we now have a specific team of like four people. + +00:22:00.320 --> 00:22:01.800 +that helped maintain the architecture. + +00:22:01.960 --> 00:22:05.520 +But then Intel for years paid one of their engineers + +00:22:05.920 --> 00:22:09.860 +to optimize specifically the x86 code path. + +00:22:10.480 --> 00:22:13.320 +And then ARM has a NumPy maintainer + +00:22:13.620 --> 00:22:15.840 +who got commit writes a few years ago. + +00:22:16.220 --> 00:22:17.860 +And he's the final authority + +00:22:18.140 --> 00:22:19.720 +on all the ARM instructions that are in there. + +00:22:20.520 --> 00:22:25.060 +So that whole complicated thing is now shipped + +00:22:25.100 --> 00:22:27.060 +and it's extremely good for performance. + +00:22:27.380 --> 00:22:29.840 +but you can see how this is not a scalable process + +00:22:29.980 --> 00:22:31.220 +to do in many batches. + +00:22:31.220 --> 00:22:31.860 +- Yeah, yeah. + +00:22:32.840 --> 00:22:34.780 +- Plus, if you compile everything five times, + +00:22:34.900 --> 00:22:37.240 +you get a binary that's not five times bigger, + +00:22:37.400 --> 00:22:38.240 +but it's a lot bigger. + +00:22:39.159 --> 00:22:41.300 +So it's not great for users as well. + +00:22:42.340 --> 00:22:43.980 +- Yeah, actually the nickname for these things + +00:22:44.020 --> 00:22:44.760 +are called Fat Bin. + +00:22:45.340 --> 00:22:49.640 +So you have the idea for why they'll go that way, + +00:22:49.740 --> 00:22:51.340 +because they tend to be pretty heavy to download. + +00:22:51.640 --> 00:22:51.940 +- Yeah, yeah. + +00:22:52.700 --> 00:22:53.940 +Instead of wheels, you got big wheels. + +00:22:57.140 --> 00:22:59.700 +So what happens if all these changes get adopted + +00:22:59.780 --> 00:23:02.400 +and it doesn't need to be compiled into one giant binary? + +00:23:03.880 --> 00:23:05.340 +Are all these maintainers still working + +00:23:05.440 --> 00:23:07.500 +and they just don't have to deal with trying to compile + +00:23:07.680 --> 00:23:08.580 +and boot it all into one thing? + +00:23:08.920 --> 00:23:11.780 +They might still have to do-- + +00:23:11.780 --> 00:23:13.380 +yes, I think essentially you're correct. + +00:23:13.540 --> 00:23:15.920 +You still need to write the actual code that + +00:23:16.120 --> 00:23:18.560 +uses the SIMD instructions. + +00:23:19.220 --> 00:23:21.580 +But then you can just produce a wheel that says, + +00:23:21.860 --> 00:23:23.980 +OK, it works on this specific CPU architecture. + +00:23:24.560 --> 00:23:27.100 +And just ignore this code if I'm building + +00:23:27.120 --> 00:23:32.140 +architecture and all the, you know, detecting the CPU at runtime and the dynamic dispatch + +00:23:32.560 --> 00:23:33.460 +features you all don't need. + +00:23:34.100 --> 00:23:35.420 +Will it make the code faster? + +00:23:36.940 --> 00:23:37.580 +It will. + +00:23:38.060 --> 00:23:40.240 +Well, will you have a better cache? + +00:23:40.400 --> 00:23:41.740 +Will there be smaller stuff in memory? + +00:23:41.810 --> 00:23:42.380 +You know, that kind of stuff. + +00:23:42.660 --> 00:23:45.580 +I don't think it will make the NumPy code much faster. + +00:23:46.840 --> 00:23:52.720 +It will, you know, it will make a huge difference for all the other packages that don't have + +00:23:52.750 --> 00:23:54.040 +this amount of complexity today. + +00:23:54.340 --> 00:23:58.420 +So like SciPy, scikit-learn, Pandas, Pillow, + +00:23:58.900 --> 00:24:02.240 +like none of these packages actually use SIMD code. + +00:24:02.980 --> 00:24:05.200 +And for SciPy, it's the easiest for me to talk about + +00:24:05.260 --> 00:24:06.620 +'cause I'm also a SciPy maintainer. + +00:24:07.040 --> 00:24:09.660 +We actually have a lot of code that, you know, + +00:24:09.920 --> 00:24:11.980 +got vendored in from somehow like Fourier transforms, + +00:24:12.160 --> 00:24:14.140 +for example, they benefit a lot as well. + +00:24:14.180 --> 00:24:19.460 +We have AVX2 and ARM Neon implementations, + +00:24:20.020 --> 00:24:22.840 +but we just don't build them and don't ship that as wheels + +00:24:22.860 --> 00:24:24.080 +because we have no way of doing that. + +00:24:24.940 --> 00:24:27.360 +As soon as we have, you know, wheel variants, + +00:24:27.540 --> 00:24:30.100 +we can say, okay, let's ship two sets of wheels. + +00:24:30.380 --> 00:24:32.660 +I mean, that's more CI jobs to build more wheels, + +00:24:33.020 --> 00:24:35.760 +but you know, when it's worth it, you know, + +00:24:35.840 --> 00:24:37.220 +you can make that trade off, right? + +00:24:37.400 --> 00:24:38.320 +Like we already have the code. + +00:24:38.340 --> 00:24:40.460 +We just have to change a build option, + +00:24:40.620 --> 00:24:41.760 +produce a different wheel and ship it. + +00:24:43.080 --> 00:24:45.200 +- So do you just set up something like a, + +00:24:45.620 --> 00:24:49.080 +on hash if def sort of thing for like, + +00:24:49.260 --> 00:24:51.500 +if defs this capability, + +00:24:52.500 --> 00:24:54.420 +else you put in the generic code. + +00:24:55.440 --> 00:24:55.920 +Exactly. + +00:24:56.880 --> 00:24:59.540 +Yeah, the C code is basically just a bunch of if-defs. + +00:24:59.960 --> 00:25:02.300 +And if you only-- + +00:25:02.600 --> 00:25:05.860 +for maintainability reasons, you only add more if-defs + +00:25:05.860 --> 00:25:07.320 +if it's really much faster. + +00:25:07.640 --> 00:25:09.820 +You're not going to do it for 10% or 20% faster. + +00:25:09.940 --> 00:25:14.040 +But if it's 2x faster, well, why not have an extra else branch? + +00:25:14.320 --> 00:25:15.120 +Yeah, absolutely. + +00:25:15.740 --> 00:25:17.880 +Charlie, does Rust have a hash if-def equivalent? + +00:25:18.920 --> 00:25:19.480 +It must, right? + +00:25:19.920 --> 00:25:21.480 +Yeah, you can do-- + +00:25:21.600 --> 00:25:27.060 +it has directives like that yeah but you guys don't really need to worry about using this for + +00:25:27.160 --> 00:25:35.140 +yourself this is more for the things that you service providing to everyone right yeah um + +00:25:36.700 --> 00:25:43.400 +yeah this is mostly this wouldn't have a huge impact on uv or um i mean it could have some + +00:25:43.510 --> 00:25:48.140 +small impact but i think i think largely this is about yeah how can we make it easier for users to + +00:25:48.160 --> 00:25:52.560 +consume this stuff. And I mean, the NumPy, like this is a good example of how it affects + +00:25:53.520 --> 00:25:58.480 +like build and distribution, because yes, they still have to write like architecture specific + +00:25:58.700 --> 00:26:03.160 +code if they want to get these optimizations. But what we'll be doing with these proposals is making + +00:26:03.360 --> 00:26:09.380 +it much easier for them to ship separate builds that are like dedicated for each of those different + +00:26:09.500 --> 00:26:16.680 +variants. So like the end user, you know, will get access to it. But in this case, it's like the + +00:26:16.560 --> 00:26:21.260 +bottleneck is or part of the bottleneck is like all the complexity it puts on the maintainers and + +00:26:21.260 --> 00:26:27.040 +the people publishing yeah how much do you think it would impact the performance to ship python + +00:26:27.360 --> 00:26:36.360 +standalone with different cpu extension oh um that is a good question jonathan so we'd actually like + +00:26:36.390 --> 00:26:44.380 +to do uh um i don't know that i have a great answer to that i mean this they like a good + +00:26:44.400 --> 00:26:49.560 +quantitative answer to it. I think we are very interested in doing stuff like that. We've also + +00:26:49.740 --> 00:26:55.820 +considered, for example, shipping a build, like we ship with a relatively old like glibc minimum, + +00:26:56.060 --> 00:27:03.260 +we've considered shipping a build, a variant, not in the sense of the tab, sorry, a different build, + +00:27:03.700 --> 00:27:07.280 +let me just put it that way, that uses a more modern glibc version, for example. + +00:27:09.300 --> 00:27:13.280 +We do run into other problems with that, like our build matrix is really big, we have to split it + +00:27:13.260 --> 00:27:14.880 +across multiple GitHub actions now. + +00:27:15.040 --> 00:27:17.660 +And so we just have a lot of builds. + +00:27:17.900 --> 00:27:20.820 +So we're worried about doubling the size of the build matrix, + +00:27:21.220 --> 00:27:23.360 +for example, but that's a separate problem. + +00:27:24.460 --> 00:27:26.640 +But yes, it could actually be helpful there, + +00:27:26.960 --> 00:27:28.200 +although we don't ship those as wheels today. + +00:27:28.840 --> 00:27:29.760 +Yeah, that's awesome. + +00:27:30.940 --> 00:27:34.180 +In a very interesting angle to think about how much leverage-- + +00:27:35.160 --> 00:27:37.880 +I mean, this is probably something you've thought about. + +00:27:38.040 --> 00:27:42.040 +But how much leverage you and your team actually + +00:27:42.060 --> 00:27:46.980 +have on Python performance by how you control Python Build + +00:27:47.140 --> 00:27:47.360 +Standalone. + +00:27:48.120 --> 00:27:50.620 +Maybe just tell people, what is the relevance there? + +00:27:51.539 --> 00:27:53.720 +What is Python Build Standalone, and how does this even + +00:27:53.920 --> 00:27:54.820 +apply to what we're talking about? + +00:27:55.620 --> 00:27:56.420 +Oh, yeah, sure. + +00:27:56.420 --> 00:27:57.500 +I use it every day. + +00:27:57.520 --> 00:27:57.920 +I love it. + +00:27:58.580 --> 00:28:00.100 +A lot of people use it and don't even know. + +00:28:00.180 --> 00:28:05.340 +I mean, it's probably the least public or direct user + +00:28:05.780 --> 00:28:07.160 +facing thing that we do. + +00:28:07.380 --> 00:28:10.540 +But we took over a maintenance of a project called Python Build + +00:28:10.560 --> 00:28:19.560 +standalone, probably like a year ago, maybe a little more. And that project, the basic idea is + +00:28:19.900 --> 00:28:24.540 +like typically when you build CPython, you know, at least like on Linux, for example, + +00:28:25.200 --> 00:28:29.140 +a bunch of absolute paths get embedded into the binary, which makes it hard to build like + +00:28:29.440 --> 00:28:34.180 +reproducible and relocatable CPythons. Like it's hard for someone to build a CPython that you can + +00:28:34.200 --> 00:28:38.140 +then download and run on your machine. You typically need to build it on your own machine. + +00:28:39.500 --> 00:28:44.260 +So what this project does is it's sort of like a fork of the CPython build system. + +00:28:44.520 --> 00:28:48.000 +It's like the CPython build system with a bunch of patches and other changes applied on top. + +00:28:48.380 --> 00:28:55.240 +And it makes it so that we can build Pythons that you can just download, unzip, and run. + +00:28:55.820 --> 00:29:01.640 +So when you install Python with uv, and these are also used in Bazel and a bunch of other tools, + +00:29:02.680 --> 00:29:04.180 +we don't actually build Python from source. + +00:29:04.320 --> 00:29:08.320 +We actually download, unzip, and run Python, which just makes it much easier. + +00:29:08.560 --> 00:29:10.080 +It means it's faster. + +00:29:10.400 --> 00:29:14.620 +You don't have to have the build tool chain on your machine. + +00:29:14.900 --> 00:29:17.780 +You don't run into problems around failing to build it or anything like that. + +00:29:19.280 --> 00:29:21.620 +But the other thing that's been cool about that project, at least recently, + +00:29:21.640 --> 00:29:23.080 +is we've been very focused on performance. + +00:29:24.300 --> 00:29:27.560 +So on actually just trying to make sure that we're distributing. + +00:29:28.420 --> 00:29:31.000 +Our goal is to be the fastest Python distribution, + +00:29:31.280 --> 00:29:33.000 +even without changing CPython source code, + +00:29:33.300 --> 00:29:38.280 +just changing how we build it and various things that we can tweak there. + +00:29:38.460 --> 00:29:39.940 +So we've been working on a bunch of benchmarks around. + +00:29:40.290 --> 00:29:42.120 +I do think we have the fastest Python now, + +00:29:42.170 --> 00:29:46.120 +but we haven't actually published our rigorous benchmark methodology. + +00:29:46.400 --> 00:29:49.560 +So I won't stake my reputation on that claim yet, + +00:29:49.650 --> 00:29:50.820 +but we've been very focused on it. + +00:29:50.870 --> 00:29:53.160 +And it's been a cool point of leverage because like we can just, + +00:29:53.800 --> 00:29:55.040 +yeah, if we can make Python, you know, + +00:29:55.070 --> 00:29:58.280 +if we can put out a Python distribution that's like 10 or 15% faster, + +00:29:59.000 --> 00:30:00.300 +you know, just by changing how we build it. + +00:30:01.100 --> 00:30:02.080 +Yeah, it's a big lever for impact. + +00:30:03.020 --> 00:30:03.780 +Yeah, it's a huge lever. + +00:30:03.910 --> 00:30:07.500 +And I hadn't really thought about it being a lever until Jonathan brought it up. + +00:30:08.260 --> 00:30:11.660 +It's not directly impacted by this because we don't ship it. + +00:30:11.750 --> 00:30:13.480 +I guess for the reason that we don't ship it as a wheel. + +00:30:13.910 --> 00:30:15.240 +Although someday we potentially could. + +00:30:15.350 --> 00:30:18.640 +Right now they're just files that uv knows how to install, basically. + +00:30:18.670 --> 00:30:20.200 +But it's the same logic at the core. + +00:30:20.740 --> 00:30:23.900 +Once you start tweaking the packaging of Python packages, + +00:30:24.070 --> 00:30:27.820 +the next part you want to tweak is your Python install. + +00:30:28.740 --> 00:30:29.120 +Yeah. + +00:30:30.490 --> 00:30:34.800 +Well, for example, all of my stuff that runs on the servers, + +00:30:36.660 --> 00:30:41.620 +all in Docker and it has a base Docker image and one of the very first lines is + +00:30:42.040 --> 00:30:47.580 +you know installed use curl plus the shell to install uv the next line is uv + +00:30:48.120 --> 00:30:54.620 +V and V and that that installs Python from Python build standalone and then + +00:30:54.900 --> 00:30:59.360 +whatever you need to make an actual app out of that afterwards right and so how + +00:30:59.460 --> 00:31:04.400 +many people are doing that I if it seems like a huge portion of the world has + +00:31:04.380 --> 00:31:08.600 +adopted uv for sort of bootstrapping Python instead of the other way. + +00:31:08.840 --> 00:31:10.660 +So that's why it's such a big lever, right? + +00:31:11.860 --> 00:31:12.880 +Yep. Yeah, exactly. + +00:31:14.860 --> 00:31:15.100 +All right. + +00:31:15.420 --> 00:31:23.680 +As a way to sort of get into the peps, Charlie, you mentioned variants. + +00:31:23.740 --> 00:31:25.160 +You're like, wait, wait, wait, not that variant. + +00:31:27.280 --> 00:31:30.040 +What variant are we talking about that's not that variant? + +00:31:31.020 --> 00:31:33.380 +What is that variant, I guess, that we're not talking about in uv? + +00:31:34.280 --> 00:31:35.080 +or Python build standalone. + +00:31:37.000 --> 00:31:37.900 +Who wants to take that? + +00:31:38.160 --> 00:31:38.860 +Ralph, do you want to take that? + +00:31:40.340 --> 00:31:42.080 +I'm not actually sure what the question is here. + +00:31:42.560 --> 00:31:44.700 +I think you were targeted for the question. + +00:31:45.240 --> 00:31:45.640 +Water variant? + +00:31:47.120 --> 00:31:47.700 +Yeah, yeah. + +00:31:48.120 --> 00:31:48.600 +No, that's fine. + +00:31:48.740 --> 00:31:50.820 +I mean, like, so we use, + +00:31:50.900 --> 00:31:53.120 +so the PEP revolves around this concept of wheel variants. + +00:31:53.620 --> 00:31:57.880 +And the idea is you can have, + +00:31:59.080 --> 00:32:00.260 +I'll keep using the word variants. + +00:32:00.360 --> 00:32:01.560 +You can have different variants, + +00:32:01.760 --> 00:32:09.220 +different builds of a wheel that are intended to be installed based on properties that are + +00:32:10.040 --> 00:32:19.820 +known or detected on the machine. So for example, that could be like, okay, what NVIDIA drivers do + +00:32:19.820 --> 00:32:24.520 +you have on your machine? What are the versions of those drivers? Because that implies things about + +00:32:24.610 --> 00:32:30.180 +what versions of the CUDA runtime you can use. And so when someone publishes a wheel, maybe that wheel + +00:32:30.920 --> 00:32:35.880 +leverages CUDA and needs to be built against CUDA and needs to be built in a way that leverages CUDA. + +00:32:36.300 --> 00:32:42.420 +And so they might publish different variants, effectively just slightly different versions of + +00:32:42.940 --> 00:32:48.560 +versions is wrong, different variants, slightly different flavors of that package that are all + +00:32:48.640 --> 00:32:54.000 +built against different CUDA versions. And so we would call those different variants. + +00:32:54.700 --> 00:32:57.020 +Okay. It's a-- + +00:32:57.060 --> 00:32:57.580 +Please correct me. + +00:32:57.640 --> 00:33:02.780 +across what I understand the packaging space, + +00:33:02.910 --> 00:33:03.980 +even outside of Python. + +00:33:04.190 --> 00:33:07.260 +If you type variants in general, + +00:33:07.770 --> 00:33:09.300 +we try to reuse the terminology + +00:33:09.840 --> 00:33:13.500 +that ends up being pretty widely adopted + +00:33:14.040 --> 00:33:15.500 +in the packaging ecosystem, + +00:33:15.760 --> 00:33:17.960 +not Python packaging, the packaging at large. + +00:33:20.180 --> 00:33:21.840 +Variants is the name that you'll find around + +00:33:22.320 --> 00:33:23.540 +for this kind of concept. + +00:33:24.640 --> 00:33:25.580 +Related to that, + +00:33:26.940 --> 00:33:30.620 +especially in the astral flavor these days, + +00:33:30.690 --> 00:33:32.140 +but also in many other areas. + +00:33:32.550 --> 00:33:34.560 +I feel like crates and Rust, + +00:33:34.710 --> 00:33:36.440 +what they've done with their packaging system + +00:33:36.550 --> 00:33:39.100 +has kind of influenced some of the things + +00:33:39.240 --> 00:33:40.720 +we're adopting in the Python world. + +00:33:42.500 --> 00:33:46.160 +Has anything from the Rust world influenced + +00:33:46.490 --> 00:33:47.760 +these peps that we're about to talk about? + +00:33:48.080 --> 00:33:51.060 +Well, crates are source distribution, though, mostly. + +00:33:52.240 --> 00:33:52.440 +Yeah. + +00:33:54.860 --> 00:33:58.020 +In this case, we're talking about actually binary distribution. + +00:33:58.220 --> 00:33:58.700 +Yeah, yeah, yeah. + +00:33:58.820 --> 00:33:59.240 +So not really. + +00:33:59.450 --> 00:33:59.560 +Okay. + +00:34:00.820 --> 00:34:01.580 +But in a sense... + +00:34:01.620 --> 00:34:02.820 +That's actually interesting, right? + +00:34:03.260 --> 00:34:03.520 +Yes. + +00:34:04.820 --> 00:34:12.820 +Because a lot of the best packaging systems, whether it's Rust or Nix, they start from source, + +00:34:13.240 --> 00:34:15.500 +and they know exactly what's in the box. + +00:34:16.389 --> 00:34:19.260 +And then binaries are kind of like an optimization, right? + +00:34:19.440 --> 00:34:24.300 +It's like you have a thing that you know exactly what is the binary, and you can check, like, + +00:34:24.720 --> 00:34:26.620 +oh, I don't have to build this thing from source, + +00:34:26.780 --> 00:34:28.100 +I can grab a binary somewhere. + +00:34:28.860 --> 00:34:30.860 +Python packaging is absolutely not like that. + +00:34:31.120 --> 00:34:33.780 +Like if you build a wheel and you have an S-dist, + +00:34:34.440 --> 00:34:37.080 +I mean, you have no idea if they're the same thing. + +00:34:37.200 --> 00:34:39.280 +If you, you know, you cannot rebuild the wheel + +00:34:39.480 --> 00:34:41.580 +from the S-dist unless you use very, + +00:34:42.020 --> 00:34:45.419 +very well predefined constraints. + +00:34:45.560 --> 00:34:47.080 +- Yeah, yeah, I hadn't really thought about that either, + +00:34:47.240 --> 00:34:49.919 +but that is an interesting juxtaposition. + +00:34:50.260 --> 00:34:53.500 +Like the binary stuff that is all binary + +00:34:53.520 --> 00:34:57.800 +shipping a source but the interpreted stuff is shipping as binary and I think + +00:34:58.440 --> 00:35:03.720 +part of the reason or maybe the main reason is before talking about binary + +00:35:04.040 --> 00:35:09.400 +stuff for rust well it's all rust that's compiled but for Python it's this mix + +00:35:10.220 --> 00:35:14.780 +this crazy mix of all these different libraries that are not none of them are + +00:35:14.820 --> 00:35:19.080 +Python but they're all binary in the end and so you've got to get around the + +00:35:19.200 --> 00:35:23.480 +fact like well I don't have a Fortran and a Haskell compiler so I can't run + +00:35:23.500 --> 00:35:29.760 +project you know there's something quite amazing to python in general which is called the cffi + +00:35:29.770 --> 00:35:35.560 +so the c foreign function interface which essentially allows you to build any sort of + +00:35:36.440 --> 00:35:44.060 +application you want in whatever language as long as you're compatible with cffi standard + +00:35:44.790 --> 00:35:52.440 +you can call it from python and it's incredible and amazingly useful um but to come back on what + +00:35:53.160 --> 00:35:58.640 +Ralph was saying that a lot of the design actually from We Are Variant has been inspired by a + +00:36:02.220 --> 00:36:10.580 +system that is called SPAC that was designed for supercomputers. And we use this especially + +00:36:11.220 --> 00:36:16.820 +around the design of CPU variants to kind of get a lot of inspiration around a package called + +00:36:16.860 --> 00:36:24.640 +our spec that is just from my perspective pure brilliance in some of design um just my words but + +00:36:25.000 --> 00:36:31.180 +in my opinion but but i i really think they got the thing right it's it's it's just beautifully + +00:36:31.440 --> 00:36:37.220 +designed everything is static in json fired and it's extremely easy to to scale and maintain um + +00:36:37.760 --> 00:36:44.360 +but yes if you if you take all the kind of system designed to support the most specific + +00:36:45.340 --> 00:36:51.360 +like deployment scenarios like SPAC, like Nix, or even in some cases, Cargo, + +00:36:51.840 --> 00:36:57.480 +well, they mostly ship sources to go around this variant problem because that allows you to control + +00:36:57.670 --> 00:37:02.660 +the entire build chain, essentially. And in some cases, maybe Ralph can talk about it, but Conda + +00:37:02.800 --> 00:37:10.280 +Forge also kind of take an approach that is similar to Nix to kind of go around these issues a little + +00:37:10.220 --> 00:37:15.900 +bit maybe ralph if you want to talk about not quite because conda and connor forge don't do + +00:37:16.200 --> 00:37:22.040 +source distributions at all right and they just take a release and they build binaries and if + +00:37:22.140 --> 00:37:27.220 +there are no binaries you can't install it uh but yeah i would say that's a good point right we have + +00:37:27.320 --> 00:37:32.860 +people that worked on all these systems like uh one of jonathan's colleagues at nvidia mike saran + +00:37:33.100 --> 00:37:38.760 +used to work on conda i contribute to connor forge as well and like so we have some of the some + +00:37:38.780 --> 00:37:44.420 +ideas that originally came from conda some that came from spec and like you know the end result + +00:37:44.540 --> 00:37:49.380 +is nothing like you know not exactly like any of those systems but it takes some of the best + +00:37:49.880 --> 00:37:51.940 +aspects of them to enhance python packaging + +00:37:54.660 --> 00:38:00.080 +not reinventing the wheel i mean maybe but not too much yeah not too much + +00:38:02.680 --> 00:38:06.040 +but it's kind of it's it's it's cool because i think um + +00:38:08.280 --> 00:38:10.760 +I feel like a lot of this work really got kicked off. + +00:38:10.840 --> 00:38:12.400 +We did an in-person summit. + +00:38:14.580 --> 00:38:17.400 +And I honestly can't remember when that was because my mind is so blurred. + +00:38:17.400 --> 00:38:17.900 +March 2025. + +00:38:19.480 --> 00:38:19.760 +Thank you. + +00:38:19.820 --> 00:38:20.840 +Okay, so it was about a year ago. + +00:38:21.480 --> 00:38:22.960 +And there's a bunch of notes about this. + +00:38:23.580 --> 00:38:31.000 +And we had people from probably like, I don't know, I'd have to guess 20 different companies, maybe more, all in person for a day. + +00:38:31.820 --> 00:38:33.280 +Just talking about these problems. + +00:38:33.540 --> 00:38:36.580 +and a bunch of people presented on their own open source projects and how they intersect + +00:38:37.560 --> 00:38:41.340 +with like we had people from PyTorch, people from the JAX team, just talking about like how, + +00:38:41.740 --> 00:38:47.000 +what their concerns are, like what's working well for them, what's not. And so, you know, + +00:38:47.280 --> 00:38:51.400 +similarly to how we've, I think a lot of the design has really been influenced by + +00:38:52.060 --> 00:38:55.060 +like what are other designs, what's the prior art and like what's working well. + +00:38:56.220 --> 00:39:00.520 +You know, a lot of it was also informed by like just talking to a bunch of people across the + +00:39:00.540 --> 00:39:02.660 +industry and understanding what their concerns are. + +00:39:03.080 --> 00:39:04.900 +And so at least from my perspective, + +00:39:07.500 --> 00:39:09.520 +honestly, by calendar time, I have not + +00:39:09.540 --> 00:39:10.960 +been involved in Python that long. + +00:39:10.980 --> 00:39:15.540 +But it's been definitely the most cross-company, + +00:39:15.760 --> 00:39:17.360 +cross-project, cross-organization effort + +00:39:17.620 --> 00:39:19.540 +I've been involved in by a lot. + +00:39:21.000 --> 00:39:25.500 +We tried to replicate a model that I really + +00:39:25.660 --> 00:39:28.080 +like in the Python community that was PyStasy Python. + +00:39:29.100 --> 00:39:32.760 +We tried to philosophically create the packaging child + +00:39:33.140 --> 00:39:34.200 +of faster CPython. + +00:39:35.740 --> 00:39:37.960 +And that's how we created We Are Next. + +00:39:38.480 --> 00:39:42.980 +It was all the amazing work that the faster CPython + +00:39:43.220 --> 00:39:45.900 +community did on the CPython side, + +00:39:46.540 --> 00:39:48.860 +and kind of creating the same synergy + +00:39:49.140 --> 00:39:50.280 +but around Python packaging. + +00:39:51.000 --> 00:39:51.580 +And that's why-- + +00:39:51.580 --> 00:39:51.760 +OK. + +00:39:52.100 --> 00:39:56.800 +I would almost say it's even quite a bit more diverse. + +00:39:57.100 --> 00:39:59.520 +at least my understanding is faster supply thinners + +00:39:59.640 --> 00:40:02.120 +primarily like funded and created by Microsoft + +00:40:02.260 --> 00:40:04.600 +and it kind of turned into a community thing. + +00:40:05.420 --> 00:40:07.720 +But like all the money came from Microsoft, I think. + +00:40:07.860 --> 00:40:09.560 +- I think the majority of the people were working + +00:40:10.040 --> 00:40:11.980 +in a team inside Microsoft at least. + +00:40:12.100 --> 00:40:14.800 +- Yeah, and here we've got Nvidia, + +00:40:16.240 --> 00:40:17.780 +Meta, the PyTorch folks at Meta. + +00:40:18.620 --> 00:40:20.860 +We got some contributions from AMD and Intel + +00:40:21.120 --> 00:40:22.960 +and then Astrol, QuantSight, + +00:40:24.340 --> 00:40:29.020 +A large amount of the time that we've been able to spend at QuantSight + +00:40:29.180 --> 00:40:31.880 +came from funding from Red Hat, who came with their own problem sets. + +00:40:33.660 --> 00:40:36.240 +And that's just the most prominent contributor. + +00:40:36.500 --> 00:40:40.240 +So there's at least 10 companies that started investing in this + +00:40:40.520 --> 00:40:42.120 +because it solves so many problems. + +00:40:42.400 --> 00:40:42.960 +Yeah, that's really encouraging as well. + +00:40:42.960 --> 00:40:46.560 +If you go on the menu on the left side, you'll see a section called Who We Are. + +00:40:47.480 --> 00:40:49.660 +Yeah, so I pulled up this project, Wheel Next. + +00:40:50.780 --> 00:40:50.880 +And Ralph-- + +00:40:50.880 --> 00:40:51.800 +Yeah, who are we? + +00:40:51.880 --> 00:40:53.580 +And you'll see all the names of the companies. + +00:40:54.640 --> 00:40:56.540 +and the name of also the open source project + +00:40:56.740 --> 00:40:59.580 +that contributed time and expertise. + +00:41:00.260 --> 00:41:01.340 +And kind of-- + +00:41:01.340 --> 00:41:03.680 +MARK BLYTH: AMD, Anaconda, Aprio, Astral, Google, + +00:41:04.000 --> 00:41:07.840 +Huawei, Intel, Laplev, Meta, NVIDIA, Preferred Networks, + +00:41:08.300 --> 00:41:10.960 +Probable, QuantSight, and Red Hat. + +00:41:11.080 --> 00:41:13.960 +That's a bit of a group working on this. + +00:41:14.220 --> 00:41:16.340 +VINCENT DELAMON: And you can see just above all the different + +00:41:16.720 --> 00:41:19.980 +open source project that different OSS and lead + +00:41:20.100 --> 00:41:22.300 +maintainers have contributed time and energy + +00:41:22.320 --> 00:41:25.400 +to kind of try to make this move forward. + +00:41:25.480 --> 00:41:27.200 +So it is quite a few people. + +00:41:28.160 --> 00:41:33.560 +Yeah, most notably maybe QPy and PyTorch possibly. + +00:41:33.800 --> 00:41:34.680 +I mean, they're all-- + +00:41:34.680 --> 00:41:37.100 +Maybe one company that is not too well known, + +00:41:37.840 --> 00:41:40.100 +and undeservingly, because they should, + +00:41:40.460 --> 00:41:44.120 +which is probable at the bottom that you mentioned, + +00:41:44.260 --> 00:41:48.240 +which is essentially the support company behind scikit-learn. + +00:41:48.420 --> 00:41:55.920 +So if people don't know it, Probable is essentially representing site. + +00:41:58.560 --> 00:42:01.500 +Yeah, so this is wheelnext.dev. + +00:42:01.500 --> 00:42:06.580 +This is basically the website for the group, the working group, something like that. + +00:42:07.240 --> 00:42:07.440 +Yep. + +00:42:08.300 --> 00:42:11.880 +We try to leave our notes, our thinking, our drafts. + +00:42:13.500 --> 00:42:16.160 +One aspect that I really like on the work that we did + +00:42:16.180 --> 00:42:18.200 +is that it kind of felt like a startup. + +00:42:18.440 --> 00:42:21.100 +We were making a mockup and iterating very fast + +00:42:21.860 --> 00:42:24.420 +and getting feedback and this, I don't like this, + +00:42:24.480 --> 00:42:26.600 +I don't like this, I don't like change it. + +00:42:26.800 --> 00:42:29.900 +I worked really closely with two people, + +00:42:31.080 --> 00:42:34.860 +one from Quantside, one from Astro, Constantine and Michel, + +00:42:35.310 --> 00:42:38.260 +and we did so many hours of work, + +00:42:39.020 --> 00:42:40.940 +so many different prototypes iterating, + +00:42:41.980 --> 00:42:46.580 +having exposing the work to people, collecting feedback, + +00:42:47.220 --> 00:42:50.720 +adjusting, and repeating the cycle so many times + +00:42:50.960 --> 00:42:52.260 +until we finally got to something + +00:42:52.560 --> 00:42:54.160 +that we thought was reasonable. + +00:42:54.820 --> 00:42:57.000 +And that's where we started to write the PEP essentially. + +00:42:57.480 --> 00:42:58.820 +But that process took us a year. + +00:43:00.900 --> 00:43:03.620 +- All right, well, we should probably jump into the PEPs. + +00:43:03.740 --> 00:43:04.560 +And I'll tell you what, + +00:43:06.000 --> 00:43:10.540 +you all have quite the authorship attribution here, + +00:43:10.640 --> 00:43:12.820 +But also, I believe, correct me if I'm wrong, + +00:43:12.950 --> 00:43:16.240 +that this PEP is notable in that it's the longest PEP ever, + +00:43:16.480 --> 00:43:17.300 +something like that, right? + +00:43:19.619 --> 00:43:21.700 +- Yeah, I don't know if it's an achievement to be proud of. + +00:43:24.900 --> 00:43:27.240 +- It's the most powerful PEP ever written. + +00:43:27.280 --> 00:43:27.600 +- Yes. + +00:43:28.740 --> 00:43:29.500 +- It's a super PEP. + +00:43:30.159 --> 00:43:33.360 +So, so much so that, so we're talking about PEP 817 + +00:43:33.680 --> 00:43:35.520 +wheel variants, which is the variant thing + +00:43:35.800 --> 00:43:37.780 +that we actually are talking about, not the other variants, + +00:43:38.380 --> 00:43:39.540 +beyond platform tags. + +00:43:40.200 --> 00:43:43.540 +but then so much so that it actually got kicked to the curb for like, well, + +00:43:43.660 --> 00:43:46.420 +what is the minimal viable PEP of this PEP? + +00:43:46.960 --> 00:43:49.820 +And so he can take it in steps and Jonathan, + +00:43:49.870 --> 00:43:53.340 +you just told me really good news that PEP. + +00:43:53.580 --> 00:43:58.180 +So you spun off this other PEP, PEP 825 wheel variance package format, + +00:43:58.380 --> 00:44:00.700 +which is smaller, which still has a significant authorship, + +00:44:02.280 --> 00:44:06.000 +but that this was just, it says draft, but is that true? + +00:44:08.820 --> 00:44:15.000 +Yes, so perhaps maybe Ralph, you want to discuss a little about what's the process of PEP that I think that's-- + +00:44:15.120 --> 00:44:15.920 +Yeah, so-- + +00:44:15.940 --> 00:44:16.640 +That would be important. + +00:44:17.080 --> 00:44:30.400 +When you submit a PEP, it first, you know, submit it on GitHub and then there's a group of folks called the PEP editors who basically just edit, you know, they review it for clarity, you know, language, consistency with other PEPs and so on. + +00:44:31.090 --> 00:44:33.860 +So they don't really look at the content of what you're proposing. + +00:44:35.920 --> 00:44:38.840 +So just as long as it's clear, they're happy, you merge it in. + +00:44:38.900 --> 00:44:40.700 +But because the first PAP was already so long, + +00:44:40.840 --> 00:44:43.160 +that process took like over a month already. + +00:44:45.000 --> 00:44:47.020 +But at that point, it's merged as draft. + +00:44:47.140 --> 00:44:49.880 +And then you go to the Python packaging discourse + +00:44:50.500 --> 00:44:53.520 +where you say, okay, here's our PAP. + +00:44:54.020 --> 00:44:56.620 +Now, please, let's start the actual community review. + +00:44:57.280 --> 00:45:00.920 +And then basically anybody with an opinion can weigh in. + +00:45:01.480 --> 00:45:03.420 +And it's a form. + +00:45:04.400 --> 00:45:05.900 +there's not even a threaded forum. + +00:45:06.040 --> 00:45:08.020 +So it's just one long thread of comments, + +00:45:08.180 --> 00:45:10.420 +which tends to make it like a little challenging. + +00:45:11.100 --> 00:45:12.840 +The more complex the topic gets, + +00:45:13.500 --> 00:45:16.020 +the harder it is to make sense of this conversation. + +00:45:16.700 --> 00:45:18.600 +- It's really hard to have a threaded + +00:45:20.140 --> 00:45:22.740 +multi-component conversation, it is. + +00:45:23.440 --> 00:45:23.920 +- Exactly. + +00:45:24.380 --> 00:45:27.500 +So that's one of the reasons it's now split + +00:45:27.580 --> 00:45:28.400 +into smaller parts. + +00:45:28.460 --> 00:45:30.040 +So you can at least have separate threads + +00:45:30.180 --> 00:45:31.300 +about different topics, right? + +00:45:32.440 --> 00:45:36.860 +Because especially not all of the parts of the design apply to everybody. + +00:45:37.180 --> 00:45:43.280 +When we're talking about installers, we want to hear primarily from the authors of uv and + +00:45:43.560 --> 00:45:47.260 +PIP, Poetry, Hatch, PDM. + +00:45:47.900 --> 00:45:51.880 +But if we're talking about how do you build a wheel, well, we have to talk primarily to + +00:45:52.200 --> 00:45:58.320 +setup tools, scikit, build core, mess on Python, the build backends. + +00:45:58.920 --> 00:46:01.060 +And the index server the same. + +00:46:01.440 --> 00:46:08.780 +want to know that the pi pi maintainers are happy so that's why you know organizing this review and + +00:46:08.900 --> 00:46:13.580 +chopping it up into complex parts it's still going to be really hard to get the right amount of + +00:46:13.780 --> 00:46:22.000 +feedback but we now have like the first pr you know the first merge path in draft status so it's + +00:46:22.000 --> 00:46:28.720 +going to only be accepted once the whole review community review process is done and probably what + +00:46:28.720 --> 00:46:33.320 +will happen is it's going to be provisionally accepted only because we know there's like three + +00:46:33.440 --> 00:46:40.400 +more paths coming for the other parts and eventually like the you know you want you want all four to be + +00:46:41.030 --> 00:46:45.900 +you know working and accepted like you know we now have prototypes but you know we want the + +00:46:46.580 --> 00:46:50.920 +prototypes for the final design and have like you know the tool author say like yeah this works for + +00:46:51.020 --> 00:46:54.920 +us before you really go from provisional to actually accept it + +00:46:58.840 --> 00:47:03.180 +amazing so yeah this is this is part of what i get out when i said at the beginning that this + +00:47:03.360 --> 00:47:07.980 +touches like every part of the packaging stack there's just like that it's very hard to break + +00:47:08.130 --> 00:47:13.300 +it up into i mean that's what we're trying to do in some sense but like it's from the start it's + +00:47:13.300 --> 00:47:17.520 +been hard it's hard to there aren't necessarily super great cut points because it does affect + +00:47:18.580 --> 00:47:23.679 +how you build packages how you publish them like how they get hosted and served from the registry + +00:47:23.700 --> 00:47:29.160 +how installers look at them and understand them, all of those things, like marker syntax, + +00:47:29.580 --> 00:47:32.240 +all of that stuff gets impacted in different ways. + +00:47:33.340 --> 00:47:34.400 +It's very funny. + +00:47:34.460 --> 00:47:41.320 +As we were prototyping this for a year, we ended up pretty much forking the entire ecosystem. + +00:47:42.980 --> 00:47:43.920 +It got forked. + +00:47:43.920 --> 00:47:45.040 +It got forks. + +00:47:45.420 --> 00:47:46.620 +Warehouse got forked. + +00:47:47.960 --> 00:47:49.400 +Packaging got forked. + +00:47:49.480 --> 00:47:53.420 +Like absolutely every package in the ecosystem had been forked. + +00:47:53.740 --> 00:47:55.380 +because we needed to test our implementation. + +00:47:56.640 --> 00:47:56.760 +And we needed to -- + +00:47:56.760 --> 00:47:59.220 +The goal, of course, is to un-fork those things. + +00:47:59.560 --> 00:47:59.740 +Yes. + +00:48:00.120 --> 00:48:00.240 +Yeah. + +00:48:00.840 --> 00:48:02.120 +It's to re-emerge that. + +00:48:02.440 --> 00:48:07.860 +But we needed to have a playground to be able to experiment + +00:48:08.040 --> 00:48:11.600 +and see how the concept that we were developing was functioning + +00:48:11.700 --> 00:48:15.680 +in pip and then in packaging, but then also in setup tools, + +00:48:16.520 --> 00:48:20.380 +and then in scikit build core, and then in Python method. + +00:48:20.440 --> 00:48:23.500 +But it just keeps spreading essentially + +00:48:23.570 --> 00:48:26.580 +to every single corner of the packaging, installation, + +00:48:26.950 --> 00:48:29.200 +and distribution aspect of Python. + +00:48:29.730 --> 00:48:30.760 +So that was pretty funny. + +00:48:31.780 --> 00:48:32.300 +Yeah. + +00:48:32.960 --> 00:48:33.440 +What-- + +00:48:33.540 --> 00:48:36.100 +Yeah, we have a fork in uv. + +00:48:36.390 --> 00:48:38.360 +Or I guess, technically, it's just a branch + +00:48:40.180 --> 00:48:42.860 +that Constantine on our team on here on the PEP + +00:48:43.120 --> 00:48:45.820 +has been-- who's been super involved-- + +00:48:45.870 --> 00:48:49.099 +oh, thanks-- who's been super involved throughout + +00:48:49.120 --> 00:48:53.400 +done a ton of work on basically implementing the standard in uv. So we have like a working + +00:48:53.860 --> 00:48:59.560 +implementation that we've used to, yeah, you can actually install it, from, you know, + +00:48:59.560 --> 00:49:04.820 +we basically distribute it to a slightly different URL so you can install it and test it. but, + +00:49:04.850 --> 00:49:10.200 +uh, yeah, that's been, that fork has evolved a lot or that branch has evolved a lot and it's + +00:49:10.240 --> 00:49:14.600 +been a lot of work to, it's been incredibly helpful for the design process for us to understand + +00:49:14.840 --> 00:49:19.080 +like what's hard, what's easy. And then, I also think it's important for PES just to have + +00:49:19.100 --> 00:49:22.600 +working implementations too. And I mean, a lot of people agree that's not an awful point, but + +00:49:23.100 --> 00:49:26.980 +that's been one of the goals too, is to show what it's like in practice and that it actually works. + +00:49:28.840 --> 00:49:34.720 +So people want to play around with this. An easy way might be to try to use this fork. + +00:49:35.280 --> 00:49:42.200 +We put a lot of work to actually make it go ahead and try it. Because I think it's... + +00:49:43.760 --> 00:49:48.420 +I personally have a lot of admiration for the work done in free-threading Python. + +00:49:49.100 --> 00:49:56.380 +especially to the PEP and i think sam gross who is the main author managed to make significant + +00:49:56.550 --> 00:50:02.380 +amount of progress as he was coming up with prototypes that uh it's not just my word let + +00:50:02.460 --> 00:50:08.540 +me show you to you it works and yeah there was so much skepticism around that idea of free + +00:50:08.720 --> 00:50:15.300 +threaded python he had to had to show not tell but i think if we didn't do the work similarly on + +00:50:15.320 --> 00:50:22.460 +variant enabled wheels, people would have told us, oh, well, resolution is too slow. It's going to + +00:50:22.680 --> 00:50:27.900 +slow down the installer too much. And Astro is probably one of the installers that care the most + +00:50:27.970 --> 00:50:34.660 +about speed. So we need to both convince us, but also Charlie's and his team to be... + +00:50:34.900 --> 00:50:39.100 +Yeah, and we had plenty of feedback on that front too. Well, during the design, + +00:50:39.150 --> 00:50:42.880 +we were like, no, this is going to be too slow, or like, this is like a better way to do it, + +00:50:42.900 --> 00:50:49.640 +But I like this little snippet because this is basically, if you haven't felt this pain, + +00:50:49.640 --> 00:50:51.000 +it might not be meaningful to you. + +00:50:51.000 --> 00:50:55.300 +But if you've worked with PyTorch, this is what we want to enable. + +00:50:56.480 --> 00:51:01.740 +You don't have to configure a specific index URL that captures the CUDA variant or anything + +00:51:01.880 --> 00:51:02.160 +like that. + +00:51:02.240 --> 00:51:04.860 +You just say, hey, install Torch. + +00:51:04.910 --> 00:51:09.480 +And then in this variant-enabled build, uv would go look at Torch. + +00:51:10.040 --> 00:51:14.800 +It would see, OK, Torch, it has different variants + +00:51:14.900 --> 00:51:15.760 +for different CUDA versions. + +00:51:16.240 --> 00:51:18.960 +And here's how I inspect what CUDA version I should use + +00:51:18.960 --> 00:51:19.460 +on your machine. + +00:51:19.540 --> 00:51:20.740 +And then it would pick out the right version + +00:51:20.880 --> 00:51:22.820 +based on what's supported by the GPU that's running. + +00:51:23.560 --> 00:51:24.560 +That should all happen. + +00:51:24.780 --> 00:51:26.760 +And users shouldn't have to think about configuring it + +00:51:26.960 --> 00:51:29.500 +effectively is what we have been working towards. + +00:51:29.820 --> 00:51:31.120 +ANDREAS SULLIVAN: And in the future, + +00:51:31.160 --> 00:51:32.380 +the first line doesn't exist. + +00:51:32.620 --> 00:51:34.000 +Because right now, the first line is just + +00:51:34.140 --> 00:51:35.720 +here to install this Vientenable. + +00:51:35.720 --> 00:51:36.940 +JOHN MUELLER: Yeah, that just installs the fork. + +00:51:37.280 --> 00:51:37.380 +Yeah. + +00:51:37.840 --> 00:51:40.840 +For people listening and not watching, what they mean by those lines, there's three lines + +00:51:40.880 --> 00:51:41.840 +here to say how to use this. + +00:51:42.340 --> 00:51:42.760 +It says curl. + +00:51:42.760 --> 00:51:43.120 +Oh, sorry. + +00:51:43.360 --> 00:51:44.260 +Basically, yeah, no worries. + +00:51:44.480 --> 00:51:51.120 +It's the install statement for uv, which is typical, except for that it overrides the + +00:51:51.700 --> 00:51:52.360 +download URL. + +00:51:52.800 --> 00:51:53.740 +The download URL. + +00:51:54.620 --> 00:51:57.560 +It's a different URL, which is index.astral.sh. + +00:51:57.580 --> 00:52:02.320 +We distribute a separate variant-enabled experimental, quote-unquote, prototype build. + +00:52:03.080 --> 00:52:03.240 +Right. + +00:52:03.360 --> 00:52:07.260 +And then you just create a virtual environment, uv, VNV, and then you just uv pip install. + +00:52:08.080 --> 00:52:10.140 +like normal, but it handles this. + +00:52:10.160 --> 00:52:13.600 +And Charlie, we spoke, I think on the pyx episode + +00:52:14.000 --> 00:52:17.580 +about just how large some of these things are + +00:52:17.680 --> 00:52:20.280 +like PyTorch and others that are compiled. + +00:52:22.320 --> 00:52:23.740 +You can't just download everything, + +00:52:24.160 --> 00:52:26.360 +all the variations into one wheel. + +00:52:26.720 --> 00:52:28.920 +I mean, I guess you could, but it'd be crazy, right? + +00:52:29.380 --> 00:52:30.780 +That's actually a big benefit, right? + +00:52:31.940 --> 00:52:34.940 +Right now you go to PyPI, you download the PyTorch wheel, + +00:52:35.020 --> 00:52:37.100 +it'll be about 900 megabytes. + +00:52:38.240 --> 00:52:39.240 +you could make it small. + +00:52:39.830 --> 00:52:42.000 +Part of the reason it's so large is, again, these fat binaries. + +00:52:43.600 --> 00:52:45.580 +The NumPy ones are a few megabytes. + +00:52:45.860 --> 00:52:48.320 +The PyTorch ones have a bunch of CUDA inside + +00:52:48.830 --> 00:52:51.420 +for five or six different CUDA architectures. + +00:52:51.860 --> 00:52:53.640 +It loads very, very quickly. + +00:52:54.390 --> 00:52:57.280 +Actually, the PyTorch team has to try incredibly hard + +00:52:57.800 --> 00:52:59.340 +to stay under one gigabyte. + +00:53:01.460 --> 00:53:03.400 +If we have variants, we can just slim it down + +00:53:03.560 --> 00:53:05.240 +to one CUDA architecture per wheel. + +00:53:05.900 --> 00:53:12.620 +So you can go down to like 200 megabytes or so, 250 maybe. + +00:53:12.740 --> 00:53:16.180 +But it's way better for both for index servers. + +00:53:16.360 --> 00:53:17.340 +It's better for users. + +00:53:18.540 --> 00:53:20.040 +It's going to be pretty slow too. + +00:53:20.400 --> 00:53:22.880 +The only thing is not better for it's CI + +00:53:23.120 --> 00:53:25.620 +servers that have to build all these different things + +00:53:25.800 --> 00:53:26.800 +if you start sharding. + +00:53:27.460 --> 00:53:30.540 +But that's a one-time cost that at the end ends up being-- + +00:53:31.680 --> 00:53:34.940 +it's much better to have a slight increase one time + +00:53:35.040 --> 00:53:38.200 +and massive decrease scalable essentially. + +00:53:38.900 --> 00:53:43.580 +Yeah, you build it once, it gets installed a million times. + +00:53:43.920 --> 00:53:45.300 +That's a massive difference. + +00:53:46.120 --> 00:53:50.900 +And it's also better for the warehouse folks like PyPI. + +00:53:52.920 --> 00:53:55.880 +It's easy for people to just assume pip install, + +00:53:56.060 --> 00:53:58.180 +uv pip install, that sort of stuff is going to work. + +00:53:59.180 --> 00:54:02.540 +But the cost of just the bandwidth in that infrastructure + +00:54:02.560 --> 00:54:04.640 +is astronomical, which is crazy. + +00:54:05.480 --> 00:54:09.080 +So this is going to be a major benefit for bandwidth. + +00:54:10.760 --> 00:54:13.220 +Yeah, and also install speed. + +00:54:14.280 --> 00:54:16.280 +You'll also benefit from that because you're no longer + +00:54:16.460 --> 00:54:19.540 +downloading as much stuff to actually install PyTorch. + +00:54:20.180 --> 00:54:23.880 +I mean, if you use uv, it's got some really good caching, + +00:54:24.200 --> 00:54:25.580 +and it's pretty quick. + +00:54:25.800 --> 00:54:28.240 +Oh, but it doesn't multiply your bandwidth by magic. + +00:54:30.700 --> 00:54:33.400 +I wish Charlie could find a solution to that. + +00:54:34.700 --> 00:54:35.520 +I haven't yet. + +00:54:35.900 --> 00:54:40.580 +But yeah, if you're downloading Torch and all the NVIDIA, all the CUDA stuff, it's, yeah. + +00:54:40.760 --> 00:54:41.320 +It's hefty. + +00:54:42.140 --> 00:54:44.900 +It's a large number of megabytes. + +00:54:46.660 --> 00:54:46.860 +Crazy. + +00:54:47.860 --> 00:54:48.480 +All right, let's see. + +00:54:48.620 --> 00:54:57.280 +Let's talk real quick about the PyPackaging Native Guide, and then I want to get an update on pyx real quick before we go. + +00:54:57.280 --> 00:54:58.780 +So, Ralph, this is your project, right? + +00:54:59.220 --> 00:54:59.800 +Tell us about this. + +00:54:59.940 --> 00:55:00.300 +Yes. + +00:55:01.080 --> 00:55:07.860 +Okay, so I've been watching discussions about some of the topics we've talked about in this + +00:55:08.060 --> 00:55:12.220 +episode since 2010 or so in Python packaging. + +00:55:12.620 --> 00:55:20.200 +And even back then, long before we had wheels, NumPy, for example, had different.exe installers + +00:55:20.280 --> 00:55:21.480 +that we would upload to PyPI. + +00:55:21.900 --> 00:55:28.920 +There would be one named _sse2, one _sse3, and user had just picked the right.exe and + +00:55:28.920 --> 00:55:30.380 +install it on their Windows machine. + +00:55:31.280 --> 00:55:32.220 +And then we would go to-- + +00:55:32.220 --> 00:55:33.320 +Wow, I had no idea. + +00:55:33.640 --> 00:55:33.760 +OK. + +00:55:34.340 --> 00:55:36.800 +Yes, it was not fun. + +00:55:37.300 --> 00:55:39.300 +And actually, this is by far the hardest thing + +00:55:39.360 --> 00:55:41.480 +when I became NumPy release manager, + +00:55:41.860 --> 00:55:44.600 +because we had to build these things on Linux under Wine, + +00:55:45.580 --> 00:55:47.580 +and there were no instructions, and there were really janky + +00:55:47.760 --> 00:55:47.900 +scripts. + +00:55:48.040 --> 00:55:50.300 +So it took me three months to get the first release out. + +00:55:52.600 --> 00:55:58.880 +But yeah, so I saw all these discussions about this + +00:55:58.900 --> 00:56:03.880 +2 and SEC 3 and like you know the pip authors and you know most of the people who work with pure + +00:56:04.100 --> 00:56:10.100 +python like you know the devops folks the you know web framework folks they had no idea about this and + +00:56:10.100 --> 00:56:15.140 +and usually these conversations went in circles because when you explain something to one person + +00:56:15.260 --> 00:56:19.480 +the next person would come in and like you know this is endless mailing strats that would never go + +00:56:19.680 --> 00:56:27.440 +anywhere so after you know seeing that for 12 13 years or so i you know finally got tired of that + +00:56:27.440 --> 00:56:31.140 +I thought I'm going to write a reference site that explains the problem. + +00:56:31.230 --> 00:56:33.820 +I don't want to propose any solutions, but just explain the problem. + +00:56:33.960 --> 00:56:37.540 +So the next time someone starts a new conversation about, you know, + +00:56:39.000 --> 00:56:43.180 +CMD extensions or about GPUs or, you know, + +00:56:43.420 --> 00:56:47.020 +about some of the issues with mixing, you know, source and binary distributions. + +00:56:48.200 --> 00:56:50.360 +Just link to this site, like, please use that. + +00:56:50.460 --> 00:56:54.600 +It's the best approach at trying to summarize the problem, you know. + +00:56:54.940 --> 00:56:57.180 +So we have a baseline to start talking about solutions. + +00:56:58.010 --> 00:57:01.980 +And I think, you know, Jonathan, you know, as one of the people who saw this, + +00:57:02.160 --> 00:57:07.480 +I think a lot of people read this, but it was a nice basis to, you know, + +00:57:08.510 --> 00:57:12.100 +just point at this is like, there are your problem descriptions and, you know, + +00:57:12.260 --> 00:57:16.720 +for the GPU part, like Nvidia folks really helped to make sure that all the + +00:57:17.100 --> 00:57:18.860 +explanations of the problems were correct. + +00:57:19.860 --> 00:57:23.240 +So when we started Wheel Next, we could just start talking about like, okay, + +00:57:23.500 --> 00:57:24.420 +what are the solutions here? + +00:57:26.500 --> 00:57:29.180 +This website is absolutely incredible. + +00:57:29.920 --> 00:57:30.360 +It's amazing. + +00:57:30.720 --> 00:57:31.360 +Yeah, it's amazing. + +00:57:32.200 --> 00:57:35.960 +The work that Ralph and every contributor to this website + +00:57:36.160 --> 00:57:40.700 +have made, this is by far the best explanation anywhere + +00:57:41.460 --> 00:57:44.360 +on the internet to all these packaging issues. + +00:57:45.040 --> 00:57:46.380 +And I really like the perspective + +00:57:46.600 --> 00:57:48.960 +that Ralph has took, which is don't state the solution, + +00:57:49.280 --> 00:57:51.520 +just focus on stating the problem very clear. + +00:57:52.420 --> 00:57:57.840 +and and then with real next we try to take the exact flip coin flip side of the coin which is + +00:57:58.420 --> 00:58:03.820 +don't focus on the problem it's already explained just focus on proposing one solution to some of + +00:58:03.840 --> 00:58:08.500 +the problems and this is how we created real next i love it you know one of the big problems + +00:58:09.120 --> 00:58:16.280 +challenges i guess is if you don't fully understand the problem space you could be debating two + +00:58:16.380 --> 00:58:20.600 +different things and one person sees a really important angle the other person doesn't even see + +00:58:20.500 --> 00:58:25.180 +that angle there they have a different perspective that they're arguing for optimizing for and so + +00:58:25.720 --> 00:58:29.560 +yeah it's sort of a little bit like the wheel next stuff like let's get everyone involved and + +00:58:30.020 --> 00:58:36.640 +see all the angles and then discuss it right exactly cool all right well you know the you + +00:58:36.760 --> 00:58:42.940 +know the saying a problem well stated is a problem half solved so this is that is an awesome + +00:58:42.960 --> 00:58:44.420 +exactly what we are trying to say. + +00:58:45.100 --> 00:58:45.520 +I love it. + +00:58:46.360 --> 00:58:46.500 +All right. + +00:58:46.640 --> 00:58:46.840 +Let's, + +00:58:47.210 --> 00:58:52.720 +I want to get a quick update on pyx since I feel like Charlie, + +00:58:53.020 --> 00:58:54.780 +you're right in the middle of this. + +00:58:54.870 --> 00:58:58.220 +I know pyx was looking to solve some of these problems as well. + +00:58:59.520 --> 00:59:01.480 +Give us the elevator pitch and just, + +00:59:01.930 --> 00:59:03.200 +we have a whole episode on this from, + +00:59:03.440 --> 00:59:03.620 +I don't know, + +00:59:03.650 --> 00:59:04.680 +six months ago or something, + +00:59:04.900 --> 00:59:05.240 +but yeah, + +00:59:05.290 --> 00:59:05.740 +give us the, + +00:59:05.980 --> 00:59:08.680 +what's the situation here and does this change things on how you're + +00:59:08.690 --> 00:59:09.400 +handling things? + +00:59:10.080 --> 00:59:10.260 +Yeah. + +00:59:10.270 --> 00:59:10.400 +Yeah. + +00:59:10.670 --> 00:59:10.760 +Yeah. + +00:59:10.760 --> 00:59:10.820 +Yeah. + +00:59:10.830 --> 00:59:11.140 +For sure. + +00:59:11.260 --> 00:59:11.520 +It's really, + +00:59:12.180 --> 00:59:12.260 +yeah. + +00:59:12.360 --> 00:59:14.340 +pyx is our hosted package registry. + +00:59:15.800 --> 00:59:17.680 +And it's in beta right now. + +00:59:17.800 --> 00:59:20.820 +So we're live with a bunch of great customers. + +00:59:23.460 --> 00:59:25.980 +The goal of pyx is basically to enable us + +00:59:25.980 --> 00:59:28.840 +to solve more of the packaging problems that we see in the uv + +00:59:28.960 --> 00:59:33.160 +issue tracker by having our own registry that we think + +00:59:33.160 --> 00:59:35.100 +is well implemented and solves problems + +00:59:35.280 --> 00:59:37.160 +that we see that other registries don't really solve. + +00:59:40.120 --> 00:59:47.700 +Like basically from the start, the way that we've approached the wheel, these like problems around the GPU stuff is from like two perspectives. + +00:59:49.180 --> 00:59:54.200 +And in pyx, we're really just focused in terms of how it overlaps with wheel variants. + +00:59:54.260 --> 00:59:55.760 +We're really just focused on the GPU part. + +00:59:56.140 --> 01:00:03.240 +But the way that we've approached it has basically been try to push the standards forward as much as we can. + +01:00:03.760 --> 01:00:05.420 +And that's what we've been doing in this effort. + +01:00:05.840 --> 01:00:10.720 +and then simultaneously try to figure out how we can help users until the standards change. + +01:00:11.800 --> 01:00:17.720 +And so pyx has more been in that second camp of assuming that standards don't change, + +01:00:17.980 --> 01:00:22.420 +because we don't want to unilaterally start changing a bunch of things without going through the process. + +01:00:22.940 --> 01:00:26.420 +How can we make the world a little bit easier for people who are working with this kind of stuff? + +01:00:26.520 --> 01:00:35.280 +So for example, in pyx, we take a lot of packages that are PyTorch extensions or need to be built against CUDA, + +01:00:35.700 --> 01:00:36.620 +and we build those. + +01:00:37.300 --> 01:00:39.600 +We build them across a wide range of CUDA versions, + +01:00:39.940 --> 01:00:42.720 +PyTorch versions, Python versions, CPU architectures, + +01:00:43.080 --> 01:00:44.420 +and we make those available to users. + +01:00:44.950 --> 01:00:48.520 +So it doesn't solve the core problem of how do you build + +01:00:48.570 --> 01:00:50.320 +and distribute this stuff, but it does + +01:00:50.330 --> 01:00:52.160 +mean that if you're operating within the constraints + +01:00:52.400 --> 01:00:54.200 +of the current set of standards, we + +01:00:54.360 --> 01:00:56.740 +can make people's lives easier by making it so they don't + +01:00:56.780 --> 01:00:57.640 +have to build so many things. + +01:00:57.800 --> 01:00:59.880 +We build them well, they all work together, + +01:01:00.070 --> 01:01:00.840 +all that kind of stuff. + +01:01:01.440 --> 01:01:03.240 +So that's what we've been focused on. + +01:01:03.420 --> 01:01:11.640 +And I think looking forward, our goal is to support WheelVariance as soon as possible + +01:01:12.480 --> 01:01:13.700 +and put those into the registry. + +01:01:13.880 --> 01:01:17.720 +So as soon as we feel like that's a feasible thing to do on the registry, + +01:01:17.820 --> 01:01:20.820 +we'll support it in pyx and support it for our users and our customers. + +01:01:21.720 --> 01:01:23.840 +But in the meantime, it's kind of been like a parallel track effort + +01:01:24.060 --> 01:01:26.680 +of pushing forward on all the WheelNext work and standards + +01:01:26.840 --> 01:01:29.460 +and then just trying to solve immediate user problems + +01:01:29.640 --> 01:01:32.000 +without changing standards, like partly through the registry. + +01:01:33.180 --> 01:01:37.460 +Okay. Things are going good at pyx? You're making progress? + +01:01:37.460 --> 01:01:42.000 +Yeah. We're making progress. Yeah. Yeah. No, it's good. + +01:01:42.940 --> 01:01:44.640 +Customers are growing. Numbers are growing up. It's good. + +01:01:45.570 --> 01:01:47.800 +Awesome. People want to try pyx? What are they? + +01:01:48.520 --> 01:01:50.560 +They can join the wait list here. Yeah. Yeah. + +01:01:51.550 --> 01:01:53.340 +This is, you know, you just, we have a, yeah, + +01:01:53.460 --> 01:01:57.380 +or you can go to ash.pyx and we look at all the responses and we basically + +01:01:57.600 --> 01:01:59.300 +onboard people one by one. + +01:02:00.280 --> 01:02:05.260 +Okay. So talking about when is this stuff going to be ready? You'll be able to adopt it. I guess + +01:02:05.400 --> 01:02:10.780 +maybe that's a good place to close out our conversation here is what's the timeline? + +01:02:11.560 --> 01:02:14.300 +What are expectations? How are things going? What's next? + +01:02:15.200 --> 01:02:15.620 +It's a great question. + +01:02:15.620 --> 01:02:18.000 +As everything is open source, it's a two month delay. + +01:02:22.580 --> 01:02:24.560 +What's the party line on this question? + +01:02:27.880 --> 01:02:28.620 +Oh, gosh. + +01:02:29.100 --> 01:02:29.740 +Well, it's... + +01:02:29.760 --> 01:02:30.380 +It's next time. + +01:02:30.860 --> 01:02:31.860 +I liked... + +01:02:32.080 --> 01:02:33.840 +We have a joke inside... + +01:02:33.980 --> 01:02:36.620 +I don't know if it's inside Widespread, inside WeOnX, + +01:02:36.700 --> 01:02:39.280 +but we call this the Varys Force Law. + +01:02:39.780 --> 01:02:40.760 +Varso Force Law. + +01:02:40.820 --> 01:02:42.820 +I don't remember exactly how... + +01:02:43.060 --> 01:02:44.480 +Which is essentially make an estimate, + +01:02:45.340 --> 01:02:46.920 +multiply it by two, and change the unit. + +01:02:47.180 --> 01:02:49.360 +So if you think it's going to take six months, + +01:02:49.780 --> 01:02:50.180 +it's one year. + +01:02:50.180 --> 01:02:50.560 +Oh, no. + +01:02:50.940 --> 01:02:52.340 +Change the unit one decade. + +01:02:57.260 --> 01:03:03.160 +And it's a running joke that we have that I think is really good. + +01:03:04.320 --> 01:03:15.680 +Realistically, I think it depends on where are we going to set the bar for starting to roll things out. + +01:03:15.760 --> 01:03:20.240 +So as Ralph was saying, we'll probably see some provision here accepted. + +01:03:21.640 --> 01:03:25.940 +But as we get to that point, some of the stuff will be possible. + +01:03:27.580 --> 01:03:35.180 +For example, I expect that little by little, we can start experimenting with things without + +01:03:35.440 --> 01:03:40.320 +getting necessarily to the absolute final stage, but the full feature will be available + +01:03:40.500 --> 01:03:43.280 +to the app at the last stage. + +01:03:44.440 --> 01:03:47.060 +So, complicated question to answer. + +01:03:47.160 --> 01:03:49.180 +We hope that it's not going to take too many years. + +01:03:49.920 --> 01:03:54.680 +I'll make a connection back to pyx here, because I think, you know, there's part is like, okay, + +01:03:54.800 --> 01:03:57.160 +there's four PAPs that need to be reviewed. + +01:03:57.600 --> 01:04:00.100 +Probably we need to update some prototypes here and there. + +01:04:00.320 --> 01:04:03.340 +It's probably going to take, you know, the better part of this year. + +01:04:04.700 --> 01:04:06.760 +At that point, you know, you have accepted PAPs, right? + +01:04:06.840 --> 01:04:09.320 +But then PyPI needs to be updated. + +01:04:09.740 --> 01:04:11.900 +Like, you know, all the tools that, like Twine, + +01:04:11.940 --> 01:04:12.800 +would need to be updated. + +01:04:13.060 --> 01:04:15.640 +Like there's a new metadata version. + +01:04:15.740 --> 01:04:17.740 +So everything that consumes that needs to be updated + +01:04:18.980 --> 01:04:22.559 +before, you know, package authors can actually start producing + +01:04:22.560 --> 01:04:25.260 +these wheels and upload them to PyPI. + +01:04:25.780 --> 01:04:27.720 +So that's going to not be this year. + +01:04:28.000 --> 01:04:34.380 +There's a very long tail of how the implementation rolls through the ecosystem, and then you have + +01:04:34.380 --> 01:04:39.260 +to wait until users get newer tools, and then only then can you start uploading wheels. + +01:04:40.060 --> 01:04:45.260 +So I'm going to poke at Charlie a bit here, because one of the advantages of having a separate + +01:04:45.480 --> 01:04:52.540 +registry is, plus the ability to rebuild everything, you can start using variant wheels like the + +01:04:52.560 --> 01:05:09.900 +moment that everything is accepted. It's way sooner. Have you thought about that? That is true. Yeah, yeah, of course. Yeah. I think from our perspective, we're mostly like, do we feel like design is done or how much churn will there be on the design? But yeah, we're definitely in a position to like start building and distributing this stuff much, much sooner. + +01:05:11.500 --> 01:05:16.420 +uv has a second advantage, which is I think they have a much shorter tail of users in terms of version. + +01:05:17.160 --> 01:05:21.700 +I think uv users end up on a much more quote-unquote recent version. + +01:05:22.380 --> 01:05:25.580 +If you look at pip, I think, I don't remember the statistic on top of my head, + +01:05:25.740 --> 01:05:31.340 +but a still significant portion of users use five-year-old version of pip, + +01:05:31.580 --> 01:05:34.120 +which I don't even know which version of Python. + +01:05:34.230 --> 01:05:35.920 +It was 3.9 or something. + +01:05:36.700 --> 01:05:43.740 +So it is, it is, uv is able to move a lot faster, but also the users are more reactive. + +01:05:44.360 --> 01:05:45.700 +That's a very interesting point. + +01:05:45.980 --> 01:05:53.160 +I mean, I think a lot of people who are very tuned into the Python space have switched to uv, started using uv. + +01:05:53.940 --> 01:05:59.520 +And there's probably a lot of people who don't read the newsletters, don't listen to the podcast and so on. + +01:05:59.580 --> 01:06:03.780 +And they know pip and they just keep on pipping, which is fine, not knocking it. + +01:06:03.820 --> 01:06:20.000 +But, you know, it, it means not only are these in pip, they might be using an older version of Python, because they don't want to, they don't want to shake it up. And, you know, that, that those are going to be the long tails that are going to be hard. I guess one more thought about what's next here before we call this a show here. + +01:06:23.000 --> 01:06:28.360 +What is the minimal? We talked about a PEP 825, the minimal pep. What is the minimal amount of + +01:06:28.580 --> 01:06:34.940 +adoption? Right? So if if the top five biggest data science and machine learning libraries adopt this, + +01:06:35.760 --> 01:06:42.400 +and the installer tools like uv and pip and support it, that actually alone might be a really big + +01:06:42.680 --> 01:06:48.780 +benefit if all the other packages are just ignored, right? So that's way more achievable than every + +01:06:48.800 --> 01:06:54.160 +single package that has native code has all these specifiers right what's what's the minimum level + +01:06:54.300 --> 01:07:06.340 +of adoption i would say that i mean the minimum level at which you can call the success yeah five + +01:07:06.340 --> 01:07:09.920 +is probably not that far off the benefits start to accumulate quickly but i would expect + +01:07:10.980 --> 01:07:17.080 +once packages like pytorch start adopting this especially in a deep learning space you know + +01:07:17.000 --> 01:07:21.280 +this will be adopted very widely, very quickly, because it's all so many problems. + +01:07:21.550 --> 01:07:28.320 +Like many of the most popular packages like VLLM, but very large development teams and very large numbers of users. + +01:07:29.060 --> 01:07:32.560 +If you look at their install pages, it's like, you know, it's like a puzzle book. + +01:07:33.900 --> 01:07:35.520 +You just don't know how to install this stuff. + +01:07:35.610 --> 01:07:39.020 +And they don't have wheels on PyPI and they have their own extra index servers. + +01:07:39.910 --> 01:07:41.540 +And it's not for lack of trying. + +01:07:42.420 --> 01:07:43.320 +It's not for lack of trying. + +01:07:43.450 --> 01:07:46.440 +Like those teams like put a lot of effort into trying to make it easier to install. + +01:07:46.700 --> 01:07:49.220 +but they basically all run into different kinds of roadblocks. + +01:07:49.920 --> 01:07:54.020 +I think five packages is what you'll get after maybe two weeks. + +01:07:55.020 --> 01:07:57.460 +After a month, you will get twice that amount + +01:07:57.890 --> 01:08:01.680 +and probably a quadratic progression for quite a few weeks. + +01:08:01.770 --> 01:08:05.180 +But it's, especially in the scientific compute space + +01:08:05.250 --> 01:08:07.340 +and maybe machine learning to be more specific, + +01:08:08.540 --> 01:08:12.320 +well, the moment that it works, so many packages will switch, + +01:08:12.700 --> 01:08:13.500 +like so many. + +01:08:14.220 --> 01:08:16.660 +If you just take PyTorch, half of its dependencies + +01:08:16.680 --> 01:08:18.440 +will probably activate variant mode. + +01:08:19.299 --> 01:08:21.580 +And then the people that build on top of PyTorch + +01:08:21.580 --> 01:08:23.100 +are people who build on top of Jax. + +01:08:23.600 --> 01:08:26.540 +So just that, you end up with at least 50 packages + +01:08:26.759 --> 01:08:28.020 +in a matter of a few months. + +01:08:28.620 --> 01:08:31.359 +I was thinking there's probably a very small set + +01:08:31.600 --> 01:08:34.799 +that are feeling the most pain and are most-- + +01:08:34.859 --> 01:08:37.759 +you could do direct outreach to just the most important + +01:08:38.000 --> 01:08:40.660 +projects and get that adopted and make a really big difference, + +01:08:41.020 --> 01:08:42.720 +even if it's not every package. + +01:08:43.380 --> 01:08:46.040 +But the funny part is that most of the packages that + +01:08:46.060 --> 01:08:48.100 +would be interested that we would reach out + +01:08:48.220 --> 01:08:49.759 +to already part of We Are Next. + +01:08:51.240 --> 01:08:56.080 +Because they, in some way, find the pain really significant + +01:08:56.960 --> 01:08:59.020 +and are starving for a solution. + +01:08:59.560 --> 01:09:00.420 +JOHN MUELLER: They know. + +01:09:00.759 --> 01:09:01.279 +They already know. + +01:09:02.900 --> 01:09:04.819 +All right, let's call it a show, folks. + +01:09:05.060 --> 01:09:07.400 +Let's-- final call to action. + +01:09:07.859 --> 01:09:09.120 +People out there listening, either they're + +01:09:09.420 --> 01:09:11.920 +maintainers of packages or they're + +01:09:12.359 --> 01:09:14.779 +users of these libraries or whatever. + +01:09:15.000 --> 01:09:16.279 +They got their own open source project. + +01:09:19.260 --> 01:09:20.299 +They're seeing the light. + +01:09:20.370 --> 01:09:21.240 +They want to get involved. + +01:09:22.020 --> 01:09:22.779 +They want to try it out. + +01:09:23.290 --> 01:09:23.740 +What do you tell them? + +01:09:25.680 --> 01:09:30.400 +Well, first, it would be great if people were to come on discuss.python.org. + +01:09:30.759 --> 01:09:36.640 +That's where the community is trying to aggregate to discuss all these different proposals. + +01:09:37.549 --> 01:09:41.620 +So I think the more people get involved, the more better. + +01:09:43.660 --> 01:09:54.060 +But also trying the different packages that we are trying to publish that Charlize has been helping us and his team has been helping us to create a sort of end-to-end experience. + +01:09:54.630 --> 01:10:07.060 +I think right now we have example of Linux, macOS, and Windows. It works on different type of hardware, different type of CPUs, different type of GPUs. + +01:10:07.680 --> 01:10:11.100 +it works pretty broadly and we wanted to give a sort of a + +01:10:12.980 --> 01:10:17.020 +sample flavor of what could be a variant-enabled world + +01:10:19.400 --> 01:10:24.500 +yeah i'd say yeah for the majority of listeners they're not going to be packaging tool authors + +01:10:24.700 --> 01:10:30.780 +right so those are the ones you would expect to participate in in the review primarily but it's a + +01:10:31.320 --> 01:10:36.220 +if you're a user of any of the packages we mentioned just try it out you know download + +01:10:36.240 --> 01:10:42.480 +the uv variant enabled installer and if you're a package author and like we haven't mentioned your + +01:10:42.560 --> 01:10:48.600 +package but it will solve a problem for you like get in touch because i think that's maybe the most + +01:10:49.140 --> 01:10:54.460 +you know relevant part here there's like at least you know hundreds maybe thousands of packages you + +01:10:54.460 --> 01:11:00.060 +know that we think we have answers for but if their solution or their problem statement is + +01:11:00.200 --> 01:11:04.360 +slightly different i think now would be a great time to learn and make sure we cover as many use + +01:11:04.320 --> 01:11:15.780 +cases possible yeah i mean i guess the only thing i'd say is ideally the average user won't even + +01:11:15.790 --> 01:11:20.200 +have to think about this right and hopefully it's just hopefully they just get it through uv or + +01:11:20.300 --> 01:11:25.220 +through pip or whatever um in the long term but that's that may take time but that's our goal + +01:11:25.420 --> 01:11:31.180 +certainly yeah it's all behind the scenes they don't know but certainly if it solves a problem + +01:11:31.280 --> 01:11:33.260 +reach out and be part of it. + +01:11:35.220 --> 01:11:35.400 +Jonathan, + +01:11:35.620 --> 01:11:37.140 +Ralph, Charlie, thanks for being on the show. + +01:11:37.460 --> 01:11:39.280 +It's been great. Keep up the good work. + +01:11:39.280 --> 01:11:39.900 +Thanks for having us. + +01:11:41.700 --> 01:11:42.020 +Bye. + +01:11:42.980 --> 01:11:43.240 +Bye. + +01:11:43.260 --> 01:11:43.400 +Bye. + From af46b919d64cc40ff4089ff3fc731ed07fdd1d99 Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 25 Mar 2026 13:52:18 -0700 Subject: [PATCH 08/16] latest transcripts (541, 542) --- ...python-in-rust-for-ai-transcript-final.txt | 1962 +++++++++++ ...python-in-rust-for-ai-transcript-final.vtt | 2944 +++++++++++++++++ ...odern-static-site-generator-transcript.txt | 1140 +++++++ ...odern-static-site-generator-transcript.vtt | 1720 ++++++++++ 4 files changed, 7766 insertions(+) create mode 100644 transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt create mode 100644 transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt create mode 100644 transcripts/542-zensical-a-modern-static-site-generator-transcript.txt create mode 100644 transcripts/542-zensical-a-modern-static-site-generator-transcript.vtt diff --git a/transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt new file mode 100644 index 0000000..e8c561f --- /dev/null +++ b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt @@ -0,0 +1,1962 @@ +00:00:00 When LLMs write code to accomplish a task, that code has to actually run somewhere. + +00:00:05 And right now, the options aren't great. + +00:00:07 You can spin up a sandbox container and you're paying the full second of cold start overhead, plus the complexity of another service. + +00:00:15 Let the LLM loose on your actual machine and, well, you better keep an eye on it. + +00:00:20 On this episode, I sit down with Samuel Colvin, the creator of Pydantic, now at 10 billion downloads, to explore Monty, a Python interpreter written from scratch in Rust, purpose-built to run LLM-generated code. + +00:00:33 It starts in microseconds, is completely sandboxed by design, and can even serialize its entire state to a database and resume later. + +00:00:42 We dig into why this deliberately limited interpreter might be exactly what the AI agent error needs. + +00:00:48 This is Talk Python To Me, episode 541, recorded February 17, 2026. + +00:00:54 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:16 This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. + +00:01:22 Let's connect on social media. + +00:01:24 You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:27 The social links are all in your show notes. + +00:01:30 You can find over 10 years of past episodes at talkpython.fm. + +00:01:33 And if you want to be part of the show, you can join our recording live streams. + +00:01:37 That's right. We live stream the raw, uncut version of each episode on YouTube. + +00:01:41 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:46 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:51 This episode is brought to you by our Agentic AI Programming for Python course. + +00:01:55 Learn to work with AI that actually understands your code base and build real features. + +00:02:00 Visit talkpython.fm/Agentic-AI. + +00:02:05 Samuel, welcome back to Talk Python To Me. + +00:02:07 Great to have you here, as always. + +00:02:08 Thank you so much for having me back. Yeah, it's good to be here. + +00:02:11 I saw your project and I immediately sent you a message. + +00:02:14 You need to come on the show and talk about this. + +00:02:16 What is going on? What is Monty? + +00:02:18 Hat tip to the name. I want to hear the origin of the name. + +00:02:21 You might be able to guess it. + +00:02:22 I think I can guess it. I think I can guess it. + +00:02:25 It's awesome to be here talking about this. + +00:02:28 You've been on a bunch of times, but there's a bunch of new listeners or they don't listen to every show. + +00:02:32 Give us your background. + +00:02:33 So I'm Samuel and I'm probably best known as creating Pydantic validation library way back in the annals of time in 2017. + +00:02:42 That is kind of an infrastructural bit of Python today. + +00:02:45 We just crossed 10 billion downloads in total. + +00:02:48 We're at like 580 million downloads a month. + +00:02:50 So that gets a lot of usage. + +00:02:53 Very lucky that Sequoia Capital came along and invested in Pydantic to start a company at the beginning of 2023. + +00:02:59 So now we have a kind of stable of different things we do, what we call the Pydantic stack. + +00:03:04 So there's Pydantic validation. + +00:03:05 We talked about Pydantic AI, which is an agent framework where Monty kind of fits in best. + +00:03:11 And then there's Pydantic Logfire, the observability platform for AI and general observability, which is the commercial bit of what we do. + +00:03:19 So I suppose I'm supposed to be being CEOing most of the time. + +00:03:23 I actually spend far too much of my time clauding. + +00:03:25 I seem to be in good company. + +00:03:27 I keep seeing people on Twitter, lots of CEOs of much bigger companies and writing lots of code. + +00:03:31 So apparently I'm allowed to again. + +00:03:32 It is an insanely exciting time with just the agentic AI in general. + +00:03:38 And Claude, you know, Claude Opus, Claude Sonnet in particular, they are so good. + +00:03:43 I don't know about you. + +00:03:43 I'm sure at least half the people, at least half of the people listening are like, they've got a backlog of ideas they want to try. + +00:03:50 Things they've always wanted to build and not the time. + +00:03:52 Or maybe it's a bit of a stretch. + +00:03:54 Like, I don't really know mobile. + +00:03:55 I can't really build a mobile app. + +00:03:56 But if I could, I would build this. + +00:03:58 And now you kind of can, right? + +00:04:00 Yeah. + +00:04:00 I mean, I think it's got scary bits of it too. + +00:04:02 I mean, maybe we're experiencing the like bonfire of the thing. + +00:04:04 We all, you know, I was speaking to Zach Hatfield Dodds just before Christmas. + +00:04:09 And he was like, we have had this weird time period when the thing I love doing happens to be incredibly financially lucrative. + +00:04:15 I mean, he's anthropic. + +00:04:16 So it's probably more financially lucrative for him than the rest of us. + +00:04:19 But hey, and maybe that time is going to come to an end. + +00:04:22 But I still feel very privileged to have had that time. + +00:04:24 I don't know exactly what's going to go. + +00:04:27 I mean, and definitely the jobs of software developers are changing. + +00:04:30 And some of that is scary. + +00:04:31 But as you say, it's also super exciting projects from Go Build a Mobile App, which you didn't know how to do. + +00:04:37 But there were others who did through to building Monty, which I think we were relatively well placed to do it as a team of people. + +00:04:43 But we would never have had the resources or the time to do it if it wasn't for LLMs being especially good at tasks like that. + +00:04:52 Interesting. + +00:04:52 Okay. + +00:04:53 I do want to dive into that later. + +00:04:54 But we haven't even introduced what Monty is yet. + +00:04:57 So let's hold off on that deep dive. + +00:05:00 But when I saw this, I'm like, I wonder what role that agentic coding sort of made this possible for a small team. + +00:05:07 You know, like that was certainly one of the thoughts I had. + +00:05:10 Yeah. + +00:05:11 I mean, I can dive into it. + +00:05:12 But yeah, I mean, I've got a bit of help now from David Hewitt, who is a great deal better Rust developer and knows more of the Python internals than many people. + +00:05:22 Well, definitely more than me. + +00:05:23 But for the most of it was just me in my spare time building it, which I'll talk in a bit about like why I think this is such an eligible project for LLM acceleration. + +00:05:33 Yeah. + +00:05:33 Yeah. + +00:05:33 So you're playing both sides of the fence here. + +00:05:36 It sounds like both maybe using a little AI, but also building for AI, which I think is quite interesting. + +00:05:42 Yeah, I think we're I mean, yeah, we're doing we're building Pydantic AI as a way for LLMs to power applications or be part of applications. + +00:05:51 We're also using AI to build that more and more. + +00:05:55 I think of people's usage of Logfire is through their coding agent, as in sure, people can log into Logfire. + +00:06:00 We love our tracing view, et cetera. + +00:06:02 But I acknowledge there's a lot of people who are just going to point and clawed code at it and ask it to go and work out what's wrong and fix their bug. + +00:06:08 So, yeah, we contact contact with what's going on in LLMs all over the place. + +00:06:12 How did you facilitate that? + +00:06:14 Like, how can the AI get that information? + +00:06:16 We make this weird, esoteric, odd decision back when we first started Logfire not to allow users to write arbitrary SQL against their data. + +00:06:25 We did that really because we thought it was too much hard work to build a build a query builder. + +00:06:30 And like SQL seemed like the thing we would want. + +00:06:32 And it seemed like a pretty esoteric, odd decision back when we started it in 2023. + +00:06:36 Now it is like the most powerful, most defensible thing we have because we've spent two years learning how to build effectively an analytical database that anyone can go and query and run any query against and dealing with all of the side effects of that. + +00:06:51 But everyone has an MCP server. + +00:06:53 Fine. + +00:06:53 But what's powerful about Logfires is LLMs are very, very, very good at writing SQL when they have a schema. + +00:06:59 And so, you know, you ask it something that no one's ever asked it before, say, find me the five slowest endpoints by P95. + +00:07:04 Now that's a reasonable one, but you can imagine some incredibly complex question that no one's ever answered before that no other kind of query builder dialect could do. + +00:07:12 But because you have full SQL, you can go and write this. + +00:07:15 LLM will write the SQL to give you back the answer. + +00:07:17 I want the P95 worst top five there. + +00:07:20 For this app, at this endpoint, for the people in Southeast Asia on Tuesday. + +00:07:26 Right? + +00:07:27 Something like you're like, we've run out of filters, but like SQL just keeps going. + +00:07:31 And by the way, group that by hour or group that by every 15 minutes. + +00:07:36 And like, you know, it gets arbitrarily more complex. + +00:07:38 That just just works. + +00:07:39 Yeah. + +00:07:40 How very interesting. + +00:07:41 I just wrote an article about how I think working in the native query language, if you're using agentic programming. + +00:07:50 I saw you write it. + +00:07:50 I was like, yeah, yeah, yeah. + +00:07:52 Yeah. + +00:07:52 And I mean, Pydantic is a perfect fit for that style. + +00:07:55 It's like, if you could write your actual queries in native syntax and then transform it to a rich class, like a Pydantic or a data class or something like that. + +00:08:03 These AIs, they are so trained on SQL or MongoDB native query syntax or, you know, whatever vanilla lowest level thing. + +00:08:11 They see more of that than anything because it's across all the technologies. + +00:08:14 I think that's going to be a thing. + +00:08:15 And it's interesting how you sort of set the stage so that was already present for you and your product, right? + +00:08:21 Yeah. + +00:08:22 But even when we started building the Logfire platform, I remember saying, everyone was like, you know, which ORM are we going to use? + +00:08:27 We're building a FastAPI. + +00:08:28 So there was some debate about how we do it. + +00:08:30 And I was like, let's just write SQL. + +00:08:32 And everyone, you know, it seemed like an odd thing to do because, sure, it's like six lines of SQL to do a simple, like, what would be a like get in Django ORM. + +00:08:39 But, I mean, I think even before LLMs, people were compelled enough because they were like, yeah, the like autocomplete kind of LLM will do a lot of the work for me. + +00:08:46 And now I have complete control. + +00:08:48 Now, I think where the majority of code is being written by AIs, having full control, full SQL is incredibly useful. + +00:08:55 And you can optimize it, right? + +00:08:56 You can only get the particular column that you want. + +00:08:58 You can be very careful about which indexes are being used. + +00:09:01 You can copy paste the SQL into whatever and work out the plan. + +00:09:05 That's much harder when you're using an ORM. + +00:09:07 So, yeah. + +00:09:08 Yeah. + +00:09:08 And you could just star star the dictionary that comes back right into a Pydantic class. + +00:09:13 And then you put that behind a function. + +00:09:15 You don't mess with it. + +00:09:16 It's safe. + +00:09:17 Exactly. + +00:09:18 Yeah. + +00:09:18 You kind of get the programmer benefits of programming against typed classes and the AI benefits of it can just talk like vanilla and the performance as well. + +00:09:27 All right. + +00:09:27 Don't necessarily want to go too far down that rat hole. + +00:09:30 We got a different one to go down. + +00:09:32 Let's talk about Python interpreters. + +00:09:34 So, you built Monty, a specialized Python interpreter written in Rust. + +00:09:40 And I just want to just do a little historical journey to show, like, for people who don't know, like, this is not the first one of these. + +00:09:49 Actually, I'm happy to riff on this, but I'll let you take the lead. + +00:09:52 I heard a conversation from two programmers in interchange, exchange between those two, talking about CPython. + +00:09:59 They're like, what is CPython? + +00:10:01 Is it like Python that compiles to C? + +00:10:04 Or, you know? + +00:10:05 So, maybe just a little bit of a chat about what the heck is an interpreter? + +00:10:09 Yeah, go ahead. + +00:10:10 I remember being confused about that, too. + +00:10:11 And, you know, in Cython, which I don't think we hear about so much anymore, but that confused me as well. + +00:10:15 I remember, yeah. + +00:10:16 So, it's interesting that even from as far back as CPython's origination, there was an acknowledgement that there might be other Pythons, and that Python is a language, not an implementation. + +00:10:28 But, yeah. + +00:10:28 Go ahead. + +00:10:29 Yeah. + +00:10:29 So, well, we've got the Python interpreter, and we've got Python code we write. + +00:10:34 Often, we write, well, Python, the language, but when it executes, it doesn't actually execute in Python. + +00:10:41 It might execute because C understands it, and a C compiled thing runs. + +00:10:45 Or, in your case, Rust understands the bytecode, right? + +00:10:49 So, the interpreter parses our Python into Python bytecodes, which you can get through with the disk module. + +00:10:56 You can disassemble it and look at the actual bytecodes you got back. + +00:10:59 And then those are sent off to, like, a giant loop that interprets them, hence the term interpreter. + +00:11:04 So, we've got CPython. + +00:11:06 We have the defunct IronPython for .NET, which made it all the way to 3.4. + +00:11:10 We've got the defunct Gython, which made it all the way to 2.7. + +00:11:14 And we've got the much more exciting and modern Pyodide. + +00:11:18 Well, Pyodide is still CPython, so. + +00:11:20 Yes. + +00:11:21 But compiled for WebAssembly, which I feel, I don't know, I feel like Rust and WebAssembly have this kinship. + +00:11:25 So, it's like, I don't know, it feels closer to Rust than the others. + +00:11:28 I agree. + +00:11:28 There's also Rust Python, which is in active development. + +00:11:30 I don't know what that's currently pointing at. + +00:11:34 There's also Grail, which is another Python interpreter. + +00:11:40 And the second biggest, really, is PyPy, probably the best-known one of all. + +00:11:46 So, without meaning to cause offense to those that are still active, there's also Unladen Swallow was another attempt. + +00:11:54 And there's a whole, but look, without meaning to cause offense to any of those that are still alive, there was a kind of graveyard of other Python implementations. + +00:12:02 And so, I went into this knowing that it's a space where lots of people have tried to build things, put in, bluntly, a great deal more effort than we have. + +00:12:09 And for the most part, I wouldn't say they failed, but they haven't got the same kind of adoption that CPython has. + +00:12:16 I mean, I think... + +00:12:16 Oh, 100%. + +00:12:17 CPython is 99.9s of usage of Python. + +00:12:21 And my take is that the reason for that is you need almost complete, perfect consistency with CPython to use something else. + +00:12:33 Again, you need 99.59s of perfection, of identical behavior before you would go and switch in any real application. + +00:12:40 I remember trying to use PyPy, and even if I could get it to run, well, it turns out its foreign function interfaces are not with, like, asyncpg were slower than CPython's, and so actually it didn't perform as well. + +00:12:50 And so, the threshold to switch from CPython to something else or to choose something else was incredibly high. + +00:12:57 And so, we are not trying to build another Python interpreter that you might credibly move your application across. + +00:13:04 We're using Python as a syntax for a very specific thing where LLMs write code. + +00:13:10 And the fact that we have a different goal is one of the reasons that we thought this was a credible project to take on. + +00:13:16 This portion of Talk Python To Me is brought to you by us. + +00:13:21 I want to tell you about a course I put together that I'm really proud of, Agentic AI Programming for Python Developers. + +00:13:29 I know a lot of you have tried AI coding tools and come away thinking, well, this is more hassle than it's worth. + +00:13:35 And honestly, all the vibe coding hype isn't helping. + +00:13:39 It's a smokescreen that hides what these tools can actually do. + +00:13:42 This course is about agentic engineering, applying real software engineering practices with AI that understands your entire code base, runs your tests, and builds complete features under your direction. + +00:13:55 I've used these techniques to ship real production code across Talk Python, Python Bytes, and completely new projects. + +00:14:02 I migrated an entire CSS framework on a production site with thousands of lines of HTML in a few hours, twice. + +00:14:09 I shipped a new search feature with caching and async in under an hour. + +00:14:14 I built a complete CLI tool for Talk Python from scratch, tested, documented, and published to PyPI in an afternoon. + +00:14:22 Real projects, real production code, both Greenfield and Legacy. + +00:14:27 No toy demos, no fluff. + +00:14:29 I'll show you the guardrails, the planning techniques, and the workflows that turn AI into a genuine engineering partner. + +00:14:35 Check it out at talkpython.fm/agentic dash engineering. + +00:14:39 That's talkpython.fm/agentic dash engineering. + +00:14:43 The link is in your podcast player's show notes. + +00:14:45 You know, the real challenge, I think, that I saw with all of those is there are so many different use cases, and it's both a big benefit of all the Python packages and stuff, + +00:14:58 but, you know, this package pulls in this compiled thing, and this other one pulls in another compiled thing, and it assumes that the gil works exactly in this way. + +00:15:08 And so there's all these implied behaviors that have to be carried across. + +00:15:12 And a lot of these, I think, we're trying to say, let's put those to the side and see if we could build something neater that's more native to Java or .NET or whatever people were after, you know, with those different ones. + +00:15:24 But then the compatibility just hit them in the face, right? + +00:15:27 We've like, I haven't actually counted PyPI lately, but how many were almost just short of three-quarter million, two packages short of three-quarters of a million packages. + +00:15:37 We've got to reload this page at the end of the pod. + +00:15:39 I'm just going to say, yes, we're going to leave it open. + +00:15:42 We're absolutely leaving that open. + +00:15:44 But trying to be compatible with that many projects? + +00:15:47 We're actually 5,002 short of. + +00:15:50 Oh, yeah, yeah, okay. + +00:15:51 Sorry to be a pedant, but it comes with a gun. + +00:15:54 Oh, yeah, yeah, no, you're right. + +00:15:55 We're at 744, not 7. + +00:15:57 Or 9, yeah. + +00:15:59 There's going to be some kind of milestone reach, but it's not the one I was hoping for. + +00:16:02 Anyway, the point is there's so many edge cases and so many specializations. + +00:16:08 Yeah. + +00:16:09 I think that's really where it hit them. + +00:16:11 And, you know, maybe this is a good segue to just, you know, if not that, then what are you actually building? + +00:16:17 What is this Monty? + +00:16:18 So Monty tries to solve this problem where we want to allow, LLMs are very, very good at writing code. + +00:16:25 We were talking about them writing SQL earlier. + +00:16:26 They're very good at writing Python and JavaScript. + +00:16:30 I think, honestly, it wouldn't really matter to the implementation whether we were implementing Python or JavaScript. + +00:16:37 It just turns out for a bunch of reasons. + +00:16:38 Python is easier and it's also like where we come from. + +00:16:43 The simplest use case of Monty is what people call programmatic tool calling or code mode, where instead of my LLM calling tools in a loop, + +00:16:54 sometimes using the return value from one tool straight into the next tool, the LLM can just go and write code and thereby be more reliable and much more performant and much lower cost. + +00:17:07 So we've seen examples of like, if you, for example, connect Pydantic AI with code mode enabled to GitHub's MCP and you say, go and find the five latest pull requests. + +00:17:18 And I forget what the question was, right? + +00:17:21 But the point was we have to go jump through their API via MCP and calculate some value. + +00:17:27 We've seen tasks go from kind of $2 down to $0.04 as a result of using code mode. + +00:17:33 Because one of the big reasons for that is that those MCP responses are vast. + +00:17:38 And so the LLM has to put loads of tokens into context to go and pull out, well, actually, this is just like the ID of the thing I need to make the next request. + +00:17:45 I just added an MPC server to Talk Python a few weeks ago so people could ask questions about it and stuff. + +00:17:52 And what really surprised me is the actual return type that MCP servers recommend is markdown, not structured data. + +00:18:01 So you basically send a giant blob of markdown back as the response. + +00:18:05 And then, like you're saying, a bunch of tokens get consumed just trying to understand the response rather than, here's a JSON document. + +00:18:11 I know it's called this. + +00:18:12 Boom, answer. + +00:18:13 So I think in the case of GitHub's one, they do return JSON, which is useful for us because we can then go parse that JSON. + +00:18:18 But also, if you don't need the whole of that response, you can search through it and extract a particular thing you need. + +00:18:26 So the conservative threshold for what Monty can do is allow us to implement this code mode use case. + +00:18:35 And I think it works for that for the most part now. + +00:18:37 We're working hard on some improvements. + +00:18:39 The biggest difference of it versus all of the other Python implementations is it is completely sandboxed. + +00:18:46 It is isolated from your machine. + +00:18:50 So you can't open a file or read an environment variable unless you very specifically say, here are the environment variables you're passing into this context. + +00:18:59 Or here are the pseudo files or indeed real files that I specifically want to expose to this runtime. + +00:19:07 That means that obviously reading a file is going to be way less performant than in CPython where we can go and make some syscall to read a file. + +00:19:14 We're not doing that. + +00:19:15 You're calling back from the Monty runtime to the host runtime, which might be Python or might be JavaScript or Rust, to say, read me this particular file, and then it can choose what to do. + +00:19:26 But that is obviously what you want in this scenario where the LLM is writing the code. + +00:19:31 So that is the regard in which we are completely different from all of the other Python implementations. + +00:19:38 And then there's a few other projects doing similar things, but we're different in that regard from all of the established programming languages, which would all have ways to read files. + +00:19:47 Very interesting take. + +00:19:47 You know, it might be worth just a quick mention. + +00:19:50 There's plenty of people out there listening who have not done agentic tool using coding. + +00:19:56 So I think understanding just that the flow of that is kind of important to understanding the value of this, right? + +00:20:02 And you did definitely touch on it, but if you go and ask Claude Code to do something, or Cursor, or whatever, it's constantly like, let me run this GitHub command. + +00:20:11 Let me run this Git command. + +00:20:12 Let me run this LS command. + +00:20:13 Let me run this find. + +00:20:15 And periodically it'll just exec Python, like little strings of Python and stuff. + +00:20:20 So one of your core ideas is, what if we could give it a better Python that it's encouraged to use for this kind of behavior, right? + +00:20:30 Let me describe it in a slightly different way. + +00:20:32 Okay, so we have a continuum of how much control and how much flexibility LLMs have. + +00:20:37 At one end of the spectrum, we have pure tool calling, where they can basically return JSON with the name of a tool that you're going to call. + +00:20:43 And there are agent frameworks like PyLance AI that allow you to hook that up to functions. + +00:20:49 But ultimately, you're just getting JSON back and you're deciding what to do with that. + +00:20:53 And you may call the LLM again with some return value. + +00:20:55 At the full other end of the spectrum, we have complete computer use. + +00:20:58 Some LLM has some vision model and is moving my cursor around on screen to do everything I want. + +00:21:04 Type onto our keyboard. + +00:21:05 In the middle, we have a bunch of options. + +00:21:07 We have Monty, which is kind of on the near the tool calling end of the spectrum. + +00:21:12 Then we have sandboxes like Daytona and E2B and modal. + +00:21:15 And then we have the kind of Claude code or codex style of like complete control of your terminal. + +00:21:20 And along that spectrum, you go more and more power in terms of like capacity of what the LLM might be able to do and more and more security concerns. + +00:21:29 And generally that comes with more and more of having an adult watching what it's going to go and do and controlling it and uncrashing it when it crashes, when it goes and does the wrong thing. + +00:21:37 And so for the most part today, when we're using something in the cloud that uses an LLM, it's doing the tool calling end of the spectrum. + +00:21:48 That's what the kind of LangChain, Langgraph, Pydantic AI, Crew AI, all those guys are doing. + +00:21:54 The LLM is doing very similar things when Claude code basically decides to go and run LS or run RM-RF. + +00:22:01 It's calling the tool like bash command, which the Claude application running on your machine chooses to go and execute. + +00:22:09 The point is, for the most part, when we're building applications that are going to go and run in the cloud, we don't have a software developer who understands what's going on, sitting, watching every command. + +00:22:19 And so we need to be much more constrained in what we're going to allow the LLM to do. + +00:22:23 But we want to have a little bit more expressiveness than we do with pure tool calling. + +00:22:27 And at the moment, there is basically nothing in the spectrum between tool calling and go and run a sandboxing service and have access to a full sandbox. + +00:22:36 And that's powerful. + +00:22:37 You can do a bunch of things with it, but often we don't need that stuff. + +00:22:40 And that's where Monty's, that's the kind of sweet spot. + +00:22:43 Okay. + +00:22:43 There's interesting incentives or something that align with this undertaking as well. + +00:22:49 For example, if you don't give it a networking stack, it can't do bad things on the network. + +00:22:54 Yeah. + +00:22:55 Because it just doesn't exist, right? + +00:22:56 So it helps you, it inspires to create like a more minimal version of the standard library and so on. + +00:23:02 Yeah. + +00:23:02 And you can imagine like we, we will soon have a, some version of HTTP request that you can make, but you will be required to go and enable that explicitly. + +00:23:11 And even better, because you're calling through the host, you're going to have a perfect point where you can go and read the URL and go, no, you can't make a request to local host and go and like start snooping on what's going on here. + +00:23:22 You have to be making a request to an external URL or whatever else it might be. + +00:23:26 Or even I'm going to go and use some third party service to proxy all HTTP requests. + +00:23:31 So it is never an untrusted HTTP request inside my network. + +00:23:34 But the point is, this is the single biggest difference of Monty is every single place where you can, where the code could interact with the real world, it must call an external function. + +00:23:46 So call back through the host. + +00:23:47 And then the other regard in which it is, I think somewhat innovative is we are not using traditional callbacks for that. + +00:23:54 So we're not giving the runtime a list of pointers to functions it can call on the host. + +00:24:00 Instead, the Monty runtime is effectively suspending and returning control to the host whenever you're doing a tool call. + +00:24:08 So you're basically getting a response, which is like call the function, read file with the arguments file name on whatever else it might be. + +00:24:15 And that allows a few things, but in particular, it allows us if that tool we're going to go and run, or that function we're going to go and run is going to take two days to run, we can serialize the Monty runtime, go put that in a database, and shut down our process + +00:24:29 and wait for the tool to come back. + +00:24:31 And that's something that CPython doesn't offer, understandably, but we are able to build because we built Monty from scratch. + +00:24:37 You can serialize the entire interpreter state, go put it into a database and retrieve it later when you want to resume. + +00:24:43 That's pretty wild. + +00:24:44 So it's got this durability aspect, right? + +00:24:46 Yeah, which I think is in these scenarios where often the code execution part of this is going to take milliseconds, but our tools might take minutes or hours or whatever else, + +00:24:58 both for durability and to build an application that's both more durable and easier to maintain. + +00:25:05 You don't have to have that interpreter state hanging around in memory as you would with CPython. + +00:25:12 And all the other things like timeout and just other weird oddities, right? + +00:25:18 Like I was working on something on my laptop just yesterday and my wife's like, you ready to go? + +00:25:24 I'm like, hold on, I got to wait. + +00:25:26 I got to wait for this chat to complete before it's been going for five minutes. + +00:25:31 It's almost done. + +00:25:31 Just hold on. + +00:25:32 And then I can close my laptop and roll, you know, because it would have, who knows what it would have done to it, right? + +00:25:37 Yeah. + +00:25:37 Yeah. + +00:25:37 And talking of timeouts, the other thing that we're able to do in Monty is we're able to, look, it's not perfect yet because it's early, but we basically allow you to set resource limits. + +00:25:45 So total execution time and memory limit in particular and recursion depth. + +00:25:51 And therefore you can run this Monty thing in some small image in the cloud and you can say it's got 10 megabytes and it, you know, once it's hardened, you know, it's early, we have that support now, but I'm not saying there are no ways around it. + +00:26:04 It can't go and kill your machine out of memory. + +00:26:07 Can't oom your container. + +00:26:09 You're just going to get back a resources error saying too much memory can suit. + +00:26:14 Yeah. Very powerful. + +00:26:15 So I see on the GitHub page here, a couple of things. + +00:26:17 First of all, it supports Python 3, 10, 11, 12, 13, 14, presumably 15 will take the place of 10 in a year or something. + +00:26:25 So that is the support for the, so we have, so the Monty runtime is written entirely in Rust. + +00:26:31 It has no dependency on CPython or PyO3 or anything else. + +00:26:37 It is a pure Rust library. + +00:26:38 We're very lucky. + +00:26:39 We have the AST parser from Ruff, from the Astral team that we're able to, gives us, allows us to go from Python code to some basically structured objects. + +00:26:49 We don't have to go and do that, like parsing the Python code ourselves. + +00:26:52 Right. + +00:26:52 Because Ruff is already written in Rust. + +00:26:55 Like that's, I feel like the Astral team is kind of a peer of yours for sure. + +00:26:59 You guys must look at each other, what you all are doing. + +00:27:01 Yeah. Yeah. + +00:27:02 And, and, and, you know, we use that a lot. + +00:27:03 And also we have ty built in. + +00:27:04 So the ty type checker from Astral is again written in Rust. + +00:27:09 And so it is compiled into Monty when you use it. + +00:27:12 And so before you run your code, you can go and run type checking at the same time. + +00:27:16 And again, that, that feedback is incredibly useful for LLMs to get them to, to write reasonably reliable run like workflows. + +00:27:24 But, but to, to come back to your, your question. + +00:27:27 So we have Monty itself, which is just Rust, pure Rust, no other C dependencies, just in Rust. + +00:27:32 And then we have, and that you can use that as a Rust library directly in your Rust application, if you so wish. + +00:27:38 And there are people already doing that, but we then have libraries for Python and for JavaScript, which use in the case of Python, PyO3, which is amazing. + +00:27:47 In the case of JavaScript, a thing called NAPI, or maybe you're supposed to pronounce it NAPI. + +00:27:52 I don't know. + +00:27:54 Which allow, which basically means we can go and have JavaScript and Python packages where you can call Monty. + +00:27:58 And so slightly confusingly, that Python 310, Python 3.3.14 is referring to the Python package that you're installing. + +00:28:05 The actual Monty is targeting Python 3.14 syntax only. + +00:28:09 I see. + +00:28:10 But those are the different language features that you support basically for parsing, right? + +00:28:15 Something like that. + +00:28:15 Yes. + +00:28:16 No, no, no. + +00:28:16 So that's just like, we only support, so Monty itself will run as if it was 3.14 or, you know, some subset of it. + +00:28:22 We don't support all the syntax yet, but like 3.14 type stuff. + +00:28:26 But yeah, if you're, when you're installing it, when you're uv add Pydantic Monty, you can do that in 3.10 through 3.14. + +00:28:32 And obviously, because we maintain a bunch of Rust stuff, we've worked hard to have binaries for basically every environment, Python, Linux, macOS, Windows, bunch of different architectures. + +00:28:43 And we have PGO builds, which no one else has. + +00:28:45 So that should improve performance again. + +00:28:47 Yeah. + +00:28:47 Yeah. + +00:28:47 PGO is process. + +00:28:52 I did. + +00:28:53 Yeah. + +00:28:53 So, so we did this first in, in, in Pydantic itself, which obviously the core is written in Rust. + +00:28:58 And it was in fact, David, David Hewitt on our team, who's the PyO3 maintainer, who identified this, this great technique. + +00:29:04 So basically it's, it's part of Rust. + +00:29:06 You basically compile the library, and then you run as many different bits of code against it as you can, in our case, all of the unit tests. + +00:29:14 And then you basically recompile it with pointers as to which paths in the code, which branches are most common. + +00:29:21 And you can get up to like 50% performance improvement. + +00:29:23 But the thing is, if you're building your own library, that's a real pet. + +00:29:26 If you're building your own application, that's a pain. + +00:29:28 If you just uv add Pydantic Monty, you get that stuff for free. + +00:29:31 Yeah. + +00:29:32 Super cool. + +00:29:32 Yeah. + +00:29:33 I'm reoriented in my acronyms now, profiler guided optimizations, right? + +00:29:38 Yes. + +00:29:38 So basically compilers as, as Python people, we don't necessarily think about them a lot, but compilers have all sorts of optimizations. + +00:29:46 And I remember in the late nineties, when I was working with things like GCC and stuff, you could actually break your program by asking for too many optimizations. + +00:29:54 You know, you could, it had these levels. + +00:29:55 And if you put it on the top level, there's a chance your program like literally might not run, which is a really bizarre thing for compilers to do, but they can, they make like decisions. + +00:30:04 Like maybe we should inline this so we can avoid a stack jump and setting up the stack and all that. + +00:30:09 with the PGO, it actually looks at how the code runs and uses that as input for its optimization, which is a super cool idea. + +00:30:18 So it's awesome. + +00:30:18 You're doing that. + +00:30:19 Yeah. + +00:30:19 And I honestly don't know what the difference is here. + +00:30:21 I think when I tried it, it was relatively minor, but in Pydantic, it's, it's a, it makes for a big improvement. + +00:30:26 Yeah. + +00:30:26 going back a bit, I don't know if people remember depending on where they were in their journey, but from Pydantic one to two, I've got 50 X performance increases. + +00:30:36 And yeah, the Pydantic of today is not the Pydantic of 2017, right? + +00:30:40 It sure is not. + +00:30:41 It's sure it's not. + +00:30:42 And that was, you know, that was an enormous piece of work, the rewrite, because we didn't have LLMs. + +00:30:46 I think it would have been a job that would have been a heck of a lot easier if we'd been able to point Opus 4.6 at Pydantic and be like, do this, but in Rust, but Hey, we got it done. + +00:30:55 And I learned a lot along the way. + +00:30:56 That's a challenge that we're going to have to, I don't know how you see it, but I think as an industry and individually, each of us is going to struggle with like, how much rust did you learn and how, how much experience and ideas did you get spending that year evolving + +00:31:10 Pydantic versus if you just got it knocked out? + +00:31:13 Like where's the trade-off? + +00:31:14 It's a big double end thought. + +00:31:14 Like I don't, I know there were those people who were like, now it's impossible to enter as a software engineer. + +00:31:18 I've spoken to some people, some really amazing product people who were like, I'm writing code suddenly because I have the right technical mindset. + +00:31:24 I just have never had the time to go and learn all this stuff. + +00:31:27 And now the LLM can do the like rote for me and I can do the innovative product stuff on top. + +00:31:32 So I get to build. + +00:31:33 So we have new people entering, but you're right. + +00:31:35 There are, there are going to be big challenges because just as I don't have a clue about assembly and I'm not good at writing it. + +00:31:41 And that probably makes me a worse engineer than if I spent the first decade of my career hand writing out assembly. + +00:31:47 So as we add layers of abstraction, the layer of abstraction beneath becomes kind of in the shade to, to most of us. + +00:31:55 And we never, we never look at it. + +00:31:56 Yeah. + +00:31:57 It's very interesting. + +00:31:58 I sort of think of this whole agentic coding thing as the change when design patterns became popular. + +00:32:05 Instead of talking about, here's how we're going to do the loop or here's how we're going to construct the class. + +00:32:08 You just think singleton flyweight. + +00:32:10 And like you're building with these bigger conceptual building blocks. + +00:32:14 And now it's kind of like make a login page. + +00:32:16 Okay. + +00:32:16 We've got the law. + +00:32:17 Now what, now what else am I building? + +00:32:18 Like you can think almost in components rather than like very small pieces. + +00:32:23 Yeah. + +00:32:23 I don't know what PyPI does, but like at the next level up. + +00:32:26 Yeah. + +00:32:26 Yeah. + +00:32:27 Kind of. + +00:32:27 Yeah. + +00:32:28 I do think there's still room for people to come into the industry. + +00:32:30 I think it's super exciting. + +00:32:31 You still just, I think it's really going to come down to like problem solving and breaking down things into the way you want them to work. + +00:32:37 And that's a programmer skill. + +00:32:39 I also think what we haven't seen yet is the things that LLMs are bad at. + +00:32:43 Because one, if an LL, if I tried to do something with an LLM and it doesn't work, that is not proof that I cannot do it with an LLM. + +00:32:49 It's proof it didn't work that particular time. + +00:32:51 Whereas if I go and try and do something with an LLM and it does work, well, hey, that's proof it can be done. + +00:32:56 And two, no one wants to talk about this is the thing that failed, right? + +00:32:59 So Anthropic announced we built a C compiler in two weeks by giving Opus loads of access. + +00:33:06 What they didn't say is we tried to build an eBay clone and it was a complete unmitigated failure. + +00:33:11 Cost us what would have been a hundred thousand dollars of inference. + +00:33:14 I'm not saying that's happened and no criticism. + +00:33:16 Yeah. + +00:33:17 We don't hear about the failures both because they're less attractive to state and because they are not clear identifiers as it were in the way that like successes are. + +00:33:25 And I think one of the things we will learn over the next few years is like, here are the things LLMs are really, really good at. + +00:33:30 And here are the things that no one succeeded with them yet. + +00:33:32 And that's probably meaningful. + +00:33:34 I don't want to go too deep in this because I want to stay focused on money, but I'm also a believer of Jevon's paradox. + +00:33:39 I think that this is going to create more demand for software. + +00:33:43 Now that people see what is possible rather than just like, well, we're going to build exactly the same amount of software with fewer people. + +00:33:49 So I think there's a lot there. + +00:33:51 So codspeed. + +00:33:52 So you have, have that on. + +00:33:53 That's the, this is a pretty interesting tool. + +00:33:56 I just recently learned about this. + +00:33:57 You have this as a badge on your GitHub. + +00:33:59 Tell us a quick bit about this. + +00:34:01 I'm good friends with Arthur who, who was the founder. + +00:34:05 I'm a big fan of codspeed when you're building performance critical code. + +00:34:09 This is a nice few, but the real powerful thing is if you go in on a, on a pull request, you can see if you're getting performance regressions. + +00:34:17 So, and even better. + +00:34:19 So if, if you go to, so these are the particular benchmarks we have. + +00:34:22 So if, yeah, maybe you go to branches, it's a, or if you go to a pull request in, in our GitHub. + +00:34:28 Oh, if I compare all these, I've compared main against main. + +00:34:31 That's not super interesting. + +00:34:32 If you go back to our, if you go back to, to the, like go to PR that you guys have to, to PRs. + +00:34:38 And if you go, for example, to that data class one, the third one down. + +00:34:42 Gotcha. + +00:34:42 All right. + +00:34:43 Let's check that out. + +00:34:43 You'll see, we have a comment from codspeed saying one benchmark has got more performance. + +00:34:47 more importantly, I had performance regression. + +00:34:51 Now Monty, now, now codspeed would be failing and I'd be like, I need to go fix that before I merge it. + +00:34:56 So we can't, as long as we have enough benchmarks, we can't have like silent regressions in performance. + +00:35:02 And even more powerful. + +00:35:03 If I go click on that, on that particular one, if you kick on the pair tuples, or go just, perhaps. + +00:35:10 Yeah. + +00:35:10 Yeah. + +00:35:11 What you will see is we can now go and see, the flame chart, the flame graph of exactly what's taken, what time and where the performance changes have come from. + +00:35:19 This, this change is very minor. + +00:35:21 So it's not very interesting, but you can imagine if you accidentally do something slow in your code, this is rust, but that'll work on Python as well. + +00:35:27 You would have this like flame chart showing you where the performance has changed. + +00:35:30 yeah. + +00:35:31 If people who are listening, they just go to the Monty, get up repo, go to any pull requests, pull it down. + +00:35:36 And there's just a comment from the cod speed bot. + +00:35:39 And it says the improvement changed from 97.7 milliseconds to 88.1 milliseconds. + +00:35:45 That's a 10.95% increase in performance. + +00:35:47 So, Hey, this thing doesn't hurt performance, right? + +00:35:50 By adding it. + +00:35:51 Yeah. + +00:35:51 What's even cooler is under the hood. + +00:35:52 They're using, Oh, I'm having a blank on the name, but they're, they're not even measuring, they're measuring like CPU and CPU instructions. + +00:36:00 Okay. + +00:36:00 Yeah. + +00:36:01 So it can run in, in a like noisy environment, like you have actions and you can still get like pretty good accuracy on detecting performance changes. + +00:36:09 Valgrind. + +00:36:09 There we are. + +00:36:10 Valgrind is the underlying tool that like at the compiler level is looking at number of CPU instructions. + +00:36:15 See what this pulls up. + +00:36:16 Well, cool. + +00:36:18 I don't know what that's about, but there's a, a polygonal polygon. + +00:36:23 No, well, I don't know what this is a cartoon, but there's also the app. + +00:36:27 Yeah. + +00:36:28 The, the, the, the, Oh, that's its logo. + +00:36:30 Okay. + +00:36:30 I got it. + +00:36:30 That's it's like, at least it's like hero image or something. + +00:36:34 yeah. + +00:36:35 Yeah. + +00:36:35 So, so we, it's maybe a good segue then into performance where like the aim of Monty is not to build something faster than CPython. + +00:36:42 The aim, the aim I suppose is to build something that is not like heinously slower. + +00:36:47 we performance seems to vary from about five times better to five times worse. + +00:36:52 In most cases, I'm sure that there are, there are edge cases we need to go and improve where it's worse than that, but like, that's what I seem to see. + +00:36:58 I mean, in my impression of the kind of LLM written code that we're mostly talking about, performance is not critical. + +00:37:04 Execution is going to be in the matter of single digit milliseconds. + +00:37:08 And that's not going to matter when you add a LLM requests are taking seconds. + +00:37:11 The thing where Monty really excels. + +00:37:13 So if you scroll down a bit and I can talk you through the table, it's like near the bottom of the, of the read me. + +00:37:19 but yeah, there we are. + +00:37:22 So like the startup time here measured for Monty to go from basically code to a result. + +00:37:27 I think the code here is like one plus one is, 0.06 milliseconds. + +00:37:34 So that's six, microseconds. + +00:37:37 So, and actually in the hot, hot loop in benchmarks, we see one plus one, going from codes to result in Monty taking about 900 nanoseconds. + +00:37:45 So under a microsecond, again, that's, that's microsecond, not millisecond or second. + +00:37:51 when you compare that to like running something in Docker, which is taking in, in my example here, 195 milliseconds, Pyodide, Pyodide is awesome project. + +00:38:01 Big fan of, of the team, allowing you to run Python in the browser, but wasn't designed for this use case, running, going from zero to like getting a result in Pyodide is, 2.8 seconds. + +00:38:13 Starlark's a special case of another project, a bit like Monty, but a bit more limited. + +00:38:19 but sandboxing, I was talking earlier about that being one of the main options, like go run a, basically spin up a new container somewhere. + +00:38:25 There's a bunch of services that will do that. + +00:38:26 They're very popular at a moment from, from scratch to creating a new container and getting a result. + +00:38:30 Here's taking over a second. + +00:38:32 So where Monty really excels is where you have relatively small amount of Python code to call. + +00:38:38 And that this, the overhead of running it is, is basically in the realistic term zero. + +00:38:43 It's, it's the cold start over and over and over again. + +00:38:46 The, cause these are all one shot commands, like the LLM asks for this thing and it shuts down when it gets the answer. + +00:38:51 Right. + +00:38:52 Yeah. + +00:38:52 And, and I'm sure that if you ask the sandbox providers, they would be like, yeah, but it's not about cold start. + +00:38:57 It's about reusing an existing container. + +00:38:59 And that is way faster. + +00:39:01 I agree that, you know, and then there, there are impressive pieces of technology, but there are also lots of cases where I do want, where I do want cold start. + +00:39:07 I've spoken to the big LLM providers who are interested in Monty, because if you go and ask, ChatGPT, like effectively some, some arithmetic or like how many days between these two dates in the background, they're running Python code. + +00:39:21 do that calculation. + +00:39:22 They're obviously very security conscious. + +00:39:24 They can't just go run that Python code YOLO on, on whatever host. + +00:39:27 So they're actually using external, sandboxing services often. + +00:39:30 And that one, they're paying the second of overhead for that, where they do need a new container, but also that, you know, they're paying the organizational complexity of another, another provider. + +00:39:41 They're paying the fee of running that. + +00:39:43 Whereas Monty would allow you to do that kind of thing right there in the process. + +00:39:47 That is something that's really interesting about how these LLMs are like bad at math, you know, just add up these numbers and it might not get it right. + +00:39:53 And so, like you said, they've, they've started to go, okay, I'm going to write some bit of code that I know how to write really well and can verify. + +00:40:00 And then I'll just apply this data set to it. + +00:40:02 Right. + +00:40:03 Like you'll see it doing, you know, CSV types of things with Python and all sorts of stuff. + +00:40:08 And so that's a really good place where that Monty could be the foundation of it. + +00:40:12 Right. + +00:40:13 Yeah, exactly. + +00:40:13 And, you know, the other nice thing about that is if you have the Python code and something does go wrong, you're not having to like kind of guess at what's going on inside the black box of the LLM. + +00:40:23 Well, I suppose you are at some level, but you have the code, which is kind of the intermediate step where you can go and verify. + +00:40:28 Yep. + +00:40:28 That code makes sense. + +00:40:29 I mean, not saying everyone will do that, but as a developer debugging it, or as a data scientist trying to work out whether or not it is likely to have got the right result, I have the kind of intermediate representation of the logic that I can go and review. + +00:40:41 And so it's that much easier to, to debug. + +00:40:43 So let's talk about some of the columns, partial language completeness. + +00:40:47 I'm not saying it needs to be completely complete, but you know, like what, what does it, what does it need? + +00:40:53 You know, for example, do you need really dynamic metaclass programming for your tool use? + +00:40:58 Probably not. + +00:40:58 Right. + +00:40:59 Right. + +00:40:59 So probably not. + +00:41:00 So at the moment, the two, what does it need? + +00:41:02 Yeah. + +00:41:02 So the things we miss right now, I'll start with, with the downside. + +00:41:05 The things we miss right now are, classes, context managers. + +00:41:10 So, so with expressions, and match expressions, which are obviously relatively new. + +00:41:16 I think classes are by far the most complex of those. + +00:41:18 We will support them at some point. + +00:41:20 They're somewhat complex to, to get right. + +00:41:22 I have been amazed by how much LLMs just don't need classes to do most of the stuff they're doing. + +00:41:27 Like, so you could pass a data class into Monty and you will have some object where you can access attributes. + +00:41:32 And access as of later today methods on that, on that data class. + +00:41:35 But what you can't do is like define a class or a data class in, in the Monty code itself. + +00:41:40 I'm amazed at how often that that's just not necessary. + +00:41:43 Context managers will mostly be nice because we can allow the LLM to write the kind of code it might want to. + +00:41:50 So let's say we allow the open, at the moment the open built in is not, it's not provided at all for opening a file. + +00:41:55 We have like the, we have basic support for path lib via our way of, allowing use like very controlled access to the outside world. + +00:42:04 But if we have add open, very often LLMs want to write with open, yada, yada. + +00:42:08 And we want to be able to support that match expressions are, are, are neat. + +00:42:12 And I think will be more and more common in Python. + +00:42:13 And I think we can, you know, full support will be hard, but getting most of it there is hard. + +00:42:18 What we will never. + +00:42:19 And then, then the other big part of partial is we don't have the full standard library. + +00:42:23 So we have a very, very limited standard library today of some bits of typing, some bits of the SIS, module, OS dot environment, as a PR up from someone to add re, regexes date, date time. + +00:42:38 And I think we'll add Jason. + +00:42:40 and so those will all be, be supported. + +00:42:42 And to be clear, they will all be implemented in rust. + +00:42:45 So like Jason dot loads will be rust level performance of loading that thing. + +00:42:49 I mean, we're a bit of overhead to creating the Monty object, but, but very, very fast. + +00:42:54 but we're never going to go and support the whole standard library. + +00:42:58 It'll be on a case by case to LLMs actually need this thing, that we can go and go and add them. + +00:43:02 I will say, and I know we're going to talk about this at some point, but like, it is amazing what this project is only made possible by LLMs and not, not that we're ever aiming to full standard library, but adding support for certain, certain modules of the standard library is a heck of a lot easier when you can, again, + +00:43:17 we have a perfect record of what it's supposed to do. + +00:43:19 So we can go and ask the LLM to, to build that. + +00:43:22 and then the last test for like CPython has a ton of tests. + +00:43:26 You can extract out the bits that apply to that maybe. + +00:43:29 And just, well, does it run here? + +00:43:31 I'll come on to like, so I have three reasons why I think it's, this is possible with LLM. + +00:43:35 Let me just, the last point that's going to make is what we will never support is, or I think never support is third party libraries. + +00:43:41 So you'll never be able to pip install Pydantic or FastAPI or requests inside, inside Monty. + +00:43:48 And because the reason, the reason for that is we would need to support the CPython ABI and basically support full CPython. + +00:43:55 And if you're going to do that, you're basically back to CPython. + +00:43:58 and so sure there are ways of sandboxing CPython, most of which are demonstrated here. + +00:44:02 That's not the aim of this project. + +00:44:03 However, what we can allow you to do is basically have a shim where you expose, let's say HTTPX, get and post methods and patch and whatever you need through to Monty. + +00:44:13 And we're, we're currently working out whether or not we basically add those, provide those shims as, as part of the library. + +00:44:21 So you don't need to go and think about that. + +00:44:22 You can be like, yes, give it HTTP access or yes, give it access to DuckDB's SQL, engine or give it access to beautiful soup. + +00:44:32 And that shim comes and you don't need to go and implement it. + +00:44:34 so you can whitelist in like super critical libraries that people are like, we, if I had this, I could really do. + +00:44:42 So one of the questions we have now, that we need to probably go run evals on to find out is if we come up with a very Pythonic type safe, example of let's say an HTTP library, and we give those types to the LLM, + +00:44:54 does it do better or worse with that than just being told you can use requests? + +00:44:58 And I don't know the answer. + +00:44:59 There are, there are genuine arguments in both cases. + +00:45:02 Some people seem to be very sure one or the other is right. + +00:45:04 I just, I just don't know. + +00:45:05 And that's the kind of thing where we need to go and run evals and work out what an LLM will find easiest. + +00:45:10 but yeah, we can either kind of attempt to fake the existing libraries, API, warts and all, or we can go. + +00:45:18 And in many cases just say, Oh, we've got this new fetch library that has a fetch method and here's its, signature. + +00:45:24 And I suspect the LLM will do, do a pretty good job of it. + +00:45:26 So what are the weird new, not quite typo squatting, but kind of typo squatting supply chain type of issues has, at least in the earlier days of LLMs, they, when you would ask it to write code, sometimes it would say, + +00:45:39 we're going to import some library and that library didn't exist. + +00:45:42 And then it imagined a bunch of code series that happened after it. + +00:45:45 So people would go and find popular ones of those and then register malicious packages that the LLMs had hallucinated. + +00:45:53 Right. + +00:45:54 but I guess you probably kind of, you kind of got to do a similar analysis, but not for evil where you say like, well, if I just ask Claude or, or, codex or whatever to do a thing, what is it? + +00:46:06 What does it try to do? + +00:46:07 If you see it always asking for a question, like maybe it's just better that we, we lie to it and say, okay, whenever it says import requests, we give it our special way to just get stuff off the internet. + +00:46:17 And it only really needs to get put in like a couple of, it doesn't need all of requests. + +00:46:21 It just needs a very basic behaviors. + +00:46:23 Yeah. + +00:46:23 Is that the kind of stuff you're thinking? + +00:46:25 Yeah, exactly that. + +00:46:25 And that's one of the reasons we didn't start with Starlark, which is a, I think originally a meta Facebook project to have a, like basically isolated Python runtime was because Starlark has a very + +00:46:39 disciplined and principled approach to what it supports and what it doesn't. + +00:46:43 We have to be not principled. + +00:46:44 We have to be like, well, if the LLM wants to write this thing, we're going to go and implement the CSV module, but not the Toml lib module. + +00:46:51 Cause that's just what they need to go and use. + +00:46:53 And we're going to be like, our principle is give the LLM what it wants, not here's our rule. + +00:46:57 so yes, exactly. + +00:47:00 And yeah, I mean, I think Boris, Boris, the, code code creator talked about this. + +00:47:06 So I saw him speaking, he was saying like, you know, one of the reasons they gave the LLM bash early on was like, you can tell it to use the make dark, tool to make directories, but half the time it'll just go and call make dark --P and make the directory that way. + +00:47:20 And like, are we going to fight it and always return an error being like, you should do this other thing, or are we just going to make that thing work? + +00:47:25 And often you have to just make nothing work. + +00:47:27 so, so yeah, go ahead. + +00:47:29 Yeah. + +00:47:29 Is this useful outside of this for AI story? + +00:47:34 You know, like if I'm creating something that has really high security, I want to add some, some mechanism for people to write scripting, but not full on programming language. + +00:47:44 So in other places. + +00:47:46 Yeah. + +00:47:46 We've actually thought about this internally inside log fire already. + +00:47:49 Like we want to be able to give people a way of basically entering config config that can do things. + +00:47:54 There's no easy way of doing that right now. + +00:47:55 Right. + +00:47:55 As it's sure I can go and use as again, one of these sandboxing services to run that code, all the complexity of setting up, we offer self-hosted log fire. + +00:48:02 So they're not going to work, et cetera, et cetera. + +00:48:03 Or once Monty is a bit more mature, we can just go and use Monty to let them like define the expression that, that it might be as simple as like, what field do we use from your profile to display as your net? + +00:48:14 Right. + +00:48:15 And we can, we can let you bet in or an AI can write the, like one line of code that does that. + +00:48:19 And then we can call it lots of times. + +00:48:20 They're like, it's feasible now to have the, like a few lines of Python code to define this. + +00:48:24 That's generally, generally been hard until now. + +00:48:27 but of course, you know, the best tools are the ones where you, people use the tool for not what it was originally designed for. + +00:48:34 So someone invents the hammer and I think it's going to be used for nails. + +00:48:36 And then someone else realizes that you can like change the, like knockout, like mistakes in your bumper of your car with a hammer. + +00:48:43 Right. + +00:48:43 And like, of course, what's amazing about Pydantic, why I'm so proud of it is people gone and used it as a general purpose tool for a bunch of things I'd never thought of. + +00:48:50 So my like dream for Monty is that people come along with things to do with it that I had never heard of. + +00:48:55 And like, RLM is a really good example of that. + +00:48:57 So recursive language models of this way in which you use almost always a Python REPL as a way of implementing effectively agentic loop. + +00:49:05 And there were some people who have an example of doing that and like getting better results in the RKGI 2 benchmarks by using RLM. + +00:49:13 I didn't even know about RLMs when I announced Monty. + +00:49:16 There are now at least four different libraries that are using Monty for RLM with, with DSPY because DSPY because people are super excited about that space. + +00:49:25 So that's, that's agentic, but it's definitely something I hadn't thought of when I announced it. + +00:49:29 Yeah. + +00:49:30 I was even thinking just like, I have a medical device, like a CT scanner. + +00:49:34 I want to let people script it, but we can't break it and like zap somebody. + +00:49:38 Do you know what I mean? + +00:49:39 It needs to be really very, very controlled. + +00:49:42 this could be a really interesting, thing. + +00:49:44 So does it compile to WebAssembly? + +00:49:46 Can I in browser it? + +00:49:48 Yep. + +00:49:48 And in fact, Simon Willison, the day it came out or Simon Willison, Claude prompted by Simon Willison set one up. + +00:49:54 So I think if you go to Simon's blog somewhere, there's actually an example of Monty running somewhere, somewhere in a browser that you can, you can go and go and try it. + +00:50:03 Probably an earlier version. + +00:50:04 yeah, somewhere here, I think he'll have a link to, to his, his version of it. + +00:50:09 so as he pointed out that you can do the really crazy thing, which is you can, you can compile the Python package for, yeah, so this is, this is his example, which is, I think like, WebAssembly running directly in the browser, but he did something even more crazy, which is he took the Python library, + +00:50:24 compiled that to, to Wasm and then called that from inside Pyodide, which is like crazy worlds within worlds. + +00:50:31 definitely not the original plan, but, but interesting. + +00:50:34 Yeah. + +00:50:34 Wow. + +00:50:35 Okay. + +00:50:35 So yes. + +00:50:36 And here's your example to do it, right? + +00:50:38 Yeah. + +00:50:39 Yeah. + +00:50:39 And I think the other, the other thing we really need to add to this table, in terms of, of latency and complexity is calling back to the host. + +00:50:45 So one of the reasons a number of people have reached out to me and excited about this is sure that they're happy to have a sandboxing service. + +00:50:51 They don't even mind the second of, of start time, but like if they want to, for example, build an agent that can go and basically, run SQL against a bunch of CSV files, how do I get those CSV files into the sandbox? + +00:51:03 Well, that is painful and often slow because we have to make a full network round trip back to the host to get those files. + +00:51:09 The, the network latent, the, sorry, the overhead of calling a function on the host in Monty is a single digit milliseconds or maybe even less. + +00:51:17 And so if you're making, if you're reading 50 different files from the, from, from the local, yeah, from within the sandbox, but effectively they're registered locally, that's super easy and performance because it's running right there and the same process. + +00:51:30 Very neat. + +00:51:30 So a couple of questions. + +00:51:32 Bonita says we have agents running on AWS strands. + +00:51:36 Here's the crazy thing about AWS. + +00:51:37 There's like so many services. + +00:51:39 I don't even know what strands is. + +00:51:40 Yeah. + +00:51:40 But amazing. + +00:51:41 I think strands is their agent framework is my, my, my guess. + +00:51:45 Yeah. + +00:51:45 Yeah. + +00:51:45 Will the use of Monty help us improve performance there? + +00:51:49 Could they use Monty? + +00:51:50 Yes, it should be able to. + +00:51:51 I'm again, again, apologies if I don't know exactly what strands is. + +00:51:54 If strands is their agent framework. + +00:51:56 Yes. + +00:51:57 In principle, Pynastic AI, our agent framework will have support for Monty as a code execution environment later this week. + +00:52:05 And so you'll be able to basically, instead of running, yes, open source agents SDK. + +00:52:11 So I don't know whether AWS intend to add specific support for Monty, but I know our agent framework will support it later this week. + +00:52:18 My guess from, from what we've built in the past is others will pick up on it and also integrate it into, into their things. + +00:52:24 And of course, the nice thing is here because all the only real requirement is rust. + +00:52:28 We already have the Python package and JavaScript package, but if you wanted to call it from, from any other language base where you can call rust, that should be possible. + +00:52:36 And data science, you mentioned DuckDB already. + +00:52:39 Sort of. + +00:52:39 Yeah. + +00:52:40 NumPy would be, would be great to have, I think full, I mean, I think when like, this is where we need to be a bit careful about what we add. + +00:52:47 Like, sure. + +00:52:47 If there are particular bits of, of NumPy that are useful, can we go and add shims for that? + +00:52:51 Or can we even go and implement that in rust? + +00:52:53 So you can do a like NumPy matrix transformation that happens effectively in rust, but we need to work out what people want and where, what we can't do, unfortunately, I'd love to be able to, but we can't do is just be like, yep, click this button. + +00:53:06 And then now we have the full NumPy API available. + +00:53:09 That is the, you know, that's the big, I'm not going to say Achilles heel because I'm super optimistic about Monty, but that's the, you know, the biggest challenge of Monty is, is that we don't just get to use all the libraries. + +00:53:18 Okay. + +00:53:19 Let me propose a slightly different path. + +00:53:21 Yep. + +00:53:22 Polars. + +00:53:23 Yep. + +00:53:23 Plus Narwhals. + +00:53:24 What's Narwhals? + +00:53:25 Narwhals is a, a facade API, a facade across NumPy, Polars, and a few other things that gives you, like you can program in either, and it'll talk to one or the other. + +00:53:37 So basically you could use Narwhals to talk NumPy, but it translates all the calls over to Polars. + +00:53:43 Yeah. + +00:53:43 I mean, given that, you know, there's a paradigm shift happening here. + +00:53:47 We, what we, what we're not trying to do is let your existing Python code run in this runtime. + +00:53:52 We're trying to give it a context for LLMs to be able to write code. + +00:53:56 And so why not? + +00:53:57 I mean, Polars is written in Rust. + +00:53:58 And exactly. + +00:53:59 That's why I said that. + +00:54:00 Yeah. + +00:54:00 Go and like compile Polars into Monty. + +00:54:03 And now you have a full, like very performant data frame library or, you know, analytical database effectively built into it. + +00:54:11 And you can, and we have the full Polars API available in, in Monty. + +00:54:16 That would be, that would be one option. + +00:54:19 Again, I'm going to be a bit restrictive and, you know, any color, as long as it's black about what we add, because I don't think, you know, we don't need, I don't care about your taste of whether you prefer Polars to Pandas or anything else. + +00:54:30 I care about what are the LLMs find easy to do. + +00:54:33 I think the biggest point of proof of that, Samuel, is that it doesn't do Pydantic yet. + +00:54:39 Yeah. + +00:54:39 If it doesn't do Pydantic, like, okay, you, you're, you're walking the walk. + +00:54:44 Yeah. + +00:54:44 And, and I, to be clear, I don't think, yeah, am I going to vibe code a whole new Pydantic in Monty? + +00:54:50 I don't know whether I'm keen for that yet. + +00:54:53 Yeah. + +00:54:53 Yes, indeed. + +00:54:54 So how do I go about making my AI, like, let's say I'm doing cloud code, Opus 4.6, some project. + +00:55:03 I'm actually not a huge fan of the terminal cloud code. + +00:55:06 I feel like it takes me too far away from the code. + +00:55:10 Just, I prefer to kind of have it in to kind of editor, like the extension for say cursor or VS Code, where I can sort of like watch the code as it's going and sort of, no, no, no, you're going the wrong way. + +00:55:21 Anyway, it doesn't matter really which, how you run it. + +00:55:22 Suppose I'm running it somehow. + +00:55:25 How do I tell it about Monty? + +00:55:27 How does it know what Monty can and can't do? + +00:55:29 How do I make it use Monty? + +00:55:31 You know what I mean? + +00:55:32 You wait a few weeks for us to have skills for Monty and the rest of our stack, and then you install those skills. + +00:55:39 It's something we need to do. + +00:55:40 And I think that's the number. + +00:55:41 We will have proper documentation for Monty as well. + +00:55:44 And that will, that will be an important part of it. + +00:55:47 That's, yeah, there's a lot to do here. + +00:55:49 LLMs can help with some of it, but not, not by any means do all of it. + +00:55:52 I mean, at the moment, read the read me and read the issues. + +00:55:55 And I'm, I am like impressed, surprised, scared by how much people are using Monty already. + +00:56:01 how much is he picked up? + +00:56:02 It's already, what are you doing? + +00:56:04 You know what I saw? + +00:56:04 I saw your announcement of this on X actually is where I saw it. + +00:56:08 And I believe, it's been a little while since I saw it, but it said something to the effect of like, this is way too early, but what the heck, here we go. + +00:56:17 Posted the GitHub link, right? + +00:56:19 Something to that effect. + +00:56:20 And that was, what's that last week? + +00:56:23 Here we are with 5,000 stars. + +00:56:26 Yeah. + +00:56:26 yeah, exactly. + +00:56:28 And it shows how many people are, you know, are looking, are interested in this space. + +00:56:32 I mean, look, a lot of people would have started thinking, Oh, there's going to be a new Python. + +00:56:36 That's just faster. + +00:56:37 Cause it's in Rust and it's going to do everything better in a way that like, you might argue, you know, rough is like wholly better than what one before. + +00:56:44 That is not, that's not the aim for Monty. + +00:56:46 This is not going to supplant or replace in any way. + +00:56:48 See Python. + +00:56:49 It's a, it's a completely separate thing. + +00:56:50 But I think there's also a lot of people who have started this because they're running, they're having a headache running stuff in a, you know, with existing options for sandboxing and something like this is, is interesting. + +00:56:59 There's also, there's another project that's worth calling out from Vasell called Just Bash, which is very similar conceptually. + +00:57:07 It's a bash environment written entirely in TypeScript by, by a team. + +00:57:11 I've said, I met them when I was in San Francisco a few weeks ago and the plan, when I get around to finishing the JavaScript API is that they will in fact use, Monty as the way of calling Python code. + +00:57:22 Cause they have some way of calling Python code within this, which I think uses Pyodide at the moment. + +00:57:26 And it has some, some overheads and some, some, challenges around, security. + +00:57:33 but yeah, this is very similar in the sense of like, it's basically vibe coding, all of the terminal methods that you might want, and using a bunch of existing unit tests to, to check that they're correct. + +00:57:43 interesting that obviously Vasell is a much, much bigger name than we are. + +00:57:47 And it hasn't got as much, like traction early on as at least in terms of GitHub stars, the, you know, the worst of all vanity metrics. + +00:57:54 they've been out like two or three times as long as you have, they've 1000 stars. + +00:57:58 That is, I mean, that's noteworthy, honestly. + +00:58:00 Yeah. + +00:58:01 And there's another project like this, which has about 20 stars, which I was looking at earlier today, which is this, but in rust completely, which already has support for Monty, which I can't remember the name of right now, but maybe I should find it quickly and call it out. + +00:58:13 Cause I feel like it deserves it given that it's a really cool project. + +00:58:16 It has, as I say about, 30 stars. + +00:58:20 let me very quickly, excuse me for one minute. + +00:58:23 It was one of the replies to my initial announcement. + +00:58:27 sorry. + +00:58:28 I will not be very long. + +00:58:31 it's called bash, bash kit. + +00:58:33 I put the, put the link here. + +00:58:37 this already actually has optional support for using Monty as the, as a Python, runtime. + +00:58:44 well, if I was logged into GitHub on my streaming machine, I would have one more star, but I'll do it later. + +00:58:49 Fair enough. + +00:58:49 Fair enough. + +00:58:50 But, but I think what's interesting is all of these three projects and I've heard of a few others, you know, these are only possible really, or they're only really challenges anyone would take on with the advantage of, of an AI. + +00:59:00 And so, so I was mentioning this earlier. + +00:59:02 I think there were three reasons why these things have, why I'll talk about Monty in particular, why it is possible now when it wasn't before and why it is something where the like speed up from an LLM is even greater than in most, most coding tasks. + +00:59:14 One, the LLM, knows in its soul, in its weights, the internal implementation, how to go about implementing a bytecode interpreter or how to implement it. + +00:59:25 If I asked most even experienced Python engineers or Rust engineers, how do I write a bytecode interpreter? + +00:59:31 They would scratch their head and be like, yeah, I sort of know about this. + +00:59:33 I'll put my head up and say, I didn't know what a bytecode interpreter was or how they worked until I and Claude built one together. + +00:59:39 But like, they know exactly how to do it because they've read 15 different, well, well-trodden implementations. + +00:59:44 And it's got a great example. + +00:59:45 You can say, not just any, here's the Python, CPython one, just help me do that. + +00:59:50 Whatever that does. + +00:59:51 And the second thing is they know what the public interface is again, in their soul, as in they know what, what Python should be like. + +00:59:57 They know the signature of the filter function without you having to go and describe it. + +01:00:01 Thirdly, you have an amazing set of unit tests, which is basically just, does it match CPython? + +01:00:05 So in our case, we basically vibe generate tests whenever we're, whenever we're adding a feature and then we run them with CPython and Monty. + +01:00:14 And we confirm that they are identical output down to the byte. + +01:00:17 You know, the exceptions have to be identical to the, you know, to the byte. + +01:00:19 But in the case of just bash, they, they have the existing set of like some bash tests somewhere for like any shared environment that they're able to leverage. + +01:00:32 And I think one thing we might do at some point is basically go steal a bunch of CPython tests and run them with both. + +01:00:36 I haven't got there yet, but that would be an interesting way ahead. + +01:00:39 And then the last thing is you don't have to bike shed or have any human debate about what should the, what should the function, what should the error message be when you try and add an int to a string? + +01:00:48 There's no, there's no debate about that. + +01:00:50 You're just doing whatever CPython does. + +01:00:51 And so there's a whole, whole range of bike shedding debates that we just don't have to go and have because we're just like trying to target CPython. + +01:00:59 Now, of course, around the edge of that, there's a bunch of places where we do have to think about it. + +01:01:02 Like how do we do these external function calling things? + +01:01:05 And that's, that is obviously, that is honestly much, much slower because we don't have this, like the LLM knows already the answer. + +01:01:12 approach, but I think these are the kinds of tasks where LLMs are massively faster or one, one set of cases where LLMs are massively faster than without. + +01:01:21 So I was speaking to big public company in New York who was saying that one of their team had vibe coded a Redis, clone in rust, put it into production after 72 hours and it was 30% faster than Redis. + +01:01:34 Why is that probably worked fine, right? + +01:01:36 Yeah. + +01:01:37 And why is that possible? + +01:01:37 Well, the same things are all true. + +01:01:39 The unit test is super easy. + +01:01:40 It's just, is it the same as Redis? + +01:01:41 There's no debate about what the API is, et cetera, et cetera. + +01:01:44 And so there are these tasks, which historically we would have thought was super hard. + +01:01:48 So I think often we fall into the trap of thinking that what LLMs are good at is what humans are good at. + +01:01:53 And what LLMs are bad at is what humans are bad at. + +01:01:55 I think more and more, we're seeing there are things that LLMs are much better at than we are. + +01:01:59 And there are things that they are, that they're less good at. + +01:02:01 And we're still very early in learning what those things are, but it is not good enough just to be, just to use the like naive, simplistic approach of like what humans are good at, they're good at. + +01:02:10 The simplest example of that is like, ask a NLM to generate you a B-tree implementation in C. + +01:02:15 And with that prompt alone, it will write you 500 lines of C that work as a B-tree implementation. + +01:02:20 It takes you 20 minutes to study it, to be sure. + +01:02:23 And it's like, you're not, I think it works this way, right? + +01:02:26 Yeah. + +01:02:26 I honestly think the little, the bits of weird math and a little, the little hallucinations and stuff have shaken a lot of people's trust in these things. + +01:02:34 And it's just like, well, I'm, I mean, how easy is it to add five numbers? + +01:02:38 Come on. + +01:02:39 Obviously these things are junk because they can't do that. + +01:02:41 And it's just like, well, maybe that's not the tool to use for that situation. + +01:02:44 Right. + +01:02:44 Yeah. + +01:02:45 But, but what you're using here is incredible. + +01:02:47 Yeah. + +01:02:47 But again, we have the guardrails of you must write unit tests all the time that match the two. + +01:02:51 I mean, well, or we have fuzzing going on. + +01:02:53 The fuzzing is another amazing technique. + +01:02:55 So we use, so we have a Jason parser called jitter, which is about the fastest Jason parser in rust that we also is built into, Pydantic core, but it's also actually independently a package in, in PyPI that's used an awful lot. + +01:03:09 You'll see it in the dependencies of open AI, for example. + +01:03:12 but jitter was where we, I discovered about fuzzing really. + +01:03:15 No, I found out about it through, the hypothesis project project of, my friends, Zach Hatfield Dodds in Python, but then fuzzing in rust because the performance is so much better is, is, is really powerful. + +01:03:27 So basically it's generating random strings and using them as an input something, but then it's using very clever stochastic techniques to work out where to try more things. + +01:03:35 And so you can basically fuzz, Monty, you can just give it arbitrary strings for hour after hour. + +01:03:41 And periodically it'll find something where there's an error where like the memory usage is too high. + +01:03:46 If you do the following sequence of multiplying integers together. + +01:03:49 I don't think it will find a like true read the file system vulnerability, but it'll definitely find like odd memory uses or it has found, stack overflows and panics and things like that. + +01:03:59 Well, I think people are excited about it. + +01:04:02 It's definitely got a lot of people talking, a lot of attention, a lot of, a lot of comments in the live stream. + +01:04:07 So congrats. + +01:04:08 And yeah, keep us posted on where it goes. + +01:04:11 And we'll do. + +01:04:11 Thank you very much. + +01:04:12 Yeah. + +01:04:12 Thanks so much for having me. + +01:04:14 You bet. + +01:04:14 Bye. + +01:04:15 This has been another episode of talk Python to me. + +01:04:18 Thank you to our sponsors. + +01:04:19 Be sure to check out what they're offering. + +01:04:21 It really helps support the show. + +01:04:22 This episode is brought to you by our agentic AI programming for Python course. + +01:04:27 Learn to work with AI that actually understands your code base and build real features. + +01:04:32 Visit talkpython.fm/agentic dash AI. + +01:04:36 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, flask, Django, HTMX, and even LLMs. + +01:04:49 Best of all, there's no subscription in sight. + +01:04:52 Browse the catalog at talkpython.fm. + +01:04:54 And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:04:59 Just search for Python in your podcast player. + +01:05:01 We should be right at the top. + +01:05:02 If you enjoy that geeky rap song, you can download the full track. + +01:05:06 The link is actually in your podcast blur show notes. + +01:05:08 This is your host, Michael Kennedy. + +01:05:10 Thank you so much for listening. + +01:05:11 I really appreciate it. + +01:05:13 I'll see you next time. + +01:05:24 I thought of me. + +01:05:26 Get we ready to roll. + +01:05:29 Upgrade the code. + +01:05:31 No fear of getting old. + +01:05:33 We tapped into that modern vibe. + +01:05:36 Overcame each storm. + +01:05:38 Talk Python To Me. + +01:05:40 I sync is the norm. + diff --git a/transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt new file mode 100644 index 0000000..32b0f6f --- /dev/null +++ b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt @@ -0,0 +1,2944 @@ +WEBVTT + +00:00:00.000 --> 00:00:04.660 +When LLMs write code to accomplish a task, that code has to actually run somewhere. + +00:00:05.180 --> 00:00:07.460 +And right now, the options aren't great. + +00:00:07.760 --> 00:00:14.600 +You can spin up a sandbox container and you're paying the full second of cold start overhead, plus the complexity of another service. + +00:00:15.060 --> 00:00:19.600 +Let the LLM loose on your actual machine and, well, you better keep an eye on it. + +00:00:20.000 --> 00:00:32.960 +On this episode, I sit down with Samuel Colvin, the creator of Pydantic, now at 10 billion downloads, to explore Monty, a Python interpreter written from scratch in Rust, purpose-built to run LLM-generated code. + +00:00:33.520 --> 00:00:41.660 +It starts in microseconds, is completely sandboxed by design, and can even serialize its entire state to a database and resume later. + +00:00:42.140 --> 00:00:47.900 +We dig into why this deliberately limited interpreter might be exactly what the AI agent error needs. + +00:00:48.700 --> 00:00:54.340 +This is Talk Python To Me, episode 541, recorded February 17, 2026. + +00:00:54.340 --> 00:01:16.380 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:16.380 --> 00:01:22.140 +This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. + +00:01:22.680 --> 00:01:23.840 +Let's connect on social media. + +00:01:24.140 --> 00:01:27.320 +You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:27.500 --> 00:01:29.460 +The social links are all in your show notes. + +00:01:30.160 --> 00:01:33.720 +You can find over 10 years of past episodes at talkpython.fm. + +00:01:33.800 --> 00:01:37.220 +And if you want to be part of the show, you can join our recording live streams. + +00:01:37.380 --> 00:01:41.440 +That's right. We live stream the raw, uncut version of each episode on YouTube. + +00:01:41.440 --> 00:01:46.460 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:46.640 --> 00:01:50.280 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:51.000 --> 00:01:55.240 +This episode is brought to you by our Agentic AI Programming for Python course. + +00:01:55.740 --> 00:02:00.320 +Learn to work with AI that actually understands your code base and build real features. + +00:02:00.880 --> 00:02:04.220 +Visit talkpython.fm/Agentic-AI. + +00:02:05.140 --> 00:02:07.320 +Samuel, welcome back to Talk Python To Me. + +00:02:07.500 --> 00:02:08.980 +Great to have you here, as always. + +00:02:08.980 --> 00:02:11.720 +Thank you so much for having me back. Yeah, it's good to be here. + +00:02:11.980 --> 00:02:14.580 +I saw your project and I immediately sent you a message. + +00:02:14.920 --> 00:02:16.440 +You need to come on the show and talk about this. + +00:02:16.520 --> 00:02:18.300 +What is going on? What is Monty? + +00:02:18.920 --> 00:02:21.300 +Hat tip to the name. I want to hear the origin of the name. + +00:02:21.600 --> 00:02:22.480 +You might be able to guess it. + +00:02:22.820 --> 00:02:25.280 +I think I can guess it. I think I can guess it. + +00:02:25.480 --> 00:02:27.260 +It's awesome to be here talking about this. + +00:02:28.180 --> 00:02:32.620 +You've been on a bunch of times, but there's a bunch of new listeners or they don't listen to every show. + +00:02:32.880 --> 00:02:33.480 +Give us your background. + +00:02:33.480 --> 00:02:41.220 +So I'm Samuel and I'm probably best known as creating Pydantic validation library way back in the annals of time in 2017. + +00:02:42.540 --> 00:02:45.360 +That is kind of an infrastructural bit of Python today. + +00:02:45.480 --> 00:02:47.740 +We just crossed 10 billion downloads in total. + +00:02:48.100 --> 00:02:50.620 +We're at like 580 million downloads a month. + +00:02:50.620 --> 00:02:52.600 +So that gets a lot of usage. + +00:02:53.020 --> 00:02:59.160 +Very lucky that Sequoia Capital came along and invested in Pydantic to start a company at the beginning of 2023. + +00:02:59.780 --> 00:03:03.780 +So now we have a kind of stable of different things we do, what we call the Pydantic stack. + +00:03:04.180 --> 00:03:05.660 +So there's Pydantic validation. + +00:03:05.940 --> 00:03:11.220 +We talked about Pydantic AI, which is an agent framework where Monty kind of fits in best. + +00:03:11.680 --> 00:03:19.400 +And then there's Pydantic Logfire, the observability platform for AI and general observability, which is the commercial bit of what we do. + +00:03:19.400 --> 00:03:23.780 +So I suppose I'm supposed to be being CEOing most of the time. + +00:03:23.880 --> 00:03:25.640 +I actually spend far too much of my time clauding. + +00:03:25.880 --> 00:03:27.060 +I seem to be in good company. + +00:03:27.220 --> 00:03:31.080 +I keep seeing people on Twitter, lots of CEOs of much bigger companies and writing lots of code. + +00:03:31.180 --> 00:03:32.360 +So apparently I'm allowed to again. + +00:03:32.760 --> 00:03:38.160 +It is an insanely exciting time with just the agentic AI in general. + +00:03:38.320 --> 00:03:43.060 +And Claude, you know, Claude Opus, Claude Sonnet in particular, they are so good. + +00:03:43.260 --> 00:03:43.940 +I don't know about you. + +00:03:43.940 --> 00:03:50.240 +I'm sure at least half the people, at least half of the people listening are like, they've got a backlog of ideas they want to try. + +00:03:50.380 --> 00:03:52.580 +Things they've always wanted to build and not the time. + +00:03:52.680 --> 00:03:53.980 +Or maybe it's a bit of a stretch. + +00:03:54.120 --> 00:03:55.320 +Like, I don't really know mobile. + +00:03:55.400 --> 00:03:56.580 +I can't really build a mobile app. + +00:03:56.600 --> 00:03:58.140 +But if I could, I would build this. + +00:03:58.460 --> 00:03:59.780 +And now you kind of can, right? + +00:04:00.040 --> 00:04:00.180 +Yeah. + +00:04:00.200 --> 00:04:01.940 +I mean, I think it's got scary bits of it too. + +00:04:02.060 --> 00:04:04.640 +I mean, maybe we're experiencing the like bonfire of the thing. + +00:04:04.760 --> 00:04:09.300 +We all, you know, I was speaking to Zach Hatfield Dodds just before Christmas. + +00:04:09.300 --> 00:04:15.240 +And he was like, we have had this weird time period when the thing I love doing happens to be incredibly financially lucrative. + +00:04:15.400 --> 00:04:16.440 +I mean, he's anthropic. + +00:04:16.540 --> 00:04:19.020 +So it's probably more financially lucrative for him than the rest of us. + +00:04:19.200 --> 00:04:22.460 +But hey, and maybe that time is going to come to an end. + +00:04:22.520 --> 00:04:24.420 +But I still feel very privileged to have had that time. + +00:04:24.780 --> 00:04:27.000 +I don't know exactly what's going to go. + +00:04:27.140 --> 00:04:30.240 +I mean, and definitely the jobs of software developers are changing. + +00:04:30.360 --> 00:04:31.240 +And some of that is scary. + +00:04:31.240 --> 00:04:37.220 +But as you say, it's also super exciting projects from Go Build a Mobile App, which you didn't know how to do. + +00:04:37.300 --> 00:04:43.420 +But there were others who did through to building Monty, which I think we were relatively well placed to do it as a team of people. + +00:04:43.680 --> 00:04:52.020 +But we would never have had the resources or the time to do it if it wasn't for LLMs being especially good at tasks like that. + +00:04:52.460 --> 00:04:52.640 +Interesting. + +00:04:52.920 --> 00:04:53.140 +Okay. + +00:04:53.160 --> 00:04:54.520 +I do want to dive into that later. + +00:04:54.820 --> 00:04:57.200 +But we haven't even introduced what Monty is yet. + +00:04:57.200 --> 00:05:00.060 +So let's hold off on that deep dive. + +00:05:00.220 --> 00:05:07.440 +But when I saw this, I'm like, I wonder what role that agentic coding sort of made this possible for a small team. + +00:05:07.620 --> 00:05:10.320 +You know, like that was certainly one of the thoughts I had. + +00:05:10.680 --> 00:05:10.820 +Yeah. + +00:05:11.080 --> 00:05:12.600 +I mean, I can dive into it. + +00:05:12.760 --> 00:05:22.040 +But yeah, I mean, I've got a bit of help now from David Hewitt, who is a great deal better Rust developer and knows more of the Python internals than many people. + +00:05:22.280 --> 00:05:23.320 +Well, definitely more than me. + +00:05:23.320 --> 00:05:32.920 +But for the most of it was just me in my spare time building it, which I'll talk in a bit about like why I think this is such an eligible project for LLM acceleration. + +00:05:33.340 --> 00:05:33.480 +Yeah. + +00:05:33.720 --> 00:05:33.900 +Yeah. + +00:05:33.960 --> 00:05:35.920 +So you're playing both sides of the fence here. + +00:05:36.120 --> 00:05:42.080 +It sounds like both maybe using a little AI, but also building for AI, which I think is quite interesting. + +00:05:42.080 --> 00:05:51.880 +Yeah, I think we're I mean, yeah, we're doing we're building Pydantic AI as a way for LLMs to power applications or be part of applications. + +00:05:51.880 --> 00:05:55.280 +We're also using AI to build that more and more. + +00:05:55.440 --> 00:06:00.580 +I think of people's usage of Logfire is through their coding agent, as in sure, people can log into Logfire. + +00:06:00.740 --> 00:06:02.160 +We love our tracing view, et cetera. + +00:06:02.240 --> 00:06:08.880 +But I acknowledge there's a lot of people who are just going to point and clawed code at it and ask it to go and work out what's wrong and fix their bug. + +00:06:08.980 --> 00:06:12.700 +So, yeah, we contact contact with what's going on in LLMs all over the place. + +00:06:12.700 --> 00:06:14.240 +How did you facilitate that? + +00:06:14.480 --> 00:06:16.420 +Like, how can the AI get that information? + +00:06:16.600 --> 00:06:25.600 +We make this weird, esoteric, odd decision back when we first started Logfire not to allow users to write arbitrary SQL against their data. + +00:06:25.800 --> 00:06:30.060 +We did that really because we thought it was too much hard work to build a build a query builder. + +00:06:30.520 --> 00:06:32.360 +And like SQL seemed like the thing we would want. + +00:06:32.480 --> 00:06:36.480 +And it seemed like a pretty esoteric, odd decision back when we started it in 2023. + +00:06:36.480 --> 00:06:51.360 +Now it is like the most powerful, most defensible thing we have because we've spent two years learning how to build effectively an analytical database that anyone can go and query and run any query against and dealing with all of the side effects of that. + +00:06:51.620 --> 00:06:53.240 +But everyone has an MCP server. + +00:06:53.340 --> 00:06:53.520 +Fine. + +00:06:53.580 --> 00:06:58.640 +But what's powerful about Logfires is LLMs are very, very, very good at writing SQL when they have a schema. + +00:06:59.120 --> 00:07:04.760 +And so, you know, you ask it something that no one's ever asked it before, say, find me the five slowest endpoints by P95. + +00:07:04.760 --> 00:07:12.540 +Now that's a reasonable one, but you can imagine some incredibly complex question that no one's ever answered before that no other kind of query builder dialect could do. + +00:07:12.700 --> 00:07:14.800 +But because you have full SQL, you can go and write this. + +00:07:15.180 --> 00:07:16.900 +LLM will write the SQL to give you back the answer. + +00:07:17.240 --> 00:07:20.580 +I want the P95 worst top five there. + +00:07:20.900 --> 00:07:26.480 +For this app, at this endpoint, for the people in Southeast Asia on Tuesday. + +00:07:26.940 --> 00:07:27.100 +Right? + +00:07:27.200 --> 00:07:31.420 +Something like you're like, we've run out of filters, but like SQL just keeps going. + +00:07:31.420 --> 00:07:36.400 +And by the way, group that by hour or group that by every 15 minutes. + +00:07:36.520 --> 00:07:38.420 +And like, you know, it gets arbitrarily more complex. + +00:07:38.560 --> 00:07:39.500 +That just just works. + +00:07:39.820 --> 00:07:40.080 +Yeah. + +00:07:40.220 --> 00:07:41.220 +How very interesting. + +00:07:41.520 --> 00:07:49.680 +I just wrote an article about how I think working in the native query language, if you're using agentic programming. + +00:07:50.040 --> 00:07:50.680 +I saw you write it. + +00:07:50.780 --> 00:07:52.440 +I was like, yeah, yeah, yeah. + +00:07:52.440 --> 00:07:52.520 +Yeah. + +00:07:52.740 --> 00:07:55.500 +And I mean, Pydantic is a perfect fit for that style. + +00:07:55.580 --> 00:08:03.180 +It's like, if you could write your actual queries in native syntax and then transform it to a rich class, like a Pydantic or a data class or something like that. + +00:08:03.380 --> 00:08:10.960 +These AIs, they are so trained on SQL or MongoDB native query syntax or, you know, whatever vanilla lowest level thing. + +00:08:11.040 --> 00:08:13.980 +They see more of that than anything because it's across all the technologies. + +00:08:14.180 --> 00:08:15.380 +I think that's going to be a thing. + +00:08:15.380 --> 00:08:21.680 +And it's interesting how you sort of set the stage so that was already present for you and your product, right? + +00:08:21.980 --> 00:08:22.120 +Yeah. + +00:08:22.200 --> 00:08:27.520 +But even when we started building the Logfire platform, I remember saying, everyone was like, you know, which ORM are we going to use? + +00:08:27.620 --> 00:08:28.660 +We're building a FastAPI. + +00:08:28.860 --> 00:08:30.740 +So there was some debate about how we do it. + +00:08:30.740 --> 00:08:32.080 +And I was like, let's just write SQL. + +00:08:32.420 --> 00:08:39.380 +And everyone, you know, it seemed like an odd thing to do because, sure, it's like six lines of SQL to do a simple, like, what would be a like get in Django ORM. + +00:08:39.380 --> 00:08:46.580 +But, I mean, I think even before LLMs, people were compelled enough because they were like, yeah, the like autocomplete kind of LLM will do a lot of the work for me. + +00:08:46.660 --> 00:08:48.520 +And now I have complete control. + +00:08:48.720 --> 00:08:55.200 +Now, I think where the majority of code is being written by AIs, having full control, full SQL is incredibly useful. + +00:08:55.200 --> 00:08:56.460 +And you can optimize it, right? + +00:08:56.480 --> 00:08:58.360 +You can only get the particular column that you want. + +00:08:58.440 --> 00:09:01.260 +You can be very careful about which indexes are being used. + +00:09:01.320 --> 00:09:05.320 +You can copy paste the SQL into whatever and work out the plan. + +00:09:05.500 --> 00:09:07.380 +That's much harder when you're using an ORM. + +00:09:07.600 --> 00:09:08.000 +So, yeah. + +00:09:08.260 --> 00:09:08.400 +Yeah. + +00:09:08.400 --> 00:09:13.200 +And you could just star star the dictionary that comes back right into a Pydantic class. + +00:09:13.460 --> 00:09:15.480 +And then you put that behind a function. + +00:09:15.600 --> 00:09:16.380 +You don't mess with it. + +00:09:16.460 --> 00:09:16.940 +It's safe. + +00:09:17.460 --> 00:09:17.760 +Exactly. + +00:09:18.160 --> 00:09:18.300 +Yeah. + +00:09:18.320 --> 00:09:27.200 +You kind of get the programmer benefits of programming against typed classes and the AI benefits of it can just talk like vanilla and the performance as well. + +00:09:27.380 --> 00:09:27.660 +All right. + +00:09:27.800 --> 00:09:30.240 +Don't necessarily want to go too far down that rat hole. + +00:09:30.340 --> 00:09:31.660 +We got a different one to go down. + +00:09:32.040 --> 00:09:34.560 +Let's talk about Python interpreters. + +00:09:34.560 --> 00:09:40.220 +So, you built Monty, a specialized Python interpreter written in Rust. + +00:09:40.220 --> 00:09:48.960 +And I just want to just do a little historical journey to show, like, for people who don't know, like, this is not the first one of these. + +00:09:49.020 --> 00:09:52.540 +Actually, I'm happy to riff on this, but I'll let you take the lead. + +00:09:52.540 --> 00:09:59.780 +I heard a conversation from two programmers in interchange, exchange between those two, talking about CPython. + +00:09:59.900 --> 00:10:00.940 +They're like, what is CPython? + +00:10:01.120 --> 00:10:03.700 +Is it like Python that compiles to C? + +00:10:04.020 --> 00:10:04.640 +Or, you know? + +00:10:05.060 --> 00:10:08.980 +So, maybe just a little bit of a chat about what the heck is an interpreter? + +00:10:09.680 --> 00:10:10.040 +Yeah, go ahead. + +00:10:10.040 --> 00:10:11.380 +I remember being confused about that, too. + +00:10:11.920 --> 00:10:15.420 +And, you know, in Cython, which I don't think we hear about so much anymore, but that confused me as well. + +00:10:15.480 --> 00:10:16.300 +I remember, yeah. + +00:10:16.540 --> 00:10:27.680 +So, it's interesting that even from as far back as CPython's origination, there was an acknowledgement that there might be other Pythons, and that Python is a language, not an implementation. + +00:10:28.020 --> 00:10:28.300 +But, yeah. + +00:10:28.500 --> 00:10:28.820 +Go ahead. + +00:10:29.080 --> 00:10:29.280 +Yeah. + +00:10:29.280 --> 00:10:34.020 +So, well, we've got the Python interpreter, and we've got Python code we write. + +00:10:34.140 --> 00:10:41.460 +Often, we write, well, Python, the language, but when it executes, it doesn't actually execute in Python. + +00:10:41.660 --> 00:10:45.600 +It might execute because C understands it, and a C compiled thing runs. + +00:10:45.700 --> 00:10:49.220 +Or, in your case, Rust understands the bytecode, right? + +00:10:49.220 --> 00:10:56.620 +So, the interpreter parses our Python into Python bytecodes, which you can get through with the disk module. + +00:10:56.720 --> 00:10:59.180 +You can disassemble it and look at the actual bytecodes you got back. + +00:10:59.280 --> 00:11:04.420 +And then those are sent off to, like, a giant loop that interprets them, hence the term interpreter. + +00:11:04.780 --> 00:11:06.120 +So, we've got CPython. + +00:11:06.460 --> 00:11:10.640 +We have the defunct IronPython for .NET, which made it all the way to 3.4. + +00:11:10.800 --> 00:11:14.600 +We've got the defunct Gython, which made it all the way to 2.7. + +00:11:14.800 --> 00:11:18.000 +And we've got the much more exciting and modern Pyodide. + +00:11:18.540 --> 00:11:20.400 +Well, Pyodide is still CPython, so. + +00:11:20.800 --> 00:11:21.160 +Yes. + +00:11:21.540 --> 00:11:25.820 +But compiled for WebAssembly, which I feel, I don't know, I feel like Rust and WebAssembly have this kinship. + +00:11:25.820 --> 00:11:28.380 +So, it's like, I don't know, it feels closer to Rust than the others. + +00:11:28.380 --> 00:11:28.620 +I agree. + +00:11:28.620 --> 00:11:30.780 +There's also Rust Python, which is in active development. + +00:11:30.980 --> 00:11:33.420 +I don't know what that's currently pointing at. + +00:11:34.400 --> 00:11:40.220 +There's also Grail, which is another Python interpreter. + +00:11:40.860 --> 00:11:46.320 +And the second biggest, really, is PyPy, probably the best-known one of all. + +00:11:46.320 --> 00:11:54.960 +So, without meaning to cause offense to those that are still active, there's also Unladen Swallow was another attempt. + +00:11:54.960 --> 00:12:02.300 +And there's a whole, but look, without meaning to cause offense to any of those that are still alive, there was a kind of graveyard of other Python implementations. + +00:12:02.300 --> 00:12:09.660 +And so, I went into this knowing that it's a space where lots of people have tried to build things, put in, bluntly, a great deal more effort than we have. + +00:12:09.660 --> 00:12:16.180 +And for the most part, I wouldn't say they failed, but they haven't got the same kind of adoption that CPython has. + +00:12:16.300 --> 00:12:16.940 +I mean, I think... + +00:12:16.940 --> 00:12:17.960 +Oh, 100%. + +00:12:17.960 --> 00:12:21.480 +CPython is 99.9s of usage of Python. + +00:12:21.480 --> 00:12:32.860 +And my take is that the reason for that is you need almost complete, perfect consistency with CPython to use something else. + +00:12:33.100 --> 00:12:40.420 +Again, you need 99.59s of perfection, of identical behavior before you would go and switch in any real application. + +00:12:40.700 --> 00:12:50.560 +I remember trying to use PyPy, and even if I could get it to run, well, it turns out its foreign function interfaces are not with, like, asyncpg were slower than CPython's, and so actually it didn't perform as well. + +00:12:50.560 --> 00:12:57.360 +And so, the threshold to switch from CPython to something else or to choose something else was incredibly high. + +00:12:57.640 --> 00:13:04.220 +And so, we are not trying to build another Python interpreter that you might credibly move your application across. + +00:13:04.500 --> 00:13:09.860 +We're using Python as a syntax for a very specific thing where LLMs write code. + +00:13:10.180 --> 00:13:16.520 +And the fact that we have a different goal is one of the reasons that we thought this was a credible project to take on. + +00:13:16.520 --> 00:13:20.940 +This portion of Talk Python To Me is brought to you by us. + +00:13:21.460 --> 00:13:28.700 +I want to tell you about a course I put together that I'm really proud of, Agentic AI Programming for Python Developers. + +00:13:29.380 --> 00:13:35.260 +I know a lot of you have tried AI coding tools and come away thinking, well, this is more hassle than it's worth. + +00:13:35.620 --> 00:13:38.900 +And honestly, all the vibe coding hype isn't helping. + +00:13:39.160 --> 00:13:42.380 +It's a smokescreen that hides what these tools can actually do. + +00:13:42.380 --> 00:13:54.820 +This course is about agentic engineering, applying real software engineering practices with AI that understands your entire code base, runs your tests, and builds complete features under your direction. + +00:13:55.140 --> 00:14:01.660 +I've used these techniques to ship real production code across Talk Python, Python Bytes, and completely new projects. + +00:14:02.080 --> 00:14:09.000 +I migrated an entire CSS framework on a production site with thousands of lines of HTML in a few hours, twice. + +00:14:09.000 --> 00:14:13.600 +I shipped a new search feature with caching and async in under an hour. + +00:14:14.060 --> 00:14:22.160 +I built a complete CLI tool for Talk Python from scratch, tested, documented, and published to PyPI in an afternoon. + +00:14:22.660 --> 00:14:26.620 +Real projects, real production code, both Greenfield and Legacy. + +00:14:27.100 --> 00:14:28.740 +No toy demos, no fluff. + +00:14:29.320 --> 00:14:35.460 +I'll show you the guardrails, the planning techniques, and the workflows that turn AI into a genuine engineering partner. + +00:14:35.460 --> 00:14:39.500 +Check it out at talkpython.fm/agentic dash engineering. + +00:14:39.740 --> 00:14:42.880 +That's talkpython.fm/agentic dash engineering. + +00:14:43.060 --> 00:14:45.200 +The link is in your podcast player's show notes. + +00:14:45.200 --> 00:14:58.020 +You know, the real challenge, I think, that I saw with all of those is there are so many different use cases, and it's both a big benefit of all the Python packages and stuff, + +00:14:58.140 --> 00:15:07.900 +but, you know, this package pulls in this compiled thing, and this other one pulls in another compiled thing, and it assumes that the gil works exactly in this way. + +00:15:08.220 --> 00:15:12.380 +And so there's all these implied behaviors that have to be carried across. + +00:15:12.380 --> 00:15:24.180 +And a lot of these, I think, we're trying to say, let's put those to the side and see if we could build something neater that's more native to Java or .NET or whatever people were after, you know, with those different ones. + +00:15:24.280 --> 00:15:27.160 +But then the compatibility just hit them in the face, right? + +00:15:27.220 --> 00:15:37.000 +We've like, I haven't actually counted PyPI lately, but how many were almost just short of three-quarter million, two packages short of three-quarters of a million packages. + +00:15:37.480 --> 00:15:39.820 +We've got to reload this page at the end of the pod. + +00:15:39.820 --> 00:15:42.460 +I'm just going to say, yes, we're going to leave it open. + +00:15:42.560 --> 00:15:43.800 +We're absolutely leaving that open. + +00:15:44.480 --> 00:15:47.260 +But trying to be compatible with that many projects? + +00:15:47.260 --> 00:15:50.100 +We're actually 5,002 short of. + +00:15:50.280 --> 00:15:51.100 +Oh, yeah, yeah, okay. + +00:15:51.480 --> 00:15:54.260 +Sorry to be a pedant, but it comes with a gun. + +00:15:54.260 --> 00:15:55.500 +Oh, yeah, yeah, no, you're right. + +00:15:55.660 --> 00:15:57.360 +We're at 744, not 7. + +00:15:57.840 --> 00:15:58.620 +Or 9, yeah. + +00:15:59.580 --> 00:16:02.400 +There's going to be some kind of milestone reach, but it's not the one I was hoping for. + +00:16:02.400 --> 00:16:08.380 +Anyway, the point is there's so many edge cases and so many specializations. + +00:16:08.900 --> 00:16:08.980 +Yeah. + +00:16:09.200 --> 00:16:11.360 +I think that's really where it hit them. + +00:16:11.740 --> 00:16:17.700 +And, you know, maybe this is a good segue to just, you know, if not that, then what are you actually building? + +00:16:17.780 --> 00:16:18.480 +What is this Monty? + +00:16:18.800 --> 00:16:25.200 +So Monty tries to solve this problem where we want to allow, LLMs are very, very good at writing code. + +00:16:25.260 --> 00:16:26.780 +We were talking about them writing SQL earlier. + +00:16:26.860 --> 00:16:30.140 +They're very good at writing Python and JavaScript. + +00:16:30.140 --> 00:16:37.220 +I think, honestly, it wouldn't really matter to the implementation whether we were implementing Python or JavaScript. + +00:16:37.480 --> 00:16:38.780 +It just turns out for a bunch of reasons. + +00:16:38.900 --> 00:16:42.340 +Python is easier and it's also like where we come from. + +00:16:43.020 --> 00:16:53.880 +The simplest use case of Monty is what people call programmatic tool calling or code mode, where instead of my LLM calling tools in a loop, + +00:16:54.160 --> 00:17:07.360 +sometimes using the return value from one tool straight into the next tool, the LLM can just go and write code and thereby be more reliable and much more performant and much lower cost. + +00:17:07.440 --> 00:17:17.740 +So we've seen examples of like, if you, for example, connect Pydantic AI with code mode enabled to GitHub's MCP and you say, go and find the five latest pull requests. + +00:17:18.500 --> 00:17:21.560 +And I forget what the question was, right? + +00:17:21.560 --> 00:17:27.960 +But the point was we have to go jump through their API via MCP and calculate some value. + +00:17:27.960 --> 00:17:33.380 +We've seen tasks go from kind of $2 down to $0.04 as a result of using code mode. + +00:17:33.760 --> 00:17:37.940 +Because one of the big reasons for that is that those MCP responses are vast. + +00:17:38.360 --> 00:17:45.440 +And so the LLM has to put loads of tokens into context to go and pull out, well, actually, this is just like the ID of the thing I need to make the next request. + +00:17:45.440 --> 00:17:52.080 +I just added an MPC server to Talk Python a few weeks ago so people could ask questions about it and stuff. + +00:17:52.300 --> 00:18:01.300 +And what really surprised me is the actual return type that MCP servers recommend is markdown, not structured data. + +00:18:01.440 --> 00:18:05.820 +So you basically send a giant blob of markdown back as the response. + +00:18:05.920 --> 00:18:11.620 +And then, like you're saying, a bunch of tokens get consumed just trying to understand the response rather than, here's a JSON document. + +00:18:11.760 --> 00:18:12.580 +I know it's called this. + +00:18:12.740 --> 00:18:13.500 +Boom, answer. + +00:18:13.500 --> 00:18:18.440 +So I think in the case of GitHub's one, they do return JSON, which is useful for us because we can then go parse that JSON. + +00:18:18.800 --> 00:18:26.300 +But also, if you don't need the whole of that response, you can search through it and extract a particular thing you need. + +00:18:26.620 --> 00:18:35.600 +So the conservative threshold for what Monty can do is allow us to implement this code mode use case. + +00:18:35.920 --> 00:18:37.820 +And I think it works for that for the most part now. + +00:18:37.980 --> 00:18:39.640 +We're working hard on some improvements. + +00:18:39.640 --> 00:18:46.620 +The biggest difference of it versus all of the other Python implementations is it is completely sandboxed. + +00:18:46.700 --> 00:18:50.120 +It is isolated from your machine. + +00:18:50.280 --> 00:18:59.400 +So you can't open a file or read an environment variable unless you very specifically say, here are the environment variables you're passing into this context. + +00:18:59.400 --> 00:19:07.520 +Or here are the pseudo files or indeed real files that I specifically want to expose to this runtime. + +00:19:07.780 --> 00:19:14.020 +That means that obviously reading a file is going to be way less performant than in CPython where we can go and make some syscall to read a file. + +00:19:14.220 --> 00:19:14.900 +We're not doing that. + +00:19:15.000 --> 00:19:26.740 +You're calling back from the Monty runtime to the host runtime, which might be Python or might be JavaScript or Rust, to say, read me this particular file, and then it can choose what to do. + +00:19:26.740 --> 00:19:31.080 +But that is obviously what you want in this scenario where the LLM is writing the code. + +00:19:31.240 --> 00:19:38.060 +So that is the regard in which we are completely different from all of the other Python implementations. + +00:19:38.300 --> 00:19:46.580 +And then there's a few other projects doing similar things, but we're different in that regard from all of the established programming languages, which would all have ways to read files. + +00:19:47.000 --> 00:19:47.800 +Very interesting take. + +00:19:47.800 --> 00:19:50.460 +You know, it might be worth just a quick mention. + +00:19:50.960 --> 00:19:55.960 +There's plenty of people out there listening who have not done agentic tool using coding. + +00:19:56.420 --> 00:20:02.260 +So I think understanding just that the flow of that is kind of important to understanding the value of this, right? + +00:20:02.280 --> 00:20:11.560 +And you did definitely touch on it, but if you go and ask Claude Code to do something, or Cursor, or whatever, it's constantly like, let me run this GitHub command. + +00:20:11.600 --> 00:20:12.660 +Let me run this Git command. + +00:20:12.720 --> 00:20:13.800 +Let me run this LS command. + +00:20:13.800 --> 00:20:14.860 +Let me run this find. + +00:20:15.180 --> 00:20:20.260 +And periodically it'll just exec Python, like little strings of Python and stuff. + +00:20:20.520 --> 00:20:29.460 +So one of your core ideas is, what if we could give it a better Python that it's encouraged to use for this kind of behavior, right? + +00:20:30.020 --> 00:20:31.980 +Let me describe it in a slightly different way. + +00:20:32.200 --> 00:20:37.080 +Okay, so we have a continuum of how much control and how much flexibility LLMs have. + +00:20:37.180 --> 00:20:43.380 +At one end of the spectrum, we have pure tool calling, where they can basically return JSON with the name of a tool that you're going to call. + +00:20:43.800 --> 00:20:49.440 +And there are agent frameworks like PyLance AI that allow you to hook that up to functions. + +00:20:49.540 --> 00:20:52.980 +But ultimately, you're just getting JSON back and you're deciding what to do with that. + +00:20:53.040 --> 00:20:55.400 +And you may call the LLM again with some return value. + +00:20:55.560 --> 00:20:58.580 +At the full other end of the spectrum, we have complete computer use. + +00:20:58.800 --> 00:21:04.380 +Some LLM has some vision model and is moving my cursor around on screen to do everything I want. + +00:21:04.620 --> 00:21:05.380 +Type onto our keyboard. + +00:21:05.580 --> 00:21:07.080 +In the middle, we have a bunch of options. + +00:21:07.220 --> 00:21:12.080 +We have Monty, which is kind of on the near the tool calling end of the spectrum. + +00:21:12.080 --> 00:21:15.780 +Then we have sandboxes like Daytona and E2B and modal. + +00:21:15.780 --> 00:21:20.560 +And then we have the kind of Claude code or codex style of like complete control of your terminal. + +00:21:20.780 --> 00:21:29.300 +And along that spectrum, you go more and more power in terms of like capacity of what the LLM might be able to do and more and more security concerns. + +00:21:29.300 --> 00:21:37.820 +And generally that comes with more and more of having an adult watching what it's going to go and do and controlling it and uncrashing it when it crashes, when it goes and does the wrong thing. + +00:21:37.820 --> 00:21:48.380 +And so for the most part today, when we're using something in the cloud that uses an LLM, it's doing the tool calling end of the spectrum. + +00:21:48.680 --> 00:21:53.500 +That's what the kind of LangChain, Langgraph, Pydantic AI, Crew AI, all those guys are doing. + +00:21:54.980 --> 00:22:01.600 +The LLM is doing very similar things when Claude code basically decides to go and run LS or run RM-RF. + +00:22:01.880 --> 00:22:08.920 +It's calling the tool like bash command, which the Claude application running on your machine chooses to go and execute. + +00:22:09.100 --> 00:22:19.160 +The point is, for the most part, when we're building applications that are going to go and run in the cloud, we don't have a software developer who understands what's going on, sitting, watching every command. + +00:22:19.520 --> 00:22:22.960 +And so we need to be much more constrained in what we're going to allow the LLM to do. + +00:22:23.180 --> 00:22:27.400 +But we want to have a little bit more expressiveness than we do with pure tool calling. + +00:22:27.720 --> 00:22:36.800 +And at the moment, there is basically nothing in the spectrum between tool calling and go and run a sandboxing service and have access to a full sandbox. + +00:22:36.980 --> 00:22:37.760 +And that's powerful. + +00:22:37.980 --> 00:22:40.540 +You can do a bunch of things with it, but often we don't need that stuff. + +00:22:40.700 --> 00:22:42.740 +And that's where Monty's, that's the kind of sweet spot. + +00:22:43.120 --> 00:22:43.180 +Okay. + +00:22:43.500 --> 00:22:48.800 +There's interesting incentives or something that align with this undertaking as well. + +00:22:49.120 --> 00:22:54.460 +For example, if you don't give it a networking stack, it can't do bad things on the network. + +00:22:54.540 --> 00:22:54.900 +Yeah. + +00:22:55.040 --> 00:22:56.480 +Because it just doesn't exist, right? + +00:22:56.760 --> 00:23:01.840 +So it helps you, it inspires to create like a more minimal version of the standard library and so on. + +00:23:02.080 --> 00:23:02.200 +Yeah. + +00:23:02.260 --> 00:23:11.620 +And you can imagine like we, we will soon have a, some version of HTTP request that you can make, but you will be required to go and enable that explicitly. + +00:23:11.840 --> 00:23:22.700 +And even better, because you're calling through the host, you're going to have a perfect point where you can go and read the URL and go, no, you can't make a request to local host and go and like start snooping on what's going on here. + +00:23:22.700 --> 00:23:26.640 +You have to be making a request to an external URL or whatever else it might be. + +00:23:26.680 --> 00:23:31.060 +Or even I'm going to go and use some third party service to proxy all HTTP requests. + +00:23:31.260 --> 00:23:34.660 +So it is never an untrusted HTTP request inside my network. + +00:23:34.840 --> 00:23:45.900 +But the point is, this is the single biggest difference of Monty is every single place where you can, where the code could interact with the real world, it must call an external function. + +00:23:46.160 --> 00:23:47.540 +So call back through the host. + +00:23:47.820 --> 00:23:54.660 +And then the other regard in which it is, I think somewhat innovative is we are not using traditional callbacks for that. + +00:23:54.920 --> 00:24:00.260 +So we're not giving the runtime a list of pointers to functions it can call on the host. + +00:24:00.680 --> 00:24:08.140 +Instead, the Monty runtime is effectively suspending and returning control to the host whenever you're doing a tool call. + +00:24:08.220 --> 00:24:15.360 +So you're basically getting a response, which is like call the function, read file with the arguments file name on whatever else it might be. + +00:24:15.620 --> 00:24:29.680 +And that allows a few things, but in particular, it allows us if that tool we're going to go and run, or that function we're going to go and run is going to take two days to run, we can serialize the Monty runtime, go put that in a database, and shut down our process + +00:24:29.680 --> 00:24:31.400 +and wait for the tool to come back. + +00:24:31.560 --> 00:24:37.640 +And that's something that CPython doesn't offer, understandably, but we are able to build because we built Monty from scratch. + +00:24:37.880 --> 00:24:42.900 +You can serialize the entire interpreter state, go put it into a database and retrieve it later when you want to resume. + +00:24:43.240 --> 00:24:43.820 +That's pretty wild. + +00:24:44.160 --> 00:24:46.560 +So it's got this durability aspect, right? + +00:24:46.840 --> 00:24:57.860 +Yeah, which I think is in these scenarios where often the code execution part of this is going to take milliseconds, but our tools might take minutes or hours or whatever else, + +00:24:58.140 --> 00:25:04.840 +both for durability and to build an application that's both more durable and easier to maintain. + +00:25:05.340 --> 00:25:12.320 +You don't have to have that interpreter state hanging around in memory as you would with CPython. + +00:25:12.320 --> 00:25:18.200 +And all the other things like timeout and just other weird oddities, right? + +00:25:18.360 --> 00:25:24.740 +Like I was working on something on my laptop just yesterday and my wife's like, you ready to go? + +00:25:24.780 --> 00:25:26.280 +I'm like, hold on, I got to wait. + +00:25:26.540 --> 00:25:31.120 +I got to wait for this chat to complete before it's been going for five minutes. + +00:25:31.180 --> 00:25:31.740 +It's almost done. + +00:25:31.820 --> 00:25:32.080 +Just hold on. + +00:25:32.160 --> 00:25:36.780 +And then I can close my laptop and roll, you know, because it would have, who knows what it would have done to it, right? + +00:25:37.080 --> 00:25:37.200 +Yeah. + +00:25:37.380 --> 00:25:37.560 +Yeah. + +00:25:37.560 --> 00:25:45.500 +And talking of timeouts, the other thing that we're able to do in Monty is we're able to, look, it's not perfect yet because it's early, but we basically allow you to set resource limits. + +00:25:45.500 --> 00:25:50.620 +So total execution time and memory limit in particular and recursion depth. + +00:25:51.040 --> 00:26:03.920 +And therefore you can run this Monty thing in some small image in the cloud and you can say it's got 10 megabytes and it, you know, once it's hardened, you know, it's early, we have that support now, but I'm not saying there are no ways around it. + +00:26:04.060 --> 00:26:06.460 +It can't go and kill your machine out of memory. + +00:26:07.480 --> 00:26:09.780 +Can't oom your container. + +00:26:09.940 --> 00:26:13.800 +You're just going to get back a resources error saying too much memory can suit. + +00:26:14.140 --> 00:26:14.720 +Yeah. Very powerful. + +00:26:15.020 --> 00:26:17.680 +So I see on the GitHub page here, a couple of things. + +00:26:17.740 --> 00:26:24.680 +First of all, it supports Python 3, 10, 11, 12, 13, 14, presumably 15 will take the place of 10 in a year or something. + +00:26:25.020 --> 00:26:31.640 +So that is the support for the, so we have, so the Monty runtime is written entirely in Rust. + +00:26:31.720 --> 00:26:37.040 +It has no dependency on CPython or PyO3 or anything else. + +00:26:37.040 --> 00:26:38.400 +It is a pure Rust library. + +00:26:38.400 --> 00:26:39.200 +We're very lucky. + +00:26:39.380 --> 00:26:49.800 +We have the AST parser from Ruff, from the Astral team that we're able to, gives us, allows us to go from Python code to some basically structured objects. + +00:26:49.920 --> 00:26:52.220 +We don't have to go and do that, like parsing the Python code ourselves. + +00:26:52.700 --> 00:26:52.780 +Right. + +00:26:52.840 --> 00:26:54.840 +Because Ruff is already written in Rust. + +00:26:55.020 --> 00:26:59.220 +Like that's, I feel like the Astral team is kind of a peer of yours for sure. + +00:26:59.360 --> 00:27:01.580 +You guys must look at each other, what you all are doing. + +00:27:01.880 --> 00:27:02.180 +Yeah. Yeah. + +00:27:02.200 --> 00:27:03.440 +And, and, and, you know, we use that a lot. + +00:27:03.440 --> 00:27:04.800 +And also we have ty built in. + +00:27:04.800 --> 00:27:09.820 +So the ty type checker from Astral is again written in Rust. + +00:27:09.940 --> 00:27:12.840 +And so it is compiled into Monty when you use it. + +00:27:12.940 --> 00:27:16.480 +And so before you run your code, you can go and run type checking at the same time. + +00:27:16.520 --> 00:27:23.960 +And again, that, that feedback is incredibly useful for LLMs to get them to, to write reasonably reliable run like workflows. + +00:27:24.740 --> 00:27:27.200 +But, but to, to come back to your, your question. + +00:27:27.380 --> 00:27:32.880 +So we have Monty itself, which is just Rust, pure Rust, no other C dependencies, just in Rust. + +00:27:32.880 --> 00:27:38.180 +And then we have, and that you can use that as a Rust library directly in your Rust application, if you so wish. + +00:27:38.200 --> 00:27:47.560 +And there are people already doing that, but we then have libraries for Python and for JavaScript, which use in the case of Python, PyO3, which is amazing. + +00:27:47.560 --> 00:27:52.480 +In the case of JavaScript, a thing called NAPI, or maybe you're supposed to pronounce it NAPI. + +00:27:52.620 --> 00:27:53.080 +I don't know. + +00:27:54.360 --> 00:27:58.720 +Which allow, which basically means we can go and have JavaScript and Python packages where you can call Monty. + +00:27:58.720 --> 00:28:04.960 +And so slightly confusingly, that Python 310, Python 3.3.14 is referring to the Python package that you're installing. + +00:28:05.380 --> 00:28:09.300 +The actual Monty is targeting Python 3.14 syntax only. + +00:28:09.640 --> 00:28:09.820 +I see. + +00:28:10.040 --> 00:28:15.060 +But those are the different language features that you support basically for parsing, right? + +00:28:15.160 --> 00:28:15.720 +Something like that. + +00:28:15.960 --> 00:28:16.240 +Yes. + +00:28:16.340 --> 00:28:16.840 +No, no, no. + +00:28:16.900 --> 00:28:22.240 +So that's just like, we only support, so Monty itself will run as if it was 3.14 or, you know, some subset of it. + +00:28:22.280 --> 00:28:26.100 +We don't support all the syntax yet, but like 3.14 type stuff. + +00:28:26.180 --> 00:28:32.560 +But yeah, if you're, when you're installing it, when you're uv add Pydantic Monty, you can do that in 3.10 through 3.14. + +00:28:32.800 --> 00:28:42.960 +And obviously, because we maintain a bunch of Rust stuff, we've worked hard to have binaries for basically every environment, Python, Linux, macOS, Windows, bunch of different architectures. + +00:28:43.120 --> 00:28:45.660 +And we have PGO builds, which no one else has. + +00:28:45.740 --> 00:28:47.060 +So that should improve performance again. + +00:28:47.540 --> 00:28:47.680 +Yeah. + +00:28:47.740 --> 00:28:47.920 +Yeah. + +00:28:47.920 --> 00:28:52.420 +PGO is process. + +00:28:52.660 --> 00:28:52.840 +I did. + +00:28:53.300 --> 00:28:53.700 +Yeah. + +00:28:53.800 --> 00:28:58.500 +So, so we did this first in, in, in Pydantic itself, which obviously the core is written in Rust. + +00:28:58.500 --> 00:29:04.280 +And it was in fact, David, David Hewitt on our team, who's the PyO3 maintainer, who identified this, this great technique. + +00:29:04.360 --> 00:29:05.960 +So basically it's, it's part of Rust. + +00:29:06.280 --> 00:29:14.020 +You basically compile the library, and then you run as many different bits of code against it as you can, in our case, all of the unit tests. + +00:29:14.020 --> 00:29:20.940 +And then you basically recompile it with pointers as to which paths in the code, which branches are most common. + +00:29:21.160 --> 00:29:23.620 +And you can get up to like 50% performance improvement. + +00:29:23.920 --> 00:29:26.100 +But the thing is, if you're building your own library, that's a real pet. + +00:29:26.140 --> 00:29:28.320 +If you're building your own application, that's a pain. + +00:29:28.420 --> 00:29:31.580 +If you just uv add Pydantic Monty, you get that stuff for free. + +00:29:31.960 --> 00:29:31.980 +Yeah. + +00:29:32.060 --> 00:29:32.540 +Super cool. + +00:29:32.780 --> 00:29:32.940 +Yeah. + +00:29:33.040 --> 00:29:37.760 +I'm reoriented in my acronyms now, profiler guided optimizations, right? + +00:29:38.040 --> 00:29:38.220 +Yes. + +00:29:38.220 --> 00:29:46.080 +So basically compilers as, as Python people, we don't necessarily think about them a lot, but compilers have all sorts of optimizations. + +00:29:46.220 --> 00:29:53.640 +And I remember in the late nineties, when I was working with things like GCC and stuff, you could actually break your program by asking for too many optimizations. + +00:29:54.000 --> 00:29:55.700 +You know, you could, it had these levels. + +00:29:55.700 --> 00:30:04.000 +And if you put it on the top level, there's a chance your program like literally might not run, which is a really bizarre thing for compilers to do, but they can, they make like decisions. + +00:30:04.000 --> 00:30:09.300 +Like maybe we should inline this so we can avoid a stack jump and setting up the stack and all that. + +00:30:09.300 --> 00:30:17.720 + with the PGO, it actually looks at how the code runs and uses that as input for its optimization, which is a super cool idea. + +00:30:18.060 --> 00:30:18.500 +So it's awesome. + +00:30:18.580 --> 00:30:19.020 +You're doing that. + +00:30:19.260 --> 00:30:19.340 +Yeah. + +00:30:19.400 --> 00:30:21.280 +And I honestly don't know what the difference is here. + +00:30:21.340 --> 00:30:25.840 +I think when I tried it, it was relatively minor, but in Pydantic, it's, it's a, it makes for a big improvement. + +00:30:26.120 --> 00:30:26.320 +Yeah. + +00:30:26.500 --> 00:30:35.760 +going back a bit, I don't know if people remember depending on where they were in their journey, but from Pydantic one to two, I've got 50 X performance increases. + +00:30:36.140 --> 00:30:40.180 +And yeah, the Pydantic of today is not the Pydantic of 2017, right? + +00:30:40.460 --> 00:30:41.220 +It sure is not. + +00:30:41.340 --> 00:30:42.180 +It's sure it's not. + +00:30:42.380 --> 00:30:46.580 +And that was, you know, that was an enormous piece of work, the rewrite, because we didn't have LLMs. + +00:30:46.700 --> 00:30:54.960 +I think it would have been a job that would have been a heck of a lot easier if we'd been able to point Opus 4.6 at Pydantic and be like, do this, but in Rust, but Hey, we got it done. + +00:30:55.000 --> 00:30:56.080 +And I learned a lot along the way. + +00:30:56.400 --> 00:31:10.500 +That's a challenge that we're going to have to, I don't know how you see it, but I think as an industry and individually, each of us is going to struggle with like, how much rust did you learn and how, how much experience and ideas did you get spending that year evolving + +00:31:10.500 --> 00:31:12.920 +Pydantic versus if you just got it knocked out? + +00:31:13.000 --> 00:31:14.060 +Like where's the trade-off? + +00:31:14.060 --> 00:31:14.660 +It's a big double end thought. + +00:31:14.860 --> 00:31:18.420 +Like I don't, I know there were those people who were like, now it's impossible to enter as a software engineer. + +00:31:18.500 --> 00:31:24.760 +I've spoken to some people, some really amazing product people who were like, I'm writing code suddenly because I have the right technical mindset. + +00:31:24.760 --> 00:31:27.300 +I just have never had the time to go and learn all this stuff. + +00:31:27.360 --> 00:31:32.660 +And now the LLM can do the like rote for me and I can do the innovative product stuff on top. + +00:31:32.820 --> 00:31:33.560 +So I get to build. + +00:31:33.740 --> 00:31:35.660 +So we have new people entering, but you're right. + +00:31:35.740 --> 00:31:41.640 +There are, there are going to be big challenges because just as I don't have a clue about assembly and I'm not good at writing it. + +00:31:41.680 --> 00:31:47.160 +And that probably makes me a worse engineer than if I spent the first decade of my career hand writing out assembly. + +00:31:47.160 --> 00:31:55.300 +So as we add layers of abstraction, the layer of abstraction beneath becomes kind of in the shade to, to most of us. + +00:31:55.320 --> 00:31:56.400 +And we never, we never look at it. + +00:31:56.900 --> 00:31:57.020 +Yeah. + +00:31:57.060 --> 00:31:58.120 +It's very interesting. + +00:31:58.440 --> 00:32:05.040 +I sort of think of this whole agentic coding thing as the change when design patterns became popular. + +00:32:05.180 --> 00:32:08.880 +Instead of talking about, here's how we're going to do the loop or here's how we're going to construct the class. + +00:32:08.920 --> 00:32:10.900 +You just think singleton flyweight. + +00:32:10.900 --> 00:32:14.320 +And like you're building with these bigger conceptual building blocks. + +00:32:14.380 --> 00:32:16.140 +And now it's kind of like make a login page. + +00:32:16.320 --> 00:32:16.500 +Okay. + +00:32:16.560 --> 00:32:17.000 +We've got the law. + +00:32:17.080 --> 00:32:18.480 +Now what, now what else am I building? + +00:32:18.520 --> 00:32:22.920 +Like you can think almost in components rather than like very small pieces. + +00:32:23.140 --> 00:32:23.300 +Yeah. + +00:32:23.400 --> 00:32:26.420 +I don't know what PyPI does, but like at the next level up. + +00:32:26.580 --> 00:32:26.760 +Yeah. + +00:32:26.980 --> 00:32:27.160 +Yeah. + +00:32:27.160 --> 00:32:27.580 +Kind of. + +00:32:27.760 --> 00:32:27.980 +Yeah. + +00:32:28.080 --> 00:32:30.500 +I do think there's still room for people to come into the industry. + +00:32:30.560 --> 00:32:31.500 +I think it's super exciting. + +00:32:31.800 --> 00:32:37.720 +You still just, I think it's really going to come down to like problem solving and breaking down things into the way you want them to work. + +00:32:37.820 --> 00:32:39.020 +And that's a programmer skill. + +00:32:39.020 --> 00:32:43.080 +I also think what we haven't seen yet is the things that LLMs are bad at. + +00:32:43.240 --> 00:32:49.600 +Because one, if an LL, if I tried to do something with an LLM and it doesn't work, that is not proof that I cannot do it with an LLM. + +00:32:49.680 --> 00:32:51.440 +It's proof it didn't work that particular time. + +00:32:51.600 --> 00:32:55.760 +Whereas if I go and try and do something with an LLM and it does work, well, hey, that's proof it can be done. + +00:32:56.000 --> 00:32:59.300 +And two, no one wants to talk about this is the thing that failed, right? + +00:32:59.560 --> 00:33:06.360 +So Anthropic announced we built a C compiler in two weeks by giving Opus loads of access. + +00:33:06.360 --> 00:33:11.260 +What they didn't say is we tried to build an eBay clone and it was a complete unmitigated failure. + +00:33:11.420 --> 00:33:13.960 +Cost us what would have been a hundred thousand dollars of inference. + +00:33:14.020 --> 00:33:15.880 +I'm not saying that's happened and no criticism. + +00:33:16.080 --> 00:33:16.120 +Yeah. + +00:33:17.260 --> 00:33:25.400 +We don't hear about the failures both because they're less attractive to state and because they are not clear identifiers as it were in the way that like successes are. + +00:33:25.400 --> 00:33:30.020 +And I think one of the things we will learn over the next few years is like, here are the things LLMs are really, really good at. + +00:33:30.040 --> 00:33:32.460 +And here are the things that no one succeeded with them yet. + +00:33:32.520 --> 00:33:34.100 +And that's probably meaningful. + +00:33:34.500 --> 00:33:39.360 +I don't want to go too deep in this because I want to stay focused on money, but I'm also a believer of Jevon's paradox. + +00:33:39.820 --> 00:33:43.320 +I think that this is going to create more demand for software. + +00:33:43.460 --> 00:33:49.100 +Now that people see what is possible rather than just like, well, we're going to build exactly the same amount of software with fewer people. + +00:33:49.100 --> 00:33:50.640 +So I think there's a lot there. + +00:33:51.080 --> 00:33:52.540 +So codspeed. + +00:33:52.940 --> 00:33:53.800 +So you have, have that on. + +00:33:53.840 --> 00:33:55.840 +That's the, this is a pretty interesting tool. + +00:33:56.120 --> 00:33:57.820 +I just recently learned about this. + +00:33:57.960 --> 00:33:59.440 +You have this as a badge on your GitHub. + +00:33:59.580 --> 00:34:01.320 +Tell us a quick bit about this. + +00:34:01.740 --> 00:34:04.800 +I'm good friends with Arthur who, who was the founder. + +00:34:05.200 --> 00:34:08.440 +I'm a big fan of codspeed when you're building performance critical code. + +00:34:09.060 --> 00:34:17.260 +This is a nice few, but the real powerful thing is if you go in on a, on a pull request, you can see if you're getting performance regressions. + +00:34:17.260 --> 00:34:19.600 +So, and even better. + +00:34:19.700 --> 00:34:22.840 +So if, if you go to, so these are the particular benchmarks we have. + +00:34:22.920 --> 00:34:27.620 +So if, yeah, maybe you go to branches, it's a, or if you go to a pull request in, in our GitHub. + +00:34:28.480 --> 00:34:31.040 +Oh, if I compare all these, I've compared main against main. + +00:34:31.100 --> 00:34:31.840 +That's not super interesting. + +00:34:32.080 --> 00:34:38.580 +If you go back to our, if you go back to, to the, like go to PR that you guys have to, to PRs. + +00:34:38.820 --> 00:34:42.380 +And if you go, for example, to that data class one, the third one down. + +00:34:42.620 --> 00:34:42.880 +Gotcha. + +00:34:42.980 --> 00:34:43.140 +All right. + +00:34:43.140 --> 00:34:43.760 +Let's check that out. + +00:34:43.960 --> 00:34:47.240 +You'll see, we have a comment from codspeed saying one benchmark has got more performance. + +00:34:47.260 --> 00:34:51.400 +more importantly, I had performance regression. + +00:34:51.760 --> 00:34:56.780 +Now Monty, now, now codspeed would be failing and I'd be like, I need to go fix that before I merge it. + +00:34:56.780 --> 00:35:02.140 +So we can't, as long as we have enough benchmarks, we can't have like silent regressions in performance. + +00:35:02.140 --> 00:35:03.320 +And even more powerful. + +00:35:03.320 --> 00:35:10.000 +If I go click on that, on that particular one, if you kick on the pair tuples, or go just, perhaps. + +00:35:10.240 --> 00:35:10.480 +Yeah. + +00:35:10.480 --> 00:35:10.760 +Yeah. + +00:35:11.560 --> 00:35:19.240 +What you will see is we can now go and see, the flame chart, the flame graph of exactly what's taken, what time and where the performance changes have come from. + +00:35:19.420 --> 00:35:21.000 +This, this change is very minor. + +00:35:21.000 --> 00:35:27.180 +So it's not very interesting, but you can imagine if you accidentally do something slow in your code, this is rust, but that'll work on Python as well. + +00:35:27.180 --> 00:35:30.360 +You would have this like flame chart showing you where the performance has changed. + +00:35:30.700 --> 00:35:31.560 +yeah. + +00:35:31.640 --> 00:35:36.480 +If people who are listening, they just go to the Monty, get up repo, go to any pull requests, pull it down. + +00:35:36.540 --> 00:35:39.560 +And there's just a comment from the cod speed bot. + +00:35:39.600 --> 00:35:45.200 +And it says the improvement changed from 97.7 milliseconds to 88.1 milliseconds. + +00:35:45.200 --> 00:35:47.620 +That's a 10.95% increase in performance. + +00:35:47.800 --> 00:35:50.300 +So, Hey, this thing doesn't hurt performance, right? + +00:35:50.320 --> 00:35:50.920 +By adding it. + +00:35:51.000 --> 00:35:51.100 +Yeah. + +00:35:51.200 --> 00:35:52.620 +What's even cooler is under the hood. + +00:35:52.720 --> 00:36:00.240 +They're using, Oh, I'm having a blank on the name, but they're, they're not even measuring, they're measuring like CPU and CPU instructions. + +00:36:00.400 --> 00:36:00.680 +Okay. + +00:36:00.900 --> 00:36:01.080 +Yeah. + +00:36:01.920 --> 00:36:09.000 +So it can run in, in a like noisy environment, like you have actions and you can still get like pretty good accuracy on detecting performance changes. + +00:36:09.340 --> 00:36:09.780 +Valgrind. + +00:36:09.880 --> 00:36:10.280 +There we are. + +00:36:10.420 --> 00:36:15.060 +Valgrind is the underlying tool that like at the compiler level is looking at number of CPU instructions. + +00:36:15.320 --> 00:36:16.400 +See what this pulls up. + +00:36:16.580 --> 00:36:17.120 +Well, cool. + +00:36:18.000 --> 00:36:22.780 +I don't know what that's about, but there's a, a polygonal polygon. + +00:36:23.680 --> 00:36:27.060 +No, well, I don't know what this is a cartoon, but there's also the app. + +00:36:27.060 --> 00:36:27.700 +Yeah. + +00:36:28.140 --> 00:36:29.940 +The, the, the, the, Oh, that's its logo. + +00:36:30.160 --> 00:36:30.300 +Okay. + +00:36:30.360 --> 00:36:30.760 +I got it. + +00:36:30.800 --> 00:36:33.400 +That's it's like, at least it's like hero image or something. + +00:36:34.280 --> 00:36:34.940 +yeah. + +00:36:35.140 --> 00:36:35.380 +Yeah. + +00:36:35.420 --> 00:36:42.140 +So, so we, it's maybe a good segue then into performance where like the aim of Monty is not to build something faster than CPython. + +00:36:42.140 --> 00:36:46.260 +The aim, the aim I suppose is to build something that is not like heinously slower. + +00:36:47.260 --> 00:36:52.220 +we performance seems to vary from about five times better to five times worse. + +00:36:52.220 --> 00:36:58.080 +In most cases, I'm sure that there are, there are edge cases we need to go and improve where it's worse than that, but like, that's what I seem to see. + +00:36:58.080 --> 00:37:04.400 +I mean, in my impression of the kind of LLM written code that we're mostly talking about, performance is not critical. + +00:37:04.940 --> 00:37:08.400 +Execution is going to be in the matter of single digit milliseconds. + +00:37:08.400 --> 00:37:11.180 +And that's not going to matter when you add a LLM requests are taking seconds. + +00:37:11.320 --> 00:37:13.380 +The thing where Monty really excels. + +00:37:13.460 --> 00:37:19.480 +So if you scroll down a bit and I can talk you through the table, it's like near the bottom of the, of the read me. + +00:37:19.780 --> 00:37:22.120 +but yeah, there we are. + +00:37:22.120 --> 00:37:27.740 +So like the startup time here measured for Monty to go from basically code to a result. + +00:37:27.900 --> 00:37:33.360 +I think the code here is like one plus one is, 0.06 milliseconds. + +00:37:34.000 --> 00:37:37.160 +So that's six, microseconds. + +00:37:37.160 --> 00:37:45.600 +So, and actually in the hot, hot loop in benchmarks, we see one plus one, going from codes to result in Monty taking about 900 nanoseconds. + +00:37:45.600 --> 00:37:51.220 +So under a microsecond, again, that's, that's microsecond, not millisecond or second. + +00:37:51.580 --> 00:38:01.660 +when you compare that to like running something in Docker, which is taking in, in my example here, 195 milliseconds, Pyodide, Pyodide is awesome project. + +00:38:01.740 --> 00:38:12.880 +Big fan of, of the team, allowing you to run Python in the browser, but wasn't designed for this use case, running, going from zero to like getting a result in Pyodide is, 2.8 seconds. + +00:38:13.380 --> 00:38:18.740 +Starlark's a special case of another project, a bit like Monty, but a bit more limited. + +00:38:19.260 --> 00:38:25.060 +but sandboxing, I was talking earlier about that being one of the main options, like go run a, basically spin up a new container somewhere. + +00:38:25.200 --> 00:38:26.580 +There's a bunch of services that will do that. + +00:38:26.780 --> 00:38:30.900 +They're very popular at a moment from, from scratch to creating a new container and getting a result. + +00:38:30.900 --> 00:38:32.180 +Here's taking over a second. + +00:38:32.760 --> 00:38:38.020 +So where Monty really excels is where you have relatively small amount of Python code to call. + +00:38:38.020 --> 00:38:42.980 +And that this, the overhead of running it is, is basically in the realistic term zero. + +00:38:43.420 --> 00:38:46.320 +It's, it's the cold start over and over and over again. + +00:38:46.320 --> 00:38:51.940 +The, cause these are all one shot commands, like the LLM asks for this thing and it shuts down when it gets the answer. + +00:38:51.940 --> 00:38:52.200 +Right. + +00:38:52.440 --> 00:38:52.600 +Yeah. + +00:38:52.600 --> 00:38:57.520 +And, and I'm sure that if you ask the sandbox providers, they would be like, yeah, but it's not about cold start. + +00:38:57.640 --> 00:38:59.560 +It's about reusing an existing container. + +00:38:59.560 --> 00:39:00.820 +And that is way faster. + +00:39:01.060 --> 00:39:07.600 +I agree that, you know, and then there, there are impressive pieces of technology, but there are also lots of cases where I do want, where I do want cold start. + +00:39:07.600 --> 00:39:21.500 +I've spoken to the big LLM providers who are interested in Monty, because if you go and ask, ChatGPT, like effectively some, some arithmetic or like how many days between these two dates in the background, they're running Python code. + +00:39:21.500 --> 00:39:22.820 +do that calculation. + +00:39:22.820 --> 00:39:24.140 +They're obviously very security conscious. + +00:39:24.220 --> 00:39:27.340 +They can't just go run that Python code YOLO on, on whatever host. + +00:39:27.340 --> 00:39:30.780 +So they're actually using external, sandboxing services often. + +00:39:30.780 --> 00:39:41.400 +And that one, they're paying the second of overhead for that, where they do need a new container, but also that, you know, they're paying the organizational complexity of another, another provider. + +00:39:41.520 --> 00:39:43.560 +They're paying the fee of running that. + +00:39:43.820 --> 00:39:46.720 +Whereas Monty would allow you to do that kind of thing right there in the process. + +00:39:47.100 --> 00:39:53.620 +That is something that's really interesting about how these LLMs are like bad at math, you know, just add up these numbers and it might not get it right. + +00:39:53.900 --> 00:40:00.760 +And so, like you said, they've, they've started to go, okay, I'm going to write some bit of code that I know how to write really well and can verify. + +00:40:00.920 --> 00:40:02.860 +And then I'll just apply this data set to it. + +00:40:02.860 --> 00:40:03.060 +Right. + +00:40:03.060 --> 00:40:08.480 +Like you'll see it doing, you know, CSV types of things with Python and all sorts of stuff. + +00:40:08.480 --> 00:40:12.580 +And so that's a really good place where that Monty could be the foundation of it. + +00:40:12.580 --> 00:40:12.820 +Right. + +00:40:13.100 --> 00:40:13.540 +Yeah, exactly. + +00:40:13.860 --> 00:40:23.280 +And, you know, the other nice thing about that is if you have the Python code and something does go wrong, you're not having to like kind of guess at what's going on inside the black box of the LLM. + +00:40:23.560 --> 00:40:28.280 +Well, I suppose you are at some level, but you have the code, which is kind of the intermediate step where you can go and verify. + +00:40:28.480 --> 00:40:28.700 +Yep. + +00:40:28.740 --> 00:40:29.660 +That code makes sense. + +00:40:29.700 --> 00:40:41.500 +I mean, not saying everyone will do that, but as a developer debugging it, or as a data scientist trying to work out whether or not it is likely to have got the right result, I have the kind of intermediate representation of the logic that I can go and review. + +00:40:41.660 --> 00:40:43.460 +And so it's that much easier to, to debug. + +00:40:43.940 --> 00:40:47.200 +So let's talk about some of the columns, partial language completeness. + +00:40:47.540 --> 00:40:53.840 +I'm not saying it needs to be completely complete, but you know, like what, what does it, what does it need? + +00:40:53.900 --> 00:40:58.020 +You know, for example, do you need really dynamic metaclass programming for your tool use? + +00:40:58.260 --> 00:40:58.860 +Probably not. + +00:40:58.980 --> 00:40:59.340 +Right. + +00:40:59.460 --> 00:40:59.640 +Right. + +00:40:59.760 --> 00:41:00.180 +So probably not. + +00:41:00.500 --> 00:41:01.780 +So at the moment, the two, what does it need? + +00:41:02.040 --> 00:41:02.200 +Yeah. + +00:41:02.200 --> 00:41:05.440 +So the things we miss right now, I'll start with, with the downside. + +00:41:05.560 --> 00:41:10.720 +The things we miss right now are, classes, context managers. + +00:41:10.940 --> 00:41:15.860 +So, so with expressions, and match expressions, which are obviously relatively new. + +00:41:16.100 --> 00:41:18.600 +I think classes are by far the most complex of those. + +00:41:18.760 --> 00:41:20.620 +We will support them at some point. + +00:41:20.760 --> 00:41:22.660 +They're somewhat complex to, to get right. + +00:41:22.660 --> 00:41:26.920 +I have been amazed by how much LLMs just don't need classes to do most of the stuff they're doing. + +00:41:27.080 --> 00:41:32.040 +Like, so you could pass a data class into Monty and you will have some object where you can access attributes. + +00:41:32.040 --> 00:41:35.700 +And access as of later today methods on that, on that data class. + +00:41:35.740 --> 00:41:40.120 +But what you can't do is like define a class or a data class in, in the Monty code itself. + +00:41:40.120 --> 00:41:42.760 +I'm amazed at how often that that's just not necessary. + +00:41:43.200 --> 00:41:50.040 +Context managers will mostly be nice because we can allow the LLM to write the kind of code it might want to. + +00:41:50.040 --> 00:41:55.640 +So let's say we allow the open, at the moment the open built in is not, it's not provided at all for opening a file. + +00:41:55.760 --> 00:42:04.180 +We have like the, we have basic support for path lib via our way of, allowing use like very controlled access to the outside world. + +00:42:04.260 --> 00:42:08.640 +But if we have add open, very often LLMs want to write with open, yada, yada. + +00:42:08.820 --> 00:42:12.160 +And we want to be able to support that match expressions are, are, are neat. + +00:42:12.220 --> 00:42:13.660 +And I think will be more and more common in Python. + +00:42:13.660 --> 00:42:17.960 +And I think we can, you know, full support will be hard, but getting most of it there is hard. + +00:42:18.120 --> 00:42:19.340 +What we will never. + +00:42:19.600 --> 00:42:23.980 +And then, then the other big part of partial is we don't have the full standard library. + +00:42:23.980 --> 00:42:38.600 +So we have a very, very limited standard library today of some bits of typing, some bits of the SIS, module, OS dot environment, as a PR up from someone to add re, regexes date, date time. + +00:42:38.980 --> 00:42:39.840 +And I think we'll add Jason. + +00:42:40.200 --> 00:42:42.140 +and so those will all be, be supported. + +00:42:42.260 --> 00:42:44.840 +And to be clear, they will all be implemented in rust. + +00:42:45.000 --> 00:42:49.560 +So like Jason dot loads will be rust level performance of loading that thing. + +00:42:49.560 --> 00:42:54.000 +I mean, we're a bit of overhead to creating the Monty object, but, but very, very fast. + +00:42:54.380 --> 00:42:58.260 +but we're never going to go and support the whole standard library. + +00:42:58.300 --> 00:43:02.560 +It'll be on a case by case to LLMs actually need this thing, that we can go and go and add them. + +00:43:02.640 --> 00:43:17.620 +I will say, and I know we're going to talk about this at some point, but like, it is amazing what this project is only made possible by LLMs and not, not that we're ever aiming to full standard library, but adding support for certain, certain modules of the standard library is a heck of a lot easier when you can, again, + +00:43:17.700 --> 00:43:19.600 +we have a perfect record of what it's supposed to do. + +00:43:19.600 --> 00:43:22.340 +So we can go and ask the LLM to, to build that. + +00:43:22.420 --> 00:43:26.740 +and then the last test for like CPython has a ton of tests. + +00:43:26.740 --> 00:43:29.800 +You can extract out the bits that apply to that maybe. + +00:43:29.960 --> 00:43:31.780 +And just, well, does it run here? + +00:43:31.900 --> 00:43:35.760 +I'll come on to like, so I have three reasons why I think it's, this is possible with LLM. + +00:43:35.840 --> 00:43:41.660 +Let me just, the last point that's going to make is what we will never support is, or I think never support is third party libraries. + +00:43:41.740 --> 00:43:48.060 +So you'll never be able to pip install Pydantic or FastAPI or requests inside, inside Monty. + +00:43:48.060 --> 00:43:55.120 +And because the reason, the reason for that is we would need to support the CPython ABI and basically support full CPython. + +00:43:55.120 --> 00:43:57.700 +And if you're going to do that, you're basically back to CPython. + +00:43:58.040 --> 00:44:02.400 +and so sure there are ways of sandboxing CPython, most of which are demonstrated here. + +00:44:02.480 --> 00:44:03.540 +That's not the aim of this project. + +00:44:03.760 --> 00:44:13.900 +However, what we can allow you to do is basically have a shim where you expose, let's say HTTPX, get and post methods and patch and whatever you need through to Monty. + +00:44:13.900 --> 00:44:21.240 +And we're, we're currently working out whether or not we basically add those, provide those shims as, as part of the library. + +00:44:21.240 --> 00:44:22.640 +So you don't need to go and think about that. + +00:44:22.700 --> 00:44:32.040 +You can be like, yes, give it HTTP access or yes, give it access to DuckDB's SQL, engine or give it access to beautiful soup. + +00:44:32.100 --> 00:44:34.880 +And that shim comes and you don't need to go and implement it. + +00:44:34.880 --> 00:44:42.240 +so you can whitelist in like super critical libraries that people are like, we, if I had this, I could really do. + +00:44:42.240 --> 00:44:53.840 +So one of the questions we have now, that we need to probably go run evals on to find out is if we come up with a very Pythonic type safe, example of let's say an HTTP library, and we give those types to the LLM, + +00:44:54.120 --> 00:44:58.240 +does it do better or worse with that than just being told you can use requests? + +00:44:58.380 --> 00:44:59.680 +And I don't know the answer. + +00:44:59.780 --> 00:45:02.040 +There are, there are genuine arguments in both cases. + +00:45:02.220 --> 00:45:04.600 +Some people seem to be very sure one or the other is right. + +00:45:04.640 --> 00:45:05.760 +I just, I just don't know. + +00:45:05.820 --> 00:45:10.260 +And that's the kind of thing where we need to go and run evals and work out what an LLM will find easiest. + +00:45:10.260 --> 00:45:18.260 +but yeah, we can either kind of attempt to fake the existing libraries, API, warts and all, or we can go. + +00:45:18.420 --> 00:45:24.100 +And in many cases just say, Oh, we've got this new fetch library that has a fetch method and here's its, signature. + +00:45:24.260 --> 00:45:26.880 +And I suspect the LLM will do, do a pretty good job of it. + +00:45:26.880 --> 00:45:39.320 +So what are the weird new, not quite typo squatting, but kind of typo squatting supply chain type of issues has, at least in the earlier days of LLMs, they, when you would ask it to write code, sometimes it would say, + +00:45:39.440 --> 00:45:42.440 +we're going to import some library and that library didn't exist. + +00:45:42.440 --> 00:45:45.980 +And then it imagined a bunch of code series that happened after it. + +00:45:45.980 --> 00:45:53.720 +So people would go and find popular ones of those and then register malicious packages that the LLMs had hallucinated. + +00:45:53.880 --> 00:45:54.020 +Right. + +00:45:54.020 --> 00:46:06.640 +but I guess you probably kind of, you kind of got to do a similar analysis, but not for evil where you say like, well, if I just ask Claude or, or, codex or whatever to do a thing, what is it? + +00:46:06.800 --> 00:46:07.860 +What does it try to do? + +00:46:07.860 --> 00:46:17.640 +If you see it always asking for a question, like maybe it's just better that we, we lie to it and say, okay, whenever it says import requests, we give it our special way to just get stuff off the internet. + +00:46:17.640 --> 00:46:21.440 +And it only really needs to get put in like a couple of, it doesn't need all of requests. + +00:46:21.500 --> 00:46:23.120 +It just needs a very basic behaviors. + +00:46:23.300 --> 00:46:23.460 +Yeah. + +00:46:23.460 --> 00:46:24.940 +Is that the kind of stuff you're thinking? + +00:46:25.060 --> 00:46:25.640 +Yeah, exactly that. + +00:46:25.720 --> 00:46:39.400 +And that's one of the reasons we didn't start with Starlark, which is a, I think originally a meta Facebook project to have a, like basically isolated Python runtime was because Starlark has a very + +00:46:39.400 --> 00:46:43.300 +disciplined and principled approach to what it supports and what it doesn't. + +00:46:43.520 --> 00:46:44.940 +We have to be not principled. + +00:46:44.940 --> 00:46:51.420 +We have to be like, well, if the LLM wants to write this thing, we're going to go and implement the CSV module, but not the Toml lib module. + +00:46:51.420 --> 00:46:53.020 +Cause that's just what they need to go and use. + +00:46:53.020 --> 00:46:57.280 +And we're going to be like, our principle is give the LLM what it wants, not here's our rule. + +00:46:57.600 --> 00:47:00.760 +so yes, exactly. + +00:47:00.860 --> 00:47:06.240 +And yeah, I mean, I think Boris, Boris, the, code code creator talked about this. + +00:47:06.240 --> 00:47:20.260 +So I saw him speaking, he was saying like, you know, one of the reasons they gave the LLM bash early on was like, you can tell it to use the make dark, tool to make directories, but half the time it'll just go and call make dark --P and make the directory that way. + +00:47:20.260 --> 00:47:25.260 +And like, are we going to fight it and always return an error being like, you should do this other thing, or are we just going to make that thing work? + +00:47:25.260 --> 00:47:27.100 +And often you have to just make nothing work. + +00:47:27.280 --> 00:47:29.020 +so, so yeah, go ahead. + +00:47:29.020 --> 00:47:29.440 +Yeah. + +00:47:29.580 --> 00:47:34.860 +Is this useful outside of this for AI story? + +00:47:34.860 --> 00:47:44.580 +You know, like if I'm creating something that has really high security, I want to add some, some mechanism for people to write scripting, but not full on programming language. + +00:47:44.820 --> 00:47:46.080 +So in other places. + +00:47:46.400 --> 00:47:46.500 +Yeah. + +00:47:46.520 --> 00:47:49.120 +We've actually thought about this internally inside log fire already. + +00:47:49.120 --> 00:47:54.040 +Like we want to be able to give people a way of basically entering config config that can do things. + +00:47:54.140 --> 00:47:55.640 +There's no easy way of doing that right now. + +00:47:55.640 --> 00:47:55.800 +Right. + +00:47:55.840 --> 00:48:02.220 +As it's sure I can go and use as again, one of these sandboxing services to run that code, all the complexity of setting up, we offer self-hosted log fire. + +00:48:02.300 --> 00:48:03.800 +So they're not going to work, et cetera, et cetera. + +00:48:03.800 --> 00:48:14.620 +Or once Monty is a bit more mature, we can just go and use Monty to let them like define the expression that, that it might be as simple as like, what field do we use from your profile to display as your net? + +00:48:14.820 --> 00:48:15.000 +Right. + +00:48:15.040 --> 00:48:19.180 +And we can, we can let you bet in or an AI can write the, like one line of code that does that. + +00:48:19.200 --> 00:48:20.460 +And then we can call it lots of times. + +00:48:20.460 --> 00:48:24.700 +They're like, it's feasible now to have the, like a few lines of Python code to define this. + +00:48:24.700 --> 00:48:27.740 +That's generally, generally been hard until now. + +00:48:27.980 --> 00:48:33.660 +but of course, you know, the best tools are the ones where you, people use the tool for not what it was originally designed for. + +00:48:34.020 --> 00:48:36.640 +So someone invents the hammer and I think it's going to be used for nails. + +00:48:36.640 --> 00:48:43.000 +And then someone else realizes that you can like change the, like knockout, like mistakes in your bumper of your car with a hammer. + +00:48:43.000 --> 00:48:43.320 +Right. + +00:48:43.380 --> 00:48:50.380 +And like, of course, what's amazing about Pydantic, why I'm so proud of it is people gone and used it as a general purpose tool for a bunch of things I'd never thought of. + +00:48:50.380 --> 00:48:55.180 +So my like dream for Monty is that people come along with things to do with it that I had never heard of. + +00:48:55.180 --> 00:48:57.600 +And like, RLM is a really good example of that. + +00:48:57.660 --> 00:49:05.740 +So recursive language models of this way in which you use almost always a Python REPL as a way of implementing effectively agentic loop. + +00:49:05.740 --> 00:49:13.340 +And there were some people who have an example of doing that and like getting better results in the RKGI 2 benchmarks by using RLM. + +00:49:13.620 --> 00:49:15.800 +I didn't even know about RLMs when I announced Monty. + +00:49:16.000 --> 00:49:24.980 +There are now at least four different libraries that are using Monty for RLM with, with DSPY because DSPY because people are super excited about that space. + +00:49:25.120 --> 00:49:29.640 +So that's, that's agentic, but it's definitely something I hadn't thought of when I announced it. + +00:49:29.920 --> 00:49:30.020 +Yeah. + +00:49:30.100 --> 00:49:34.160 +I was even thinking just like, I have a medical device, like a CT scanner. + +00:49:34.260 --> 00:49:38.180 +I want to let people script it, but we can't break it and like zap somebody. + +00:49:38.460 --> 00:49:39.000 +Do you know what I mean? + +00:49:39.040 --> 00:49:41.800 +It needs to be really very, very controlled. + +00:49:42.240 --> 00:49:44.480 +this could be a really interesting, thing. + +00:49:44.700 --> 00:49:46.440 +So does it compile to WebAssembly? + +00:49:46.600 --> 00:49:47.980 +Can I in browser it? + +00:49:48.200 --> 00:49:48.320 +Yep. + +00:49:48.520 --> 00:49:54.840 +And in fact, Simon Willison, the day it came out or Simon Willison, Claude prompted by Simon Willison set one up. + +00:49:54.900 --> 00:50:03.020 +So I think if you go to Simon's blog somewhere, there's actually an example of Monty running somewhere, somewhere in a browser that you can, you can go and go and try it. + +00:50:03.060 --> 00:50:04.060 +Probably an earlier version. + +00:50:04.540 --> 00:50:09.360 +yeah, somewhere here, I think he'll have a link to, to his, his version of it. + +00:50:09.460 --> 00:50:23.920 +so as he pointed out that you can do the really crazy thing, which is you can, you can compile the Python package for, yeah, so this is, this is his example, which is, I think like, WebAssembly running directly in the browser, but he did something even more crazy, which is he took the Python library, + +00:50:24.060 --> 00:50:30.900 +compiled that to, to Wasm and then called that from inside Pyodide, which is like crazy worlds within worlds. + +00:50:31.200 --> 00:50:34.120 +definitely not the original plan, but, but interesting. + +00:50:34.540 --> 00:50:34.700 +Yeah. + +00:50:34.880 --> 00:50:35.220 +Wow. + +00:50:35.220 --> 00:50:35.720 +Okay. + +00:50:35.720 --> 00:50:36.560 +So yes. + +00:50:36.880 --> 00:50:38.600 +And here's your example to do it, right? + +00:50:38.840 --> 00:50:38.980 +Yeah. + +00:50:39.180 --> 00:50:39.340 +Yeah. + +00:50:39.560 --> 00:50:45.820 +And I think the other, the other thing we really need to add to this table, in terms of, of latency and complexity is calling back to the host. + +00:50:45.920 --> 00:50:51.700 +So one of the reasons a number of people have reached out to me and excited about this is sure that they're happy to have a sandboxing service. + +00:50:51.700 --> 00:51:03.140 +They don't even mind the second of, of start time, but like if they want to, for example, build an agent that can go and basically, run SQL against a bunch of CSV files, how do I get those CSV files into the sandbox? + +00:51:03.360 --> 00:51:09.400 +Well, that is painful and often slow because we have to make a full network round trip back to the host to get those files. + +00:51:09.400 --> 00:51:17.160 +The, the network latent, the, sorry, the overhead of calling a function on the host in Monty is a single digit milliseconds or maybe even less. + +00:51:17.160 --> 00:51:29.760 +And so if you're making, if you're reading 50 different files from the, from, from the local, yeah, from within the sandbox, but effectively they're registered locally, that's super easy and performance because it's running right there and the same process. + +00:51:30.100 --> 00:51:30.360 +Very neat. + +00:51:30.480 --> 00:51:32.100 +So a couple of questions. + +00:51:32.500 --> 00:51:36.260 +Bonita says we have agents running on AWS strands. + +00:51:36.680 --> 00:51:37.800 +Here's the crazy thing about AWS. + +00:51:37.980 --> 00:51:39.320 +There's like so many services. + +00:51:39.440 --> 00:51:40.500 +I don't even know what strands is. + +00:51:40.620 --> 00:51:40.720 +Yeah. + +00:51:40.780 --> 00:51:41.200 +But amazing. + +00:51:41.360 --> 00:51:45.080 +I think strands is their agent framework is my, my, my guess. + +00:51:45.360 --> 00:51:45.500 +Yeah. + +00:51:45.500 --> 00:51:45.640 +Yeah. + +00:51:45.920 --> 00:51:49.100 +Will the use of Monty help us improve performance there? + +00:51:49.220 --> 00:51:50.300 +Could they use Monty? + +00:51:50.300 --> 00:51:51.580 +Yes, it should be able to. + +00:51:51.960 --> 00:51:54.900 +I'm again, again, apologies if I don't know exactly what strands is. + +00:51:54.980 --> 00:51:56.300 +If strands is their agent framework. + +00:51:56.300 --> 00:51:57.840 +Yes. + +00:51:57.840 --> 00:52:05.440 +In principle, Pynastic AI, our agent framework will have support for Monty as a code execution environment later this week. + +00:52:05.440 --> 00:52:11.000 +And so you'll be able to basically, instead of running, yes, open source agents SDK. + +00:52:11.320 --> 00:52:18.560 +So I don't know whether AWS intend to add specific support for Monty, but I know our agent framework will support it later this week. + +00:52:18.820 --> 00:52:24.380 +My guess from, from what we've built in the past is others will pick up on it and also integrate it into, into their things. + +00:52:24.420 --> 00:52:28.180 +And of course, the nice thing is here because all the only real requirement is rust. + +00:52:28.180 --> 00:52:35.700 +We already have the Python package and JavaScript package, but if you wanted to call it from, from any other language base where you can call rust, that should be possible. + +00:52:36.100 --> 00:52:39.120 +And data science, you mentioned DuckDB already. + +00:52:39.460 --> 00:52:39.800 +Sort of. + +00:52:39.800 --> 00:52:40.300 +Yeah. + +00:52:40.300 --> 00:52:46.940 +NumPy would be, would be great to have, I think full, I mean, I think when like, this is where we need to be a bit careful about what we add. + +00:52:47.060 --> 00:52:47.380 +Like, sure. + +00:52:47.420 --> 00:52:51.800 +If there are particular bits of, of NumPy that are useful, can we go and add shims for that? + +00:52:51.840 --> 00:52:53.660 +Or can we even go and implement that in rust? + +00:52:53.700 --> 00:53:06.660 +So you can do a like NumPy matrix transformation that happens effectively in rust, but we need to work out what people want and where, what we can't do, unfortunately, I'd love to be able to, but we can't do is just be like, yep, click this button. + +00:53:06.660 --> 00:53:09.820 +And then now we have the full NumPy API available. + +00:53:09.960 --> 00:53:18.580 +That is the, you know, that's the big, I'm not going to say Achilles heel because I'm super optimistic about Monty, but that's the, you know, the biggest challenge of Monty is, is that we don't just get to use all the libraries. + +00:53:18.920 --> 00:53:19.040 +Okay. + +00:53:19.100 --> 00:53:21.660 +Let me propose a slightly different path. + +00:53:21.900 --> 00:53:22.080 +Yep. + +00:53:22.360 --> 00:53:22.720 +Polars. + +00:53:23.040 --> 00:53:23.260 +Yep. + +00:53:23.420 --> 00:53:24.180 +Plus Narwhals. + +00:53:24.460 --> 00:53:25.120 +What's Narwhals? + +00:53:25.660 --> 00:53:37.480 +Narwhals is a, a facade API, a facade across NumPy, Polars, and a few other things that gives you, like you can program in either, and it'll talk to one or the other. + +00:53:37.560 --> 00:53:43.200 +So basically you could use Narwhals to talk NumPy, but it translates all the calls over to Polars. + +00:53:43.540 --> 00:53:43.720 +Yeah. + +00:53:43.960 --> 00:53:47.580 +I mean, given that, you know, there's a paradigm shift happening here. + +00:53:47.740 --> 00:53:52.620 +We, what we, what we're not trying to do is let your existing Python code run in this runtime. + +00:53:52.620 --> 00:53:55.820 +We're trying to give it a context for LLMs to be able to write code. + +00:53:56.000 --> 00:53:56.980 +And so why not? + +00:53:57.120 --> 00:53:58.320 +I mean, Polars is written in Rust. + +00:53:58.920 --> 00:53:59.360 +And exactly. + +00:53:59.520 --> 00:54:00.160 +That's why I said that. + +00:54:00.240 --> 00:54:00.300 +Yeah. + +00:54:00.440 --> 00:54:03.280 +Go and like compile Polars into Monty. + +00:54:03.280 --> 00:54:10.580 +And now you have a full, like very performant data frame library or, you know, analytical database effectively built into it. + +00:54:11.000 --> 00:54:16.120 +And you can, and we have the full Polars API available in, in Monty. + +00:54:16.200 --> 00:54:17.580 +That would be, that would be one option. + +00:54:19.040 --> 00:54:30.840 +Again, I'm going to be a bit restrictive and, you know, any color, as long as it's black about what we add, because I don't think, you know, we don't need, I don't care about your taste of whether you prefer Polars to Pandas or anything else. + +00:54:30.840 --> 00:54:32.820 +I care about what are the LLMs find easy to do. + +00:54:33.440 --> 00:54:38.900 +I think the biggest point of proof of that, Samuel, is that it doesn't do Pydantic yet. + +00:54:39.520 --> 00:54:39.680 +Yeah. + +00:54:39.940 --> 00:54:44.160 +If it doesn't do Pydantic, like, okay, you, you're, you're walking the walk. + +00:54:44.480 --> 00:54:44.640 +Yeah. + +00:54:44.840 --> 00:54:50.200 +And, and I, to be clear, I don't think, yeah, am I going to vibe code a whole new Pydantic in Monty? + +00:54:50.260 --> 00:54:51.720 +I don't know whether I'm keen for that yet. + +00:54:53.100 --> 00:54:53.500 +Yeah. + +00:54:53.740 --> 00:54:54.280 +Yes, indeed. + +00:54:54.280 --> 00:55:03.060 +So how do I go about making my AI, like, let's say I'm doing cloud code, Opus 4.6, some project. + +00:55:03.240 --> 00:55:06.840 +I'm actually not a huge fan of the terminal cloud code. + +00:55:06.980 --> 00:55:09.920 +I feel like it takes me too far away from the code. + +00:55:10.180 --> 00:55:20.840 +Just, I prefer to kind of have it in to kind of editor, like the extension for say cursor or VS Code, where I can sort of like watch the code as it's going and sort of, no, no, no, you're going the wrong way. + +00:55:21.020 --> 00:55:22.940 +Anyway, it doesn't matter really which, how you run it. + +00:55:22.940 --> 00:55:25.260 +Suppose I'm running it somehow. + +00:55:25.660 --> 00:55:27.520 +How do I tell it about Monty? + +00:55:27.640 --> 00:55:29.820 +How does it know what Monty can and can't do? + +00:55:29.960 --> 00:55:31.140 +How do I make it use Monty? + +00:55:31.280 --> 00:55:31.680 +You know what I mean? + +00:55:32.040 --> 00:55:39.200 +You wait a few weeks for us to have skills for Monty and the rest of our stack, and then you install those skills. + +00:55:39.460 --> 00:55:40.440 +It's something we need to do. + +00:55:40.580 --> 00:55:41.480 +And I think that's the number. + +00:55:41.580 --> 00:55:44.360 +We will have proper documentation for Monty as well. + +00:55:44.500 --> 00:55:46.820 +And that will, that will be an important part of it. + +00:55:47.060 --> 00:55:48.680 +That's, yeah, there's a lot to do here. + +00:55:49.620 --> 00:55:52.360 +LLMs can help with some of it, but not, not by any means do all of it. + +00:55:52.360 --> 00:55:55.540 +I mean, at the moment, read the read me and read the issues. + +00:55:55.620 --> 00:56:00.800 +And I'm, I am like impressed, surprised, scared by how much people are using Monty already. + +00:56:01.300 --> 00:56:02.680 +how much is he picked up? + +00:56:02.680 --> 00:56:03.720 +It's already, what are you doing? + +00:56:04.080 --> 00:56:04.840 +You know what I saw? + +00:56:04.920 --> 00:56:08.660 +I saw your announcement of this on X actually is where I saw it. + +00:56:08.900 --> 00:56:17.500 +And I believe, it's been a little while since I saw it, but it said something to the effect of like, this is way too early, but what the heck, here we go. + +00:56:17.820 --> 00:56:19.120 +Posted the GitHub link, right? + +00:56:19.120 --> 00:56:20.100 +Something to that effect. + +00:56:20.300 --> 00:56:23.260 +And that was, what's that last week? + +00:56:23.560 --> 00:56:25.940 +Here we are with 5,000 stars. + +00:56:26.400 --> 00:56:26.560 +Yeah. + +00:56:26.920 --> 00:56:28.380 +yeah, exactly. + +00:56:28.640 --> 00:56:32.620 +And it shows how many people are, you know, are looking, are interested in this space. + +00:56:32.860 --> 00:56:36.400 +I mean, look, a lot of people would have started thinking, Oh, there's going to be a new Python. + +00:56:36.400 --> 00:56:37.180 +That's just faster. + +00:56:37.180 --> 00:56:44.200 +Cause it's in Rust and it's going to do everything better in a way that like, you might argue, you know, rough is like wholly better than what one before. + +00:56:44.380 --> 00:56:46.440 +That is not, that's not the aim for Monty. + +00:56:46.520 --> 00:56:48.520 +This is not going to supplant or replace in any way. + +00:56:48.560 --> 00:56:48.920 +See Python. + +00:56:49.060 --> 00:56:50.480 +It's a, it's a completely separate thing. + +00:56:50.480 --> 00:56:59.780 +But I think there's also a lot of people who have started this because they're running, they're having a headache running stuff in a, you know, with existing options for sandboxing and something like this is, is interesting. + +00:56:59.780 --> 00:57:06.940 +There's also, there's another project that's worth calling out from Vasell called Just Bash, which is very similar conceptually. + +00:57:07.100 --> 00:57:11.180 +It's a bash environment written entirely in TypeScript by, by a team. + +00:57:11.620 --> 00:57:22.200 +I've said, I met them when I was in San Francisco a few weeks ago and the plan, when I get around to finishing the JavaScript API is that they will in fact use, Monty as the way of calling Python code. + +00:57:22.200 --> 00:57:26.980 +Cause they have some way of calling Python code within this, which I think uses Pyodide at the moment. + +00:57:26.980 --> 00:57:32.380 +And it has some, some overheads and some, some, challenges around, security. + +00:57:33.000 --> 00:57:43.500 +but yeah, this is very similar in the sense of like, it's basically vibe coding, all of the terminal methods that you might want, and using a bunch of existing unit tests to, to check that they're correct. + +00:57:43.880 --> 00:57:47.540 +interesting that obviously Vasell is a much, much bigger name than we are. + +00:57:47.540 --> 00:57:54.560 +And it hasn't got as much, like traction early on as at least in terms of GitHub stars, the, you know, the worst of all vanity metrics. + +00:57:54.980 --> 00:57:58.660 +they've been out like two or three times as long as you have, they've 1000 stars. + +00:57:58.660 --> 00:58:00.400 +That is, I mean, that's noteworthy, honestly. + +00:58:00.800 --> 00:58:00.960 +Yeah. + +00:58:01.200 --> 00:58:13.860 +And there's another project like this, which has about 20 stars, which I was looking at earlier today, which is this, but in rust completely, which already has support for Monty, which I can't remember the name of right now, but maybe I should find it quickly and call it out. + +00:58:13.860 --> 00:58:16.740 +Cause I feel like it deserves it given that it's a really cool project. + +00:58:16.900 --> 00:58:19.720 +It has, as I say about, 30 stars. + +00:58:20.040 --> 00:58:23.740 +let me very quickly, excuse me for one minute. + +00:58:23.740 --> 00:58:26.640 +It was one of the replies to my initial announcement. + +00:58:27.240 --> 00:58:28.420 +sorry. + +00:58:28.840 --> 00:58:30.980 +I will not be very long. + +00:58:31.260 --> 00:58:33.080 +it's called bash, bash kit. + +00:58:33.520 --> 00:58:36.380 +I put the, put the link here. + +00:58:37.000 --> 00:58:43.760 + this already actually has optional support for using Monty as the, as a Python, runtime. + +00:58:44.200 --> 00:58:48.460 +well, if I was logged into GitHub on my streaming machine, I would have one more star, but I'll do it later. + +00:58:49.240 --> 00:58:49.880 +Fair enough. + +00:58:49.960 --> 00:58:50.300 +Fair enough. + +00:58:50.300 --> 00:59:00.140 +But, but I think what's interesting is all of these three projects and I've heard of a few others, you know, these are only possible really, or they're only really challenges anyone would take on with the advantage of, of an AI. + +00:59:00.380 --> 00:59:02.040 +And so, so I was mentioning this earlier. + +00:59:02.080 --> 00:59:14.380 +I think there were three reasons why these things have, why I'll talk about Monty in particular, why it is possible now when it wasn't before and why it is something where the like speed up from an LLM is even greater than in most, most coding tasks. + +00:59:14.780 --> 00:59:25.480 +One, the LLM, knows in its soul, in its weights, the internal implementation, how to go about implementing a bytecode interpreter or how to implement it. + +00:59:25.600 --> 00:59:30.720 +If I asked most even experienced Python engineers or Rust engineers, how do I write a bytecode interpreter? + +00:59:31.100 --> 00:59:33.540 +They would scratch their head and be like, yeah, I sort of know about this. + +00:59:33.600 --> 00:59:39.300 +I'll put my head up and say, I didn't know what a bytecode interpreter was or how they worked until I and Claude built one together. + +00:59:39.300 --> 00:59:44.200 +But like, they know exactly how to do it because they've read 15 different, well, well-trodden implementations. + +00:59:44.460 --> 00:59:45.600 +And it's got a great example. + +00:59:45.760 --> 00:59:50.000 +You can say, not just any, here's the Python, CPython one, just help me do that. + +00:59:50.120 --> 00:59:50.740 +Whatever that does. + +00:59:51.080 --> 00:59:57.160 +And the second thing is they know what the public interface is again, in their soul, as in they know what, what Python should be like. + +00:59:57.200 --> 01:00:01.460 +They know the signature of the filter function without you having to go and describe it. + +01:00:01.920 --> 01:00:05.940 +Thirdly, you have an amazing set of unit tests, which is basically just, does it match CPython? + +01:00:05.940 --> 01:00:14.080 +So in our case, we basically vibe generate tests whenever we're, whenever we're adding a feature and then we run them with CPython and Monty. + +01:00:14.200 --> 01:00:16.800 +And we confirm that they are identical output down to the byte. + +01:00:17.100 --> 01:00:19.780 +You know, the exceptions have to be identical to the, you know, to the byte. + +01:00:19.780 --> 01:00:32.160 +But in the case of just bash, they, they have the existing set of like some bash tests somewhere for like any shared environment that they're able to leverage. + +01:00:32.260 --> 01:00:36.600 +And I think one thing we might do at some point is basically go steal a bunch of CPython tests and run them with both. + +01:00:36.680 --> 01:00:38.820 +I haven't got there yet, but that would be an interesting way ahead. + +01:00:39.040 --> 01:00:48.700 +And then the last thing is you don't have to bike shed or have any human debate about what should the, what should the function, what should the error message be when you try and add an int to a string? + +01:00:48.900 --> 01:00:50.080 +There's no, there's no debate about that. + +01:00:50.180 --> 01:00:51.580 +You're just doing whatever CPython does. + +01:00:51.660 --> 01:00:59.560 +And so there's a whole, whole range of bike shedding debates that we just don't have to go and have because we're just like trying to target CPython. + +01:00:59.660 --> 01:01:02.740 +Now, of course, around the edge of that, there's a bunch of places where we do have to think about it. + +01:01:02.760 --> 01:01:05.600 +Like how do we do these external function calling things? + +01:01:05.600 --> 01:01:12.780 +And that's, that is obviously, that is honestly much, much slower because we don't have this, like the LLM knows already the answer. + +01:01:12.780 --> 01:01:21.440 +approach, but I think these are the kinds of tasks where LLMs are massively faster or one, one set of cases where LLMs are massively faster than without. + +01:01:21.540 --> 01:01:34.560 +So I was speaking to big public company in New York who was saying that one of their team had vibe coded a Redis, clone in rust, put it into production after 72 hours and it was 30% faster than Redis. + +01:01:34.760 --> 01:01:36.420 +Why is that probably worked fine, right? + +01:01:36.720 --> 01:01:36.880 +Yeah. + +01:01:37.100 --> 01:01:37.820 +And why is that possible? + +01:01:37.920 --> 01:01:39.000 +Well, the same things are all true. + +01:01:39.260 --> 01:01:40.360 +The unit test is super easy. + +01:01:40.360 --> 01:01:41.780 +It's just, is it the same as Redis? + +01:01:41.780 --> 01:01:43.920 +There's no debate about what the API is, et cetera, et cetera. + +01:01:44.020 --> 01:01:48.320 +And so there are these tasks, which historically we would have thought was super hard. + +01:01:48.660 --> 01:01:53.300 +So I think often we fall into the trap of thinking that what LLMs are good at is what humans are good at. + +01:01:53.340 --> 01:01:55.180 +And what LLMs are bad at is what humans are bad at. + +01:01:55.400 --> 01:01:59.220 +I think more and more, we're seeing there are things that LLMs are much better at than we are. + +01:01:59.240 --> 01:02:01.060 +And there are things that they are, that they're less good at. + +01:02:01.080 --> 01:02:10.160 +And we're still very early in learning what those things are, but it is not good enough just to be, just to use the like naive, simplistic approach of like what humans are good at, they're good at. + +01:02:10.360 --> 01:02:15.200 +The simplest example of that is like, ask a NLM to generate you a B-tree implementation in C. + +01:02:15.460 --> 01:02:20.140 +And with that prompt alone, it will write you 500 lines of C that work as a B-tree implementation. + +01:02:20.560 --> 01:02:23.280 +It takes you 20 minutes to study it, to be sure. + +01:02:23.380 --> 01:02:25.900 +And it's like, you're not, I think it works this way, right? + +01:02:26.120 --> 01:02:26.280 +Yeah. + +01:02:26.280 --> 01:02:34.880 +I honestly think the little, the bits of weird math and a little, the little hallucinations and stuff have shaken a lot of people's trust in these things. + +01:02:34.880 --> 01:02:38.540 +And it's just like, well, I'm, I mean, how easy is it to add five numbers? + +01:02:38.640 --> 01:02:39.180 +Come on. + +01:02:39.420 --> 01:02:41.360 +Obviously these things are junk because they can't do that. + +01:02:41.380 --> 01:02:44.320 +And it's just like, well, maybe that's not the tool to use for that situation. + +01:02:44.320 --> 01:02:44.640 +Right. + +01:02:44.840 --> 01:02:45.000 +Yeah. + +01:02:45.060 --> 01:02:46.700 +But, but what you're using here is incredible. + +01:02:47.000 --> 01:02:47.100 +Yeah. + +01:02:47.100 --> 01:02:51.700 +But again, we have the guardrails of you must write unit tests all the time that match the two. + +01:02:51.800 --> 01:02:53.520 +I mean, well, or we have fuzzing going on. + +01:02:53.600 --> 01:02:55.000 +The fuzzing is another amazing technique. + +01:02:55.180 --> 01:03:09.480 +So we use, so we have a Jason parser called jitter, which is about the fastest Jason parser in rust that we also is built into, Pydantic core, but it's also actually independently a package in, in PyPI that's used an awful lot. + +01:03:09.540 --> 01:03:12.360 +You'll see it in the dependencies of open AI, for example. + +01:03:12.800 --> 01:03:15.720 +but jitter was where we, I discovered about fuzzing really. + +01:03:15.820 --> 01:03:27.000 +No, I found out about it through, the hypothesis project project of, my friends, Zach Hatfield Dodds in Python, but then fuzzing in rust because the performance is so much better is, is, is really powerful. + +01:03:27.000 --> 01:03:35.200 +So basically it's generating random strings and using them as an input something, but then it's using very clever stochastic techniques to work out where to try more things. + +01:03:35.200 --> 01:03:41.320 +And so you can basically fuzz, Monty, you can just give it arbitrary strings for hour after hour. + +01:03:41.600 --> 01:03:46.020 +And periodically it'll find something where there's an error where like the memory usage is too high. + +01:03:46.020 --> 01:03:49.540 +If you do the following sequence of multiplying integers together. + +01:03:49.540 --> 01:03:59.740 +I don't think it will find a like true read the file system vulnerability, but it'll definitely find like odd memory uses or it has found, stack overflows and panics and things like that. + +01:03:59.740 --> 01:04:01.940 +Well, I think people are excited about it. + +01:04:02.200 --> 01:04:07.340 +It's definitely got a lot of people talking, a lot of attention, a lot of, a lot of comments in the live stream. + +01:04:07.500 --> 01:04:08.400 +So congrats. + +01:04:08.580 --> 01:04:10.820 +And yeah, keep us posted on where it goes. + +01:04:11.160 --> 01:04:11.640 +And we'll do. + +01:04:11.920 --> 01:04:12.440 +Thank you very much. + +01:04:12.800 --> 01:04:12.920 +Yeah. + +01:04:12.920 --> 01:04:14.120 +Thanks so much for having me. + +01:04:14.140 --> 01:04:14.380 +You bet. + +01:04:14.580 --> 01:04:14.720 +Bye. + +01:04:15.940 --> 01:04:18.320 +This has been another episode of talk Python to me. + +01:04:18.440 --> 01:04:19.440 +Thank you to our sponsors. + +01:04:19.600 --> 01:04:20.900 +Be sure to check out what they're offering. + +01:04:21.020 --> 01:04:22.440 +It really helps support the show. + +01:04:22.860 --> 01:04:27.200 +This episode is brought to you by our agentic AI programming for Python course. + +01:04:27.200 --> 01:04:32.280 +Learn to work with AI that actually understands your code base and build real features. + +01:04:32.760 --> 01:04:36.300 +Visit talkpython.fm/agentic dash AI. + +01:04:36.600 --> 01:04:49.060 +If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, flask, Django, HTMX, and even LLMs. + +01:04:49.300 --> 01:04:51.720 +Best of all, there's no subscription in sight. + +01:04:52.160 --> 01:04:53.900 +Browse the catalog at talkpython.fm. + +01:04:54.560 --> 01:04:59.240 +And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:04:59.840 --> 01:05:01.720 +Just search for Python in your podcast player. + +01:05:01.820 --> 01:05:02.680 +We should be right at the top. + +01:05:02.820 --> 01:05:06.000 +If you enjoy that geeky rap song, you can download the full track. + +01:05:06.100 --> 01:05:08.000 +The link is actually in your podcast blur show notes. + +01:05:08.000 --> 01:05:10.120 +This is your host, Michael Kennedy. + +01:05:10.320 --> 01:05:11.620 +Thank you so much for listening. + +01:05:11.800 --> 01:05:12.600 +I really appreciate it. + +01:05:13.000 --> 01:05:13.760 +I'll see you next time. + +01:05:24.000 --> 01:05:25.200 +I thought of me. + +01:05:26.260 --> 01:05:27.780 +Get we ready to roll. + +01:05:29.280 --> 01:05:30.580 +Upgrade the code. + +01:05:31.280 --> 01:05:33.000 +No fear of getting old. + +01:05:33.000 --> 01:05:36.620 +We tapped into that modern vibe. + +01:05:36.620 --> 01:05:37.980 +Overcame each storm. + +01:05:38.700 --> 01:05:39.980 +Talk Python To Me. + +01:05:40.100 --> 01:05:41.400 +I sync is the norm. diff --git a/transcripts/542-zensical-a-modern-static-site-generator-transcript.txt b/transcripts/542-zensical-a-modern-static-site-generator-transcript.txt new file mode 100644 index 0000000..dee4aa5 --- /dev/null +++ b/transcripts/542-zensical-a-modern-static-site-generator-transcript.txt @@ -0,0 +1,1140 @@ +00:00:00 If you built documentation in the Python ecosystem, chances are you've used Martin Donoff's work. + +00:00:05 His material for MKDocs powers docs for FastAPI, uv, AWS, OpenAI, and tens of thousands of other projects. + +00:00:13 When MKDocs 2.0 took a direction that would break material in 300 ecosystem plugins, Martin went back to the drawing board. + +00:00:20 The result is Zenzicle, a new static site generator with a Rust core, differential builds in milliseconds instead of minutes, and a migration path designed to bring the whole community along. + +00:00:31 This is Talk Python To Me, episode 542, recorded February 17th, 2026. + +00:00:37 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:59 This is your host, Michael Kennedy. + +00:01:01 I'm a PSF fellow who's been coding for over 25 years. + +00:01:05 Let's connect on social media. + +00:01:06 You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:10 The social links are all in your show notes. + +00:01:12 You can find over 10 years of past episodes at talkpython.fm. + +00:01:16 And if you want to be part of the show, you can join our recording live streams. + +00:01:19 That's right, we live stream the raw, uncut version of each episode on YouTube. + +00:01:24 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:29 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:33 This episode is brought to you by Sentry. + +00:01:34 You know Sentry for the error monitoring, but they now have logs too. + +00:01:39 And with Sentry, your logs become way more usable, interleaving into your error reports to enhance debugging and understanding. + +00:01:46 Get started today at talkpython.fm/sentry. + +00:01:50 Martin, welcome to Talk Python To Me. + +00:01:52 Great to have you here. + +00:01:53 Thanks for having me. + +00:01:53 I'm excited to talk about static sites and the next big platform for building them here in Python and beyond. + +00:02:03 So really excited to talk about Zensical. + +00:02:06 Am I saying that right? + +00:02:08 Yeah, pretty much. + +00:02:09 Zensical. + +00:02:09 Zensical. + +00:02:10 Okay. + +00:02:11 Yeah. + +00:02:11 Great. + +00:02:11 Yeah. + +00:02:12 I know MKDocs, the material for MKDocs has been really, really popular. + +00:02:17 And you all have made a big splash announcing this new project. + +00:02:22 So I'm really looking forward to diving into it. + +00:02:24 Before we do, though, let's just get a little bit of background on you. + +00:02:28 Who is Martin? + +00:02:28 Of course. + +00:02:29 So hi, my name is Martin Donut. + +00:02:31 Most people probably know me as Squidfunk. + +00:02:34 I've been an independent developer and consultant for the last 20 years now. + +00:02:39 And I mostly write in TypeScript, Python, and lately a lot of Rust. + +00:02:44 So I've become a huge fan of Rust, actually. + +00:02:48 I'm kind of a free spirit. + +00:02:50 So I love doing my own thing and building products from front to back, basically. + +00:02:55 So doing the front end as well as the back end. + +00:02:57 And for the past 15 years, I contributed a lot to open source. + +00:03:01 As already mentioned, my most popular project so far is material for MKDocs. + +00:03:06 And it's, well, millions of people looking, basically look at sites that are built with it every day. + +00:03:14 Yeah. + +00:03:14 Well, and Zensical, my latest project will hopefully go far beyond that. + +00:03:18 So we're working very hard on it. + +00:03:19 And this is why I'm here today. + +00:03:20 So excited to talk about it. + +00:03:22 Yeah, I am as well. + +00:03:24 Well, and let's just start by admiring your website a little bit. + +00:03:30 Thanks. + +00:03:31 Brian and I spoke about this over on our Python Bytes podcast. + +00:03:35 And we kind of just got distracted just staring at the website. + +00:03:39 It's this beautiful flow of, I don't know, colors. + +00:03:44 It looks a little bit like a black hole worm, a white wormhole sort of experience. + +00:03:48 I don't know. + +00:03:48 What was the inspiration there with this cool design? + +00:03:51 Yeah, this is actually a strange attractor. + +00:03:53 So this is something from physics. + +00:03:55 I'm not very, very proficient in physics, but those strange attractors, I had a fascination for them for a very long time. + +00:04:03 And they follow very simple rules. + +00:04:06 So it's just three equations that define how their points move in three-dimensional space. + +00:04:12 And yeah, but still with those simple rules, a very complex shape can emerge. + +00:04:18 And this is for us actually symbolizes the process of evolving ideas through writing. + +00:04:24 So if you have slightly different conditions from the start, it's still orbiting around the same shape, but it might look a little bit different. + +00:04:33 And there's actually, I can share this now, there's actually a little Easter egg. + +00:04:37 Nobody has found it so far. + +00:04:38 So if you hover over the homepage on zensicle.org with the mouse in the left bottom corner, you can actually change the coefficients of the animation. + +00:04:52 And if you do this, you can click on them and then you can use your cursor. + +00:04:56 I'm changing beta. + +00:04:58 We're running beta 0.22 right now. + +00:05:01 Oh, it really does change it. + +00:05:02 Yeah. + +00:05:02 Oh my goodness. + +00:05:03 Yeah. + +00:05:04 So it takes a little time, but if you change the coefficients in a specific way, it might be completely chaotic and become unstable. + +00:05:15 So this is what I really find fascinating about those strange attractors. + +00:05:18 And it's also the inspiration for the logo. + +00:05:20 So we're building on this image a lot. + +00:05:23 Okay. + +00:05:24 I thought it was just a cool design. + +00:05:25 I didn't realize it had all this meaning and actual math and physics behind it. + +00:05:30 That's super cool. + +00:05:31 I love chaos theory and all of this, these fractal type of ideas here. + +00:05:36 And yeah, it's super neat. + +00:05:38 Okay. + +00:05:39 So what is zensicle? + +00:05:40 Why did you build it? + +00:05:41 Why not just more material? + +00:05:43 So there are a lot of questions in there, actually. + +00:05:45 Maybe let me just start by shortly speaking about what it is. + +00:05:49 So in very simple terms, it's a tool to build beautiful websites from a folder of text files. + +00:05:55 So you just write a markdown and can generate a static site. + +00:05:59 You don't need a database for it. + +00:06:00 So to those that don't know what a static site is, you don't need a database or server. + +00:06:05 It's just static HTML, which means you just pip install zensicle and you're ready to go within a few minutes. + +00:06:11 And it's fully open source, MIT licensed. + +00:06:14 And to maybe explain a little bit more about static sites. + +00:06:17 So the big benefit of it, you can host it for free in many places. + +00:06:21 For instance, on GitHub Pages or Cloudflare. + +00:06:23 And they're secure and fast by default because there's only static file serving involved. + +00:06:28 And zensicle. + +00:06:28 So we try to make it pretty with a modern design, many built-in features and fun, according to the feedback of our users, which is kind of unusual for writing documentation. + +00:06:37 So, yeah. + +00:06:38 Yeah. + +00:06:39 Very cool. + +00:06:40 And if anyone's tried to manually create a static site, it quickly becomes a challenge if you're just writing. + +00:06:50 Say, hey, it's only five HTML pages. + +00:06:52 I can just write the HTML. + +00:06:54 You know what I mean? + +00:06:54 But, well, what if you want to have common navigation or you want to change the look and feel? + +00:07:00 You know, oh, well, now I've got to go edit that in five places, right? + +00:07:04 And so if even just beyond, basically beyond one page, having something that generates the static site is, it's super valuable, right? + +00:07:13 Because it'll generate the wrapper navigation, the common CSS, the footer, all those kinds of things, right? + +00:07:21 Yes. + +00:07:21 So it depends on what you want to do. + +00:07:23 So, of course, if you have a small site, like a personal website or so, you can just write basic HTML if you're proficient in it. + +00:07:30 For instance, the users of Material, only 7% of them are front-end developers. + +00:07:38 We will dive a little bit into how Zensical relates to Material later. + +00:07:43 And what Zensical is being used for primarily is for documentation. + +00:07:47 So it builds on the Docsys code philosophy, which means that you treat your documentation exactly like your source code. + +00:07:55 So you primarily write documentation. + +00:07:57 You don't want to fight front-end development problems. + +00:08:01 You just want to keep the content, like get the content out. + +00:08:04 And with this Docsys code, what the cool thing about it is, is you can use the same tools and processes and workflows like you use for code, like versioning and PRs to make changes. + +00:08:16 And the adoption is growing really fast, actually, among companies in recent years as they're moving away from proprietary tools to open source solutions. + +00:08:24 So Zensical is for you, or a static site generator in general is for you if you just want to get your writing out. + +00:08:32 And of course, you can also customize it and make it pretty as you want, but you don't necessarily need to know HTML, CSS, and JavaScript. + +00:08:41 And that's quite difficult. + +00:08:43 And you talked about writing, and you even have your metaphor with strange attractors. + +00:08:47 I personally find if I'm just in a clean space where it's really just about the ideas, I don't have to worry about the design. + +00:08:56 It makes it so much easier to just focus on the actual writing. + +00:09:00 You know, you're in a Markdown editor. + +00:09:01 My favorite is TypeHora, but you can use whatever variety that you want, right? + +00:09:06 And you're just there. + +00:09:08 You're not worried even hardly about the formatting of the Markdown. + +00:09:10 You're just writing. + +00:09:11 And I find that very, a good creative space, I guess. + +00:09:15 Yeah, that's the beauty of Markdown. + +00:09:16 So you can just write, as you mentioned. + +00:09:19 And how you, in the end, use it, you can still decide that afterwards. + +00:09:24 So if you want to build a website, if you want to create a PDF of it, if you just want to use it for internal note-taking or so. + +00:09:30 And this is the big benefit of Markdown. + +00:09:34 It takes away a lot of the headache of having to remember a lot of Markup in order to get your ideas out of the door. + +00:09:42 Can you actually put Markup in it if you need to? + +00:09:44 You know, for example, maybe you need a particular image, two of them side by side that are links and you want them to open in a new tab if somebody clicks them. + +00:09:54 Can you set it into basically an unsafe mode and let it do embedded Markup? + +00:09:58 Yeah, that's a great question. + +00:10:00 So, yes, it's possible. + +00:10:01 You can just use HTML within Markdown. + +00:10:04 We currently depend on Python Markdown, which we inherited from material for MKDocs. + +00:10:09 We are gradually moving towards CommonMark, which, so just as a context, Python Markdown has some oddities when you use HTML within Markdown. + +00:10:18 For instance, it won't replace relative URLs correctly. + +00:10:23 This is like an annoying thing. + +00:10:25 But once we move to CommonMark, we will also have like predefined components that you can use because you can't express everything like more complex things in plain Markdown. + +00:10:36 So there are only things like you can make text bold, you can have lists, tables, etc. But if it's more complex, some, as you mentioned, aligning to images or having an image with a caption or so, you need a basically HTML. + +00:10:51 And this is possible already, but we will make it much easier in the future. + +00:10:54 The front end world already knows this. + +00:10:56 So they use MDX. + +00:10:57 They've been using MDX for quite a while, which is a dialect on top of Markdown, which adds more liberty with components and so on. + +00:11:06 So you can create reusable components that you can use. + +00:11:10 Yeah. + +00:11:10 But, yeah. + +00:11:11 So it's possible. + +00:11:13 It's our users already also do it. + +00:11:16 We also have it on some examples on the documentation, and we will make it much more powerful in the future. + +00:11:21 Yeah. + +00:11:21 Very nice. + +00:11:22 I do think, you know, regular Markdown is just a few missing things. + +00:11:27 I love the simplicity of it. + +00:11:29 And, you know, hat tips, John Gruber for creating it. + +00:11:32 But it's just like, I just need to maybe put a class here or just do a little, if I could just control this a little bit more, then you could basically escape HTML. + +00:11:42 Obviously being careful to not just recreate HTML with square brackets instead of angle brackets, right? + +00:11:47 Yeah, there's been a lot of work on Python Markdown. + +00:11:49 So in Python Markdown, there are some extensions that allow you to add classes at least to block elements. + +00:11:54 So in Markdown, you need to distinguish between inline and block elements. + +00:12:00 Oh, no, it also works. + +00:12:01 Sorry. + +00:12:01 It also works on inline elements like links and so on. + +00:12:03 But this is special syntax. + +00:12:05 So Python Markdown is a dialect that is not standardized like CommonMark. + +00:12:09 In CommonMark, this is not easily possible to add specific classes. + +00:12:13 But with CommonMark, as I mentioned, you have MDX, which is a de facto standard. + +00:12:18 I don't know if they've standardized it already. + +00:12:20 That allows for much, much more. + +00:12:22 So what is Zensical for? + +00:12:24 Is this a documentation generating tool? + +00:12:28 Is it a just open-ended static site generator? + +00:12:31 What is possible and what is your goal or your target with this project? + +00:12:38 Yeah. + +00:12:38 So as I mentioned right now, we're focusing on documentation. + +00:12:42 So because this is the thing we're coming from. + +00:12:45 But we're building Zensical for much, much more. + +00:12:47 So our stretch goal is to have like a fully-fledged knowledge management and documentation solution. + +00:12:54 There are already a lot of companies that use it internally for knowledge management. + +00:13:00 Basically, as an alternative to a ZAS-based solution like Confluence and Notion, we are aware that for this we need WYSIWYC. + +00:13:08 So what you see is what you get. + +00:13:09 A visual editor that is also usable by non-technicals. + +00:13:12 And if you scroll, if you check out our roadmap and scroll down all the way, you will see it as a stretch goal, which is basically something we're working towards. + +00:13:23 Because this would actually allow so much more people within organizations to use it. + +00:13:30 And in general, Zensical, with Zensical, we focus on three key areas that make us different from other static site generators, which is, well, a modern design. + +00:13:42 So, of course, some also have a modern design. + +00:13:44 But within the Python ecosystem, some options might look a little bit dated or a little bit... + +00:13:52 So we try to be a little bit more on the edge, actually. + +00:13:55 And it should be flexible and it should be fast. + +00:13:58 So those three things. + +00:13:59 Because the design, actually, is the thing that people notice first. + +00:14:04 So what we offer is a design that is customizable, brandable. + +00:14:09 You have tons of options with which you can change how navigation is laid out, how you can also change colors, fonts, etc. And we have a lot of components that make it ready for technical writing. + +00:14:22 As you mentioned, you just want to start writing. + +00:14:24 So we have stuff like admonitions tabs. + +00:14:28 And one very specific feature that we have is code annotations that we inherited from Material for MKDocs, which is quite unique among static site generators, which allows you to put a little bubble onto any line of code. + +00:14:43 You have to visit our documentation. + +00:14:44 This is our... + +00:14:45 You're currently browsing our... + +00:14:47 The other side. + +00:14:49 All right, all right. + +00:14:49 Hold on. + +00:14:50 I got it. + +00:14:50 Keep going. + +00:14:51 I'll get to say. + +00:14:52 Right, right. + +00:14:52 No worries. + +00:14:53 Yeah. + +00:14:54 And there you have to search for code annotations. + +00:14:56 Yeah. + +00:14:57 So code annotations, which allow you to create a bubble in any line of code. + +00:15:03 And if you click that bubble, there opens a tooltip. + +00:15:06 And within this tooltip, you can use any rich content. + +00:15:08 So you can have lists, any formatted markdown tables, diagrams, basically anything you can use anyway within markdown. + +00:15:18 And this is a very popular feature in Material. + +00:15:20 And so, of course, we brought it over. + +00:15:23 So users can still use it. + +00:15:25 So the second thing I talked about is it should be flexible. + +00:15:29 So what makes Ansical different is we have a modular architecture or say we're working towards a modular architecture. + +00:15:34 We're still in alpha. + +00:15:36 So we're close to finishing the module system. + +00:15:40 And in Zendigl, it's modules all the way down, which means all core functionality is implemented as modules, which is different from other solutions where the plugin system sometimes is more or less an afterthought. + +00:15:53 So there's a plugin system added with specific hooks, extension points where you can hook into. + +00:15:57 And this might seem sufficient at first, but in the end, so for us, for instance, MKDocs in the end was a little bit limiting. + +00:16:07 And this allows you to basically swap, extend, replace all modules. + +00:16:11 You can use our modules. + +00:16:12 You can write your own, pull in third-party modules. + +00:16:14 And as I mentioned, Rust. + +00:16:16 So don't worry. + +00:16:17 You don't need to learn Rust. + +00:16:18 You will also be able to write modules in Python because we are super happy users of Pyro 3, which is absolutely amazing library. + +00:16:24 And Pyro 3 has really become a super important foundation of Python these days. + +00:16:30 It's almost like the C bindings for CPython. + +00:16:35 Exactly. + +00:16:35 So, yeah. + +00:16:36 So with Pyro 3, it allows us to have a Rust runtime. + +00:16:40 So all of the orchestration and how, in which order, so in which order things are run, threading, caching, parallelization, etc. All is happening in Rust. + +00:16:48 And we will provide Python binding so that you still can use Python to write modules. + +00:16:53 And they're still running fast. + +00:16:55 This portion of Talk Python To Me is brought to you by Sentry. + +00:16:59 You know Sentry for their great error monitoring. + +00:17:02 But let's talk about logs. + +00:17:03 Logs are messy. + +00:17:05 Trying to grep through them and line them up with traces and dashboards just to understand one issue isn't easy. + +00:17:11 Did you know that Sentry has logs too? + +00:17:14 And your logs just became way more usable. + +00:17:16 Sentry's logs are trace-connected and structured, so you can follow the request flow and filter by what matters. + +00:17:22 And because Sentry surfaces the context right where you're debugging, the trace, relevant logs, the error, and even the session replay all land in one timeline. + +00:17:31 No timestamp matching. + +00:17:33 No tool hopping. + +00:17:34 From front-end to mobile to back-end, whatever you're debugging, Sentry gives you the context you need so you can fix the problem and move on. + +00:17:40 More than 4.5 million developers use Sentry, including teams at Anthropic and Disney+. + +00:17:45 Get started with Sentry logs and error monitoring today at talkpython.fm/sentry. + +00:17:51 Be sure to use our code, talkpython26. + +00:17:54 The link is in your podcast player's show notes. + +00:17:56 Thank you to Sentry for supporting the show. + +00:17:59 Which brings me to the last point where we're different. + +00:18:02 We have a very heavy focus on performance, so our goal is to let you start with one page because, of course, all documentation sites, all projects start small, and let you scale that to something like 100,000 pages. + +00:18:15 How we do it is through differential builds. + +00:18:18 We have created our own runtime, which is called ZRX, and differential builds mean that we are only rebuilding what changed. + +00:18:26 So, for instance, if you only change the page title, only that page and all instances where the page title is used are being rebuilt. + +00:18:32 And this means that changes are visible in milliseconds and not minutes. + +00:18:35 Yeah, that's super cool. + +00:18:37 And so I'm presuming the build system itself is Rust-based, right? + +00:18:41 Yeah, exactly. + +00:18:42 It's 100% Rust, yeah. + +00:18:43 Yeah, yeah. + +00:18:44 Coming from a Python background, what was that experience like building that? + +00:18:49 Yeah, so that's kind of a tricky question because I'm not really coming from a long history. + +00:18:57 So I don't have a long Python background. + +00:19:00 I wrote many in TypeScript, and I only started 2021 writing Python. + +00:19:07 So this is actually the history, how materials started, and how all of this unfolded. + +00:19:14 But I've written in several languages. + +00:19:18 So I also have written in C, Erlang, Ruby, Python, TypeScript. + +00:19:22 Rust was still extremely hard to learn. + +00:19:24 So I basically banged my head against the keyboard for a month, wasn't making no progress at all because, yeah, you know, fighting with the borrow checker. + +00:19:31 So, and once you get past that, and then, of course, lifetimes and higher rank trade bounds and some other features, I'm now some kind of like 3,000 hours, 4,000 hours in, something like that. + +00:19:46 It gets really good. + +00:19:47 So I think Rust is seriously one of the best languages ever made because it allows you to express ideas extremely clearly, with extreme clarity. + +00:19:56 And this is due to the very good type system, of course, and you get bare metal performance. + +00:20:04 So I find it kind of insane having a language like Rust because it's so easy to write once you're used to it. + +00:20:12 You will be very productive and still have bare metal performance. + +00:20:17 It's completely insane. + +00:20:18 Yeah, that's wild. + +00:20:19 But it's got a little bit of a learning curve compared to like Python or TypeScript or something like that. + +00:20:24 Yeah, so I had, I think, 18 years of experience with many languages. + +00:20:30 As I mentioned, I also did a lot of C and I still found it very hard to learn. + +00:20:36 But it's worth it. + +00:20:38 It's worth it. + +00:20:39 And my recommendation probably would be to learn it on something that you really care about so that you want to build because otherwise you will probably lose the drive since you're running against those walls. + +00:20:55 Maybe for you, it's, or for somebody else, it's much easier to learn. + +00:20:59 So maybe it's just, I'm a bad example that I needed so long. + +00:21:04 I don't know. + +00:21:04 But because after that month, it wasn't that I was completely up to speed. + +00:21:09 So it was just, I was making very, very tiny progress, at least progress, because for a month I wasn't making progress at all. + +00:21:16 The next show that I'm doing after this one, which actually is, in real, on clock time, wall time, it's happening in like two hours or less from now is with Samuel Colvin from Pydantic talking about Monty, + +00:21:30 a Python runtime. + +00:21:32 He's, he and his team are rewriting in Rust, specifically targeting AI. + +00:21:37 So the Rust theme will continue. + +00:21:40 It's definitely a very, it caught me a little bit off guard, like how much people love it, but it's also, it's also, you know, it makes perfect sense that we want this nice modern language for writing lower level things, even if it plugs into Python, right? + +00:21:54 Yeah. + +00:21:54 So the fun thing is I also talked to Samuel a long time ago and he was the one recommending to me to write it in Rust. + +00:22:02 it's one of, it's one of the reasons I, yeah, definitely I looked into it and it made, it made a lot of sense also during the time, the progress we're making and so on and the walls we're hitting that to reconsider learning Rust + +00:22:16 best investment. + +00:22:17 Yeah. + +00:22:18 Yeah. + +00:22:18 Amazing. + +00:22:19 Amazing. + +00:22:19 So I wanted, I want to dig into your component structure and some of those things, but maybe before we do, let's, let's talk about the origins a little bit. + +00:22:28 So what, let's talk about how you went from material for MK docs. + +00:22:33 Why, why even change? + +00:22:34 Why not just more material? + +00:22:36 Yeah. + +00:22:37 So this is, this is a great question and this is a little bit of a story. + +00:22:41 So there are several stories in there actually, because it's, so it's 10 years. + +00:22:45 I try to go make it as compact as possible while keeping the most important things. + +00:22:51 So to those who don't know, material for MK docs is a very popular documentation framework. + +00:22:55 It's used by tens of thousands of projects. + +00:22:57 There are prominent users like AWS, Microsoft, open AI, also large open source projects use it. + +00:23:04 Like for instance, FastAPI, UVK native, and it's built on top of MK docs as the name says, which is one of the most, which became one of the most popular static site generators. + +00:23:13 And it also eventually became my job. + +00:23:15 So I could make it my job. + +00:23:17 I could work in open source and earn a living somehow. + +00:23:21 I'm getting there how that worked. + +00:23:24 And, but at some point we needed a new foundation and we've like kind of outgrown MK docs because it was not evolving at the pace that we needed. + +00:23:32 So we began exploring alternatives and yeah, so there's a lot of lessons learned in, in materials. + +00:23:38 So let me show us short, shortly, maybe talk about how it started because like it started as a side project in 2015. + +00:23:45 Like many things start because I wanted to release a, actually a C library, zero, a zero copy protocol buffers library I wrote called proto bluff. + +00:23:53 But then I realized that it needed more than a read me. + +00:23:57 So I looked at the existing static site generators which were Hugo, Jekyll, Sphinx, MK docs, something like that. + +00:24:03 And they all look a little bit dated. + +00:24:05 I'm not a designer but I wanted something more modern and Google was pushing material design quite, quite hard for app development at the time. + +00:24:13 So, and I've also seen it being used in the web. + +00:24:16 So I thought, well, maybe combine this. + +00:24:19 I quickly settled on MK docs, was easy to use, simple templating, enough for a side project basically. + +00:24:24 So it was a side project. + +00:24:26 Did what most devs do, I checked the license but didn't do any further due diligence. + +00:24:30 So even put MK docs in the name to show the connection. + +00:24:33 So, which is common for themes. + +00:24:35 And that actually turned out to be one of the biggest decisions I made in my career since I was basing my complete work on somebody, something I don't control and it shaped the next 10 years of all of the work I was doing and is actually the reason + +00:24:49 why Zensical exists today. + +00:24:52 I see. + +00:24:52 So, after I started developing it, I, like nine months later, released the first version and send a good users, a lot of feature requests and, you know, it was a side project. + +00:25:04 So I was doing client work at the time. + +00:25:06 As I mentioned, I've been a, like a consultant and developer freelancer for 20 years and I only had Sundays to work on it. + +00:25:16 So, which at first was efficient, but the more popular it got, the more maintenance that came. + +00:25:21 So it kind of crept into my mornings and evenings and I was doing triage, like answering questions and trying to fix bugs before I went to the client and it was getting harder and harder to justify in front of my partner actually because I was doing it in my spare time + +00:25:36 and so I did what eventually all projects that start as side projects and where you don't have the full time to work on it, how they, yeah, so what basically happens is you start turning down feature requests + +00:25:50 and many open source projects don't cross this line and for me it was a first. + +00:25:55 yeah, and also additionally, so I mentioned before that I started writing Python in 2021. + +00:26:01 At the time I was focusing, so I only had Sundays to work on it, I didn't know Python so I said that, okay, I will focus on the templating stuff, I will do the HTML, CSS, JavaScript, all of this, make it beautiful and try to solve as much, as many problems as possible + +00:26:16 in the front end but I won't start learning Python because it wasn't a language that I was using at that time and I couldn't make up the time for it so that's where I drew the line. + +00:26:28 Yeah, yeah, and then I tried to... + +00:26:30 It's probably going to be a fad that Python thing anyway. + +00:26:32 I don't think so but... + +00:26:35 Well, at the time, in 2015 it wasn't clear that it was going to be as popular as it was as it is now, right? + +00:26:41 It's really... + +00:26:42 It started to become popular then but it's really taken over the world and... + +00:26:47 Absolutely. + +00:26:48 For a lot of reasons, so... + +00:26:49 Of course, yeah, I think one of the main reasons is because it's very popular in the ML community and all of the LLM AI work that's happening and so on made it extremely popular so... + +00:27:02 And I also think that Rust is doing a very good job on keeping it that way because finally you have a very easy way to offload work to native code which is much easier than fiddling with + +00:27:16 C and C++ and void pointers and whatever so as I mentioned, Pyro 3 is just an absolutely amazing library. + +00:27:22 It's so easy to write Rust code. + +00:27:24 Yeah, I think you're right. + +00:27:25 I think... + +00:27:26 Rust has really provided an important escape hatch for... + +00:27:29 I wrote it this way. + +00:27:30 It's not fast enough. + +00:27:31 Like, well, this part, we're going to make it as fast as it can be, basically. + +00:27:34 Yeah. + +00:27:35 Yeah. + +00:27:36 So... + +00:27:37 Sorry, I interrupted you. + +00:27:38 Keep going. + +00:27:38 Oh, no worries. + +00:27:39 No worries. + +00:27:39 Yeah, no, no. + +00:27:40 Yeah, so as I mentioned, I tried to keep it basically afloat for the first four years and at the time I didn't see the potential at all. + +00:27:48 It was just a theme, not a kind of product or so, but yet I felt responsible and kept on maintaining it and my developer friends didn't understand why I was doing that. + +00:27:57 So, but for me, it was like, you know, it was kind of cool because I had a growing project. + +00:28:03 I had no immediate plans. + +00:28:04 I don't know. + +00:28:05 Let's see where that, where I can, where I can take it and, yeah, so, and with this steady and slowly growth over years, then companies and organizations started using it. + +00:28:14 so they were basing their public facing documentation on me, like the guy that maybe works on this project on a Sunday and yet I felt responsible enough to trying to fix the bugs reported as quickly as possible. + +00:28:29 Yeah, and, yeah, then in 2020 actually came the turning point. + +00:28:33 So when I was working on version five of it, I shared my progress publicly as I did before and somebody mentioned the donate button. + +00:28:38 So, and I think the wording was something like so that I can order pizza to survive the long Sunday coding sessions. + +00:28:48 But I heard from another developer who did this on his project, well, successful project for five years, a donate button and he made $90. + +00:28:57 So I immediately said that's not going to work. + +00:29:00 But I said, let's try an Amazon wish list. + +00:29:04 You know, I just put some stuff on there and maybe if somebody thinks my work is useful, then he can order me, like, make me a present and something send me a present. + +00:29:13 So, yeah, and I basically received everything on that wish list. + +00:29:17 It was completely insane. + +00:29:18 So there were two consecutive days that felt like Christmas. + +00:29:20 I even put like, so, I put some, you know, books and, but then also single malt. + +00:29:27 I love Scottish, Scottish single malt. + +00:29:30 It was a whiskey that cost $120 and I received that that as well. + +00:29:34 So it was like, what's happening? + +00:29:37 And that led me to start thinking actually about demographics. + +00:29:40 So in, that I needed to better understand the audience of material for MKDocs. + +00:29:45 And I did a poll and the results were absolutely eye-opening. + +00:29:49 I mentioned before, 7% only of front, of users are front-end developers, which means, and material is a front-end heavy project. + +00:29:57 So I kind of had an edge there in the Python space because, yeah, you know, it's based on Python. + +00:30:03 So front-end developers that write in JavaScript, they rather go for something like DocuSaurus or React-based or whatever. + +00:30:10 And technical writers were quite happy with the project. + +00:30:13 I didn't know even technical writers existed. + +00:30:15 So I had no clue that this job, that this is a job because I thought at the time, and it's in hindsight completely naive, of course, I thought that as a developer, you need to write the documentation, you know. + +00:30:26 And so I learned about that and accidentally built a product for technical writers. + +00:30:32 And by the way, when I say product, I mean something that is not necessarily something you pay for, but something that doesn't feel engineered. + +00:30:39 So something that is like polished and designed and that you actually want to use. + +00:30:46 And yeah, so I had a product that has like product market fit and but at the time I didn't earn any money off it. + +00:30:54 So at the same time I read about SponsorWare and this, like, I'm not sure if you heard of it, but it's like a new model of monetization for open source. + +00:31:03 At the time it was quite new so that you can get paid for your work. + +00:31:07 So you can, so some developers for instance, they sell course material or access to gated content or code or nothing at all. + +00:31:14 So if you have a popular project you can just try to raise sponsorships from, and some companies are very generous when it comes to open source. + +00:31:22 And what we did with Material was we gave away early access to the latest features to the sponsors and each feature was tied to a funding goal and when that funding goal was met it became free for everyone. + +00:31:34 So it was like kind of a funded feature development in multiple stages. + +00:31:40 And that's what I thought of. + +00:31:42 Sorry? + +00:31:43 Yeah. + +00:31:43 That's super clever. + +00:31:44 I really love the idea of providing something for the sponsors but still not turning it into well, here's a paid version of our product and here's the open source version. + +00:31:54 but there's always this tension of how do you reward the people who support you without undermining the open source project and that's a clever angle. + +00:32:04 Yeah. + +00:32:04 So that's extremely challenging. + +00:32:06 So as I'm telling this so this is what I came up with and I thought maybe it could work something like that and again my developer friends they said will never work nobody will pay for open source you're insane. + +00:32:19 Spoiler alert it did work and in the end we made 200k a year of it and could build a team and everything so I know in Silicon Valley terms this is probably minimum wage but in Europe it's quite an amount with which you can work very well. + +00:32:34 And yeah so I started this program in 2020 and it grew steadily and it finally allowed me to work on features outside of the Sunday so invest more hours into it and finally learned Python in 2021 wrote my first plugin + +00:32:48 and started hacking the MKDocs features that well that got turned down that we upstreamed but where the maintainer said ah it's maybe not a good fit or we don't have the time for it and yeah in total + +00:33:03 I wrote 12 MKDocs plugins so it started as a theme but it turned into a popular sorry into a powerful docs framework in the end and this worked quite well for several years until it didn't anymore and that's the reason why Zensical + +00:33:17 then came into being. + +00:33:18 So the way it didn't work is that just where you want to take it started to diverge from MKDocs or you you couldn't get your changes upstreamed or committed back? + +00:33:30 So the thing was that MKDocs was not evolving as we needed it in the it's so historically MKDocs had a sequence of single maintainers and as far as I know all of them worked on it in their spare time because they had + +00:33:45 regular jobs and material was evolving quickly because you know we had funding we could we could invest much more time in it we could then of course then an open source project that is only maintained in the spare time and so it was changing too slowly + +00:34:00 so we started a lot of discussions on necessary API changes because for many users material for MKDocs was MKDocs so we were kind of like the storefront where most of the issues and like bug reports and feature requests came in + +00:34:14 because many people are using material for MKDocs and with this MKDocs basically and the main challenges that we faced were performance and plugin orchestration I mentioned I wrote 12 plugins and + +00:34:29 it's very hard to make them cooperate and if you look at any popular MKDocs plugins issue tracker you will find issues that go something like well this plugin is incompatible with this plugin well if I change the + +00:34:43 order of the plugins in the configuration this and this happens and both of those problems were brought to us again and again by the users with which we talked and so it was coming up a lot then suddenly + +00:34:58 after nine years the original maintainer returned to MKDocs and we were super optimistic because the project was maintained again he also started a sponsorship program we upstreamed some of our funding immediately and supported his work so before MKDocs had no + +00:35:13 way to sponsor them and the moment this went live we immediately supported it and some PRs were finally merged and issues were closed but yeah then the works went silent and it started working + +00:35:28 in basically in the quiet and three months later we were invited to a video call so we as maintainers from so I as a maintainer for material for MKDocs and some other key ecosystem maintainers + +00:35:42 and we learned that MKDocs that the plans for MKDocs 2.0 were completely different from what existed at so what currently exists MKDocs 1.x which primarily means no plugin + +00:35:57 API and customization via templating alone so we already knew this is not enough because that's what we've done the first four years where as I mentioned I was only doing templating and some things with templates for instance + +00:36:11 having a tag support where you need to pull in different tags from different pages and then render them on another page or so you need synchronization efforts and you can't do this with templating by the way all of this information is public + +00:36:26 so you can read it on the mkdocs issue tracker so I'm not telling anything secret or so it's a completely different direction than they were dismissed + +00:36:40 so mkdocs 2.0 as it looks right now is incompatible with material for mkdocs 300 plugins and ecosystem will become useless and tens of thousands of projects will be affected and for us so we had absolutely no choice than to + +00:36:55 start building something so to start make something of this because at the time we had already 50,000 projects 50,000 public projects depending on us we talking to enterprise users and we knew that this number + +00:37:10 is much much higher so for instance one of our professional users they already also sponsored material they have two and a half thousand projects internally so only one company and they have a dedicated team of + +00:37:25 individuals that maintain their customizations on top of material for mkdocs for all of the teams inside the company it's a very big company so that's incredible what you could infer from the I + +00:37:39 could believe it I couldn't believe it at all so absolutely insane yeah so as I mentioned we had no choice so what we did was we immediately went back to the drawing board with the learnings from the almost 10 years + +00:37:54 that passed since I started material we built a lot of prototypes in typescript and Python iterated on them we did a lot of conceptual work things what could actually be done with a radically different architecture + +00:38:09 because writing 12 plugins I know the ins and outs of mkdocs I had to do a lot of hacks to make the block plugin of material work with the way navigation works in mkdocs and the number + +00:38:24 one complaint as I mentioned was mkdocs is slow and it doesn't scale so fixing a typo you're doing a full rebuild and this can take minutes so our design work centered exactly around this problem and after + +00:38:39 a short while so we knew exactly what mkdocs should look like and we didn't want to let our users down and we so in essence we had two options we know what it should look like we could fork it or we could start from scratch and + +00:38:53 forking is not really possible the way because of the way how python dependencies work so all of the plugins have a dependency on mkdocs and this means that we would also need to fork all of so without doing black magic with + +00:39:08 imports which might not be the best idea so we would also need to fork all plugins or all plugins would need to switch to the fork so this would be like moving an entire city at once and it's frankly + +00:39:23 impossible so and if we would fork it we wouldn't be able to realize our learnings that we gained in the groundwork that we did so we had to start from scratch actually right plus you'd have to convince the entire community to at + +00:39:37 least create a parallel package because when you pip install that other plugin it's going to say hey PyPI I need mkdocs and now you'd it + +00:39:52 would be a big battle wouldn't it just technically with or you'd have to move the community which is a very challenging thing to do yeah and so for us the most sensible thing was to just you know we just start from scratch we make it + +00:40:07 as compatible as possible it became quite clear very quickly that we need to optimize for compatibility because if you create something that is not compatible work + +00:40:22 to get over to something else you won't get a lot of adoption all you gotta do is think about that 2500 project team like okay how do I keep them working with this right yes yes yeah so what we + +00:40:37 then did is we had an idea how it should look then we started with Rust because it was recommended to us so it was very hard at first and it in total of + +00:40:52 this but it was not only writing code it was also exactly knowing where we want to go because we're starting fresh so we better be sure that we are going into a direction where we actually want to go for the next 10 to 20 to 30 years + +00:41:07 depends so we are really in for this for the long game so the 10 years that I've been doing this I see that this is only the start and we wrote a lot of things from scratch so the runtime as I mentioned it's like the + +00:41:21 heart of Zensical it already has something like 15,000 lines of code a tiny HTTP middleware framework for file serving because we also want to make the file server extensible and don't want users to force them into async + +00:41:36 Rust and also don't have a dependency on Tokyo in JavaScript for instance there's Lerna and it has 800 + +00:41:51 dependencies so when you install it what you pull down is just insane so we worked a lot on the processes as well that we can make releases very easy and we have a good way of working basically and we are very careful about + +00:42:06 our choice of dependencies so if it's not something that you can write quite quickly actually and we rather own in order to make changes ourselves we rather write it from scratch I + +00:42:21 think that's a very healthy philosophy and also I think this agentic AI world that we're in these days if you just need one or two functions and you used to think well maybe I'll lean on this and in your case a crate or maybe a PyPI package + +00:42:36 or something but if it's just I started using pip dash audit for a lot of my projects and I would say for my + +00:42:51 bigger projects every two weeks I get at least one CVE vulnerability notification for something I'm using but here's the thing it's in a situation of that probably a piece of code or functionality of that + +00:43:05 package that I don't even use or care about so it doesn't really apply to me but then I've got all these here's an issue that I'm + +00:43:20 going to be fine you know what I mean I think things are swinging back a little bit from let's just pull in everything because it's going to help us to well maybe not everything yeah and also you + +00:43:35 can't just change things easily and you depend on other APIs so for instance one of the reasons why we choose to build a lot of things from scratch is that we want to control the public API so + +00:43:49 the worst thing for us would probably just be to export a third party API that we're using as part of our public interface because it's rust so it would mean that if those this public API would change the entire ecosystem would break so we're very + +00:44:04 careful ! + +00:44:05 to what APIs we expose and rather wrap it in order to be safe so we can replace things replaceable so maybe you have the philosophy of it might be okay to use this crate but we don't + +00:44:19 exchange its types as the public as part of our API or something along those lines we don't expose it so we in some instance the wrappers that I've wrote are identical to the types + +00:44:34 that we use from using our own types or just wrapping them because in Rust the nice benefit is you have zero cost abstraction so all the code is monomorphized in line so you don't pay for wrapping code that's + +00:44:49 the absolute crazy thing so it's you can finally create a really clean architecture without runtime penalties if you do it right oh that's wild yeah yeah yeah very interesting so you can see I have this huge list of + +00:45:07 I'd like to go back to this component wrong search there you have components that was in the other part let's just talk through some of these things here so you've got like admonitions + +00:45:22 buttons code blocks let's talk through some of the building blocks I guess that you think are interesting here yeah so I think most of the so if you if you're new to technical writing most of the stuff shouldn't be quite new so like + +00:45:36 admonitions code blocks stuff like that you've probably seen or data tables diagrams are just mermaid diagrams as they are as you can use them on github one of the so like the flagship features in material or + +00:45:51 and now Zensical as I mentioned like code annotations which is a part of code blocks otherwise we also have an icon and emoji integration so you can use one of I think we have something like over 10,000 icons now + +00:46:06 with a quite simple syntax that's not standard markdown that's the problem so that's like a Python markdown extension and we're working on moving this over to common mark and finding a way to migrate this over because + +00:46:20 right now Zensical uses Python markdown for compatibility with materials for mkdocs which means that for markdown + +00:46:32 we the same parsing engine to have this strong compatibility right we can even read mkdocs YML configuration so you can build an mkdocs project with Zensical as it stands the thing that we currently don't support in its entirety is the + +00:47:16 plugins from the ecosystem we already support some plugins for instance the mkdocs strings plugin the author is also part of the Zensical team now with mkdocs strings being the second biggest project in the mkdocs space so we are + +00:47:31 happy to have Tim on board and several other plugins but as I mentioned so Zensical uses modules so what we will do in the end is we will still always be able to read mkdocs configuration and map the plugin + +00:47:46 configurations to equivalent Zensical modules so the logic will be completely rewritten but you will be able to migrate your project with a command that's our goal because there has so much work been going on + +00:48:01 into projects built with material and mkdocs so we need to make it easy for users and organizations to switch and this is the main part we're working on in 2026 I think this is + +00:48:15 it's critical right if your absolute best users you know like that big company but many others of course they're not going to rewrite everything well maybe they will but many of them won't rewrite everything they'll just use an old version and grin and bear it + +00:48:30 as long as they have to you know what I started working on search before we started working on Zensical yeah I noticed how nice the search was when I was playing with it we're in so is zensicle.org itself built in Zensical yeah of course and it's actually built with an mkdocs YML + +00:50:07 because we're dog fooding so you can also build it with mkdocs with material for mkdocs the project layout is exactly the same yeah you know I find that there's just a bunch of static sites that seem to have I don't know + +00:50:21 what's going on with them but their search is really bad either they've just integrated some kind of Google thing where it's a site colon and they use your URL and the search which is a real bad experience or you go search and it spins and it spins and then + +00:50:36 eventually it pulls up so it looks like you are pre-computing these types of things or something with your search engine or you've got some cool data structure to make that fast right well it's not one cool data structure that would be great because then everybody could just use it but + +00:50:51 several months of work went into the search of course so it's a project of its own as I ! + +00:51:03 it's ! + +00:51:03 also completely modular and the reason why most of the search engines that are open source so like the libraries that you can use not services you have to pay for that they don't provide results that are really relevant or + +00:51:18 is that they use BM25 which is the standard bag of words ranking algorithm for information retrieval and this doesn't nicely pair with autocomplete so what you get is you start typing + +00:51:34 documents the balancing will be off because the relevance is computed based on the occurrence of a word in the entire corpus so you add a new document those weights change again + +00:51:49 the search that we have we of course as a baseline also have a BM25 implementation but the implementation you're seeing is a tie breaking implementation which provides much better accuracy and you can configure it so tie breaking + +00:52:03 means okay we first look into the title of the document then and see if we have matches then how many matches then where they are then we look into the navigate in the path and then in the body of the document and so on all of this is configurable + +00:52:18 and this is also why we believe that this go alone will also be a very interesting project for other for for instance side generators to integrate and you asked about pre-computing so no this is a search from the + +00:52:32 documents we build a search index which is a strapped down version of the HTML that is rendered when you load the page it's one JSON that we ship to the client and for most pages actually this JSON is below one megabyte you can + +00:52:47 GC it so compress it then it's something like 200k and you have extremely fast search on the client with no cost and so we believe that for 90 95 maybe 99 percent of + +00:53:02 documentation sites or sites in general this client site search is basically the way to go because it's fast and it doesn't require you to pay for anything and there are several SaaS based services that can be extremely expensive + +00:53:16 when you do the math so yeah you only need to use a server basically when you when the index becomes too big to ship to the client and we're also working on that by the way okay that's really cool you could shard the index or + +00:53:31 something like that right I suppose like you could say we're going to have 20 26 index bits and if only if the word starts with an A do you + +00:53:48 some other interesting solutions like page find is a pretty interesting library it does a completely different approach but it's not as snappy as the search that we ship to the client I use page find + +00:54:03 for my personal website which is a static site yeah it's also a great great solution but some things you won't be able to implement in page find properly because so it's with software it's trade-offs all the way so well I'm + +00:54:18 already thinking like I better pay attention to disco when it comes out so maybe adopt it for some stuff beautiful okay we got a couple interesting questions sort of following up from the component side of things jamsack says do you + +00:54:32 foresee community led templates or themes for zensical I know you have like two themes that I see something along those lines a couple of themes that you can choose now but what is the theme story I guess I want to ask you more broadly yeah so + +00:54:47 absolutely so right now we have only this one theme we have this variant setting where you can choose like the classic variant which is when you move over from material for mkdocs it looks exactly the same this is also why we + +00:55:02 needed to keep the HTML as it is also with the once we move to the component system we will make it possible to one use components + +00:55:17 within markdown and two also create a template engine that is based on components this will allow us much much faster rendering because for instance if you render the header for a site it's a lot of HTML because there's + +00:55:32 the search box in it and some other stuff but only the title changes so we will also make the rendering differential as part of the build that's the plan and with this we will also make it open to theme developers of course so there will be the + +00:55:46 packaging for instance compilation of ZAS styles or TypeScript or so will be part of Zensico so you don't need to precompile the theme like we need to do for the last 10 years for material + +00:56:00 so it will have a proper asset pipeline it will have a proper process to install themes all of this is planned but right now we focus on feature parity so in order to make it possible for more users to migrate right now that's really interesting + +00:56:15 that you would deliver the theme as basically its original source not its rendered compiled or transpiled version right to keep it yes part of the build step right yes exactly because + +00:56:32 the sidebar disappears too early for my taste and this is not so for this you have to go through the compilation step again and basically fork the theme + +00:56:47 and recompile it we want to make this configurable so that you can use yeah so you know configure the theme and build it and it just works so this like you know it just works that's like the + +00:57:02 we're working towards make it as simple as possible yeah yeah very cool let's maybe get short on time here maybe wrap up our our chat talking about two things the future where are you going you talked about compatibility being a big + +00:57:17 part of things going forward in 2026 but also sustainability right you had all these great supporters for material for mkdocs which you must have just been absolutely ! + +00:57:28 thrilled to realize how successful that was right going from I'll put up a wish list and then actually people love this I can put all my energy into it I know how great of a feeling that is right that's completely insane and when I + +00:57:43 started it I would never believe that this would be my job at some point really I feel the same way about the podcast and it's just I'm so grateful for it's amazing but then with this transition to Zensical how does that change anything + +00:57:57 or what's the story how do you bring that support over to Zensical as we don't have a lot of time I try to explain it as compact as possible so we are saying goodbye to this pay for extra features so + +00:58:12 in material you needed to be a sponsor in order to get the latest features earlier what we will do is everything is open source from the start so for users it's completely free and we are shifting our model from the sponsorships + +00:58:27 to something we call Zensical Spark because what we discovered talking a lot to our professional users is that the more we know about the problem space and the better we understand the problem space and the more we can collaborate with them the + +00:58:41 better degrees of freedom we can provide so we don't intend to just chip feature feature feature but we intend to create degrees of freedom so that you can adapt Zensical to the processes within your organization how + +00:58:56 they work to the workflows etc which are all different which is all very diverse basically so Spark is a space where you as a company can get a seat and together with us shape Zensical as part of high level discussions where we + +00:59:11 explore the problem space we create proposals so on the website you have clicked on the Spark section there's this Zaps in progress we call them Zaps Zensical advancement proposals it's on the left side we write very elaborate detailed + +00:59:26 proposals on specific topics that we intend to work on and then with the feedback that we get iterate on them and create + +00:59:39 a solution that is opinionated but that is as unopinionated as possible and the third thing that you get besides the opportunity to + +00:59:53 have high level discussions with us is professional support we've been asked for quite a lot by companies in Spark you can basically get + +01:00:08 direct access to the team and also we have those open video calls where we share our progress and where you can get a window of support and we talk about any problem that is keeping you up at night + +01:00:23 basically and stuff like migrations or how do you do this and this in Zensical and yeah it's been a blast so we're really happy that the organizations are enrolling into this + +01:00:38 that might translate quite well to other projects because you get a huge competitive advantage you know exactly what to build yeah you're on you're talking to the actual users they're saying this is the thing that really is hard for us or you just get maybe they don't say it but + +01:00:53 you see it right exactly yes yes and talking to the users is the best ! + +01:01:08 and this new project I'm very excited to see it coming along and it looks like it's going to be great maybe a final call to action for people like can they go ahead and start using Zensical if + +01:01:38 it has a lot of built in functionality already you get all of these components that we talked about free search that you don't have to host a very modern static site that is great on mobile so just give it a try and we have a newsletter + +01:01:52 where we once a month share the latest updates and that might also be worth checking ! + +01:02:08 it great to see as many users as possible and shape the future of Zensical together with all you to our sponsors be sure to check out what they're offering it really helps support the show + +01:03:08 best of all there's no subscription in sight browse the catalog at talkpython.fm and if you're not already subscribed to the show on your favorite podcast player what are you waiting for just search for python in your podcast player we should be right at the top if you enjoy that + +01:03:22 geeky rap song you can download the full track the link is actually in your podcast blur show notes this is your host Michael Kennedy thank you so much for listening I really appreciate it I'll see you next time + +01:03:59 is the norm + diff --git a/transcripts/542-zensical-a-modern-static-site-generator-transcript.vtt b/transcripts/542-zensical-a-modern-static-site-generator-transcript.vtt new file mode 100644 index 0000000..07eabef --- /dev/null +++ b/transcripts/542-zensical-a-modern-static-site-generator-transcript.vtt @@ -0,0 +1,1720 @@ +WEBVTT + +00:00:00.000 --> 00:00:04.740 +If you built documentation in the Python ecosystem, chances are you've used Martin Donoff's work. + +00:00:05.140 --> 00:00:12.940 +His material for MKDocs powers docs for FastAPI, uv, AWS, OpenAI, and tens of thousands of other projects. + +00:00:13.460 --> 00:00:20.340 +When MKDocs 2.0 took a direction that would break material in 300 ecosystem plugins, Martin went back to the drawing board. + +00:00:20.900 --> 00:00:31.380 +The result is Zenzicle, a new static site generator with a Rust core, differential builds in milliseconds instead of minutes, and a migration path designed to bring the whole community along. + +00:00:31.760 --> 00:00:37.000 +This is Talk Python To Me, episode 542, recorded February 17th, 2026. + +00:00:37.000 --> 00:00:58.860 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:59.100 --> 00:01:00.720 +This is your host, Michael Kennedy. + +00:01:01.080 --> 00:01:04.700 +I'm a PSF fellow who's been coding for over 25 years. + +00:01:05.260 --> 00:01:06.400 +Let's connect on social media. + +00:01:06.400 --> 00:01:09.880 +You'll find me and Talk Python on Mastodon, BlueSky, and X. + +00:01:10.100 --> 00:01:12.020 +The social links are all in your show notes. + +00:01:12.720 --> 00:01:16.280 +You can find over 10 years of past episodes at talkpython.fm. + +00:01:16.360 --> 00:01:19.780 +And if you want to be part of the show, you can join our recording live streams. + +00:01:19.980 --> 00:01:24.000 +That's right, we live stream the raw, uncut version of each episode on YouTube. + +00:01:24.500 --> 00:01:29.020 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:29.180 --> 00:01:32.840 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:33.280 --> 00:01:34.880 +This episode is brought to you by Sentry. + +00:01:34.880 --> 00:01:38.780 +You know Sentry for the error monitoring, but they now have logs too. + +00:01:39.020 --> 00:01:46.060 +And with Sentry, your logs become way more usable, interleaving into your error reports to enhance debugging and understanding. + +00:01:46.500 --> 00:01:49.760 +Get started today at talkpython.fm/sentry. + +00:01:50.500 --> 00:01:52.100 +Martin, welcome to Talk Python To Me. + +00:01:52.160 --> 00:01:52.800 +Great to have you here. + +00:01:53.060 --> 00:01:53.700 +Thanks for having me. + +00:01:53.700 --> 00:02:03.240 +I'm excited to talk about static sites and the next big platform for building them here in Python and beyond. + +00:02:03.740 --> 00:02:06.360 +So really excited to talk about Zensical. + +00:02:06.700 --> 00:02:07.540 +Am I saying that right? + +00:02:08.080 --> 00:02:08.900 +Yeah, pretty much. + +00:02:09.120 --> 00:02:09.520 +Zensical. + +00:02:09.840 --> 00:02:10.440 +Zensical. + +00:02:10.600 --> 00:02:10.780 +Okay. + +00:02:11.100 --> 00:02:11.320 +Yeah. + +00:02:11.520 --> 00:02:11.800 +Great. + +00:02:11.800 --> 00:02:12.240 +Yeah. + +00:02:12.800 --> 00:02:17.260 +I know MKDocs, the material for MKDocs has been really, really popular. + +00:02:17.760 --> 00:02:21.780 +And you all have made a big splash announcing this new project. + +00:02:22.360 --> 00:02:24.440 +So I'm really looking forward to diving into it. + +00:02:24.560 --> 00:02:28.180 +Before we do, though, let's just get a little bit of background on you. + +00:02:28.300 --> 00:02:28.780 +Who is Martin? + +00:02:28.940 --> 00:02:29.320 +Of course. + +00:02:29.480 --> 00:02:31.020 +So hi, my name is Martin Donut. + +00:02:31.020 --> 00:02:33.980 +Most people probably know me as Squidfunk. + +00:02:34.580 --> 00:02:39.060 +I've been an independent developer and consultant for the last 20 years now. + +00:02:39.720 --> 00:02:44.620 +And I mostly write in TypeScript, Python, and lately a lot of Rust. + +00:02:44.820 --> 00:02:47.100 +So I've become a huge fan of Rust, actually. + +00:02:48.040 --> 00:02:49.600 +I'm kind of a free spirit. + +00:02:50.140 --> 00:02:55.160 +So I love doing my own thing and building products from front to back, basically. + +00:02:55.500 --> 00:02:57.340 +So doing the front end as well as the back end. + +00:02:57.340 --> 00:03:01.220 +And for the past 15 years, I contributed a lot to open source. + +00:03:01.780 --> 00:03:06.140 +As already mentioned, my most popular project so far is material for MKDocs. + +00:03:06.860 --> 00:03:13.460 +And it's, well, millions of people looking, basically look at sites that are built with it every day. + +00:03:14.080 --> 00:03:14.200 +Yeah. + +00:03:14.280 --> 00:03:17.960 +Well, and Zensical, my latest project will hopefully go far beyond that. + +00:03:18.020 --> 00:03:19.240 +So we're working very hard on it. + +00:03:19.440 --> 00:03:20.600 +And this is why I'm here today. + +00:03:20.720 --> 00:03:22.280 +So excited to talk about it. + +00:03:22.860 --> 00:03:24.000 +Yeah, I am as well. + +00:03:24.000 --> 00:03:28.980 +Well, and let's just start by admiring your website a little bit. + +00:03:30.180 --> 00:03:30.620 +Thanks. + +00:03:31.200 --> 00:03:34.740 +Brian and I spoke about this over on our Python Bytes podcast. + +00:03:35.420 --> 00:03:39.720 +And we kind of just got distracted just staring at the website. + +00:03:39.900 --> 00:03:43.860 +It's this beautiful flow of, I don't know, colors. + +00:03:44.000 --> 00:03:48.240 +It looks a little bit like a black hole worm, a white wormhole sort of experience. + +00:03:48.340 --> 00:03:48.660 +I don't know. + +00:03:48.700 --> 00:03:51.340 +What was the inspiration there with this cool design? + +00:03:51.340 --> 00:03:53.800 +Yeah, this is actually a strange attractor. + +00:03:53.980 --> 00:03:55.380 +So this is something from physics. + +00:03:55.680 --> 00:04:03.120 +I'm not very, very proficient in physics, but those strange attractors, I had a fascination for them for a very long time. + +00:04:03.800 --> 00:04:06.000 +And they follow very simple rules. + +00:04:06.120 --> 00:04:12.360 +So it's just three equations that define how their points move in three-dimensional space. + +00:04:12.360 --> 00:04:18.060 +And yeah, but still with those simple rules, a very complex shape can emerge. + +00:04:18.860 --> 00:04:24.260 +And this is for us actually symbolizes the process of evolving ideas through writing. + +00:04:24.480 --> 00:04:33.620 +So if you have slightly different conditions from the start, it's still orbiting around the same shape, but it might look a little bit different. + +00:04:33.840 --> 00:04:37.520 +And there's actually, I can share this now, there's actually a little Easter egg. + +00:04:37.520 --> 00:04:38.880 +Nobody has found it so far. + +00:04:38.880 --> 00:04:51.920 +So if you hover over the homepage on zensicle.org with the mouse in the left bottom corner, you can actually change the coefficients of the animation. + +00:04:52.340 --> 00:04:56.580 +And if you do this, you can click on them and then you can use your cursor. + +00:04:56.940 --> 00:04:58.160 +I'm changing beta. + +00:04:58.380 --> 00:05:00.700 +We're running beta 0.22 right now. + +00:05:01.240 --> 00:05:02.480 +Oh, it really does change it. + +00:05:02.560 --> 00:05:02.760 +Yeah. + +00:05:02.840 --> 00:05:03.480 +Oh my goodness. + +00:05:03.480 --> 00:05:03.960 +Yeah. + +00:05:04.120 --> 00:05:14.680 +So it takes a little time, but if you change the coefficients in a specific way, it might be completely chaotic and become unstable. + +00:05:15.140 --> 00:05:18.300 +So this is what I really find fascinating about those strange attractors. + +00:05:18.660 --> 00:05:20.140 +And it's also the inspiration for the logo. + +00:05:20.660 --> 00:05:23.080 +So we're building on this image a lot. + +00:05:23.940 --> 00:05:24.420 +Okay. + +00:05:24.560 --> 00:05:25.800 +I thought it was just a cool design. + +00:05:25.920 --> 00:05:30.540 +I didn't realize it had all this meaning and actual math and physics behind it. + +00:05:30.600 --> 00:05:31.500 +That's super cool. + +00:05:31.500 --> 00:05:36.620 +I love chaos theory and all of this, these fractal type of ideas here. + +00:05:36.880 --> 00:05:38.440 +And yeah, it's super neat. + +00:05:38.740 --> 00:05:38.940 +Okay. + +00:05:39.040 --> 00:05:40.480 +So what is zensicle? + +00:05:40.840 --> 00:05:41.700 +Why did you build it? + +00:05:41.740 --> 00:05:43.060 +Why not just more material? + +00:05:43.460 --> 00:05:45.600 +So there are a lot of questions in there, actually. + +00:05:45.940 --> 00:05:49.440 +Maybe let me just start by shortly speaking about what it is. + +00:05:49.760 --> 00:05:55.040 +So in very simple terms, it's a tool to build beautiful websites from a folder of text files. + +00:05:55.320 --> 00:05:59.040 +So you just write a markdown and can generate a static site. + +00:05:59.320 --> 00:06:00.620 +You don't need a database for it. + +00:06:00.620 --> 00:06:05.260 +So to those that don't know what a static site is, you don't need a database or server. + +00:06:05.680 --> 00:06:11.060 +It's just static HTML, which means you just pip install zensicle and you're ready to go within a few minutes. + +00:06:11.440 --> 00:06:13.700 +And it's fully open source, MIT licensed. + +00:06:14.340 --> 00:06:17.920 +And to maybe explain a little bit more about static sites. + +00:06:17.920 --> 00:06:21.380 +So the big benefit of it, you can host it for free in many places. + +00:06:21.380 --> 00:06:23.780 +For instance, on GitHub Pages or Cloudflare. + +00:06:23.980 --> 00:06:28.020 +And they're secure and fast by default because there's only static file serving involved. + +00:06:28.360 --> 00:06:28.740 +And zensicle. + +00:06:28.940 --> 00:06:37.140 +So we try to make it pretty with a modern design, many built-in features and fun, according to the feedback of our users, which is kind of unusual for writing documentation. + +00:06:37.140 --> 00:06:38.680 +So, yeah. + +00:06:38.880 --> 00:06:39.100 +Yeah. + +00:06:39.300 --> 00:06:39.940 +Very cool. + +00:06:40.120 --> 00:06:49.920 +And if anyone's tried to manually create a static site, it quickly becomes a challenge if you're just writing. + +00:06:50.340 --> 00:06:52.620 +Say, hey, it's only five HTML pages. + +00:06:52.700 --> 00:06:54.100 +I can just write the HTML. + +00:06:54.340 --> 00:06:54.900 +You know what I mean? + +00:06:54.900 --> 00:06:59.960 +But, well, what if you want to have common navigation or you want to change the look and feel? + +00:07:00.520 --> 00:07:04.020 +You know, oh, well, now I've got to go edit that in five places, right? + +00:07:04.080 --> 00:07:13.160 +And so if even just beyond, basically beyond one page, having something that generates the static site is, it's super valuable, right? + +00:07:13.200 --> 00:07:21.200 +Because it'll generate the wrapper navigation, the common CSS, the footer, all those kinds of things, right? + +00:07:21.520 --> 00:07:21.760 +Yes. + +00:07:21.820 --> 00:07:23.180 +So it depends on what you want to do. + +00:07:23.180 --> 00:07:29.960 +So, of course, if you have a small site, like a personal website or so, you can just write basic HTML if you're proficient in it. + +00:07:30.280 --> 00:07:37.060 +For instance, the users of Material, only 7% of them are front-end developers. + +00:07:38.640 --> 00:07:42.240 +We will dive a little bit into how Zensical relates to Material later. + +00:07:43.520 --> 00:07:47.960 +And what Zensical is being used for primarily is for documentation. + +00:07:47.960 --> 00:07:55.280 +So it builds on the Docsys code philosophy, which means that you treat your documentation exactly like your source code. + +00:07:55.460 --> 00:07:56.800 +So you primarily write documentation. + +00:07:57.020 --> 00:08:01.080 +You don't want to fight front-end development problems. + +00:08:01.460 --> 00:08:04.320 +You just want to keep the content, like get the content out. + +00:08:04.320 --> 00:08:16.360 +And with this Docsys code, what the cool thing about it is, is you can use the same tools and processes and workflows like you use for code, like versioning and PRs to make changes. + +00:08:16.840 --> 00:08:24.640 +And the adoption is growing really fast, actually, among companies in recent years as they're moving away from proprietary tools to open source solutions. + +00:08:24.640 --> 00:08:32.320 +So Zensical is for you, or a static site generator in general is for you if you just want to get your writing out. + +00:08:32.660 --> 00:08:40.940 +And of course, you can also customize it and make it pretty as you want, but you don't necessarily need to know HTML, CSS, and JavaScript. + +00:08:41.280 --> 00:08:42.540 +And that's quite difficult. + +00:08:43.300 --> 00:08:47.580 +And you talked about writing, and you even have your metaphor with strange attractors. + +00:08:47.580 --> 00:08:56.080 +I personally find if I'm just in a clean space where it's really just about the ideas, I don't have to worry about the design. + +00:08:56.240 --> 00:08:59.920 +It makes it so much easier to just focus on the actual writing. + +00:09:00.160 --> 00:09:01.500 +You know, you're in a Markdown editor. + +00:09:01.980 --> 00:09:06.420 +My favorite is TypeHora, but you can use whatever variety that you want, right? + +00:09:06.660 --> 00:09:07.740 +And you're just there. + +00:09:08.060 --> 00:09:10.480 +You're not worried even hardly about the formatting of the Markdown. + +00:09:10.600 --> 00:09:11.220 +You're just writing. + +00:09:11.500 --> 00:09:14.740 +And I find that very, a good creative space, I guess. + +00:09:15.020 --> 00:09:16.140 +Yeah, that's the beauty of Markdown. + +00:09:16.140 --> 00:09:19.640 +So you can just write, as you mentioned. + +00:09:19.940 --> 00:09:24.460 +And how you, in the end, use it, you can still decide that afterwards. + +00:09:24.580 --> 00:09:29.960 +So if you want to build a website, if you want to create a PDF of it, if you just want to use it for internal note-taking or so. + +00:09:30.760 --> 00:09:34.400 +And this is the big benefit of Markdown. + +00:09:34.520 --> 00:09:41.780 +It takes away a lot of the headache of having to remember a lot of Markup in order to get your ideas out of the door. + +00:09:42.120 --> 00:09:44.960 +Can you actually put Markup in it if you need to? + +00:09:44.960 --> 00:09:53.880 +You know, for example, maybe you need a particular image, two of them side by side that are links and you want them to open in a new tab if somebody clicks them. + +00:09:54.120 --> 00:09:58.580 +Can you set it into basically an unsafe mode and let it do embedded Markup? + +00:09:58.880 --> 00:10:00.040 +Yeah, that's a great question. + +00:10:00.300 --> 00:10:01.620 +So, yes, it's possible. + +00:10:01.760 --> 00:10:04.280 +You can just use HTML within Markdown. + +00:10:04.280 --> 00:10:08.700 +We currently depend on Python Markdown, which we inherited from material for MKDocs. + +00:10:09.020 --> 00:10:18.660 +We are gradually moving towards CommonMark, which, so just as a context, Python Markdown has some oddities when you use HTML within Markdown. + +00:10:18.660 --> 00:10:23.400 +For instance, it won't replace relative URLs correctly. + +00:10:23.540 --> 00:10:25.000 +This is like an annoying thing. + +00:10:25.700 --> 00:10:36.860 +But once we move to CommonMark, we will also have like predefined components that you can use because you can't express everything like more complex things in plain Markdown. + +00:10:36.860 --> 00:10:50.700 +So there are only things like you can make text bold, you can have lists, tables, etc. But if it's more complex, some, as you mentioned, aligning to images or having an image with a caption or so, you need a basically HTML. + +00:10:51.160 --> 00:10:54.660 +And this is possible already, but we will make it much easier in the future. + +00:10:54.880 --> 00:10:56.680 +The front end world already knows this. + +00:10:56.680 --> 00:10:57.840 +So they use MDX. + +00:10:57.920 --> 00:11:06.700 +They've been using MDX for quite a while, which is a dialect on top of Markdown, which adds more liberty with components and so on. + +00:11:06.700 --> 00:11:09.460 +So you can create reusable components that you can use. + +00:11:10.440 --> 00:11:10.540 +Yeah. + +00:11:10.780 --> 00:11:11.640 +But, yeah. + +00:11:11.840 --> 00:11:12.960 +So it's possible. + +00:11:13.640 --> 00:11:16.540 +It's our users already also do it. + +00:11:16.760 --> 00:11:21.080 +We also have it on some examples on the documentation, and we will make it much more powerful in the future. + +00:11:21.440 --> 00:11:21.640 +Yeah. + +00:11:21.860 --> 00:11:22.280 +Very nice. + +00:11:22.280 --> 00:11:27.660 +I do think, you know, regular Markdown is just a few missing things. + +00:11:27.780 --> 00:11:29.180 +I love the simplicity of it. + +00:11:29.460 --> 00:11:32.160 +And, you know, hat tips, John Gruber for creating it. + +00:11:32.300 --> 00:11:42.140 +But it's just like, I just need to maybe put a class here or just do a little, if I could just control this a little bit more, then you could basically escape HTML. + +00:11:42.860 --> 00:11:47.380 +Obviously being careful to not just recreate HTML with square brackets instead of angle brackets, right? + +00:11:47.640 --> 00:11:49.400 +Yeah, there's been a lot of work on Python Markdown. + +00:11:49.400 --> 00:11:54.820 +So in Python Markdown, there are some extensions that allow you to add classes at least to block elements. + +00:11:54.820 --> 00:12:00.120 +So in Markdown, you need to distinguish between inline and block elements. + +00:12:00.280 --> 00:12:01.000 +Oh, no, it also works. + +00:12:01.020 --> 00:12:01.160 +Sorry. + +00:12:01.220 --> 00:12:03.400 +It also works on inline elements like links and so on. + +00:12:03.820 --> 00:12:05.200 +But this is special syntax. + +00:12:05.600 --> 00:12:09.160 +So Python Markdown is a dialect that is not standardized like CommonMark. + +00:12:09.160 --> 00:12:13.160 +In CommonMark, this is not easily possible to add specific classes. + +00:12:13.360 --> 00:12:18.000 +But with CommonMark, as I mentioned, you have MDX, which is a de facto standard. + +00:12:18.160 --> 00:12:19.880 +I don't know if they've standardized it already. + +00:12:20.660 --> 00:12:22.120 +That allows for much, much more. + +00:12:22.560 --> 00:12:24.680 +So what is Zensical for? + +00:12:24.800 --> 00:12:28.140 +Is this a documentation generating tool? + +00:12:28.480 --> 00:12:31.900 +Is it a just open-ended static site generator? + +00:12:31.900 --> 00:12:38.200 +What is possible and what is your goal or your target with this project? + +00:12:38.460 --> 00:12:38.600 +Yeah. + +00:12:38.720 --> 00:12:41.860 +So as I mentioned right now, we're focusing on documentation. + +00:12:42.200 --> 00:12:44.840 +So because this is the thing we're coming from. + +00:12:45.060 --> 00:12:47.860 +But we're building Zensical for much, much more. + +00:12:47.980 --> 00:12:54.580 +So our stretch goal is to have like a fully-fledged knowledge management and documentation solution. + +00:12:54.580 --> 00:12:59.720 +There are already a lot of companies that use it internally for knowledge management. + +00:13:00.620 --> 00:13:07.880 +Basically, as an alternative to a ZAS-based solution like Confluence and Notion, we are aware that for this we need WYSIWYC. + +00:13:08.000 --> 00:13:09.280 +So what you see is what you get. + +00:13:09.400 --> 00:13:12.320 +A visual editor that is also usable by non-technicals. + +00:13:12.820 --> 00:13:23.000 +And if you scroll, if you check out our roadmap and scroll down all the way, you will see it as a stretch goal, which is basically something we're working towards. + +00:13:23.000 --> 00:13:29.380 +Because this would actually allow so much more people within organizations to use it. + +00:13:30.060 --> 00:13:42.340 +And in general, Zensical, with Zensical, we focus on three key areas that make us different from other static site generators, which is, well, a modern design. + +00:13:42.440 --> 00:13:44.480 +So, of course, some also have a modern design. + +00:13:44.580 --> 00:13:52.000 +But within the Python ecosystem, some options might look a little bit dated or a little bit... + +00:13:52.000 --> 00:13:55.400 +So we try to be a little bit more on the edge, actually. + +00:13:55.960 --> 00:13:58.800 +And it should be flexible and it should be fast. + +00:13:58.840 --> 00:13:59.860 +So those three things. + +00:13:59.920 --> 00:14:04.220 +Because the design, actually, is the thing that people notice first. + +00:14:04.880 --> 00:14:08.980 +So what we offer is a design that is customizable, brandable. + +00:14:09.160 --> 00:14:22.380 +You have tons of options with which you can change how navigation is laid out, how you can also change colors, fonts, etc. And we have a lot of components that make it ready for technical writing. + +00:14:22.480 --> 00:14:24.520 +As you mentioned, you just want to start writing. + +00:14:24.860 --> 00:14:27.540 +So we have stuff like admonitions tabs. + +00:14:28.220 --> 00:14:42.820 +And one very specific feature that we have is code annotations that we inherited from Material for MKDocs, which is quite unique among static site generators, which allows you to put a little bubble onto any line of code. + +00:14:43.260 --> 00:14:44.620 +You have to visit our documentation. + +00:14:44.880 --> 00:14:45.600 +This is our... + +00:14:45.600 --> 00:14:47.440 +You're currently browsing our... + +00:14:47.440 --> 00:14:48.900 +The other side. + +00:14:49.140 --> 00:14:49.880 +All right, all right. + +00:14:49.880 --> 00:14:50.220 +Hold on. + +00:14:50.360 --> 00:14:50.780 +I got it. + +00:14:50.860 --> 00:14:51.180 +Keep going. + +00:14:51.300 --> 00:14:52.000 +I'll get to say. + +00:14:52.260 --> 00:14:52.740 +Right, right. + +00:14:52.780 --> 00:14:53.180 +No worries. + +00:14:53.640 --> 00:14:53.880 +Yeah. + +00:14:54.180 --> 00:14:56.200 +And there you have to search for code annotations. + +00:14:56.900 --> 00:14:57.220 +Yeah. + +00:14:57.260 --> 00:15:03.540 +So code annotations, which allow you to create a bubble in any line of code. + +00:15:03.540 --> 00:15:06.440 +And if you click that bubble, there opens a tooltip. + +00:15:06.540 --> 00:15:08.560 +And within this tooltip, you can use any rich content. + +00:15:08.720 --> 00:15:18.200 +So you can have lists, any formatted markdown tables, diagrams, basically anything you can use anyway within markdown. + +00:15:18.720 --> 00:15:20.540 +And this is a very popular feature in Material. + +00:15:20.920 --> 00:15:22.880 +And so, of course, we brought it over. + +00:15:23.360 --> 00:15:25.020 +So users can still use it. + +00:15:25.220 --> 00:15:28.860 +So the second thing I talked about is it should be flexible. + +00:15:29.060 --> 00:15:33.980 +So what makes Ansical different is we have a modular architecture or say we're working towards a modular architecture. + +00:15:34.160 --> 00:15:35.660 +We're still in alpha. + +00:15:36.600 --> 00:15:39.460 +So we're close to finishing the module system. + +00:15:40.360 --> 00:15:53.080 +And in Zendigl, it's modules all the way down, which means all core functionality is implemented as modules, which is different from other solutions where the plugin system sometimes is more or less an afterthought. + +00:15:53.260 --> 00:15:57.820 +So there's a plugin system added with specific hooks, extension points where you can hook into. + +00:15:57.820 --> 00:16:07.080 +And this might seem sufficient at first, but in the end, so for us, for instance, MKDocs in the end was a little bit limiting. + +00:16:07.640 --> 00:16:10.960 +And this allows you to basically swap, extend, replace all modules. + +00:16:11.140 --> 00:16:12.140 +You can use our modules. + +00:16:12.360 --> 00:16:14.340 +You can write your own, pull in third-party modules. + +00:16:14.640 --> 00:16:16.080 +And as I mentioned, Rust. + +00:16:16.280 --> 00:16:17.020 +So don't worry. + +00:16:17.280 --> 00:16:18.220 +You don't need to learn Rust. + +00:16:18.560 --> 00:16:24.520 +You will also be able to write modules in Python because we are super happy users of Pyro 3, which is absolutely amazing library. + +00:16:24.520 --> 00:16:30.560 +And Pyro 3 has really become a super important foundation of Python these days. + +00:16:30.700 --> 00:16:34.520 +It's almost like the C bindings for CPython. + +00:16:35.060 --> 00:16:35.280 +Exactly. + +00:16:35.700 --> 00:16:36.420 +So, yeah. + +00:16:36.560 --> 00:16:39.760 +So with Pyro 3, it allows us to have a Rust runtime. + +00:16:40.360 --> 00:16:48.080 +So all of the orchestration and how, in which order, so in which order things are run, threading, caching, parallelization, etc. All is happening in Rust. + +00:16:48.080 --> 00:16:53.240 +And we will provide Python binding so that you still can use Python to write modules. + +00:16:53.560 --> 00:16:55.200 +And they're still running fast. + +00:16:55.200 --> 00:16:59.360 +This portion of Talk Python To Me is brought to you by Sentry. + +00:16:59.640 --> 00:17:01.960 +You know Sentry for their great error monitoring. + +00:17:02.340 --> 00:17:03.360 +But let's talk about logs. + +00:17:03.740 --> 00:17:04.720 +Logs are messy. + +00:17:05.220 --> 00:17:10.840 +Trying to grep through them and line them up with traces and dashboards just to understand one issue isn't easy. + +00:17:11.300 --> 00:17:13.480 +Did you know that Sentry has logs too? + +00:17:14.040 --> 00:17:16.040 +And your logs just became way more usable. + +00:17:16.600 --> 00:17:22.580 +Sentry's logs are trace-connected and structured, so you can follow the request flow and filter by what matters. + +00:17:22.580 --> 00:17:31.280 +And because Sentry surfaces the context right where you're debugging, the trace, relevant logs, the error, and even the session replay all land in one timeline. + +00:17:31.680 --> 00:17:32.820 +No timestamp matching. + +00:17:33.060 --> 00:17:33.800 +No tool hopping. + +00:17:34.320 --> 00:17:40.540 +From front-end to mobile to back-end, whatever you're debugging, Sentry gives you the context you need so you can fix the problem and move on. + +00:17:40.980 --> 00:17:45.660 +More than 4.5 million developers use Sentry, including teams at Anthropic and Disney+. + +00:17:45.660 --> 00:17:51.080 +Get started with Sentry logs and error monitoring today at talkpython.fm/sentry. + +00:17:51.080 --> 00:17:53.760 +Be sure to use our code, talkpython26. + +00:17:54.400 --> 00:17:56.040 +The link is in your podcast player's show notes. + +00:17:56.340 --> 00:17:58.000 +Thank you to Sentry for supporting the show. + +00:17:59.780 --> 00:18:02.120 +Which brings me to the last point where we're different. + +00:18:02.360 --> 00:18:15.680 +We have a very heavy focus on performance, so our goal is to let you start with one page because, of course, all documentation sites, all projects start small, and let you scale that to something like 100,000 pages. + +00:18:15.680 --> 00:18:18.560 +How we do it is through differential builds. + +00:18:18.900 --> 00:18:26.140 +We have created our own runtime, which is called ZRX, and differential builds mean that we are only rebuilding what changed. + +00:18:26.220 --> 00:18:32.180 +So, for instance, if you only change the page title, only that page and all instances where the page title is used are being rebuilt. + +00:18:32.400 --> 00:18:35.520 +And this means that changes are visible in milliseconds and not minutes. + +00:18:35.940 --> 00:18:37.520 +Yeah, that's super cool. + +00:18:37.840 --> 00:18:41.160 +And so I'm presuming the build system itself is Rust-based, right? + +00:18:41.160 --> 00:18:42.120 +Yeah, exactly. + +00:18:42.260 --> 00:18:43.380 +It's 100% Rust, yeah. + +00:18:43.600 --> 00:18:44.140 +Yeah, yeah. + +00:18:44.540 --> 00:18:48.700 +Coming from a Python background, what was that experience like building that? + +00:18:49.040 --> 00:18:56.920 +Yeah, so that's kind of a tricky question because I'm not really coming from a long history. + +00:18:57.600 --> 00:18:59.900 +So I don't have a long Python background. + +00:19:00.640 --> 00:19:07.300 +I wrote many in TypeScript, and I only started 2021 writing Python. + +00:19:07.720 --> 00:19:13.300 +So this is actually the history, how materials started, and how all of this unfolded. + +00:19:14.000 --> 00:19:17.740 +But I've written in several languages. + +00:19:18.040 --> 00:19:21.460 +So I also have written in C, Erlang, Ruby, Python, TypeScript. + +00:19:22.260 --> 00:19:23.780 +Rust was still extremely hard to learn. + +00:19:24.100 --> 00:19:31.120 +So I basically banged my head against the keyboard for a month, wasn't making no progress at all because, yeah, you know, fighting with the borrow checker. + +00:19:31.420 --> 00:19:45.800 +So, and once you get past that, and then, of course, lifetimes and higher rank trade bounds and some other features, I'm now some kind of like 3,000 hours, 4,000 hours in, something like that. + +00:19:46.400 --> 00:19:47.540 +It gets really good. + +00:19:47.840 --> 00:19:56.800 +So I think Rust is seriously one of the best languages ever made because it allows you to express ideas extremely clearly, with extreme clarity. + +00:19:56.800 --> 00:20:04.200 +And this is due to the very good type system, of course, and you get bare metal performance. + +00:20:04.440 --> 00:20:11.780 +So I find it kind of insane having a language like Rust because it's so easy to write once you're used to it. + +00:20:12.160 --> 00:20:16.940 +You will be very productive and still have bare metal performance. + +00:20:17.300 --> 00:20:18.180 +It's completely insane. + +00:20:18.520 --> 00:20:19.160 +Yeah, that's wild. + +00:20:19.360 --> 00:20:24.640 +But it's got a little bit of a learning curve compared to like Python or TypeScript or something like that. + +00:20:24.640 --> 00:20:29.500 +Yeah, so I had, I think, 18 years of experience with many languages. + +00:20:30.200 --> 00:20:35.860 +As I mentioned, I also did a lot of C and I still found it very hard to learn. + +00:20:36.260 --> 00:20:38.760 +But it's worth it. + +00:20:38.920 --> 00:20:39.800 +It's worth it. + +00:20:39.860 --> 00:20:54.480 +And my recommendation probably would be to learn it on something that you really care about so that you want to build because otherwise you will probably lose the drive since you're running against those walls. + +00:20:55.040 --> 00:20:59.120 +Maybe for you, it's, or for somebody else, it's much easier to learn. + +00:20:59.260 --> 00:21:04.060 +So maybe it's just, I'm a bad example that I needed so long. + +00:21:04.120 --> 00:21:04.400 +I don't know. + +00:21:04.520 --> 00:21:09.420 +But because after that month, it wasn't that I was completely up to speed. + +00:21:09.480 --> 00:21:16.380 +So it was just, I was making very, very tiny progress, at least progress, because for a month I wasn't making progress at all. + +00:21:16.820 --> 00:21:30.280 +The next show that I'm doing after this one, which actually is, in real, on clock time, wall time, it's happening in like two hours or less from now is with Samuel Colvin from Pydantic talking about Monty, + +00:21:30.700 --> 00:21:32.260 +a Python runtime. + +00:21:32.620 --> 00:21:37.620 +He's, he and his team are rewriting in Rust, specifically targeting AI. + +00:21:37.880 --> 00:21:39.780 +So the Rust theme will continue. + +00:21:40.020 --> 00:21:54.000 +It's definitely a very, it caught me a little bit off guard, like how much people love it, but it's also, it's also, you know, it makes perfect sense that we want this nice modern language for writing lower level things, even if it plugs into Python, right? + +00:21:54.320 --> 00:21:54.620 +Yeah. + +00:21:54.680 --> 00:22:01.400 +So the fun thing is I also talked to Samuel a long time ago and he was the one recommending to me to write it in Rust. + +00:22:02.160 --> 00:22:16.100 +it's one of, it's one of the reasons I, yeah, definitely I looked into it and it made, it made a lot of sense also during the time, the progress we're making and so on and the walls we're hitting that to reconsider learning Rust + +00:22:16.100 --> 00:22:17.360 +best investment. + +00:22:17.820 --> 00:22:18.020 +Yeah. + +00:22:18.020 --> 00:22:18.400 +Yeah. + +00:22:18.500 --> 00:22:18.940 +Amazing. + +00:22:19.280 --> 00:22:19.640 +Amazing. + +00:22:19.940 --> 00:22:28.260 +So I wanted, I want to dig into your component structure and some of those things, but maybe before we do, let's, let's talk about the origins a little bit. + +00:22:28.260 --> 00:22:32.780 +So what, let's talk about how you went from material for MK docs. + +00:22:33.340 --> 00:22:34.720 +Why, why even change? + +00:22:34.800 --> 00:22:36.440 +Why not just more material? + +00:22:36.940 --> 00:22:37.040 +Yeah. + +00:22:37.120 --> 00:22:41.060 +So this is, this is a great question and this is a little bit of a story. + +00:22:41.160 --> 00:22:45.440 +So there are several stories in there actually, because it's, so it's 10 years. + +00:22:45.560 --> 00:22:51.080 +I try to go make it as compact as possible while keeping the most important things. + +00:22:51.080 --> 00:22:55.600 +So to those who don't know, material for MK docs is a very popular documentation framework. + +00:22:55.600 --> 00:22:57.300 +It's used by tens of thousands of projects. + +00:22:57.580 --> 00:23:04.500 +There are prominent users like AWS, Microsoft, open AI, also large open source projects use it. + +00:23:04.640 --> 00:23:12.660 +Like for instance, FastAPI, UVK native, and it's built on top of MK docs as the name says, which is one of the most, which became one of the most popular static site generators. + +00:23:13.300 --> 00:23:15.340 +And it also eventually became my job. + +00:23:15.760 --> 00:23:17.820 +So I could make it my job. + +00:23:17.880 --> 00:23:21.300 +I could work in open source and earn a living somehow. + +00:23:21.680 --> 00:23:23.480 +I'm getting there how that worked. + +00:23:24.900 --> 00:23:32.680 +And, but at some point we needed a new foundation and we've like kind of outgrown MK docs because it was not evolving at the pace that we needed. + +00:23:32.880 --> 00:23:38.780 +So we began exploring alternatives and yeah, so there's a lot of lessons learned in, in materials. + +00:23:38.780 --> 00:23:45.380 +So let me show us short, shortly, maybe talk about how it started because like it started as a side project in 2015. + +00:23:45.700 --> 00:23:53.640 +Like many things start because I wanted to release a, actually a C library, zero, a zero copy protocol buffers library I wrote called proto bluff. + +00:23:53.640 --> 00:23:57.060 +But then I realized that it needed more than a read me. + +00:23:57.260 --> 00:24:03.160 +So I looked at the existing static site generators which were Hugo, Jekyll, Sphinx, MK docs, something like that. + +00:24:03.700 --> 00:24:05.280 +And they all look a little bit dated. + +00:24:05.860 --> 00:24:13.980 +I'm not a designer but I wanted something more modern and Google was pushing material design quite, quite hard for app development at the time. + +00:24:13.980 --> 00:24:16.740 +So, and I've also seen it being used in the web. + +00:24:16.880 --> 00:24:18.680 +So I thought, well, maybe combine this. + +00:24:19.300 --> 00:24:24.380 +I quickly settled on MK docs, was easy to use, simple templating, enough for a side project basically. + +00:24:24.600 --> 00:24:25.500 +So it was a side project. + +00:24:26.080 --> 00:24:30.340 +Did what most devs do, I checked the license but didn't do any further due diligence. + +00:24:30.600 --> 00:24:33.680 +So even put MK docs in the name to show the connection. + +00:24:33.680 --> 00:24:35.040 +So, which is common for themes. + +00:24:35.300 --> 00:24:49.860 +And that actually turned out to be one of the biggest decisions I made in my career since I was basing my complete work on somebody, something I don't control and it shaped the next 10 years of all of the work I was doing and is actually the reason + +00:24:49.860 --> 00:24:51.480 +why Zensical exists today. + +00:24:52.240 --> 00:24:52.440 +I see. + +00:24:52.800 --> 00:25:04.080 +So, after I started developing it, I, like nine months later, released the first version and send a good users, a lot of feature requests and, you know, it was a side project. + +00:25:04.220 --> 00:25:06.320 +So I was doing client work at the time. + +00:25:06.540 --> 00:25:15.820 +As I mentioned, I've been a, like a consultant and developer freelancer for 20 years and I only had Sundays to work on it. + +00:25:16.220 --> 00:25:21.400 +So, which at first was efficient, but the more popular it got, the more maintenance that came. + +00:25:21.700 --> 00:25:36.580 +So it kind of crept into my mornings and evenings and I was doing triage, like answering questions and trying to fix bugs before I went to the client and it was getting harder and harder to justify in front of my partner actually because I was doing it in my spare time + +00:25:36.580 --> 00:25:50.860 +and so I did what eventually all projects that start as side projects and where you don't have the full time to work on it, how they, yeah, so what basically happens is you start turning down feature requests + +00:25:50.860 --> 00:25:54.840 +and many open source projects don't cross this line and for me it was a first. + +00:25:55.720 --> 00:26:01.140 +yeah, and also additionally, so I mentioned before that I started writing Python in 2021. + +00:26:01.540 --> 00:26:16.220 +At the time I was focusing, so I only had Sundays to work on it, I didn't know Python so I said that, okay, I will focus on the templating stuff, I will do the HTML, CSS, JavaScript, all of this, make it beautiful and try to solve as much, as many problems as possible + +00:26:16.220 --> 00:26:27.700 +in the front end but I won't start learning Python because it wasn't a language that I was using at that time and I couldn't make up the time for it so that's where I drew the line. + +00:26:28.700 --> 00:26:30.040 +Yeah, yeah, and then I tried to... + +00:26:30.040 --> 00:26:31.860 +It's probably going to be a fad that Python thing anyway. + +00:26:32.580 --> 00:26:35.000 +I don't think so but... + +00:26:35.000 --> 00:26:41.720 +Well, at the time, in 2015 it wasn't clear that it was going to be as popular as it was as it is now, right? + +00:26:41.760 --> 00:26:42.640 +It's really... + +00:26:42.640 --> 00:26:47.760 +It started to become popular then but it's really taken over the world and... + +00:26:47.760 --> 00:26:48.060 +Absolutely. + +00:26:48.440 --> 00:26:49.600 +For a lot of reasons, so... + +00:26:49.600 --> 00:27:02.360 +Of course, yeah, I think one of the main reasons is because it's very popular in the ML community and all of the LLM AI work that's happening and so on made it extremely popular so... + +00:27:02.360 --> 00:27:16.240 +And I also think that Rust is doing a very good job on keeping it that way because finally you have a very easy way to offload work to native code which is much easier than fiddling with + +00:27:16.240 --> 00:27:22.240 +C and C++ and void pointers and whatever so as I mentioned, Pyro 3 is just an absolutely amazing library. + +00:27:22.380 --> 00:27:24.440 +It's so easy to write Rust code. + +00:27:24.640 --> 00:27:25.620 +Yeah, I think you're right. + +00:27:25.740 --> 00:27:26.060 +I think... + +00:27:26.060 --> 00:27:29.720 +Rust has really provided an important escape hatch for... + +00:27:29.720 --> 00:27:30.520 +I wrote it this way. + +00:27:30.520 --> 00:27:31.340 +It's not fast enough. + +00:27:31.500 --> 00:27:34.500 +Like, well, this part, we're going to make it as fast as it can be, basically. + +00:27:34.920 --> 00:27:35.120 +Yeah. + +00:27:35.760 --> 00:27:36.000 +Yeah. + +00:27:36.300 --> 00:27:37.160 +So... + +00:27:37.160 --> 00:27:38.260 +Sorry, I interrupted you. + +00:27:38.320 --> 00:27:38.580 +Keep going. + +00:27:38.660 --> 00:27:39.100 +Oh, no worries. + +00:27:39.200 --> 00:27:39.500 +No worries. + +00:27:39.740 --> 00:27:40.320 +Yeah, no, no. + +00:27:40.880 --> 00:27:48.220 +Yeah, so as I mentioned, I tried to keep it basically afloat for the first four years and at the time I didn't see the potential at all. + +00:27:48.360 --> 00:27:57.820 +It was just a theme, not a kind of product or so, but yet I felt responsible and kept on maintaining it and my developer friends didn't understand why I was doing that. + +00:27:57.820 --> 00:28:03.260 +So, but for me, it was like, you know, it was kind of cool because I had a growing project. + +00:28:03.420 --> 00:28:04.300 +I had no immediate plans. + +00:28:04.380 --> 00:28:04.720 +I don't know. + +00:28:05.040 --> 00:28:14.520 +Let's see where that, where I can, where I can take it and, yeah, so, and with this steady and slowly growth over years, then companies and organizations started using it. + +00:28:14.520 --> 00:28:28.320 +so they were basing their public facing documentation on me, like the guy that maybe works on this project on a Sunday and yet I felt responsible enough to trying to fix the bugs reported as quickly as possible. + +00:28:29.160 --> 00:28:32.920 +Yeah, and, yeah, then in 2020 actually came the turning point. + +00:28:33.020 --> 00:28:37.820 +So when I was working on version five of it, I shared my progress publicly as I did before and somebody mentioned the donate button. + +00:28:38.900 --> 00:28:47.280 +So, and I think the wording was something like so that I can order pizza to survive the long Sunday coding sessions. + +00:28:48.600 --> 00:28:56.820 +But I heard from another developer who did this on his project, well, successful project for five years, a donate button and he made $90. + +00:28:57.180 --> 00:29:00.700 +So I immediately said that's not going to work. + +00:29:00.760 --> 00:29:03.860 +But I said, let's try an Amazon wish list. + +00:29:04.180 --> 00:29:12.400 +You know, I just put some stuff on there and maybe if somebody thinks my work is useful, then he can order me, like, make me a present and something send me a present. + +00:29:13.300 --> 00:29:16.820 +So, yeah, and I basically received everything on that wish list. + +00:29:17.100 --> 00:29:18.000 +It was completely insane. + +00:29:18.100 --> 00:29:20.340 +So there were two consecutive days that felt like Christmas. + +00:29:20.600 --> 00:29:26.860 +I even put like, so, I put some, you know, books and, but then also single malt. + +00:29:27.060 --> 00:29:29.280 +I, I love Scottish, Scottish single malt. + +00:29:30.140 --> 00:29:34.360 +It was a whiskey that cost $120 and I received that that as well. + +00:29:34.680 --> 00:29:36.840 +So it was like, what's happening? + +00:29:37.260 --> 00:29:40.000 +And that led me to start thinking actually about demographics. + +00:29:40.000 --> 00:29:45.720 +So in, that I needed to better understand the audience of material for MKDocs. + +00:29:45.860 --> 00:29:49.060 +And I did a poll and the results were absolutely eye-opening. + +00:29:49.240 --> 00:29:57.020 +I mentioned before, 7% only of front, of users are front-end developers, which means, and material is a front-end heavy project. + +00:29:57.160 --> 00:30:03.740 +So I kind of had an edge there in the Python space because, yeah, you know, it's based on Python. + +00:30:03.860 --> 00:30:10.120 +So front-end developers that write in JavaScript, they rather go for something like DocuSaurus or React-based or whatever. + +00:30:10.840 --> 00:30:12.920 +And technical writers were quite happy with the project. + +00:30:13.200 --> 00:30:15.400 +I didn't know even technical writers existed. + +00:30:15.640 --> 00:30:26.360 +So I had no clue that this job, that this is a job because I thought at the time, and it's in hindsight completely naive, of course, I thought that as a developer, you need to write the documentation, you know. + +00:30:26.820 --> 00:30:32.160 +And so I learned about that and accidentally built a product for technical writers. + +00:30:32.160 --> 00:30:39.780 +And by the way, when I say product, I mean something that is not necessarily something you pay for, but something that doesn't feel engineered. + +00:30:39.840 --> 00:30:45.180 +So something that is like polished and designed and that you actually want to use. + +00:30:46.420 --> 00:30:54.320 +And yeah, so I had a product that has like product market fit and but at the time I didn't earn any money off it. + +00:30:54.700 --> 00:31:03.740 +So at the same time I read about SponsorWare and this, like, I'm not sure if you heard of it, but it's like a new model of monetization for open source. + +00:31:03.800 --> 00:31:06.900 +At the time it was quite new so that you can get paid for your work. + +00:31:07.080 --> 00:31:14.100 +So you can, so some developers for instance, they sell course material or access to gated content or code or nothing at all. + +00:31:14.260 --> 00:31:21.700 +So if you have a popular project you can just try to raise sponsorships from, and some companies are very generous when it comes to open source. + +00:31:22.160 --> 00:31:34.160 +And what we did with Material was we gave away early access to the latest features to the sponsors and each feature was tied to a funding goal and when that funding goal was met it became free for everyone. + +00:31:34.160 --> 00:31:39.840 +So it was like kind of a funded feature development in multiple stages. + +00:31:40.380 --> 00:31:41.860 +And that's what I thought of. + +00:31:42.600 --> 00:31:43.120 +Sorry? + +00:31:43.340 --> 00:31:43.420 +Yeah. + +00:31:43.900 --> 00:31:44.620 +That's super clever. + +00:31:44.840 --> 00:31:54.900 +I really love the idea of providing something for the sponsors but still not turning it into well, here's a paid version of our product and here's the open source version. + +00:31:54.900 --> 00:32:03.280 +but there's always this tension of how do you reward the people who support you without undermining the open source project and that's a clever angle. + +00:32:04.060 --> 00:32:04.140 +Yeah. + +00:32:04.260 --> 00:32:06.100 +So that's extremely challenging. + +00:32:06.880 --> 00:32:18.800 +So as I'm telling this so this is what I came up with and I thought maybe it could work something like that and again my developer friends they said will never work nobody will pay for open source you're insane. + +00:32:19.320 --> 00:32:33.360 +Spoiler alert it did work and in the end we made 200k a year of it and could build a team and everything so I know in Silicon Valley terms this is probably minimum wage but in Europe it's quite an amount with which you can work very well. + +00:32:34.380 --> 00:32:48.880 +And yeah so I started this program in 2020 and it grew steadily and it finally allowed me to work on features outside of the Sunday so invest more hours into it and finally learned Python in 2021 wrote my first plugin + +00:32:48.880 --> 00:33:03.480 +and started hacking the MKDocs features that well that got turned down that we upstreamed but where the maintainer said ah it's maybe not a good fit or we don't have the time for it and yeah in total + +00:33:03.480 --> 00:33:17.540 +I wrote 12 MKDocs plugins so it started as a theme but it turned into a popular sorry into a powerful docs framework in the end and this worked quite well for several years until it didn't anymore and that's the reason why Zensical + +00:33:17.540 --> 00:33:18.940 +then came into being. + +00:33:18.940 --> 00:33:29.780 +So the way it didn't work is that just where you want to take it started to diverge from MKDocs or you you couldn't get your changes upstreamed or committed back? + +00:33:30.800 --> 00:33:45.620 +So the thing was that MKDocs was not evolving as we needed it in the it's so historically MKDocs had a sequence of single maintainers and as far as I know all of them worked on it in their spare time because they had + +00:33:45.620 --> 00:34:00.120 +regular jobs and material was evolving quickly because you know we had funding we could we could invest much more time in it we could then of course then an open source project that is only maintained in the spare time and so it was changing too slowly + +00:34:00.120 --> 00:34:14.620 +so we started a lot of discussions on necessary API changes because for many users material for MKDocs was MKDocs so we were kind of like the storefront where most of the issues and like bug reports and feature requests came in + +00:34:14.620 --> 00:34:29.100 +because many people are using material for MKDocs and with this MKDocs basically and the main challenges that we faced were performance and plugin orchestration I mentioned I wrote 12 plugins and + +00:34:29.100 --> 00:34:43.760 +it's very hard to make them cooperate and if you look at any popular MKDocs plugins issue tracker you will find issues that go something like well this plugin is incompatible with this plugin well if I change the + +00:34:43.760 --> 00:34:58.500 +order of the plugins in the configuration this and this happens and both of those problems were brought to us again and again by the users with which we talked and so it was coming up a lot then suddenly + +00:34:58.500 --> 00:35:13.240 +after nine years the original maintainer returned to MKDocs and we were super optimistic because the project was maintained again he also started a sponsorship program we upstreamed some of our funding immediately and supported his work so before MKDocs had no + +00:35:13.240 --> 00:35:28.180 +way to sponsor them and the moment this went live we immediately supported it and some PRs were finally merged and issues were closed but yeah then the works went silent and it started working + +00:35:28.180 --> 00:35:42.340 +in basically in the quiet and three months later we were invited to a video call so we as maintainers from so I as a maintainer for material for MKDocs and some other key ecosystem maintainers + +00:35:42.340 --> 00:35:57.160 +and we learned that MKDocs that the plans for MKDocs 2.0 were completely different from what existed at so what currently exists MKDocs 1.x which primarily means no plugin + +00:35:57.160 --> 00:36:11.680 +API and customization via templating alone so we already knew this is not enough because that's what we've done the first four years where as I mentioned I was only doing templating and some things with templates for instance + +00:36:11.680 --> 00:36:26.260 +having a tag support where you need to pull in different tags from different pages and then render them on another page or so you need synchronization efforts and you can't do this with templating by the way all of this information is public + +00:36:26.260 --> 00:36:40.740 +so you can read it on the mkdocs issue tracker so I'm not telling anything secret or so it's a completely different direction than they were dismissed + +00:36:40.740 --> 00:36:55.420 +so mkdocs 2.0 as it looks right now is incompatible with material for mkdocs 300 plugins and ecosystem will become useless and tens of thousands of projects will be affected and for us so we had absolutely no choice than to + +00:36:55.420 --> 00:37:10.360 +start building something so to start make something of this because at the time we had already 50,000 projects 50,000 public projects depending on us we talking to enterprise users and we knew that this number + +00:37:10.360 --> 00:37:25.140 +is much much higher so for instance one of our professional users they already also sponsored material they have two and a half thousand projects internally so only one company and they have a dedicated team of + +00:37:25.140 --> 00:37:39.360 +individuals that maintain their customizations on top of material for mkdocs for all of the teams inside the company it's a very big company so that's incredible what you could infer from the I + +00:37:39.360 --> 00:37:54.240 +could believe it I couldn't believe it at all so absolutely insane yeah so as I mentioned we had no choice so what we did was we immediately went back to the drawing board with the learnings from the almost 10 years + +00:37:54.240 --> 00:38:09.220 +that passed since I started material we built a lot of prototypes in typescript and Python iterated on them we did a lot of conceptual work things what could actually be done with a radically different architecture + +00:38:09.220 --> 00:38:24.180 +because writing 12 plugins I know the ins and outs of mkdocs I had to do a lot of hacks to make the block plugin of material work with the way navigation works in mkdocs and the number + +00:38:24.180 --> 00:38:39.000 +one complaint as I mentioned was mkdocs is slow and it doesn't scale so fixing a typo you're doing a full rebuild and this can take minutes so our design work centered exactly around this problem and after + +00:38:39.000 --> 00:38:53.540 +a short while so we knew exactly what mkdocs should look like and we didn't want to let our users down and we so in essence we had two options we know what it should look like we could fork it or we could start from scratch and + +00:38:53.540 --> 00:39:08.340 +forking is not really possible the way because of the way how python dependencies work so all of the plugins have a dependency on mkdocs and this means that we would also need to fork all of so without doing black magic with + +00:39:08.340 --> 00:39:23.020 +imports which might not be the best idea so we would also need to fork all plugins or all plugins would need to switch to the fork so this would be like moving an entire city at once and it's frankly + +00:39:23.020 --> 00:39:37.860 +impossible so and if we would fork it we wouldn't be able to realize our learnings that we gained in the groundwork that we did so we had to start from scratch actually right plus you'd have to convince the entire community to at + +00:39:37.860 --> 00:39:52.780 +least create a parallel package because when you pip install that other plugin it's going to say hey PyPI I need mkdocs and now you'd it + +00:39:52.780 --> 00:40:07.700 +would be a big battle wouldn't it just technically with or you'd have to move the community which is a very challenging thing to do yeah and so for us the most sensible thing was to just you know we just start from scratch we make it + +00:40:07.700 --> 00:40:22.660 +as compatible as possible it became quite clear very quickly that we need to optimize for compatibility because if you create something that is not compatible work + +00:40:22.660 --> 00:40:37.560 +to get over to something else you won't get a lot of adoption all you gotta do is think about that 2500 project team like okay how do I keep them working with this right yes yes yeah so what we + +00:40:37.560 --> 00:40:52.440 +then did is we had an idea how it should look then we started with Rust because it was recommended to us so it was very hard at first and it in total of + +00:40:52.440 --> 00:41:07.000 +this but it was not only writing code it was also exactly knowing where we want to go because we're starting fresh so we better be sure that we are going into a direction where we actually want to go for the next 10 to 20 to 30 years + +00:41:07.000 --> 00:41:21.900 +depends so we are really in for this for the long game so the 10 years that I've been doing this I see that this is only the start and we wrote a lot of things from scratch so the runtime as I mentioned it's like the + +00:41:21.900 --> 00:41:36.720 +heart of Zensical it already has something like 15,000 lines of code a tiny HTTP middleware framework for file serving because we also want to make the file server extensible and don't want users to force them into async + +00:41:36.720 --> 00:41:51.700 +Rust and also don't have a dependency on Tokyo in JavaScript for instance there's Lerna and it has 800 + +00:41:51.700 --> 00:42:06.560 +dependencies so when you install it what you pull down is just insane so we worked a lot on the processes as well that we can make releases very easy and we have a good way of working basically and we are very careful about + +00:42:06.560 --> 00:42:21.400 +our choice of dependencies so if it's not something that you can write quite quickly actually and we rather own in order to make changes ourselves we rather write it from scratch I + +00:42:21.400 --> 00:42:36.240 +think that's a very healthy philosophy and also I think this agentic AI world that we're in these days if you just need one or two functions and you used to think well maybe I'll lean on this and in your case a crate or maybe a PyPI package + +00:42:36.240 --> 00:42:51.120 +or something but if it's just I started using pip dash audit for a lot of my projects and I would say for my + +00:42:51.120 --> 00:43:05.940 +bigger projects every two weeks I get at least one CVE vulnerability notification for something I'm using but here's the thing it's in a situation of that probably a piece of code or functionality of that + +00:43:05.940 --> 00:43:20.760 +package that I don't even use or care about so it doesn't really apply to me but then I've got all these here's an issue that I'm + +00:43:20.760 --> 00:43:35.320 +going to be fine you know what I mean I think things are swinging back a little bit from let's just pull in everything because it's going to help us to well maybe not everything yeah and also you + +00:43:35.320 --> 00:43:49.900 +can't just change things easily and you depend on other APIs so for instance one of the reasons why we choose to build a lot of things from scratch is that we want to control the public API so + +00:43:49.900 --> 00:44:04.660 +the worst thing for us would probably just be to export a third party API that we're using as part of our public interface because it's rust so it would mean that if those this public API would change the entire ecosystem would break so we're very + +00:44:04.660 --> 00:44:05.200 +careful ! + +00:44:05.940 --> 00:44:19.960 +to what APIs we expose and rather wrap it in order to be safe so we can replace things replaceable so maybe you have the philosophy of it might be okay to use this crate but we don't + +00:44:19.960 --> 00:44:34.720 +exchange its types as the public as part of our API or something along those lines we don't expose it so we in some instance the wrappers that I've wrote are identical to the types + +00:44:34.720 --> 00:44:49.480 +that we use from using our own types or just wrapping them because in Rust the nice benefit is you have zero cost abstraction so all the code is monomorphized in line so you don't pay for wrapping code that's + +00:44:49.480 --> 00:45:02.400 +the absolute crazy thing so it's you can finally create a really clean architecture without runtime penalties if you do it right oh that's wild yeah yeah yeah very interesting so you can see I have this huge list of + +00:45:07.960 --> 00:45:22.220 +I'd like to go back to this component wrong search there you have components that was in the other part let's just talk through some of these things here so you've got like admonitions + +00:45:22.220 --> 00:45:36.840 +buttons code blocks let's talk through some of the building blocks I guess that you think are interesting here yeah so I think most of the so if you if you're new to technical writing most of the stuff shouldn't be quite new so like + +00:45:36.840 --> 00:45:51.600 +admonitions code blocks stuff like that you've probably seen or data tables diagrams are just mermaid diagrams as they are as you can use them on github one of the so like the flagship features in material or + +00:45:51.600 --> 00:46:06.000 +and now Zensical as I mentioned like code annotations which is a part of code blocks otherwise we also have an icon and emoji integration so you can use one of I think we have something like over 10,000 icons now + +00:46:06.000 --> 00:46:20.460 +with a quite simple syntax that's not standard markdown that's the problem so that's like a Python markdown extension and we're working on moving this over to common mark and finding a way to migrate this over because + +00:46:20.460 --> 00:46:28.360 +right now Zensical uses Python markdown for compatibility with materials for mkdocs which means that for markdown + +00:46:32.260 --> 00:47:02.260 +we + +00:47:02.260 --> 00:47:16.960 +the same parsing engine to have this strong compatibility right we can even read mkdocs YML configuration so you can build an mkdocs project with Zensical as it stands the thing that we currently don't support in its entirety is the + +00:47:16.960 --> 00:47:31.840 +plugins from the ecosystem we already support some plugins for instance the mkdocs strings plugin the author is also part of the Zensical team now with mkdocs strings being the second biggest project in the mkdocs space so we are + +00:47:31.840 --> 00:47:46.440 +happy to have Tim on board and several other plugins but as I mentioned so Zensical uses modules so what we will do in the end is we will still always be able to read mkdocs configuration and map the plugin + +00:47:46.440 --> 00:48:01.240 +configurations to equivalent Zensical modules so the logic will be completely rewritten but you will be able to migrate your project with a command that's our goal because there has so much work been going on + +00:48:01.240 --> 00:48:15.920 +into projects built with material and mkdocs so we need to make it easy for users and organizations to switch and this is the main part we're working on in 2026 I think this is + +00:48:15.920 --> 00:48:30.880 +it's critical right if your absolute best users you know like that big company but many others of course they're not going to rewrite everything well maybe they will but many of them won't rewrite everything they'll just use an old version and grin and bear it + +00:48:30.880 --> 00:48:32.280 +as long as they have to you know what I + +00:49:52.340 --> 00:50:07.200 +started working on search before we started working on Zensical yeah I noticed how nice the search was when I was playing with it we're in so is zensicle.org itself built in Zensical yeah of course and it's actually built with an mkdocs YML + +00:50:07.200 --> 00:50:21.980 +because we're dog fooding so you can also build it with mkdocs with material for mkdocs the project layout is exactly the same yeah you know I find that there's just a bunch of static sites that seem to have I don't know + +00:50:21.980 --> 00:50:36.660 +what's going on with them but their search is really bad either they've just integrated some kind of Google thing where it's a site colon and they use your URL and the search which is a real bad experience or you go search and it spins and it spins and then + +00:50:36.660 --> 00:50:51.100 +eventually it pulls up so it looks like you are pre-computing these types of things or something with your search engine or you've got some cool data structure to make that fast right well it's not one cool data structure that would be great because then everybody could just use it but + +00:50:51.100 --> 00:51:03.660 +several months of work went into the search of course so it's a project of its own as I ! + +00:51:03.660 --> 00:51:03.940 +it's ! + +00:51:03.940 --> 00:51:18.280 +also completely modular and the reason why most of the search engines that are open source so like the libraries that you can use not services you have to pay for that they don't provide results that are really relevant or + +00:51:18.280 --> 00:51:32.580 +is that they use BM25 which is the standard bag of words ranking algorithm for information retrieval and this doesn't nicely pair with autocomplete so what you get is you start typing + +00:51:34.180 --> 00:51:49.000 +documents the balancing will be off because the relevance is computed based on the occurrence of a word in the entire corpus so you add a new document those weights change again + +00:51:49.000 --> 00:52:03.800 +the search that we have we of course as a baseline also have a BM25 implementation but the implementation you're seeing is a tie breaking implementation which provides much better accuracy and you can configure it so tie breaking + +00:52:03.800 --> 00:52:18.340 +means okay we first look into the title of the document then and see if we have matches then how many matches then where they are then we look into the navigate in the path and then in the body of the document and so on all of this is configurable + +00:52:18.340 --> 00:52:32.980 +and this is also why we believe that this go alone will also be a very interesting project for other for for instance side generators to integrate and you asked about pre-computing so no this is a search from the + +00:52:32.980 --> 00:52:47.420 +documents we build a search index which is a strapped down version of the HTML that is rendered when you load the page it's one JSON that we ship to the client and for most pages actually this JSON is below one megabyte you can + +00:52:47.420 --> 00:53:02.080 +GC it so compress it then it's something like 200k and you have extremely fast search on the client with no cost and so we believe that for 90 95 maybe 99 percent of + +00:53:02.080 --> 00:53:16.880 +documentation sites or sites in general this client site search is basically the way to go because it's fast and it doesn't require you to pay for anything and there are several SaaS based services that can be extremely expensive + +00:53:16.880 --> 00:53:31.820 +when you do the math so yeah you only need to use a server basically when you when the index becomes too big to ship to the client and we're also working on that by the way okay that's really cool you could shard the index or + +00:53:31.820 --> 00:53:40.200 +something like that right I suppose like you could say we're going to have 20 26 index bits and if only if the word starts with an A do you + +00:53:48.400 --> 00:54:03.280 +some other interesting solutions like page find is a pretty interesting library it does a completely different approach but it's not as snappy as the search that we ship to the client I use page find + +00:54:03.280 --> 00:54:18.060 +for my personal website which is a static site yeah it's also a great great solution but some things you won't be able to implement in page find properly because so it's with software it's trade-offs all the way so well I'm + +00:54:18.060 --> 00:54:32.900 +already thinking like I better pay attention to disco when it comes out so maybe adopt it for some stuff beautiful okay we got a couple interesting questions sort of following up from the component side of things jamsack says do you + +00:54:32.900 --> 00:54:47.300 +foresee community led templates or themes for zensical I know you have like two themes that I see something along those lines a couple of themes that you can choose now but what is the theme story I guess I want to ask you more broadly yeah so + +00:54:47.300 --> 00:55:02.240 +absolutely so right now we have only this one theme we have this variant setting where you can choose like the classic variant which is when you move over from material for mkdocs it looks exactly the same this is also why we + +00:55:02.240 --> 00:55:17.080 +needed to keep the HTML as it is also with the once we move to the component system we will make it possible to one use components + +00:55:17.080 --> 00:55:32.000 +within markdown and two also create a template engine that is based on components this will allow us much much faster rendering because for instance if you render the header for a site it's a lot of HTML because there's + +00:55:32.000 --> 00:55:46.940 +the search box in it and some other stuff but only the title changes so we will also make the rendering differential as part of the build that's the plan and with this we will also make it open to theme developers of course so there will be the + +00:55:46.940 --> 00:56:00.640 +packaging for instance compilation of ZAS styles or TypeScript or so will be part of Zensico so you don't need to precompile the theme like we need to do for the last 10 years for material + +00:56:00.640 --> 00:56:15.220 +so it will have a proper asset pipeline it will have a proper process to install themes all of this is planned but right now we focus on feature parity so in order to make it possible for more users to migrate right now that's really interesting + +00:56:15.220 --> 00:56:29.600 +that you would deliver the theme as basically its original source not its rendered compiled or transpiled version right to keep it yes part of the build step right yes exactly because + +00:56:32.620 --> 00:56:47.420 +the sidebar disappears too early for my taste and this is not so for this you have to go through the compilation step again and basically fork the theme + +00:56:47.420 --> 00:57:02.320 +and recompile it we want to make this configurable so that you can use yeah so you know configure the theme and build it and it just works so this like you know it just works that's like the + +00:57:02.620 --> 00:57:17.520 +we're working towards make it as simple as possible yeah yeah very cool let's maybe get short on time here maybe wrap up our our chat talking about two things the future where are you going you talked about compatibility being a big + +00:57:17.520 --> 00:57:28.660 +part of things going forward in 2026 but also sustainability right you had all these great supporters for material for mkdocs which you must have just been absolutely ! + +00:57:28.660 --> 00:57:43.000 +thrilled to realize how successful that was right going from I'll put up a wish list and then actually people love this I can put all my energy into it I know how great of a feeling that is right that's completely insane and when I + +00:57:43.000 --> 00:57:57.920 +started it I would never believe that this would be my job at some point really I feel the same way about the podcast and it's just I'm so grateful for it's amazing but then with this transition to Zensical how does that change anything + +00:57:57.920 --> 00:58:12.860 +or what's the story how do you bring that support over to Zensical as we don't have a lot of time I try to explain it as compact as possible so we are saying goodbye to this pay for extra features so + +00:58:12.860 --> 00:58:27.820 +in material you needed to be a sponsor in order to get the latest features earlier what we will do is everything is open source from the start so for users it's completely free and we are shifting our model from the sponsorships + +00:58:27.820 --> 00:58:41.680 +to something we call Zensical Spark because what we discovered talking a lot to our professional users is that the more we know about the problem space and the better we understand the problem space and the more we can collaborate with them the + +00:58:41.680 --> 00:58:56.660 +better degrees of freedom we can provide so we don't intend to just chip feature feature feature but we intend to create degrees of freedom so that you can adapt Zensical to the processes within your organization how + +00:58:56.660 --> 00:59:11.240 +they work to the workflows etc which are all different which is all very diverse basically so Spark is a space where you as a company can get a seat and together with us shape Zensical as part of high level discussions where we + +00:59:11.240 --> 00:59:26.200 +explore the problem space we create proposals so on the website you have clicked on the Spark section there's this Zaps in progress we call them Zaps Zensical advancement proposals it's on the left side we write very elaborate detailed + +00:59:26.200 --> 00:59:34.640 +proposals on specific topics that we intend to work on and then with the feedback that we get iterate on them and create + +00:59:39.000 --> 00:59:53.780 +a solution that is opinionated but that is as unopinionated as possible and the third thing that you get besides the opportunity to + +00:59:53.780 --> 01:00:08.380 +have high level discussions with us is professional support we've been asked for quite a lot by companies in Spark you can basically get + +01:00:08.380 --> 01:00:23.020 +direct access to the team and also we have those open video calls where we share our progress and where you can get a window of support and we talk about any problem that is keeping you up at night + +01:00:23.020 --> 01:00:34.820 +basically and stuff like migrations or how do you do this and this in Zensical and yeah it's been a blast so we're really happy that the organizations are enrolling into this + +01:00:38.380 --> 01:00:53.360 +that might translate quite well to other projects because you get a huge competitive advantage you know exactly what to build yeah you're on you're talking to the actual users they're saying this is the thing that really is hard for us or you just get maybe they don't say it but + +01:00:53.360 --> 01:00:57.220 +you see it right exactly yes yes and talking to the users is the best ! + +01:01:08.380 --> 01:01:18.980 +and this new project I'm very excited to see it coming along and it looks like it's going to be great maybe a final call to action for people like can they go ahead and start using Zensical if + +01:01:38.380 --> 01:01:52.800 +it has a lot of built in functionality already you get all of these components that we talked about free search that you don't have to host a very modern static site that is great on mobile so just give it a try and we have a newsletter + +01:01:52.800 --> 01:01:59.980 +where we once a month share the latest updates and that might also be worth checking ! + +01:02:08.380 --> 01:02:26.700 +it + +01:02:26.700 --> 01:02:38.480 +great to see as many users as possible and shape the future of Zensical together with all you to our sponsors be sure to check out what they're offering it really helps support the show + +01:03:08.280 --> 01:03:22.960 +best of all there's no subscription in sight browse the catalog at talkpython.fm and if you're not already subscribed to the show on your favorite podcast player what are you waiting for just search for python in your podcast player we should be right at the top if you enjoy that + +01:03:22.960 --> 01:03:33.040 +geeky rap song you can download the full track the link is actually in your podcast blur show notes this is your host Michael Kennedy thank you so much for listening I really appreciate it I'll see you next time + +01:03:59.860 --> 01:04:00.660 +is the norm From e811e50eed2070a1bece172da7139d877e6d457c Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Fri, 10 Apr 2026 09:58:08 -0700 Subject: [PATCH 09/16] transcripts --- ...l-next-packaging-peps-transcript-final.txt | 2092 +++++++++++ ...l-next-packaging-peps-transcript-final.vtt | 3139 +++++++++++++++++ 2 files changed, 5231 insertions(+) create mode 100644 transcripts/544-wheel-next-packaging-peps-transcript-final.txt create mode 100644 transcripts/544-wheel-next-packaging-peps-transcript-final.vtt diff --git a/transcripts/544-wheel-next-packaging-peps-transcript-final.txt b/transcripts/544-wheel-next-packaging-peps-transcript-final.txt new file mode 100644 index 0000000..739209e --- /dev/null +++ b/transcripts/544-wheel-next-packaging-peps-transcript-final.txt @@ -0,0 +1,2092 @@ +00:00:00 When you pip install a package with compiled code, the wheel you get is built for CPU features from 2009. + +00:00:06 Want newer optimizations like AVX2? Your installer has no way to ask for them. + +00:00:11 Want GPU support? You're on your own configuring special index URLs. + +00:00:16 The result is fat binaries, nearly gigabyte-sized wheels, and install pages that read like puzzle books. + +00:00:22 A coalition from NVIDIA, Astral, and QuantSight has been working on WheelNext, a set of peps that let packages declare what hardware they need and let installers like uv pick the right build automatically. + +00:00:34 Just UVPip install Torch and it'll work. + +00:00:37 I sit down with Jonathan Decker from NVIDIA, Ralph Gommers from QuantSight and the NumPy and SciPy teams, and Charlie Marsh, founder of Astral and creator of uv, to dig into it all. + +00:00:47 This is Talk Python To Me, episode 544, recorded March 2nd, 2026. + +00:00:53 Talk Python To Me, yeah, we ready to roll. + +00:00:56 Upgrading the code, no fear of getting old. + +00:00:59 They sink in the air, new frameworks in sight, geeky rap on deck. + +00:01:03 Quark crew, it's time to unite. + +00:01:05 We started in Pyramid, cruising old school lanes. + +00:01:08 Had that stable base, yeah, sir. + +00:01:09 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:14 This is your host, Michael Kennedy. + +00:01:16 I'm a PSF fellow who's been coding for over 25 years. + +00:01:20 Let's connect on social media. + +00:01:21 You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:25 The social links are all in your show notes. + +00:01:28 You can find over 10 years of past episodes at talkpython.fm. + +00:01:31 And if you want to be part of the show, you can join our recording live streams. + +00:01:35 That's right. + +00:01:36 We live stream the raw, uncut version of each episode on YouTube. + +00:01:40 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:44 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:48 This episode is brought to you by Sentry. + +00:01:51 You know Sentry for the error monitoring, but they now have logs too. + +00:01:55 And with Sentry, your logs become way more usable, interleaving into your error reports to enhance debugging and understanding. + +00:02:02 Get started today at talkpython.fm/sentry. + +00:02:06 And it's brought to you by Temporal, durable workflows for Python. + +00:02:10 Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +00:02:17 Get started at talkpython.fm/Temporal. + +00:02:21 Hey, a quick announcement for everyone taking courses over at Talk Python Training. + +00:02:25 We just rolled out course completion certificates. + +00:02:28 I'm really excited about these. + +00:02:29 When you finish a course, you can now generate a certificate automatically. + +00:02:33 The best part is there's a one-click button to add it straight to LinkedIn on your profile as an official certificate. + +00:02:40 Potential employers, current colleagues, they'll all see it right there on your profile. + +00:02:43 Just head over to your account page at Talk Python Training, find a course you finished, click certificate, and the share to LinkedIn option is right there. + +00:02:52 Zero friction. + +00:02:53 And if your employer gives you credit for professional development or reimburses you for training costs, but require some sort of proof, you can also download a full certificate as a PDF. + +00:03:04 Handy for that kind of thing. + +00:03:05 I'd love to see a wave of Talk Python certificates showing up on LinkedIn. + +00:03:09 Head over to Talk Python, click courses, go to your account page, and grab your certificates. + +00:03:15 Jonathan, Ralph, and Charlie, welcome. + +00:03:17 Welcome back, depending on which one of you are hearing this. + +00:03:21 Welcome to the show, you all. + +00:03:22 It's awesome to have you on Talk Python and me. + +00:03:24 Thanks for having us. + +00:03:24 Thanks for having us. + +00:03:25 Thanks for having us, Michael. + +00:03:27 We're going to dive in deep to Python packaging and really look at how the needs of Python packaging have evolved. + +00:03:36 And what you all, as well as a group of a bunch of other people, I see very long contributor lists on these peps. + +00:03:42 So a lot of people involved in this project. + +00:03:44 Really great. + +00:03:45 So let's get into it. + +00:03:47 Before we do, let's just do a quick round of intros for you all. + +00:03:51 I guess go around clockwise. + +00:03:53 Jonathan, you can go first. + +00:03:54 I work at NVIDIA for like, I think, the better of eight years right now. + +00:04:00 I did all kinds of different roles. + +00:04:02 But very recently, I mean, over the last two something years, I move into improving our CUDA and Python offering, trying to find better ways to expose GPU programming, essentially, at the Python layer. + +00:04:17 And I think for a little bit over a year, I've been working with Ralph and Charlie over multiple proposals to improve Python packaging. + +00:04:26 Initiative call we are next. + +00:04:28 And I think we'll talk a little bit more about this. + +00:04:30 So excited to be on the show today. + +00:04:33 Yeah. + +00:04:33 Excited to have you. + +00:04:34 You have really seen the roller coaster at NVIDIA, I'm sure. + +00:04:38 Right? + +00:04:38 Well, it's really exciting. + +00:04:39 Yeah, it was like gaming and probably some data science and then all the changes and now just center of the universe. + +00:04:47 So I'm sure it is exciting. + +00:04:48 You know, the funny thing is I wanted to join NVIDIA since 15 years. + +00:04:52 And I did a PhD to actually be able to join NVIDIA. + +00:04:56 That is so awesome. + +00:04:58 I love it. + +00:04:58 I was amazed by the CUDA technology when I was in high school. + +00:05:02 And I was like, ah, this is so incredible, the concept. + +00:05:06 And I wanted to join. + +00:05:07 So I'm happy I was able to make this happen. + +00:05:10 You know, CUDA is going to be an important part of this discussion. + +00:05:13 Not the only part, but it certainly is one of the forcing functions for the things happening here. + +00:05:18 Give people the background on CUDA. + +00:05:21 What is it? + +00:05:22 How does it work? + +00:05:22 Why is it so amazing? + +00:05:24 Well, CUDA is essentially a programming language that allows you to program on GPUs, specifically NVIDIA GPUs, and has a different programming model than what you would usually do in C. + +00:05:37 So because GPUs are fundamentally very different than CPUs, you have to program them with a different mindset. + +00:05:44 Like, for example, the biggest important thing when you start with GPUs is to not think about a single thread executing the instruction. + +00:05:52 But like, how can you massively parallel a task on like thousands of threads at a single? + +00:05:58 And it takes a different perspective and mode of thinking to how can you imagine doing a task on so many threads at the same time? + +00:06:08 We're not used to it as classic computer scientists. + +00:06:13 If anything, multi-threading is something that we tend to shy away from because there's a lot of caveats. + +00:06:20 But well, GPU programming is all about how can you have as many threads as possible. + +00:06:25 Yeah. + +00:06:26 It comes from graphics and videos where like this pixel is computed independently of that pixel. + +00:06:33 And we've got, you know, 5K resolution. + +00:06:36 So let's just break that up, right? + +00:06:38 Yeah. + +00:06:38 It's exactly the idea. + +00:06:40 And now we have this reasonably new model that's called title programming that abstract it even more, which essentially instead of thinking about threads and blocks and grids, you think in terms of title. + +00:06:52 So kind of a mini representation that you could have in mind. + +00:06:56 And that thing can scale and adapt differently on different hardware. + +00:07:00 So pretty cool. + +00:07:01 But yeah, that is amazing. + +00:07:03 People think that their CPU has a lot of cores. + +00:07:06 It's got nothing on the graphics cards. + +00:07:10 Well, yeah. + +00:07:11 It's a different type of hardware. + +00:07:14 Rich is different. + +00:07:15 Absolutely. + +00:07:16 Yeah. + +00:07:16 Well, very cool. + +00:07:16 Very cool. + +00:07:17 And what a journey if you did all that work to get there. + +00:07:20 I absolutely love it. + +00:07:22 Ralph. + +00:07:22 Welcome. + +00:07:23 Hello. + +00:07:23 Yeah. + +00:07:24 Thanks, Michael. + +00:07:24 Great to be here. + +00:07:25 So about me. + +00:07:26 I am a physicist by training. + +00:07:29 I did a PhD in atomic and quantum physics. + +00:07:32 Worked in the semiconductor industry in a while. + +00:07:35 And I rolled into scientific computing. + +00:07:38 Due to that, I started using Python in 2004. + +00:07:41 And used the mailing list at that point. + +00:07:43 Because there was, I mean, NumPy didn't exist yet. + +00:07:46 There was no documentation for anything. + +00:07:48 So you had to join a mailing list. + +00:07:49 That's how I rolled into open source early on. + +00:07:51 I became the release manager of NumPy and SciPy in 2010. + +00:07:56 And yeah, I've been kind of doing that ever since. + +00:07:59 As a volunteer for 10 years. + +00:08:00 And then I got really too much. + +00:08:02 So I made it my job. + +00:08:03 I joined QuantSight, which is a small consulting company. + +00:08:06 Primarily around like data science, supplied AI, scientific computing. + +00:08:10 And yeah, I'm now one of the two co-CEOs of QuantSight. + +00:08:14 Awesome. + +00:08:15 Trying to basically, we just converted last year to a public benefit corporation. + +00:08:19 Which is very much aligned with what, you know, most of our team wants to do. + +00:08:23 Most of them are open source maintainers. + +00:08:25 And yeah, we basically do consulting to allow ourselves to make impactful open source contributions. + +00:08:32 QuantSight is doing a ton in the data science space. + +00:08:35 Scientific computing space, for sure. + +00:08:37 I've had multiple rounds of QuantSight folks on the show and things like that. + +00:08:42 And very neat. + +00:08:43 Yeah, it's a lot of fun and rewarding. + +00:08:45 So yeah, glad to be here. + +00:08:47 It's an interesting transition going from a science or something along those lines into programming, right? + +00:08:53 And I got into it through working in my math research and so on. + +00:08:58 And actually, this is just more fun. + +00:09:00 I'm just going to do programming. + +00:09:01 It's not exactly physics, but it's pretty similar, you know? + +00:09:05 Yeah. + +00:09:05 I mean, I've always liked both. + +00:09:07 But I did experimental physics. + +00:09:10 And there, you have much less control over what you end up producing. + +00:09:15 You know, building and using lasers in the lab. + +00:09:17 If one broke, maybe I had to send it off for repairs and wait a month, right? + +00:09:21 And what do you do in the meantime? + +00:09:22 You program. + +00:09:23 So, you know, that's one of the nicer things about it. + +00:09:26 And yeah, I gradually started with like, even before Python, there was some MATLAB. + +00:09:31 And then, you know, you roll into open source. + +00:09:33 And then, you know, mostly just Python a bit of C and kind of like go down from there. + +00:09:37 And then you encounter packaging. + +00:09:39 And it's one of those things that like only 5% of people like and the rest see it as a chore. + +00:09:43 But yeah, when you like it, you just have to do more and more of it. + +00:09:46 Yeah. + +00:09:47 You're with your people now, I think, on this call. + +00:09:49 That's for sure. + +00:09:50 Hey, Charlie. + +00:09:51 I mean, do we even need to give you an introduction? + +00:09:53 We just say uv and then go on or? + +00:09:56 No, I'll generally. + +00:09:57 No, please do. + +00:09:59 I'm just kidding. + +00:10:00 But the reason I say that is uv has taken the world by storm, really. + +00:10:06 And congratulations. + +00:10:07 And yeah, tell people about yourself. + +00:10:08 Thank you. + +00:10:08 Yeah, yeah, of course. + +00:10:09 So my name is Charlie. + +00:10:11 I'm the founder and CEO of Afterall. + +00:10:13 I've been working on the company for, let's see, started the company in October 2022. + +00:10:19 That's the easier way to do it. + +00:10:20 So I've been working on this for a few years. + +00:10:23 We mostly build open source. + +00:10:25 So we've worked on a couple of different tools that have become quite popular in Python. + +00:10:30 So we build Ruff, which is our linter and formatter. + +00:10:32 ty, which is our type checker. + +00:10:34 And then most relevant for this episode would be uv, which is our Python package and project manager. + +00:10:40 So yeah, we spend all our time thinking about how to build tools that make it easier and to work with Python and how to make Python programming more productive. + +00:10:49 A lot of that's about speed. + +00:10:52 We try to build things that are really fast, but it's also about user experience and trying to sort of like take complexity out of the critical path for users. + +00:11:00 So, you know, for example, we've definitely spent a lot of time thinking about how we can make it easier for people to install PyTorch, which is, you know, one of the examples that will come up, I'm sure, you know, over the course of the show. + +00:11:12 And one of the motivating examples for the peps we've been working on. + +00:11:15 So, yeah, that's why I'm here. + +00:11:17 We've been collaborating with Jonathan, Ralph, and honestly, like a bunch of other people, too. + +00:11:21 It's been a really big effort, and I'm sure we'll get into that. + +00:11:23 But it's been cool to have this long running and very like wide ranging collaboration around trying to push Python packaging forward. + +00:11:30 Well, like I said, congrats on all the stuff with Astral. + +00:11:33 And we're going to talk a little bit about pyx, I think. + +00:11:36 Maybe see if there's any. + +00:11:37 Just to check in at the end of the show after we talk about some of these things, I think, if you're up for it. + +00:11:42 Yeah, sounds good. + +00:11:43 I mean, let's just start with what is the challenge. + +00:11:45 You all have described this as the lowest common denominator packaging problem that we've got to deal with. + +00:11:52 And the idea or the problem is different CPUs have specialized instructions, different graphics cards, all these different compute and platforms and so on might have specific instructions. + +00:12:05 And they're optimized, right? + +00:12:06 Like do this as vectors operations instead of on registers or whatever. + +00:12:10 But maybe some other thing that it might run on doesn't support that, right? + +00:12:15 I don't know, WebAssembly, whatever. + +00:12:16 Yeah. + +00:12:17 And so then how do you actually end up shipping something to Python people that takes advantages of the specializations that are there when they're there, but without breaking the other ones, right? + +00:12:28 That's kind of the core problem. + +00:12:29 Is that right? + +00:12:30 Yeah. + +00:12:30 I can take one little step back before. + +00:12:32 Go ahead, Alan. + +00:12:33 If you think about it like a wheel, when you take the Python package that we have everywhere, there is a few parts of the fine names that essentially allows you to know what it's been built for. + +00:12:46 So inside this, you have, if it's a pure Python package, it's simple. + +00:12:50 You might have a minimum Python version, but in most cases, it's pretty generic. + +00:12:55 So that's not an issue. + +00:12:56 When you start having compiled code inside the package, that's a different story because now we're talking about what kind of OS it was built for. + +00:13:03 So Windows, macOS, Linux, different flavors. + +00:13:08 We're talking about the type of CPU that it was built for. + +00:13:11 So x86, ARM, PowerPC, potentially RISC-V, all these things. + +00:13:17 Mobile. + +00:13:18 And then finally, the Python API. + +00:13:20 And in most cases, it means the minimum Python API that you need. + +00:13:24 So, and for people, an API is essentially the same as an API, but for a binary. + +00:13:31 So it's important when things are stable at the API level because it allows you to be future compatible. + +00:13:38 What does API mean? + +00:13:39 Jonathan, help us out. + +00:13:40 What does API mean for those of us who don't know? + +00:13:42 Application binary interface. + +00:13:43 So it's the same as API, but instead of specifically for binaries. + +00:13:48 And the problem that we collectively kind of try to get to is that, well, today, the compute space and the scientific computing space, + +00:13:58 which if we take the latest JetBrains Python developer survey, is at least 40 to 50% of the Python developers are essentially doing data science or similar. + +00:14:10 So it's a massive percentage of the community is doing, in some form, scientific computing to whatever extent you may want to think about it. + +00:14:21 And, well, the problem is when we do these things, we try to do them fast because who likes to wait on the return of some Pandas operation or NumPy or PyTorch operation. + +00:14:32 But to go fast, you need to use all the tricks in the books that you can get to essentially, you have to optimize the binary for a specific CPU, + +00:14:42 for a specific GPU, or for a specific library that you want to use, like BLAS. + +00:14:48 BLAS is a general concept, so which BLAS implementation is, or MPI. + +00:14:53 And the problem is, well, we don't have the tags or markers to allow us to essentially flag this specific binary to be compatible with X, Y, and Z. + +00:15:03 Right, so the wheel might say, this is for 3.14, it is for ARM CPUs, and so on, but it's not going to say, + +00:15:14 and it supports this vectorization optimization on Intel chips, right? + +00:15:18 I just said ARM, didn't I? + +00:15:20 A very good example with ARM is that the default most people build with is actually a Raspberry Pi, ARM level. + +00:15:30 Yeah, yeah, yeah. + +00:15:31 And you can imagine that when you build for any type of desktop CPU, ARM, you have a little bit more complex CPUs and a little bit more advanced chips. + +00:15:42 And it's a lot of performance that you leave on the table by not optimizing for a specific platform. + +00:15:48 So obviously, in some cases, it doesn't really matter, but in other cases, it does really matter. + +00:15:53 This portion of Talk Python To Me is brought to you by Sentry. + +00:15:56 You know Sentry for their great error monitoring. + +00:15:58 But let's talk about logs. + +00:16:00 Logs are messy. + +00:16:01 Trying to grep through them and line them up with traces and dashboards just to understand one issue isn't easy. + +00:16:08 Did you know that Sentry has logs too? + +00:16:10 And your logs just became way more usable. + +00:16:13 Sentry's logs are trace-connected and structured, so you can follow the request flow and filter by what matters. + +00:16:19 And because Sentry surfaces the context right where you're debugging, the trace, relevant logs, the error, and even the session replay all land in one timeline. + +00:16:28 No timestamp matching, no tool hopping. + +00:16:31 From front end to mobile to back end, whatever you're debugging, Sentry gives you the context you need so you can fix the problem and move on. + +00:16:37 More than 4.5 million developers use Sentry, including teams at Anthropic and Disney+. + +00:16:42 Get started with Sentry logs and error monitoring today at talkpython.fm/sentry. + +00:16:48 Be sure to use our code, talkpython26. + +00:16:50 The link is in your podcast player's show notes. + +00:16:53 Thank you to Sentry for supporting the show. + +00:16:55 I'll give a very concrete example. + +00:16:57 Intel x86-64 is kind of the most common CPU that most people will have at home. + +00:17:04 If you build a wheel for that, you can only use CPU features, performance CPU features that go back to about 2009. + +00:17:13 Any new hardware features that were introduced after 2009, things like SSC4, AVX2, later versions of that, you just cannot use because the installers don't know that you put that in the wheel. + +00:17:27 And hence, they will also install it on computers that don't have those instructions, right? + +00:17:32 And then you just get like very ugly crashes. + +00:17:34 Hence, what we all do is we ship wheels, binaries that are only compatible with 2009. + +00:17:40 And the difference between the 2009 hardware features and, you know, the 2019 or 2023 one could be a factor of 10x, 20x in performance, depending on what you're doing. + +00:17:51 10x to 20x? + +00:17:53 Oh, yeah. + +00:17:53 For, you know, especially when you work with scientific data and SIMD instructions. + +00:17:58 Yeah, you can get massive performance increases. + +00:18:01 If you heard of vectorization, this is a huge deal. + +00:18:05 Yeah. + +00:18:05 I mean, I guess the way I think about it from our perspective of building, like, because these problems, like one of the things that's very hard about solving them, and it has required, + +00:18:12 like us to be so collaborative across the industry, is that it touches, like, basically every piece of the Python packaging stack. + +00:18:21 Like, it impacts how you build things, how the registries work, like what they support, how installers, like, choose what to install. + +00:18:30 And so, like, for us, it's like, you know, there's the superpower of Python, I think, in some ways. + +00:18:36 Sorry, I think the superpower of Python in some ways is, like, you can build and distribute all this software that's built for, you know, that uses native code. + +00:18:45 Like, you can take native code, and you can distribute it out to users, and they can run it just like it's any other piece of Python code. + +00:18:51 And in the spec, we have these things like, okay, you can build a wheel that targets Windows or Linux or macOS, and it can target, like, xasics or ARM or whatever else. + +00:19:03 And those are all captured in the spec. + +00:19:04 And so, for us, like, building uv, we know how to detect those things, how to figure out, like, which wheel to install based on what the user's machine is running. + +00:19:13 But there's all this other stuff that's not captured by any of those standards, like the instruction set or even, like, the supported CUDA version. + +00:19:20 Like, all these things are not captured in that wheel file, and installers don't know how to detect them. + +00:19:25 They don't know how to figure out, like, okay, which PyTorch build should I use based on the CUDA version on the user's machine? + +00:19:30 Like, all that stuff is lost. + +00:19:31 And that's kind of the gap that we're trying to bridge. + +00:19:34 And part of the philosophy is also, so right now, Python packaging exposed what is called platform tags. + +00:19:41 So, essentially, a sort of, like, mini tag that comes with a specific definition that installers know how to resolve. + +00:19:48 And what we're trying to evolve is to end up creating 200 more today and 200 more in two years and 200 more, again, in four years. + +00:19:57 So, we try to come up with a generic system that will allow you to essentially include arbitrary definition that then resolvers and package managers can then understand by some sort of mechanism. + +00:20:10 And resolve, but not create a sort of blessed list of things that you constantly have to update because it's a lot of maintenance. + +00:20:19 Yeah, that's how we got into the situation now, right? + +00:20:21 Because there's one for the version, there's one for the architecture of the CPU, but then there's not a spot for the other stuff. + +00:20:27 So, the overall idea is to say almost just a metadata section in there and things can read it or ignore it as they see fit. + +00:20:35 Yeah, that's exactly a concept. + +00:20:37 Conceptually, yeah. + +00:20:38 A little bit. + +00:20:39 Yeah, yeah. + +00:20:40 I mean, it's like, I guess the question is, like, okay, if we have this, like, huge space of things that we might possibly want to detect and condition installs on, like, okay, anytime someone publishes a wheel for Python, they should now tell us, like, what CUDA version is it built for? + +00:20:55 Or, like, if any, what, like, CPU instruction sets does it support? + +00:20:59 Like, blah, blah, blah. + +00:21:00 Like, where would we put all that stuff, right? + +00:21:01 Becomes the question. + +00:21:02 It's like, what, are we just going to keep expanding, like, the platform tag and everything else? + +00:21:06 And that's, like, the problem that we're trying to solve in kind of a generic way. + +00:21:09 Yeah, you can end up with a file name that's 4,000 characters wide or something. + +00:21:13 They can already get pretty long, by the way. + +00:21:14 But, yeah, that's, if you have any changes, we have to work around that in uv sometimes, file name length limits. + +00:21:21 But, yeah. + +00:21:22 It's actually a very famous package that used, I think, 200 first digits of pi as the version number. + +00:21:28 Oh, my gosh. + +00:21:31 It's a pretty good joke. + +00:21:32 I didn't know about it. + +00:21:33 There's somebody on discuss.python.org that posted the link. + +00:21:36 And I was like, but that's hilarious. + +00:21:38 That is wild. + +00:21:40 So, before we dive into what you all are proposing, let's maybe talk about how just a couple of packages or libraries solve this problem now in maybe different directions. + +00:21:50 So, Ralph, what about NumPy, right? + +00:21:52 I mean, you guys talked about vectorization and stuff. + +00:21:56 Yeah. + +00:21:57 That's so in line with NumPy, right? + +00:21:59 Is NumPy, like, and pandas, that's the way, you know? + +00:22:02 Yes. + +00:22:02 NumPy, yes. + +00:22:03 Pandas, no. + +00:22:04 So, NumPy does contain SIMD instructions and, you know, because it's incredibly useful for performance. + +00:22:14 You know, NumPy has all large arrays and basic instructions on them that, like, have direct hardware implementations typically. + +00:22:21 But the way it's done is incredibly complex because you need to end up with a wheel that works on every type of CPU, right? + +00:22:29 We didn't, you know, I'll stay with x86, but the same happens on the other platforms, right? + +00:22:33 You know, it needs to run on a 2010 CPU and it needs to run better on a 2024 CPU. + +00:22:39 So, what we do in NumPy is we have a system that basically allows you to either parameterize a source file that, you know, and then rebuild it multiple times, you know, four different particular CPU architectures. + +00:22:54 So, like, you know, like a Haswell family and then a Skylake family and so on. + +00:22:59 And then we basically merge that together in a single Python extension module. + +00:23:03 And then at runtime, we have our own code to detect the CPU and basically then some, like, dispatch shim layer that kind of fishes out the right, you know, family from the extension module. + +00:23:16 So, yeah, you put up the diagram there. + +00:23:19 It's pretty complicated. + +00:23:21 And I'd say there I've been collaborating with some of the, you know, world experts on this. + +00:23:27 We had like an in the end, this was only successful because we built a generic architecture that other experts per, you know, CPU architecture could come and contribute to. + +00:23:38 So, we now have a specific team of like four people that help maintain the architecture. + +00:23:44 But then like, you know, Intel for years paid one of their engineers to optimize specifically the x86 code path. + +00:23:52 And then ARM has a NumPy maintainer who, you know, got commit writes a few years ago. + +00:23:57 And he's the final authority on all the ARM instructions that are in there. + +00:24:01 So, that whole complicated thing is now shipped and it's extremely good for performance. + +00:24:06 But you can see how this is not a scalable process to do in many packages, right? + +00:24:10 Plus, you know, if you compile everything five times, you get a binary that's, you know, it's not five times bigger, but it's a lot bigger. + +00:24:17 So, it's not great for users as well. + +00:24:19 Yeah, actually the nickname for these things are called Fatbin. + +00:24:22 So, you have the idea for why they are called that way because they tend to be very heavy to download. + +00:24:28 Yeah, yeah. + +00:24:29 Instead of wheels, you got big wheels. + +00:24:31 Yep. + +00:24:31 So, what happens if all these changes get adopted and it doesn't need to be compiled into one giant binary? + +00:24:38 Okay. + +00:24:38 Are all these maintainers still working? + +00:24:40 They just don't have to deal with trying to boot it all into one thing? + +00:24:44 They might still have to do, yes. + +00:24:46 I think essentially you're correct. + +00:24:48 You still need to write the actual code that uses the SIMD instructions. + +00:24:52 But then you can just produce a wheel that says like, okay, it works on this specific CPU architecture and just ignore this code if I'm building for another architecture. + +00:25:01 And all the, you know, detecting the CPU at runtime and the dynamic dispatch features you all don't need. + +00:25:06 Will it make the code faster? + +00:25:08 It will, well. + +00:25:09 Like will you have a better cache hits? + +00:25:11 Will there be smaller stuff in memory? + +00:25:12 You know, that kind of stuff. + +00:25:13 I don't think it will make the NumPy code much faster. + +00:25:17 It will, you know, it will make a huge difference for all the other packages that don't have this amount of complexity today. + +00:25:24 So like SciPy, scikit-learn, Pandas, Pillow, like none of these packages actually use SIMD code. + +00:25:31 And for SciPy, it's the easiest for me to talk about because I'm also a SciPy maintainer. + +00:25:35 We actually have a lot of code that, you know, got vendored in from somehow, like Fourier transforms, for example. + +00:25:41 They benefit a lot as well. + +00:25:42 We have AVX2 and ARM Neon implementations, but we just don't build them and don't ship that as wheels because we have no way of doing that. + +00:25:51 As soon as we have, you know, wheel variants, we can say, okay, let's ship two sets of wheels. + +00:25:56 I mean, that's more CI jobs to build more wheels. + +00:25:59 But, you know, when it's worth it, you know, you can make that trade-off, right? + +00:26:02 Like we already have the code. + +00:26:03 We just have to change a build option, produce a different wheel, and ship it. + +00:26:07 So do you just set up something like a hash if def sort of thing for like if defs this capability? + +00:26:15 Else you put in the generic code? + +00:26:18 Exactly. + +00:26:19 The, yeah, the C code is basically just a bunch of if defs. + +00:26:22 And, you know, if you only, you know, for maintainability reasons, you only add more if defs if, you know, it's really much faster. + +00:26:29 Like you are going to do it for 10 or 20% faster, but if it's 2x faster, well, why not have an extra else bridge? + +00:26:36 Yeah, absolutely. + +00:26:37 Charlie, does Rust have a hash if def equivalent? + +00:26:40 It must, right? + +00:26:40 Yeah, you can do. + +00:26:42 It has directives like that. + +00:26:44 Yeah, but you guys don't really need to worry about using this for yourself. + +00:26:47 This is more for the things that you service providing to everyone, right? + +00:26:52 Yeah. + +00:26:52 Yeah, this is mostly, this wouldn't have a huge impact on uv or, I mean, it could have some small impact. + +00:27:00 But I think largely this is about, yeah, how can we make it easier for users to consume this stuff? + +00:27:04 And I mean, the NumPy, like this is a good example of how it affects like build and distribution. + +00:27:10 Because, yes, they still have to write like architecture specific code if they want to get these optimizations. + +00:27:15 But what we'll be doing with these proposals is making it much easier for them to ship separate builds that are like dedicated for each of those different variants. + +00:27:24 So like the end user, you know, will get access to it. + +00:27:28 But in this case, it's like the bottleneck is, or part of the bottleneck is like all the complexity it puts on the maintainers and the people publishing. + +00:27:36 How much do you think it would impact the performance to ship Python standalone with different CPU extension? + +00:27:42 That is a good question, Jonathan. + +00:27:44 So we'd actually like to do, I don't know that I have a great answer to that. + +00:27:49 I mean, like a good quantitative answer to it. + +00:27:52 I think we are very interested in doing stuff like that. + +00:27:55 We've also considered, for example, shipping a build. + +00:27:57 Like we ship with a relatively old like glibc minimum. + +00:28:01 We've considered shipping a build, a variant, not in the sense of the, sorry, a different build. + +00:28:07 Let me just put it that way. + +00:28:08 That uses a more modern glibc version, for example. + +00:28:11 We do run into other problems with that. + +00:28:13 Like our build matrix is really big. + +00:28:15 We have to split it across multiple GitHub actions now. + +00:28:18 And so like we need to, we just have like a lot of builds. + +00:28:20 So we'd probably, we're worried about like doubling the size of the build matrix, for example. + +00:28:24 But that's a separate problem. + +00:28:26 But yes, it could actually, it could actually be helpful there. + +00:28:28 Although we don't ship those as wheels today. + +00:28:29 Yeah, that's awesome. + +00:28:31 In a very interesting angle to think about how much leverage, I mean, this probably does, this is probably something you've thought about. + +00:28:37 But how much leverage you and your team actually have on Python performance by how you control Python build standalone. + +00:28:44 This portion of Talk Python To Me is sponsored by Temporal. + +00:28:48 Ever since I had Mason Egger on the podcast for episode 515, I've been fascinated with durable workflows in Python. + +00:28:56 That's why I'm thrilled that Temporal has decided to become a podcast sponsor since that episode. + +00:29:00 If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:29:10 I'm sure many of you have written brutal code to keep the workflow moving and to track when you run into problems. + +00:29:15 But it's trickier than that. + +00:29:16 What if you have a long-running workflow and you need to redeploy the app or restart the server while it's running? + +00:29:22 This is where Temporal's open source framework is a game changer. + +00:29:25 You write workflows as normal Python code and Temporal ensures that they execute reliably, even across crashes, restarts, or long-running processes, while handling retries, states, and orchestrations for you so you don't have to build and maintain that logic yourself. + +00:29:41 You may be familiar with writing asynchronous code using the async and await keywords in Python. + +00:29:46 Temporal's brilliant programming model leverages the exact same programming model that you are familiar with, but uses it for durability, not just concurrency. + +00:29:55 Imagine writing awaitworkflow.sleep, time delta, 30 days. + +00:30:00 Yes, seriously, sleep for 30 days. + +00:30:02 Restart the server, deploy new versions of the app. + +00:30:04 That's it. + +00:30:05 Temporal takes care of the rest. + +00:30:07 Temporal is used by teams at Netflix, Snap, and NVIDIA for critical production systems. + +00:30:12 Get started with the open source Python SDK today. + +00:30:15 Learn more at talkpython.fm/Temporal. + +00:30:17 The link is in your podcast player's show notes. + +00:30:19 Thank you to Temporal for supporting the show. + +00:30:22 Maybe just tell people, what is the relevance there? + +00:30:26 Like, why? + +00:30:27 What is Python Build Standalone and how does this even apply to what we're talking about? + +00:30:30 Oh, yeah, sure. + +00:30:31 I use it every day. + +00:30:32 I love it. + +00:30:33 A lot of people use it and don't even know. + +00:30:34 I mean, it's probably the least, it's the least like public or like user, direct user facing thing that we do. + +00:30:41 But we took over maintenance of a project called Python Build Standalone probably like a year ago, maybe a little more. + +00:30:49 And that project, the basic idea is like typically when you build CPython, you know, at least like on Linux, for example, a bunch of absolute paths get embedded into the binary, which makes it hard to build like reproducible and relocatable CPythons. + +00:31:05 Like it's hard for someone to build a CPython that you can then download and run on your machine. + +00:31:08 You typically need to build it on your own machine. + +00:31:12 So what this project does is it's sort of like a fork of the CPython build system. + +00:31:17 It's like the CPython build system with a bunch of patches and other changes applied on top. + +00:31:21 And it makes it so that we can build Pythons that you can just download, unzip and run. + +00:31:27 So when you install Python with uv, and these are also used in like Bazel and in a bunch of other tools, we don't actually like build Python from source. + +00:31:35 We actually download, unzip and run Python, which just makes it much easier. + +00:31:39 It means it's faster. + +00:31:41 You don't have to have like the build tool chain on your machine. + +00:31:45 You don't run into problems around like failing to build it or anything like that. + +00:31:48 But the other thing that's been cool about that project, at least recently, is we've been very focused on performance. + +00:31:53 So on actually just trying to make sure that we're distributing, like our goal is to be like the fastest Python distribution. + +00:32:00 Like even without changing CPython source code, just changing how we build it and various things that we can tweak there. + +00:32:07 And so we've been working on a bunch of benchmarks. + +00:32:08 I do think we have the fastest Python now, but we haven't actually published our rigorous benchmark methodology. + +00:32:14 So I won't stake my reputation on that claim yet, but we've been very focused on it. + +00:32:19 And it's been a cool point of leverage because like we can just, yeah, if we can make Python, you know, if we can put out a Python distribution that's like 10 or 15% faster, you know, just by changing how we build it. + +00:32:28 Yeah, it's a big lever for impact. + +00:32:30 Yeah, it's a huge lever. + +00:32:31 And I hadn't really thought about it being a lever until Jonathan brought it up. + +00:32:35 But for example, it's not directly impacted by this because we don't ship it, I guess, for the reason that we don't ship it as a wheel. + +00:32:40 Although someday we potentially could. + +00:32:42 Right now it's just, they're just the files that uv knows how to install. + +00:32:45 But it's the same logic at the core. + +00:32:47 Once you start tweaking the packaging of Python packages, the next part you want to tweak is your Python install. + +00:32:55 Well, for example, all of my stuff that runs in on the servers, it's all in Docker and it has a base Docker image. + +00:33:03 And one of the very first lines is, you know, install the, use curl plus the shell to install uv. + +00:33:09 The next line is uv, V, E, and V. + +00:33:12 And that, that installs Python from Python build standalone. + +00:33:16 And then whatever, you need to make an actual app out of that afterwards. + +00:33:19 Right. + +00:33:20 And so how many people are doing that? + +00:33:22 I, it seems like a huge portion of the world has adopted uv for sort of bootstrapping Python instead of the other way. + +00:33:29 So that's, that's why it's such a big lever, right? + +00:33:31 Yep. + +00:33:32 Yeah, exactly. + +00:33:33 All right. + +00:33:33 As a way to sort of get into the peps, Charlie, you mentioned variants. + +00:33:39 You're like, wait, wait, wait, not that variant. + +00:33:42 What variant are we talking about? + +00:33:43 That's not that variant. + +00:33:45 What is that variant? + +00:33:46 I guess that we're not talking about in uv or Python build standalone. + +00:33:49 Who wants to take that? + +00:33:49 Ralph, do you want to take that? + +00:33:50 I'm not actually sure what the question is here. + +00:33:53 I think you were targeted for the question. + +00:33:55 Yeah, yeah, yeah. + +00:33:58 That's fine. + +00:33:59 I mean, like the, so we use, so the peps revolves around this concept of wheel variants. + +00:34:03 And the idea is you can have, I'll keep using the word variants. + +00:34:09 You can have different variants, different builds, you know, of a wheel that are intended to be installed based on properties that are known or detected on the machine. + +00:34:20 So, for example, that could be like, okay, what NVIDIA drivers do you have on your machine? + +00:34:29 Like, what are the versions of those drivers? + +00:34:30 Because that then implies things about what versions of the CUDA runtime you can use. + +00:34:34 And so when someone publishes a wheel, maybe that wheel, you know, leverages CUDA and needs to be built against CUDA and needs to be built, you know, in a way that leverages CUDA. + +00:34:43 And so they might publish different variants, effectively just different, you know, slightly different versions of, versions is wrong, different variants, slightly different flavors of that package that are all built against different, you know, different CUDA versions. + +00:34:58 And so we would call those different, you know, different variants. + +00:35:01 It's a, it's a, you need to correct me. + +00:35:03 The terminology across what I understand the packaging space, even outside of Python, if you type variants in general, this is, we try to reuse the terminology that ends up being + +00:35:14 pretty widely adopted in the packaging ecosystem, not Python packaging, the packaging at large. + +00:35:21 This is the, variants is the name that you'll find around for this kind of concept. + +00:35:26 You know, related to that, like, especially in the astral flavor these days, but also in many other areas, I feel like crates and rust, what they've done with their packaging system has kind of influenced some of the things we're adopting in the Python world. + +00:35:41 Has anything from the rust world influenced the, these peps that we're about to talk about? + +00:35:46 Well, crates are source distribution now, mostly. + +00:35:49 Yeah. + +00:35:50 Well, in this case, we're talking about actually binary distribution. + +00:35:54 Yeah. + +00:35:54 Yeah. + +00:35:54 Yeah. + +00:35:54 So not really. + +00:35:55 Okay. + +00:35:56 But in a sense. + +00:35:56 That's actually interesting, right? + +00:35:58 Yes. + +00:35:58 Yeah. + +00:35:59 Because a lot of the best packaging systems, you know, whether it's, it's rust or, you know, Nix, they start from source, right? + +00:36:07 And they know exactly what's, you know, in the box. + +00:36:09 And then binaries are kind of like an optimization, right? + +00:36:12 It's like, you have a thing that you know exactly what is the binary and you can check like, oh, I don't have to build this thing from source. + +00:36:19 I can grab a binary somewhere. + +00:36:20 Right. + +00:36:20 The packaging is absolutely not like that. + +00:36:23 Like if you build a wheel and you have an sdisk, I mean, you have no idea if they're the same thing. + +00:36:28 If you, you know, you cannot rebuild the wheel from the sdisk unless, you know, you use very, very well predefined constraints. + +00:36:36 Yeah. + +00:36:36 Yeah. + +00:36:36 I hadn't really thought about that either, but that is an interesting juxtaposition. + +00:36:40 Like the binary stuff that is all binary is shipping as source, but the interpreted stuff is shipping as binary. + +00:36:46 And I think part of the reason, or maybe the main reason is if we're talking about binary stuff for Rust, well, it's all Rust that's compiled, but for Python, it's this mix, this + +00:36:57 crazy mix of all these different libraries that are not, none of them are Python, but they're all binary in the end. + +00:37:03 And so you've got to get around the fact like, well, I don't have a Fortran and a Haskell compiler, so I can't run this project, you know? + +00:37:10 There's something quite amazing to Python in general, which is called the CFFI. + +00:37:15 So the C foreign function interface, which essentially allows you to build any sort of application you want in whatever language. + +00:37:23 As long as you're compatible with CFFI standard, you can call it from Python and it's incredible and amazingly useful. + +00:37:33 But to come back on what Ralph was saying, a lot of the design actually from WeR variant has been inspired by a system that is called SPAC that was designed for supercomputers. + +00:37:47 And we use this, especially around the design of CPU variants to kind of get a lot of inspiration around a package called RSpec that is just, from my perspective, pure brilliance in some of design. + +00:38:02 Just my words, but in my opinion, but I really think they got the thing right. + +00:38:08 It's just beautifully designed. + +00:38:10 Everything is static and JSON-fired and it's extremely easy to scale and maintain. + +00:38:15 But yes, if you take all the kind of system designed to support the most specific deployment scenarios like SPAC, like Nix, or even in some cases, cargo, well, they mostly ship sources + +00:38:29 to go around this variant problem because that allows you to control the entire build chain essentially. + +00:38:35 And in some cases, maybe Ralph can talk about it, but Conda Forge also kind of take an approach that is similar to Nix to kind of go around these issues a little bit. + +00:38:44 Maybe Ralph, if you want to talk a little bit about that. + +00:38:47 Not quite because Conda and Conda Forge don't do source distributions at all. + +00:38:52 They just take a release and they build binaries. + +00:38:54 And if there are no binaries, you can't install it. + +00:38:58 But yeah, I would say that's a good point, right? + +00:39:00 We have people that worked on all these systems. + +00:39:02 Like one of Jonathan's colleagues at NVIDIA, Mike Saran, used to work on Conda. + +00:39:07 I contribute to Conda Forge as well. + +00:39:09 And so we have some ideas that originally came from Conda, some that came from SPAC. + +00:39:14 And the end result is nothing like, not exactly like any of those systems, but it takes some of the best aspects of them to enhance Python packaging. + +00:39:22 Not reinventing the wheel. + +00:39:24 I mean, maybe, but not too much. + +00:39:27 Yeah, not too much. + +00:39:29 But it's kind of, it's cool because I think, like, I feel like a lot of this work really got kicked off. + +00:39:35 We did an in-person summit. + +00:39:37 And I honestly can't remember when that was because my mind is such a blurb. + +00:39:41 March 2025. + +00:39:42 Thank you. + +00:39:43 Okay, so it was about a year ago. + +00:39:44 And there's a bunch of notes about this. + +00:39:46 And we had people from probably like, I don't know, I'd have to guess 20 different companies, maybe more, all in person for a day, just talking about these problems. + +00:39:56 And a bunch of people presented on their own open source projects and how they intersect with, like, we had people from PyTorch, people from the JAX team, just talking about like, how, what their concerns are, like, what's working well for them, what's not. + +00:40:07 And so, you know, similarly to how we've, I think a lot of the design has really been influenced by like, what are other designs? + +00:40:14 What's the prior art and like, what's working well? + +00:40:17 You know, a lot of it was also informed by like, just talking to a bunch of people across the industry and understanding like, what their concerns are. + +00:40:24 And so, at least from my perspective, having not, honestly, by calendar time, I have not been involved in Python that long. + +00:40:31 But it's been like, definitely the most like cross company, cross project, cross organization effort I've been involved in by a lot. + +00:40:38 We try to replicate a model that I really like in the Python community that was faster cpython. + +00:40:46 We try to philosophically create the packaging child of faster cpython. + +00:40:51 But, and that's how we created We Are Next. + +00:40:54 It was all the amazing work that the faster cpython community did on the cpython side, and kind of creating the same synergy, but around Python packaging. + +00:41:05 And that's why it was. + +00:41:06 I would almost say it's even, you know, quite a bit more diverse. + +00:41:11 At least my understanding is faster cpython is primarily like funded and created by Microsoft, and it kind of turned into a community thing. + +00:41:18 But like, all the money came from Microsoft, I think. + +00:41:21 I think the majority of the people were working in a team inside Microsoft, at least. + +00:41:25 And here, we've got NVIDIA, Meta, the PyTorch folks at Meta. + +00:41:30 We got some contributions from AMD and Intel, and then Astral, QuantSight. + +00:41:35 Large amount of the time that we've been able to spend at QuantSight came from funding from Red Hat, who came with their own problem sets. + +00:41:43 And, you know, so, and that's just the most prominent contributors. + +00:41:47 So there's like at least 10 companies that started investing in this, because it solves so many problems. + +00:41:53 Yeah, that's really encouraging as well. + +00:41:53 On the left side, you'll see a section called Who We Are. + +00:41:57 Yeah, so I pulled up this project, Wheel Next. + +00:42:00 And, you know, Ralph, this is yours? + +00:42:02 Yeah, who are we? + +00:42:03 And the name of also the open source project that contributed time and expertise. + +00:42:09 Yeah, AMD, Anaconda, Aprio, Astral, Google, Huawei, Intel, Lap, Lab, Meta, NVIDIA, Preferred Networks, QuantSight, and Red Hat. + +00:42:18 That's a bit of a group working on this. + +00:42:21 And you can see just above all the different open source projects that different OSS and lead maintainers have contributed time and energy to kind of try to make this move forward. + +00:42:32 So it is quite a few people. + +00:42:35 Yeah, yeah. + +00:42:35 Most notably, maybe QPy and PyTorch, possibly. + +00:42:38 I mean, they're all... + +00:42:39 Maybe one company that is not too well known, undeservingly, because they should, which is Probable at the bottom that you mentioned, which is essentially the support company behind scikit-learn. + +00:42:52 So if people don't know it, Probable is essentially representing Scikit. + +00:43:00 Yeah, so this is wheelnext.dev. + +00:43:03 This is basically the website for the group, the working group, something like that. + +00:43:08 Yep. + +00:43:09 We try to leave our notes, our thinking, our drafts. + +00:43:13 One aspect that I really like on the work that we did is that it kind of felt like a startup. + +00:43:18 We were making a mock-up and iterating very fast and getting feedback. + +00:43:23 And this, I don't like this. + +00:43:24 I don't like this. + +00:43:25 I don't like change it. + +00:43:27 I worked really closely with two people, one from QuantSight, one from Astro, Constantine and Michel. + +00:43:34 And we did so many hours of work. + +00:43:37 So many different prototypes, iterating, exposing the work to people, collecting feedback, adjusting, and repeating the cycle so many times until we finally got to something that we thought was reasonable. + +00:43:52 And that's where we started to write the peps. + +00:43:54 But that process took us a year. + +00:43:57 All right. + +00:43:58 Well, we should probably jump into the peps. + +00:44:00 And I'll tell you what, you all have quite the authorship attribution here. + +00:44:04 But also, I believe, correct me if I'm wrong, that this PEP is notable in that it's the longest PEP ever. + +00:44:10 Something like that, right? + +00:44:12 Yeah. + +00:44:12 I don't know if it's an achievement to be proud of. + +00:44:17 It's the most powerful PEP ever. + +00:44:19 Yes. + +00:44:20 No, no. + +00:44:20 It's a super pep. + +00:44:21 So much so that we're talking about PEP 817 wheel variants, which is the variant thing that we actually are talking about, not the other variants, beyond platform tags. + +00:44:31 But then so much so that it actually got kicked to the curb for like, well, what is the minimal viable PEP of this pep? + +00:44:38 So we can take it in steps. + +00:44:40 And Jonathan, you just told me really good news that pep, so you spun off this other pep, PEP 825 wheel variants package format, which is smaller, which still has a significant authorship. + +00:44:52 But that this was just, it says draft, but is that true? + +00:44:56 Yeah. + +00:44:56 Yeah. + +00:44:57 So peps, maybe Ralph, you want to discuss a little about what's the process for PEP that I think that's important. + +00:45:04 Yeah. + +00:45:04 So when you submit a pep, it first, you know, submit up on GitHub. + +00:45:08 And then there's a group of folks called the PEP editors who basically just edit, you know, they review it for clarity, you know, language, consistency with other peps and so on. + +00:45:17 So they don't really look at the content of what you're proposing. + +00:45:21 So it's just, as long as it's clear, they're happy, you merge it in. + +00:45:24 But because the first PEP was already so long, that process took like over a month already. + +00:45:29 But at that point, it's merged as draft. + +00:45:31 And then you go to the Python packaging discourse where you say, okay, here's our pep. + +00:45:38 You know, now please let's start the actual community review. + +00:45:41 And then basically anybody with an opinion can weigh in. + +00:45:44 And it's just, it's a forum. + +00:45:47 They're not, it's not even a threaded forum. + +00:45:49 So it's just one long thread of comments, which tends to make it like a little challenging. + +00:45:54 You know, the more complex the topic gets, the harder it is to make sense of this conversation. + +00:45:59 It's really hard to have a threaded multi-component conversation. + +00:46:04 It is. + +00:46:04 Exactly. + +00:46:05 So that's one of the reasons it's now split into smaller parts. + +00:46:09 So you can at least have separate threads about different topics, right? + +00:46:12 So, and because especially not all of the parts of the design apply to everybody. + +00:46:17 When we're talking about installers, we want to hear primarily from the authors of uv and pip, Poetry, Hatch, PDM. + +00:46:26 But if we're talking about how do you build a wheel, well, we have to talk primarily to setup tools, Zykit, BuildCore, Meson Python, the build backends. + +00:46:36 And, you know, the index server the same, right? + +00:46:39 Do you want to know that the PyPI maintainers are happy? + +00:46:42 So that's why, you know, organizing this review and chopping it up into a complex part, it's still going to be really hard to get the right amount of feedback. + +00:46:50 But we now have like the first PR, you know, the first merge path in draft status. + +00:46:56 So it's going to only be accepted once the whole community review process is done. + +00:47:02 And probably what will happen is it's going to be provisionally accepted only because we know there's like three more paths coming for the other parts. + +00:47:10 And eventually, like the, you know, you want all four to be, you know, working and accepted. + +00:47:15 Like, you know, we now have prototypes, but, you know, we want the prototypes for the final design and have like, you know, the tool author say like, yeah, this works for us before you really go from provisional to actually accept. + +00:47:26 Amazing. + +00:47:27 So this is part of what I get out when I said at the beginning that this touches like every part of the packaging stack. + +00:47:33 There's just like, it's very hard to break it up into, I mean, that's what we're trying to do in some sense. + +00:47:39 But like, it's from the start, it's been hard. + +00:47:41 It's hard to, there aren't necessarily super great cut points because it does affect how you build packages, how you publish them, like how they get hosted and served from the registry, how installers like look at them and understand them. + +00:47:53 All of those things, like marker syntax, all of that stuff gets impacted in different ways. + +00:47:59 It's very funny. + +00:47:59 We're prototyping this for a year. + +00:48:03 We ended up pretty much forking the entire ecosystem. + +00:48:06 pip got fork, EV got forks, warehouse got fork, packaging got fork, like absolutely every package in the ecosystem, but it didn't being forked because we needed to test our implementation. + +00:48:21 And we needed to verify. + +00:48:21 The goal, of course, is to unfork those things. + +00:48:24 Yes. + +00:48:24 Like over time. + +00:48:25 It's a re-merge pack, but we needed to have a playground to be able to experiment and see how the concept that we were developing was functioning in pip. + +00:48:36 And then in packaging, but then also in setup tools. + +00:48:39 And then in scikit build core. + +00:48:42 And then in Python method. + +00:48:43 And it just keeps spreading essentially to every single corner of the packaging, installation, and distribution aspect of Python. + +00:48:51 So that was pretty funny. + +00:48:53 Yeah. + +00:48:53 What ecosystem you got in? + +00:48:56 I think you have fork in uv, or I guess technically it's just a branch that Constantine on our team on here on the PEP has been, who's been super involved. + +00:49:06 Oh, thanks. + +00:49:07 Who's been super involved, you know, throughout and done a ton of work on basically implementing the standard in uv. + +00:49:12 So we have like a working implementation that we've used to, yeah, you can actually install it from, you know, we basically distribute it to a slightly different URL. + +00:49:22 So you can install it and test it. + +00:49:24 But yeah, that's been, that fork has evolved a lot, or that branch has evolved a lot. + +00:49:29 And it's been a lot of work to, I mean, it's been incredibly helpful for the design process for us to understand like what's hard, what's easy. + +00:49:35 And then I also think it's important for PEPs just to have like working implementations too. + +00:49:40 And I mean, a lot of people agree that's not an awful point, but that's been one of the goals too, is to show what it's like in practice and that it actually works. + +00:49:47 So people want to play around with this. + +00:49:48 An easy way might be to try to use this fork. + +00:49:52 We put a lot of work to actually make it, go ahead and try it. + +00:49:56 Because I think it's, I personally have a lot of like admiration for the work done in free threading Python, especially to the PEP. + +00:50:05 And I think Sam Gross, who is the main author, managed to make significant amount of progress as he was coming up with prototypes that are, it's not just my word. + +00:50:15 Let me show it to you. + +00:50:16 It works. + +00:50:18 And there was so much skepticism around that idea of free threading Python. + +00:50:22 He had to, had to show, not tell. + +00:50:24 But I think if we didn't do the work similarly on variant enabled wheels, people would have told us, oh, well, resolution is too slow. + +00:50:34 It's going to slow down installers too much. + +00:50:36 And Astro is probably one of the installers that care the most about speed. + +00:50:41 So we need to both convince us, but also Charlie's and his team to be, hey, it's not going to slow down anything. + +00:50:47 Yeah. + +00:50:48 And we had plenty of feedback on that front too. + +00:50:49 Well, during the design, we were like, no, this is going to be too slow. + +00:50:52 Or like, this is like a better way to do it, et cetera. + +00:50:54 But, but I like, I mean, I like this little snippet. + +00:50:57 Cause like, this is basically like, if you haven't felt this pain, it might not be meaningful to you. + +00:51:02 But if you've like worked with PyTorch, like this is kind of like, this is what we want to enable. + +00:51:06 Right. + +00:51:06 Is like, you don't, you don't have to like configure a specific index URL that like captures the CUDA variant or anything like that. + +00:51:12 Like you just say, hey, install Torch. + +00:51:15 And then in this variant enabled build, uv would, it would go look at Torch. + +00:51:19 It would see, okay, Torch, you know, it has different variants for different CUDA versions. + +00:51:25 And here's how I inspect, you know, what CUDA version I should use on your machine. + +00:51:29 And then it would pick out the right version based on what's supported by the GPU that's running. + +00:51:32 Like that should all happen and users shouldn't have to think about configuring it effectively is like what we were, what we have been working towards. + +00:51:39 And in the future, the first line doesn't exist because right now the first line is just here to install this variant enabled. + +00:51:44 Yeah. + +00:51:44 That just installs the fork. + +00:51:46 Yeah. + +00:51:46 And for people listening and not watching what they mean by this, there's three lines here to say how to use this. + +00:51:51 It says curl. + +00:51:51 I'm sorry. + +00:51:52 Basically. + +00:51:52 Yeah, no worries. + +00:51:53 It's the install statement for uv, which is typical, except for that it overrides the. + +00:51:59 The download URL. + +00:52:00 The download URL. + +00:52:01 It's a different URL, which is wheelnext.astral.sh. + +00:52:05 We serve, we distribute a separate variant enabled experimental quote unquote prototype build. + +00:52:10 Right. + +00:52:10 And then you just create a virtual environment, uv, V and V, and then you just uv pip install like normal, but it handles this. + +00:52:16 And, you know, Charlie, we spoke, I think on the pyx episode about just how large some of these things are like PyTorch and others that are compiled there. + +00:52:26 You can't just come download everything, all the variations into one wheel. + +00:52:30 I mean, I guess you could, but it'd be crazy, right? + +00:52:32 That's actually a big benefit, right? + +00:52:34 Like right now you go to PyPI, you download the PyTorch wheel. + +00:52:37 It'll be about 900 megabytes. + +00:52:40 You could make it small. + +00:52:41 You know, part of the reason it's so large is, again, these bad binaries, right? + +00:52:44 Like the NumPy ones are like a few megabytes. + +00:52:47 The PyTorch ones have a bunch of CUDA inside, like for five or six different CUDA architectures. + +00:52:52 And, you know, it floats very, very quickly. + +00:52:54 And actually the PyTorch team has to try incredibly hard to stay under one gigabyte. + +00:52:59 If we have variants, we can just slim it down to one CUDA architecture per wheel, you know, so you can go down to like, you know, 200 megabytes or so, 250 maybe. + +00:53:09 But it's way better for, you know, both for index servers, it's better for users. + +00:53:14 It's going to be pretty slow too. + +00:53:16 The only thing it's not better for it's CI servers that have to build all these different things if you start sharding. + +00:53:23 But that's a one-time cost that at the end ends up being. + +00:53:26 It's much better to have a slight increase one time and massive decrease scalable, essentially. + +00:53:33 You build it once, it gets installed a million times. + +00:53:36 That's a massive difference. + +00:53:38 And, you know, it's also better for the warehouse folks like PyPI. + +00:53:43 And it's easy for people to just assume pip install, uv pip install, that sort of stuff is going to work. + +00:53:49 But the cost of just the bandwidth in that infrastructure is astronomical, which is crazy. + +00:53:55 So this is going to be a major benefit for bandwidth. + +00:53:59 Yeah, and like also like install speed. + +00:54:01 You'll also benefit from that because you're no longer downloading as much stuff to actually install PyTorch. + +00:54:07 I mean, if you use uv, it's got some really good caching and it's pretty quick. + +00:54:12 Oh, but it doesn't multiply your bandwidth by magic. + +00:54:16 I wish. + +00:54:18 Charlie, if you find a solution to that. + +00:54:20 I haven't yet. + +00:54:22 But yeah, if you're downloading Torch and all the NVIDIA, all the CUDA stuff, it's, yeah. + +00:54:27 It's hefty. + +00:54:27 It's a large number of megabytes. + +00:54:31 Let's talk real quick about the PyPackaging Native Guide. + +00:54:34 And then I want to get an update on pyx real quick before we go. + +00:54:37 So, Ralph, this is your project, right? + +00:54:39 Tell us about this. + +00:54:39 I'll be sure to show it. + +00:54:41 Okay, so I've been watching discussions about some of the topics we've talked about in this episode, you know, since 2010 or so in Python packaging. + +00:54:51 And even back then, long before we had wheels, you know, NumPy, for example, had different .exe installers that we would upload to PyPI. + +00:54:59 And, like, there would be one named underscore sse2, one underscore sse3. + +00:55:04 And, like, user had just the right .exe and install it on their Windows machine. + +00:55:09 What? + +00:55:09 Wow. + +00:55:10 Oh, I have no idea. + +00:55:11 Okay. + +00:55:11 Yes, it was not fun. + +00:55:14 And actually, this was by far the hardest thing when I became NumPy release manager because we had to build these things on Linux under Wine. + +00:55:21 And there were no instructions and there were really janky scripts. + +00:55:24 So, it took me three months to get the first release out. + +00:55:27 But, yeah, so we, I saw all these discussions about, you know, this was sse2 and sse3. + +00:55:34 And, like, you know, the pip authors and, you know, most of the people who work with pure Python, like, you know, the DevOps folks, the, you know, web framework folks, they had no idea about this. + +00:55:44 And usually, these conversations went in circles because when you explain something to one person, the next person would come in and, like, you know, this is endless mailing list threats. + +00:55:52 That would never go anywhere. + +00:55:53 So, after, you know, seeing that for 12, 13 years or so, I, you know, finally got tired of that. + +00:55:59 And I thought, I'm going to write a reference site that explains the problem. + +00:56:03 I don't want to propose any solutions, but just explain the problem. + +00:56:06 So, the next time someone starts a new conversation about, you know, SIMD extensions or about GPUs or, you know, about some of the issues with mixing, you know, source and binary distributions. + +00:56:18 Just link to this site. + +00:56:19 Like, please use that as our best, you know, approach at trying to summarize a problem, you know. + +00:56:24 So, we have a baseline to start talking about solutions. + +00:56:27 And I think, you know, Jonathan, you know, is one of the people who saw this. + +00:56:30 I think a lot of people read this. + +00:56:32 But it was a nice basis to, you know, just point at this as, like, there are your problem descriptions. + +00:56:38 And, you know, for the GPU part, like, NVIDIA folks really helped to make sure that all the explanations of the problems were correct. + +00:56:46 So, when we started Wheel Next, we could just start talking about, like, okay, what are the solutions here? + +00:56:51 This website is absolutely incredible. + +00:56:54 It's amazing. + +00:56:55 Yeah, it's amazing. + +00:56:56 Thanks to the work that Ralph and every contributor to this website have made. + +00:57:01 This is by far the best explanation anywhere on the internet to all these packaging issues. + +00:57:08 And I really like the perspective that Ralph has took, which is don't state the solution. + +00:57:12 Just focus on stating the problem very clear. + +00:57:14 And then with Wheel Next, we try to take the exact flip coin, flip side of the coin, which is don't focus on the problem. + +00:57:21 It's already explained. + +00:57:23 Just focus on proposing one solution to some of the problems. + +00:57:26 And this is how we created Wheel Next. + +00:57:28 I love it. + +00:57:29 You know, one of the big problems, challenges, I guess, is if you don't fully understand the problem space, you could be debating two different things. + +00:57:37 And one person sees a really important angle, the other person doesn't even see that angle. + +00:57:41 They have a different perspective that they're arguing for optimizing for. + +00:57:45 And so, yeah, it's sort of a little bit like the Wheel Next stuff. + +00:57:48 Like, let's get everyone involved and see all the angles and then discuss it, right? + +00:57:52 Exactly. + +00:57:52 Well, you know the saying, a problem well stated is a problem half solved. + +00:57:57 So, this is exactly what we are trying to say. + +00:58:02 I love it. + +00:58:02 All right, I want to get a quick update on pyx since I feel like, Charlie, you're right in the middle of this. + +00:58:10 I know pyx was looking to solve some of these problems as well. + +00:58:14 Give us the elevator pitch and just, we have a whole episode on this from, I don't know, six months ago or something. + +00:58:19 But, yeah, give us the, what's the situation here and does this change things on how you're handling it and make things easier? + +00:58:24 Yeah, yeah, yeah, for sure. + +00:58:26 So, like, yeah, pyx is our hosted package registry and it's in beta right now. + +00:58:32 So, we're live with a bunch of great customers. + +00:58:37 The goal of pyx is basically to enable us to solve, like, more of the packaging problems that we see in the uv issue tracker by having our own registry that we think is well implemented and solves problems that we see that other registries don't really solve. + +00:58:51 So, like, basically from the start, the way that we've approached the wheel, these, like, problems around the GPU stuff is from, like, two perspectives. + +00:59:00 And in pyx, we're really just focused, in terms of how it overlaps with wheel variants, we're really just focused on the GPU part. + +00:59:07 But the way that we've approached it has basically been try to push the standards forward as much as we can. + +00:59:14 And that's what we've been doing in this effort. + +00:59:15 And then simultaneously try to figure out how we can help users, like, until the standards change. + +00:59:21 And so, pyx has mostly been, has more been in that second camp of, like, assuming the standards don't change because we don't want to, we don't want to, like, unilaterally start changing a bunch of things, like, without going through the process. + +00:59:32 How can we make the world, like, a little bit easier for people who are working with this kind of stuff? + +00:59:36 So, for example, like, in pyx, we take a lot of packages that are, like, PyTorch extensions or need to be built against CUDA, and we, like, build those. + +00:59:46 Like, we build them across a wide range of, like, CUDA versions, PyTorch versions, Python versions, CPU architectures, and we make those available to users. + +00:59:53 So, it doesn't solve the core problem of, like, how do you build and distribute this stuff? + +00:59:58 But it does mean that, like, if you're operating within the constraints of, like, the current set of standards, we can make people's lives easier by making it so they don't have to build so many things. + +01:00:06 Like, we build them well, they all work together, all that kind of stuff. + +01:00:10 So, that's what, like, we've been focused on. + +01:00:12 And I think, like, looking forward, like, our goal is to support Wheelnecks, like, as soon as, like, sorry, Wheel Variants, like, as soon as possible, and, like, put those into the registry. + +01:00:21 So, as soon as we feel like that's a, you know, a feasible thing to do on the registry, we'll support it in pyx and support it for, like, our users and our customers. + +01:00:29 But in the meantime, it's kind of been, like, a parallel track effort of pushing forward on all the Wheelnecks work and standards, and then just trying to, like, solve immediate user problems without changing standards, like, partly through the registry. + +01:00:40 Things are going good at pyx? + +01:00:41 You're making progress? + +01:00:42 Yeah. + +01:00:42 Getting closer to public launch? + +01:00:43 Yeah, we're making progress. + +01:00:44 Yeah, yeah. + +01:00:45 No, it's good. + +01:00:46 Customers are growing. + +01:00:47 Numbers are growing up. + +01:00:48 It's good. + +01:00:48 Awesome. + +01:00:49 People want to try pyx? + +01:00:50 What are they? + +01:00:51 Are they? + +01:00:51 They can join the waitlist here. + +01:00:53 Yeah, yeah. + +01:00:53 This is, you know, you just, we have a, yeah, or you can go to astral.sh.pyx, and we look at all the responses, and we basically onboard people one by one. + +01:01:01 So, talking about when is this stuff going to be ready, you'll be able to adopt it, I guess maybe that's a good place to close out our conversation here is, what's the timeline? + +01:01:10 What are expectations? + +01:01:11 How are things going? + +01:01:12 What's next? + +01:01:13 It's a great question. + +01:01:14 Everything's open source. + +01:01:15 It's a two-month delay. + +01:01:20 No. + +01:01:20 What's the party line on this question? + +01:01:25 Oh, gosh. + +01:01:27 Well, it's, I liked, we have a joke inside, I don't know if it's inside widespread inside WeOnX, but we call this the Barry's fourth law, Varso fourth law, I don't remember exactly how, which is essentially make an estimate, multiply it by two, and change the unit. + +01:01:44 So, if you think it's going to take six months, it's one year. + +01:01:47 Oh, no. + +01:01:48 Change the unit, one decade. + +01:01:50 And it's a running joke that we have that I think is really good. + +01:02:00 Realistically, I think it depends on where are we going to set the bar for starting to roll things out. + +01:02:09 So, as Ralf was saying, we'll probably see some provisionally accepted. + +01:02:13 But as we get to that point, some of the stuff will be possible. + +01:02:18 For example, I expect that little by little, we can start experimenting with things without getting necessarily to the absolute final stage. + +01:02:28 But the full feature will be available to the app at the last stage. + +01:02:32 So, complicated question to answer. + +01:02:35 We hope that it's not going to take too many years. + +01:02:38 I'll make a connection back to pyx here. + +01:02:40 Because I think, you know, there's part is like, okay, there's four PAPs that need to be reviewed. + +01:02:45 Probably we need to update some prototypes here and there. + +01:02:48 It's probably going to take, you know, the better part of this year. + +01:02:51 At that point, you know, you have accepted PAPs, right? + +01:02:53 But then PyPI needs to be updated. + +01:02:55 Like, you know, all the tools that, like Twine, would need to be updated. + +01:02:59 Like, there's a new metadata version. + +01:03:01 So, everything that consumes that needs to be updated before, you know, package authors can actually start producing these wheels and upload them to PyPI. + +01:03:10 So, that's going to not be this year, right? + +01:03:12 There's a very long tail of, you know, how the implementation rolls through the ecosystem. + +01:03:17 And then you have to wait until users get newer tools. + +01:03:20 And then, only then can you start uploading wheels. + +01:03:22 So, I'm going to poke at Charlie a bit here. + +01:03:25 Because one of the advantages of having a separate registry is, you know, plus the ability to rebuild everything. + +01:03:31 You can start using variant wheels, like, the moment that everything is accepted. + +01:03:36 It's way sooner. + +01:03:36 I know. + +01:03:36 It's way sooner. + +01:03:36 Yes. + +01:03:37 Have you thought about that? + +01:03:38 That is true. + +01:03:39 Yeah, yeah, of course. + +01:03:40 Yeah. + +01:03:40 I think from our perspective, we're mostly like, do we feel like the design is done or how much turn will there be on the design? + +01:03:47 But yeah, we're definitely in a position to, like, start building and distributing this stuff much, much sooner. + +01:03:51 uv has a second advantage, which is I think they have a much shorter tail of users in terms of version. + +01:03:57 I think uv users end up on a much more quote-unquote recent version. + +01:04:02 If you look at pip, I think, I don't remember the statistic on top of my head, but a still significant portion of users use five-year-old version of pip, which I don't even know which version of Python. + +01:04:12 It was 3.9 or something. + +01:04:15 So it is, uv is able to move a lot faster, but also the users are more reactive. + +01:04:21 That's a very interesting point. + +01:04:23 A very interesting angle. + +01:04:24 I mean, I think a lot of people who are very tuned into the Python space have switched to uv, started using uv. + +01:04:30 And there's probably a lot of people who don't read the newsletters, don't listen to the podcast, and so on. + +01:04:35 And they know pip, and they just keep on PIPing, which is fine. + +01:04:38 I'm not knocking it. + +01:04:39 But, you know, it means not only are they using pip, they might be using an older version of Python because they don't want to shake it up. + +01:04:47 And, you know, those are going to be the long tails that are going to be hard. + +01:04:50 I guess one more thought about what's next here before we call this a show here. + +01:04:55 What is the minimal? + +01:04:56 We talked about PEP 825, the minimal PEP. + +01:04:59 What is the minimal amount of adoption, right? + +01:05:01 So if the top five biggest data science and machine learning libraries adopt this and the installer tools like uv and pip support it, that actually alone might be a really big benefit if all the other packages are just ignored, right? + +01:05:15 So that's way more achievable than every single package that has native code has all these specifiers, right? + +01:05:22 What's the minimum level of adoption? + +01:05:24 I'd say that, I mean, the minimum level at which you can call it a success, yeah, five is probably not that far off. + +01:05:30 The benefits start to accumulate quickly. + +01:05:32 But I would expect once packages like PyTorch start adopting this, especially in the deep learning space, you know, this will be adopted very widely, very quickly because it solves so many problems. + +01:05:43 Like many of the most popular packages like VLLM with very large development teams and very large numbers of users. + +01:05:50 If you look at their install pages, it's like, you know, it's like a puzzle book. + +01:05:54 You just don't know how to install this stuff and they don't have wheels on PyPI and they have their own extra index servers. + +01:06:00 And it's not for lack of trying. + +01:06:02 It's not for lack of trying. + +01:06:03 Like those teams like put a lot of effort into trying to make it easier to install, but they basically all run into different kinds of roadblocks. + +01:06:09 I think five packages is what you'll get after maybe two weeks. + +01:06:14 After a month, you will get twice that amount and probably a quadratic progression for quite a few weeks. + +01:06:21 But it's especially in the scientific compute space and maybe machine learning to be more specific. + +01:06:27 Well, the moment that it works, so many packages will switch. + +01:06:31 Like so many. + +01:06:32 If you just take PyTorch, half of its dependencies will probably activate variant mode. + +01:06:36 And then the people that build on top of PyTorch are people who build on top of Jax. + +01:06:41 So just that, you end up with at least 50 packages in a matter of a few months. + +01:06:47 Yeah, I'm just thinking there's probably a very small set that are feeling the most pain. + +01:06:52 You could do direct outreach to just the most important projects and get that adopted and make a really big difference, even if it's not every package. + +01:07:00 But the funny part is that most of the packages that would be interested, that we would reach out to are already part of We Are Next. + +01:07:08 Because they in some way find the pain pretty significant and are starving for a solution. + +01:07:16 They know. + +01:07:17 They already know. + +01:07:18 All right, let's call it a show, folks. + +01:07:20 Let's final call to action. + +01:07:21 People out there listening, either they're maintainers of packages or they're users of these libraries or they got their own open source project. + +01:07:30 They're seeing the light. + +01:07:31 They want to get involved. + +01:07:32 They want to try it out. + +01:07:33 What do you tell them? + +01:07:34 Well, first, it would be great if people were to come and discuss the Python.org. + +01:07:39 That's where the community is trying to aggregate to discuss all these different proposals. + +01:07:45 So I think the more people get involved, the more better. + +01:07:51 But also trying the different packages that we are trying to publish that Charlize has been helping us and his team has been helping us to create a sort of end-to-end experience. + +01:08:02 I think right now we have example of Linux, macOS, and Windows. + +01:08:08 It works on different type of hardware, different type of CPUs, different type of GPUs. + +01:08:13 It works pretty broadly, and we wanted to give a sort of sample flavor of what could be a variant-enabled world. + +01:08:22 Yeah, I'd say, yeah, for the majority of listeners, they're not going to be packaging tool authors, right? + +01:08:26 So those are the ones you would expect to participate in the review primarily. + +01:08:31 But I'd say if you're a user of any of the packages we mentioned, just try it out. + +01:08:35 You know, download the uv variant-enabled installer. + +01:08:39 And if you're a package author and we haven't mentioned your package, but it will solve a problem for you, get in touch. + +01:08:45 Because I think that's maybe the most relevant part here. + +01:08:49 There's at least hundreds, maybe thousands of packages that we think we have answers for. + +01:08:55 But if their solution or their problem statement is slightly different, I think now would be a great time to learn and make sure we cover as many use cases as possible. + +01:09:02 Yeah, I mean, I guess the only thing I'd say is ideally the average user won't even have to think about this, right? + +01:09:08 And hopefully they just get it through uv or through pip or whatever in the long term. + +01:09:13 But that may take time. + +01:09:15 But that's our goal, certainly. + +01:09:16 Yeah, it's all behind the scenes. + +01:09:18 They don't know. + +01:09:18 But certainly, if it solves a problem, reach out, be part of it. + +01:09:23 Jonathan, Ralph, Charlie, thanks for being on the show. + +01:09:25 It's been great. + +01:09:26 Keep up being around. + +01:09:26 Thanks for having us. + +01:09:28 Bye. + +01:09:29 Bye-bye. + +01:09:29 Bye. + +01:09:30 This has been another episode of Talk Python To Me. + +01:09:32 Thank you to our sponsors. + +01:09:34 Be sure to check out what they're offering. + +01:09:35 It really helps support the show. + +01:09:37 This episode is brought to you by Sentry. + +01:09:39 You know Sentry for the error monitoring, but they now have logs too. + +01:09:43 And with Sentry, your logs become way more usable, interleaving into your error reports to enhance debugging and understanding. + +01:09:51 Get started today at talkpython.fm/sentry. + +01:09:54 And it's brought to you by Temporal, durable workflows for Python. + +01:09:58 Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +01:10:05 Get started at talkpython.fm/Temporal. + +01:10:09 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:10:21 Best of all, there's no subscription in sight. + +01:10:24 Browse the catalog at talkpython.fm. + +01:10:27 And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:10:32 Just search for Python in your podcast player. + +01:10:34 We should be right at the top. + +01:10:35 If you enjoyed that geeky rap song, you can download the full track. + +01:10:38 The link is actually in your podcast blur of show notes. + +01:10:41 This is your host, Michael Kennedy. + +01:10:43 Thank you so much for listening. + +01:10:44 I really appreciate it. + +01:10:45 I'll see you next time. + +01:10:46 Bye. + +01:11:16 Bye. + diff --git a/transcripts/544-wheel-next-packaging-peps-transcript-final.vtt b/transcripts/544-wheel-next-packaging-peps-transcript-final.vtt new file mode 100644 index 0000000..8c9877c --- /dev/null +++ b/transcripts/544-wheel-next-packaging-peps-transcript-final.vtt @@ -0,0 +1,3139 @@ +WEBVTT + +00:00:00.000 --> 00:00:06.060 +When you pip install a package with compiled code, the wheel you get is built for CPU features from 2009. + +00:00:06.660 --> 00:00:11.440 +Want newer optimizations like AVX2? Your installer has no way to ask for them. + +00:00:11.740 --> 00:00:15.600 +Want GPU support? You're on your own configuring special index URLs. + +00:00:16.020 --> 00:00:22.280 +The result is fat binaries, nearly gigabyte-sized wheels, and install pages that read like puzzle books. + +00:00:22.640 --> 00:00:34.240 +A coalition from NVIDIA, Astral, and QuantSight has been working on WheelNext, a set of peps that let packages declare what hardware they need and let installers like uv pick the right build automatically. + +00:00:34.660 --> 00:00:37.080 +Just UVPip install Torch and it'll work. + +00:00:37.440 --> 00:00:47.060 +I sit down with Jonathan Decker from NVIDIA, Ralph Gommers from QuantSight and the NumPy and SciPy teams, and Charlie Marsh, founder of Astral and creator of uv, to dig into it all. + +00:00:47.520 --> 00:00:52.160 +This is Talk Python To Me, episode 544, recorded March 2nd, 2026. + +00:00:53.740 --> 00:00:56.460 +Talk Python To Me, yeah, we ready to roll. + +00:00:56.540 --> 00:00:59.320 +Upgrading the code, no fear of getting old. + +00:00:59.400 --> 00:01:03.120 +They sink in the air, new frameworks in sight, geeky rap on deck. + +00:01:03.420 --> 00:01:05.120 +Quark crew, it's time to unite. + +00:01:05.240 --> 00:01:08.160 +We started in Pyramid, cruising old school lanes. + +00:01:08.420 --> 00:01:09.920 +Had that stable base, yeah, sir. + +00:01:09.920 --> 00:01:14.380 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:14.820 --> 00:01:16.240 +This is your host, Michael Kennedy. + +00:01:16.580 --> 00:01:20.220 +I'm a PSF fellow who's been coding for over 25 years. + +00:01:20.760 --> 00:01:21.920 +Let's connect on social media. + +00:01:21.920 --> 00:01:25.380 +You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:25.580 --> 00:01:27.540 +The social links are all in your show notes. + +00:01:28.240 --> 00:01:31.780 +You can find over 10 years of past episodes at talkpython.fm. + +00:01:31.860 --> 00:01:35.300 +And if you want to be part of the show, you can join our recording live streams. + +00:01:35.480 --> 00:01:35.980 +That's right. + +00:01:36.160 --> 00:01:39.520 +We live stream the raw, uncut version of each episode on YouTube. + +00:01:40.000 --> 00:01:44.520 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:44.680 --> 00:01:48.340 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:48.340 --> 00:01:50.920 +This episode is brought to you by Sentry. + +00:01:51.660 --> 00:01:54.820 +You know Sentry for the error monitoring, but they now have logs too. + +00:01:55.040 --> 00:02:02.100 +And with Sentry, your logs become way more usable, interleaving into your error reports to enhance debugging and understanding. + +00:02:02.420 --> 00:02:05.800 +Get started today at talkpython.fm/sentry. + +00:02:06.500 --> 00:02:10.060 +And it's brought to you by Temporal, durable workflows for Python. + +00:02:10.480 --> 00:02:17.040 +Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +00:02:17.040 --> 00:02:20.340 +Get started at talkpython.fm/Temporal. + +00:02:21.000 --> 00:02:25.020 +Hey, a quick announcement for everyone taking courses over at Talk Python Training. + +00:02:25.300 --> 00:02:27.940 +We just rolled out course completion certificates. + +00:02:28.260 --> 00:02:29.300 +I'm really excited about these. + +00:02:29.660 --> 00:02:32.900 +When you finish a course, you can now generate a certificate automatically. + +00:02:33.520 --> 00:02:39.900 +The best part is there's a one-click button to add it straight to LinkedIn on your profile as an official certificate. + +00:02:40.420 --> 00:02:43.780 +Potential employers, current colleagues, they'll all see it right there on your profile. + +00:02:43.780 --> 00:02:52.500 +Just head over to your account page at Talk Python Training, find a course you finished, click certificate, and the share to LinkedIn option is right there. + +00:02:52.620 --> 00:02:53.240 +Zero friction. + +00:02:53.900 --> 00:03:03.900 +And if your employer gives you credit for professional development or reimburses you for training costs, but require some sort of proof, you can also download a full certificate as a PDF. + +00:03:04.280 --> 00:03:05.180 +Handy for that kind of thing. + +00:03:05.180 --> 00:03:09.040 +I'd love to see a wave of Talk Python certificates showing up on LinkedIn. + +00:03:09.600 --> 00:03:14.320 +Head over to Talk Python, click courses, go to your account page, and grab your certificates. + +00:03:15.160 --> 00:03:17.640 +Jonathan, Ralph, and Charlie, welcome. + +00:03:17.980 --> 00:03:20.660 +Welcome back, depending on which one of you are hearing this. + +00:03:21.380 --> 00:03:22.420 +Welcome to the show, you all. + +00:03:22.440 --> 00:03:23.780 +It's awesome to have you on Talk Python and me. + +00:03:24.040 --> 00:03:24.700 +Thanks for having us. + +00:03:24.780 --> 00:03:25.380 +Thanks for having us. + +00:03:25.380 --> 00:03:26.380 +Thanks for having us, Michael. + +00:03:27.060 --> 00:03:35.460 +We're going to dive in deep to Python packaging and really look at how the needs of Python packaging have evolved. + +00:03:36.120 --> 00:03:41.960 +And what you all, as well as a group of a bunch of other people, I see very long contributor lists on these peps. + +00:03:42.360 --> 00:03:44.520 +So a lot of people involved in this project. + +00:03:44.880 --> 00:03:45.340 +Really great. + +00:03:45.520 --> 00:03:47.540 +So let's get into it. + +00:03:47.540 --> 00:03:51.520 +Before we do, let's just do a quick round of intros for you all. + +00:03:51.760 --> 00:03:53.560 +I guess go around clockwise. + +00:03:53.860 --> 00:03:54.720 +Jonathan, you can go first. + +00:03:54.720 --> 00:03:59.540 +I work at NVIDIA for like, I think, the better of eight years right now. + +00:04:00.200 --> 00:04:02.340 +I did all kinds of different roles. + +00:04:02.600 --> 00:04:16.500 +But very recently, I mean, over the last two something years, I move into improving our CUDA and Python offering, trying to find better ways to expose GPU programming, essentially, at the Python layer. + +00:04:17.300 --> 00:04:26.120 +And I think for a little bit over a year, I've been working with Ralph and Charlie over multiple proposals to improve Python packaging. + +00:04:26.920 --> 00:04:28.060 +Initiative call we are next. + +00:04:28.260 --> 00:04:30.160 +And I think we'll talk a little bit more about this. + +00:04:30.380 --> 00:04:32.700 +So excited to be on the show today. + +00:04:33.420 --> 00:04:33.520 +Yeah. + +00:04:33.600 --> 00:04:34.280 +Excited to have you. + +00:04:34.400 --> 00:04:37.640 +You have really seen the roller coaster at NVIDIA, I'm sure. + +00:04:38.120 --> 00:04:38.280 +Right? + +00:04:38.520 --> 00:04:39.980 +Well, it's really exciting. + +00:04:39.980 --> 00:04:46.820 +Yeah, it was like gaming and probably some data science and then all the changes and now just center of the universe. + +00:04:47.040 --> 00:04:48.160 +So I'm sure it is exciting. + +00:04:48.500 --> 00:04:52.080 +You know, the funny thing is I wanted to join NVIDIA since 15 years. + +00:04:52.720 --> 00:04:56.360 +And I did a PhD to actually be able to join NVIDIA. + +00:04:56.620 --> 00:04:57.740 +That is so awesome. + +00:04:58.040 --> 00:04:58.820 +I love it. + +00:04:58.900 --> 00:05:02.040 +I was amazed by the CUDA technology when I was in high school. + +00:05:02.140 --> 00:05:05.600 +And I was like, ah, this is so incredible, the concept. + +00:05:06.060 --> 00:05:07.380 +And I wanted to join. + +00:05:07.380 --> 00:05:09.860 +So I'm happy I was able to make this happen. + +00:05:10.640 --> 00:05:13.320 +You know, CUDA is going to be an important part of this discussion. + +00:05:13.460 --> 00:05:18.480 +Not the only part, but it certainly is one of the forcing functions for the things happening here. + +00:05:18.740 --> 00:05:20.960 +Give people the background on CUDA. + +00:05:21.400 --> 00:05:22.020 +What is it? + +00:05:22.280 --> 00:05:22.800 +How does it work? + +00:05:22.880 --> 00:05:23.640 +Why is it so amazing? + +00:05:24.360 --> 00:05:37.540 +Well, CUDA is essentially a programming language that allows you to program on GPUs, specifically NVIDIA GPUs, and has a different programming model than what you would usually do in C. + +00:05:37.540 --> 00:05:44.400 +So because GPUs are fundamentally very different than CPUs, you have to program them with a different mindset. + +00:05:44.900 --> 00:05:52.960 +Like, for example, the biggest important thing when you start with GPUs is to not think about a single thread executing the instruction. + +00:05:52.960 --> 00:05:58.580 +But like, how can you massively parallel a task on like thousands of threads at a single? + +00:05:58.580 --> 00:06:08.720 +And it takes a different perspective and mode of thinking to how can you imagine doing a task on so many threads at the same time? + +00:06:08.720 --> 00:06:12.860 +We're not used to it as classic computer scientists. + +00:06:13.420 --> 00:06:19.600 +If anything, multi-threading is something that we tend to shy away from because there's a lot of caveats. + +00:06:20.480 --> 00:06:24.440 +But well, GPU programming is all about how can you have as many threads as possible. + +00:06:25.560 --> 00:06:26.080 +Yeah. + +00:06:26.600 --> 00:06:33.340 +It comes from graphics and videos where like this pixel is computed independently of that pixel. + +00:06:33.340 --> 00:06:36.360 +And we've got, you know, 5K resolution. + +00:06:36.640 --> 00:06:38.020 +So let's just break that up, right? + +00:06:38.260 --> 00:06:38.480 +Yeah. + +00:06:38.620 --> 00:06:39.960 +It's exactly the idea. + +00:06:40.120 --> 00:06:52.100 +And now we have this reasonably new model that's called title programming that abstract it even more, which essentially instead of thinking about threads and blocks and grids, you think in terms of title. + +00:06:52.340 --> 00:06:56.060 +So kind of a mini representation that you could have in mind. + +00:06:56.500 --> 00:07:00.460 +And that thing can scale and adapt differently on different hardware. + +00:07:00.640 --> 00:07:01.340 +So pretty cool. + +00:07:01.340 --> 00:07:03.240 +But yeah, that is amazing. + +00:07:03.620 --> 00:07:06.480 +People think that their CPU has a lot of cores. + +00:07:06.980 --> 00:07:09.580 +It's got nothing on the graphics cards. + +00:07:10.180 --> 00:07:10.760 +Well, yeah. + +00:07:11.240 --> 00:07:13.840 +It's a different type of hardware. + +00:07:14.420 --> 00:07:15.260 +Rich is different. + +00:07:15.540 --> 00:07:15.680 +Absolutely. + +00:07:16.100 --> 00:07:16.180 +Yeah. + +00:07:16.220 --> 00:07:16.760 +Well, very cool. + +00:07:16.860 --> 00:07:17.200 +Very cool. + +00:07:17.300 --> 00:07:20.760 +And what a journey if you did all that work to get there. + +00:07:20.980 --> 00:07:21.920 +I absolutely love it. + +00:07:22.120 --> 00:07:22.380 +Ralph. + +00:07:22.580 --> 00:07:22.860 +Welcome. + +00:07:23.080 --> 00:07:23.260 +Hello. + +00:07:23.600 --> 00:07:23.780 +Yeah. + +00:07:24.020 --> 00:07:24.500 +Thanks, Michael. + +00:07:24.700 --> 00:07:25.280 +Great to be here. + +00:07:25.940 --> 00:07:26.780 +So about me. + +00:07:26.780 --> 00:07:28.780 +I am a physicist by training. + +00:07:29.180 --> 00:07:32.120 +I did a PhD in atomic and quantum physics. + +00:07:32.760 --> 00:07:35.240 +Worked in the semiconductor industry in a while. + +00:07:35.740 --> 00:07:38.280 +And I rolled into scientific computing. + +00:07:38.440 --> 00:07:41.220 +Due to that, I started using Python in 2004. + +00:07:41.220 --> 00:07:43.640 +And used the mailing list at that point. + +00:07:43.840 --> 00:07:46.260 +Because there was, I mean, NumPy didn't exist yet. + +00:07:46.360 --> 00:07:48.020 +There was no documentation for anything. + +00:07:48.020 --> 00:07:49.320 +So you had to join a mailing list. + +00:07:49.540 --> 00:07:51.420 +That's how I rolled into open source early on. + +00:07:51.700 --> 00:07:56.080 +I became the release manager of NumPy and SciPy in 2010. + +00:07:56.080 --> 00:07:58.720 +And yeah, I've been kind of doing that ever since. + +00:07:59.140 --> 00:08:00.720 +As a volunteer for 10 years. + +00:08:00.820 --> 00:08:01.900 +And then I got really too much. + +00:08:02.040 --> 00:08:03.060 +So I made it my job. + +00:08:03.120 --> 00:08:06.240 +I joined QuantSight, which is a small consulting company. + +00:08:06.620 --> 00:08:10.380 +Primarily around like data science, supplied AI, scientific computing. + +00:08:10.980 --> 00:08:14.080 +And yeah, I'm now one of the two co-CEOs of QuantSight. + +00:08:14.740 --> 00:08:15.180 +Awesome. + +00:08:15.400 --> 00:08:19.560 +Trying to basically, we just converted last year to a public benefit corporation. + +00:08:19.560 --> 00:08:23.160 +Which is very much aligned with what, you know, most of our team wants to do. + +00:08:23.160 --> 00:08:25.320 +Most of them are open source maintainers. + +00:08:25.800 --> 00:08:31.680 +And yeah, we basically do consulting to allow ourselves to make impactful open source contributions. + +00:08:32.160 --> 00:08:35.500 +QuantSight is doing a ton in the data science space. + +00:08:35.940 --> 00:08:37.500 +Scientific computing space, for sure. + +00:08:37.680 --> 00:08:42.180 +I've had multiple rounds of QuantSight folks on the show and things like that. + +00:08:42.380 --> 00:08:43.100 +And very neat. + +00:08:43.300 --> 00:08:45.340 +Yeah, it's a lot of fun and rewarding. + +00:08:45.760 --> 00:08:47.060 +So yeah, glad to be here. + +00:08:47.280 --> 00:08:53.000 +It's an interesting transition going from a science or something along those lines into programming, right? + +00:08:53.000 --> 00:08:58.040 +And I got into it through working in my math research and so on. + +00:08:58.300 --> 00:08:59.980 +And actually, this is just more fun. + +00:09:00.040 --> 00:09:01.300 +I'm just going to do programming. + +00:09:01.820 --> 00:09:04.860 +It's not exactly physics, but it's pretty similar, you know? + +00:09:05.120 --> 00:09:05.400 +Yeah. + +00:09:05.600 --> 00:09:07.220 +I mean, I've always liked both. + +00:09:07.400 --> 00:09:10.020 +But I did experimental physics. + +00:09:10.020 --> 00:09:14.740 +And there, you have much less control over what you end up producing. + +00:09:15.140 --> 00:09:17.560 +You know, building and using lasers in the lab. + +00:09:17.620 --> 00:09:21.280 +If one broke, maybe I had to send it off for repairs and wait a month, right? + +00:09:21.380 --> 00:09:22.560 +And what do you do in the meantime? + +00:09:22.660 --> 00:09:23.040 +You program. + +00:09:23.680 --> 00:09:26.660 +So, you know, that's one of the nicer things about it. + +00:09:26.940 --> 00:09:31.440 +And yeah, I gradually started with like, even before Python, there was some MATLAB. + +00:09:31.440 --> 00:09:33.200 +And then, you know, you roll into open source. + +00:09:33.340 --> 00:09:37.360 +And then, you know, mostly just Python a bit of C and kind of like go down from there. + +00:09:37.460 --> 00:09:38.760 +And then you encounter packaging. + +00:09:39.040 --> 00:09:43.320 +And it's one of those things that like only 5% of people like and the rest see it as a chore. + +00:09:43.580 --> 00:09:46.500 +But yeah, when you like it, you just have to do more and more of it. + +00:09:46.820 --> 00:09:46.980 +Yeah. + +00:09:47.040 --> 00:09:49.000 +You're with your people now, I think, on this call. + +00:09:49.060 --> 00:09:49.600 +That's for sure. + +00:09:50.520 --> 00:09:51.040 +Hey, Charlie. + +00:09:51.600 --> 00:09:53.860 +I mean, do we even need to give you an introduction? + +00:09:53.980 --> 00:09:55.840 +We just say uv and then go on or? + +00:09:56.140 --> 00:09:56.840 +No, I'll generally. + +00:09:57.920 --> 00:09:59.080 +No, please do. + +00:09:59.080 --> 00:09:59.720 +I'm just kidding. + +00:10:00.040 --> 00:10:05.600 +But the reason I say that is uv has taken the world by storm, really. + +00:10:06.060 --> 00:10:06.920 +And congratulations. + +00:10:07.320 --> 00:10:08.220 +And yeah, tell people about yourself. + +00:10:08.220 --> 00:10:08.520 +Thank you. + +00:10:08.860 --> 00:10:09.440 +Yeah, yeah, of course. + +00:10:09.680 --> 00:10:11.160 +So my name is Charlie. + +00:10:11.520 --> 00:10:13.280 +I'm the founder and CEO of Afterall. + +00:10:13.780 --> 00:10:18.880 +I've been working on the company for, let's see, started the company in October 2022. + +00:10:19.200 --> 00:10:20.140 +That's the easier way to do it. + +00:10:20.360 --> 00:10:21.900 +So I've been working on this for a few years. + +00:10:23.440 --> 00:10:25.280 +We mostly build open source. + +00:10:25.280 --> 00:10:29.940 +So we've worked on a couple of different tools that have become quite popular in Python. + +00:10:30.140 --> 00:10:32.420 +So we build Ruff, which is our linter and formatter. + +00:10:32.800 --> 00:10:34.200 +TY, which is our type checker. + +00:10:34.500 --> 00:10:39.600 +And then most relevant for this episode would be uv, which is our Python package and project manager. + +00:10:40.560 --> 00:10:49.620 +So yeah, we spend all our time thinking about how to build tools that make it easier and to work with Python and how to make Python programming more productive. + +00:10:49.620 --> 00:10:51.780 +A lot of that's about speed. + +00:10:52.180 --> 00:10:59.680 +We try to build things that are really fast, but it's also about user experience and trying to sort of like take complexity out of the critical path for users. + +00:11:00.780 --> 00:11:12.220 +So, you know, for example, we've definitely spent a lot of time thinking about how we can make it easier for people to install PyTorch, which is, you know, one of the examples that will come up, I'm sure, you know, over the course of the show. + +00:11:12.280 --> 00:11:15.020 +And one of the motivating examples for the peps we've been working on. + +00:11:15.020 --> 00:11:17.560 +So, yeah, that's why I'm here. + +00:11:17.660 --> 00:11:21.780 +We've been collaborating with Jonathan, Ralph, and honestly, like a bunch of other people, too. + +00:11:21.840 --> 00:11:23.660 +It's been a really big effort, and I'm sure we'll get into that. + +00:11:23.760 --> 00:11:30.420 +But it's been cool to have this long running and very like wide ranging collaboration around trying to push Python packaging forward. + +00:11:30.880 --> 00:11:33.300 +Well, like I said, congrats on all the stuff with Astral. + +00:11:33.800 --> 00:11:36.640 +And we're going to talk a little bit about pyx, I think. + +00:11:36.780 --> 00:11:37.700 +Maybe see if there's any. + +00:11:37.980 --> 00:11:42.520 +Just to check in at the end of the show after we talk about some of these things, I think, if you're up for it. + +00:11:42.680 --> 00:11:43.120 +Yeah, sounds good. + +00:11:43.120 --> 00:11:45.140 +I mean, let's just start with what is the challenge. + +00:11:45.340 --> 00:11:52.660 +You all have described this as the lowest common denominator packaging problem that we've got to deal with. + +00:11:52.820 --> 00:12:05.100 +And the idea or the problem is different CPUs have specialized instructions, different graphics cards, all these different compute and platforms and so on might have specific instructions. + +00:12:05.360 --> 00:12:06.560 +And they're optimized, right? + +00:12:06.600 --> 00:12:10.140 +Like do this as vectors operations instead of on registers or whatever. + +00:12:10.140 --> 00:12:14.940 +But maybe some other thing that it might run on doesn't support that, right? + +00:12:15.060 --> 00:12:16.300 +I don't know, WebAssembly, whatever. + +00:12:16.640 --> 00:12:16.820 +Yeah. + +00:12:17.740 --> 00:12:28.420 +And so then how do you actually end up shipping something to Python people that takes advantages of the specializations that are there when they're there, but without breaking the other ones, right? + +00:12:28.480 --> 00:12:29.460 +That's kind of the core problem. + +00:12:29.540 --> 00:12:29.940 +Is that right? + +00:12:30.220 --> 00:12:30.540 +Yeah. + +00:12:30.540 --> 00:12:32.900 +I can take one little step back before. + +00:12:32.900 --> 00:12:33.520 +Go ahead, Alan. + +00:12:33.520 --> 00:12:46.080 +If you think about it like a wheel, when you take the Python package that we have everywhere, there is a few parts of the fine names that essentially allows you to know what it's been built for. + +00:12:46.460 --> 00:12:50.120 +So inside this, you have, if it's a pure Python package, it's simple. + +00:12:50.340 --> 00:12:55.260 +You might have a minimum Python version, but in most cases, it's pretty generic. + +00:12:55.260 --> 00:12:56.520 +So that's not an issue. + +00:12:56.660 --> 00:13:03.640 +When you start having compiled code inside the package, that's a different story because now we're talking about what kind of OS it was built for. + +00:13:03.920 --> 00:13:07.540 +So Windows, macOS, Linux, different flavors. + +00:13:08.500 --> 00:13:11.220 +We're talking about the type of CPU that it was built for. + +00:13:11.600 --> 00:13:16.880 +So x86, ARM, PowerPC, potentially RISC-V, all these things. + +00:13:17.860 --> 00:13:18.340 +Mobile. + +00:13:18.340 --> 00:13:20.520 +And then finally, the Python API. + +00:13:20.840 --> 00:13:24.060 +And in most cases, it means the minimum Python API that you need. + +00:13:24.880 --> 00:13:30.440 +So, and for people, an API is essentially the same as an API, but for a binary. + +00:13:31.000 --> 00:13:37.360 +So it's important when things are stable at the API level because it allows you to be future compatible. + +00:13:38.400 --> 00:13:39.380 +What does API mean? + +00:13:39.740 --> 00:13:40.460 +Jonathan, help us out. + +00:13:40.480 --> 00:13:42.040 +What does API mean for those of us who don't know? + +00:13:42.040 --> 00:13:43.460 +Application binary interface. + +00:13:43.640 --> 00:13:47.860 +So it's the same as API, but instead of specifically for binaries. + +00:13:48.340 --> 00:13:58.920 +And the problem that we collectively kind of try to get to is that, well, today, the compute space and the scientific computing space, + +00:13:58.920 --> 00:14:10.520 +which if we take the latest JetBrains Python developer survey, is at least 40 to 50% of the Python developers are essentially doing data science or similar. + +00:14:10.520 --> 00:14:21.160 +So it's a massive percentage of the community is doing, in some form, scientific computing to whatever extent you may want to think about it. + +00:14:21.160 --> 00:14:32.220 +And, well, the problem is when we do these things, we try to do them fast because who likes to wait on the return of some Pandas operation or NumPy or PyTorch operation. + +00:14:32.220 --> 00:14:42.620 +But to go fast, you need to use all the tricks in the books that you can get to essentially, you have to optimize the binary for a specific CPU, + +00:14:42.620 --> 00:14:47.860 +for a specific GPU, or for a specific library that you want to use, like BLAS. + +00:14:48.140 --> 00:14:53.060 +BLAS is a general concept, so which BLAS implementation is, or MPI. + +00:14:53.500 --> 00:15:03.020 +And the problem is, well, we don't have the tags or markers to allow us to essentially flag this specific binary to be compatible with X, Y, and Z. + +00:15:03.020 --> 00:15:13.660 +Right, so the wheel might say, this is for 3.14, it is for ARM CPUs, and so on, but it's not going to say, + +00:15:14.000 --> 00:15:18.780 +and it supports this vectorization optimization on Intel chips, right? + +00:15:18.960 --> 00:15:20.140 +I just said ARM, didn't I? + +00:15:20.300 --> 00:15:30.280 +A very good example with ARM is that the default most people build with is actually a Raspberry Pi, ARM level. + +00:15:30.480 --> 00:15:31.200 +Yeah, yeah, yeah. + +00:15:31.200 --> 00:15:42.300 +And you can imagine that when you build for any type of desktop CPU, ARM, you have a little bit more complex CPUs and a little bit more advanced chips. + +00:15:42.760 --> 00:15:48.260 +And it's a lot of performance that you leave on the table by not optimizing for a specific platform. + +00:15:48.440 --> 00:15:52.420 +So obviously, in some cases, it doesn't really matter, but in other cases, it does really matter. + +00:15:53.540 --> 00:15:56.120 +This portion of Talk Python To Me is brought to you by Sentry. + +00:15:56.560 --> 00:15:58.700 +You know Sentry for their great error monitoring. + +00:15:58.700 --> 00:16:00.120 +But let's talk about logs. + +00:16:00.580 --> 00:16:01.480 +Logs are messy. + +00:16:01.980 --> 00:16:07.600 +Trying to grep through them and line them up with traces and dashboards just to understand one issue isn't easy. + +00:16:08.060 --> 00:16:10.240 +Did you know that Sentry has logs too? + +00:16:10.760 --> 00:16:12.800 +And your logs just became way more usable. + +00:16:13.360 --> 00:16:19.300 +Sentry's logs are trace-connected and structured, so you can follow the request flow and filter by what matters. + +00:16:19.300 --> 00:16:28.020 +And because Sentry surfaces the context right where you're debugging, the trace, relevant logs, the error, and even the session replay all land in one timeline. + +00:16:28.380 --> 00:16:30.540 +No timestamp matching, no tool hopping. + +00:16:31.060 --> 00:16:37.280 +From front end to mobile to back end, whatever you're debugging, Sentry gives you the context you need so you can fix the problem and move on. + +00:16:37.720 --> 00:16:42.400 +More than 4.5 million developers use Sentry, including teams at Anthropic and Disney+. + +00:16:42.400 --> 00:16:47.840 +Get started with Sentry logs and error monitoring today at talkpython.fm/sentry. + +00:16:48.220 --> 00:16:50.520 +Be sure to use our code, talkpython26. + +00:16:50.980 --> 00:16:52.780 +The link is in your podcast player's show notes. + +00:16:53.080 --> 00:16:54.760 +Thank you to Sentry for supporting the show. + +00:16:55.760 --> 00:16:57.400 +I'll give a very concrete example. + +00:16:57.400 --> 00:17:04.560 +Intel x86-64 is kind of the most common CPU that most people will have at home. + +00:17:04.760 --> 00:17:12.700 +If you build a wheel for that, you can only use CPU features, performance CPU features that go back to about 2009. + +00:17:13.000 --> 00:17:27.220 +Any new hardware features that were introduced after 2009, things like SSC4, AVX2, later versions of that, you just cannot use because the installers don't know that you put that in the wheel. + +00:17:27.560 --> 00:17:32.280 +And hence, they will also install it on computers that don't have those instructions, right? + +00:17:32.300 --> 00:17:34.640 +And then you just get like very ugly crashes. + +00:17:34.980 --> 00:17:40.020 +Hence, what we all do is we ship wheels, binaries that are only compatible with 2009. + +00:17:40.320 --> 00:17:51.220 +And the difference between the 2009 hardware features and, you know, the 2019 or 2023 one could be a factor of 10x, 20x in performance, depending on what you're doing. + +00:17:51.220 --> 00:17:52.780 +10x to 20x? + +00:17:53.040 --> 00:17:53.320 +Oh, yeah. + +00:17:53.540 --> 00:17:58.680 +For, you know, especially when you work with scientific data and SIMD instructions. + +00:17:58.960 --> 00:18:01.580 +Yeah, you can get massive performance increases. + +00:18:01.800 --> 00:18:04.780 +If you heard of vectorization, this is a huge deal. + +00:18:05.100 --> 00:18:05.300 +Yeah. + +00:18:05.620 --> 00:18:12.880 +I mean, I guess the way I think about it from our perspective of building, like, because these problems, like one of the things that's very hard about solving them, and it has required, + +00:18:12.880 --> 00:18:21.460 +like us to be so collaborative across the industry, is that it touches, like, basically every piece of the Python packaging stack. + +00:18:21.840 --> 00:18:30.920 +Like, it impacts how you build things, how the registries work, like what they support, how installers, like, choose what to install. + +00:18:30.920 --> 00:18:36.600 +And so, like, for us, it's like, you know, there's the superpower of Python, I think, in some ways. + +00:18:36.980 --> 00:18:45.460 +Sorry, I think the superpower of Python in some ways is, like, you can build and distribute all this software that's built for, you know, that uses native code. + +00:18:45.460 --> 00:18:51.020 +Like, you can take native code, and you can distribute it out to users, and they can run it just like it's any other piece of Python code. + +00:18:51.820 --> 00:19:02.600 +And in the spec, we have these things like, okay, you can build a wheel that targets Windows or Linux or macOS, and it can target, like, xasics or ARM or whatever else. + +00:19:03.000 --> 00:19:04.600 +And those are all captured in the spec. + +00:19:04.720 --> 00:19:13.160 +And so, for us, like, building uv, we know how to detect those things, how to figure out, like, which wheel to install based on what the user's machine is running. + +00:19:13.160 --> 00:19:20.080 +But there's all this other stuff that's not captured by any of those standards, like the instruction set or even, like, the supported CUDA version. + +00:19:20.600 --> 00:19:25.220 +Like, all these things are not captured in that wheel file, and installers don't know how to detect them. + +00:19:25.300 --> 00:19:30.540 +They don't know how to figure out, like, okay, which PyTorch build should I use based on the CUDA version on the user's machine? + +00:19:30.700 --> 00:19:31.880 +Like, all that stuff is lost. + +00:19:31.960 --> 00:19:33.840 +And that's kind of the gap that we're trying to bridge. + +00:19:34.520 --> 00:19:41.660 +And part of the philosophy is also, so right now, Python packaging exposed what is called platform tags. + +00:19:41.660 --> 00:19:48.080 +So, essentially, a sort of, like, mini tag that comes with a specific definition that installers know how to resolve. + +00:19:48.640 --> 00:19:57.060 +And what we're trying to evolve is to end up creating 200 more today and 200 more in two years and 200 more, again, in four years. + +00:19:57.160 --> 00:20:10.040 +So, we try to come up with a generic system that will allow you to essentially include arbitrary definition that then resolvers and package managers can then understand by some sort of mechanism. + +00:20:10.040 --> 00:20:18.500 +And resolve, but not create a sort of blessed list of things that you constantly have to update because it's a lot of maintenance. + +00:20:19.040 --> 00:20:20.940 +Yeah, that's how we got into the situation now, right? + +00:20:21.000 --> 00:20:27.880 +Because there's one for the version, there's one for the architecture of the CPU, but then there's not a spot for the other stuff. + +00:20:27.980 --> 00:20:35.620 +So, the overall idea is to say almost just a metadata section in there and things can read it or ignore it as they see fit. + +00:20:35.620 --> 00:20:37.200 +Yeah, that's exactly a concept. + +00:20:37.340 --> 00:20:38.060 +Conceptually, yeah. + +00:20:38.180 --> 00:20:38.820 +A little bit. + +00:20:39.280 --> 00:20:39.860 +Yeah, yeah. + +00:20:40.060 --> 00:20:55.000 +I mean, it's like, I guess the question is, like, okay, if we have this, like, huge space of things that we might possibly want to detect and condition installs on, like, okay, anytime someone publishes a wheel for Python, they should now tell us, like, what CUDA version is it built for? + +00:20:55.000 --> 00:20:58.900 +Or, like, if any, what, like, CPU instruction sets does it support? + +00:20:59.160 --> 00:20:59.960 +Like, blah, blah, blah. + +00:21:00.160 --> 00:21:01.840 +Like, where would we put all that stuff, right? + +00:21:01.880 --> 00:21:02.520 +Becomes the question. + +00:21:02.640 --> 00:21:06.300 +It's like, what, are we just going to keep expanding, like, the platform tag and everything else? + +00:21:06.520 --> 00:21:09.340 +And that's, like, the problem that we're trying to solve in kind of a generic way. + +00:21:09.580 --> 00:21:13.020 +Yeah, you can end up with a file name that's 4,000 characters wide or something. + +00:21:13.140 --> 00:21:14.880 +They can already get pretty long, by the way. + +00:21:14.880 --> 00:21:21.620 +But, yeah, that's, if you have any changes, we have to work around that in uv sometimes, file name length limits. + +00:21:21.860 --> 00:21:22.000 +But, yeah. + +00:21:22.000 --> 00:21:27.500 +It's actually a very famous package that used, I think, 200 first digits of pi as the version number. + +00:21:28.600 --> 00:21:29.660 +Oh, my gosh. + +00:21:31.060 --> 00:21:32.380 +It's a pretty good joke. + +00:21:32.560 --> 00:21:33.400 +I didn't know about it. + +00:21:33.440 --> 00:21:36.640 +There's somebody on discuss.python.org that posted the link. + +00:21:36.700 --> 00:21:38.020 +And I was like, but that's hilarious. + +00:21:38.840 --> 00:21:40.260 +That is wild. + +00:21:40.260 --> 00:21:50.180 +So, before we dive into what you all are proposing, let's maybe talk about how just a couple of packages or libraries solve this problem now in maybe different directions. + +00:21:50.580 --> 00:21:52.820 +So, Ralph, what about NumPy, right? + +00:21:52.880 --> 00:21:55.900 +I mean, you guys talked about vectorization and stuff. + +00:21:56.520 --> 00:21:56.800 +Yeah. + +00:21:57.300 --> 00:21:59.060 +That's so in line with NumPy, right? + +00:21:59.100 --> 00:22:02.280 +Is NumPy, like, and pandas, that's the way, you know? + +00:22:02.540 --> 00:22:02.860 +Yes. + +00:22:02.980 --> 00:22:03.460 +NumPy, yes. + +00:22:03.560 --> 00:22:04.080 +Pandas, no. + +00:22:04.080 --> 00:22:14.120 +So, NumPy does contain SIMD instructions and, you know, because it's incredibly useful for performance. + +00:22:14.400 --> 00:22:20.240 +You know, NumPy has all large arrays and basic instructions on them that, like, have direct hardware implementations typically. + +00:22:21.060 --> 00:22:29.300 +But the way it's done is incredibly complex because you need to end up with a wheel that works on every type of CPU, right? + +00:22:29.360 --> 00:22:33.440 +We didn't, you know, I'll stay with x86, but the same happens on the other platforms, right? + +00:22:33.440 --> 00:22:39.220 +You know, it needs to run on a 2010 CPU and it needs to run better on a 2024 CPU. + +00:22:39.640 --> 00:22:54.440 +So, what we do in NumPy is we have a system that basically allows you to either parameterize a source file that, you know, and then rebuild it multiple times, you know, four different particular CPU architectures. + +00:22:54.440 --> 00:22:59.060 +So, like, you know, like a Haswell family and then a Skylake family and so on. + +00:22:59.260 --> 00:23:03.660 +And then we basically merge that together in a single Python extension module. + +00:23:03.660 --> 00:23:16.420 +And then at runtime, we have our own code to detect the CPU and basically then some, like, dispatch shim layer that kind of fishes out the right, you know, family from the extension module. + +00:23:16.420 --> 00:23:18.980 +So, yeah, you put up the diagram there. + +00:23:19.180 --> 00:23:20.560 +It's pretty complicated. + +00:23:21.080 --> 00:23:26.640 +And I'd say there I've been collaborating with some of the, you know, world experts on this. + +00:23:27.140 --> 00:23:38.480 +We had like an in the end, this was only successful because we built a generic architecture that other experts per, you know, CPU architecture could come and contribute to. + +00:23:38.480 --> 00:23:44.600 +So, we now have a specific team of like four people that help maintain the architecture. + +00:23:44.900 --> 00:23:51.760 +But then like, you know, Intel for years paid one of their engineers to optimize specifically the x86 code path. + +00:23:52.400 --> 00:23:57.320 +And then ARM has a NumPy maintainer who, you know, got commit writes a few years ago. + +00:23:57.580 --> 00:24:00.840 +And he's the final authority on all the ARM instructions that are in there. + +00:24:01.100 --> 00:24:06.160 +So, that whole complicated thing is now shipped and it's extremely good for performance. + +00:24:06.160 --> 00:24:10.680 +But you can see how this is not a scalable process to do in many packages, right? + +00:24:10.920 --> 00:24:16.680 +Plus, you know, if you compile everything five times, you get a binary that's, you know, it's not five times bigger, but it's a lot bigger. + +00:24:17.020 --> 00:24:19.160 +So, it's not great for users as well. + +00:24:19.440 --> 00:24:21.740 +Yeah, actually the nickname for these things are called Fatbin. + +00:24:22.000 --> 00:24:28.220 +So, you have the idea for why they are called that way because they tend to be very heavy to download. + +00:24:28.540 --> 00:24:28.960 +Yeah, yeah. + +00:24:29.300 --> 00:24:30.900 +Instead of wheels, you got big wheels. + +00:24:31.300 --> 00:24:31.480 +Yep. + +00:24:31.480 --> 00:24:37.900 +So, what happens if all these changes get adopted and it doesn't need to be compiled into one giant binary? + +00:24:38.280 --> 00:24:38.480 +Okay. + +00:24:38.700 --> 00:24:40.480 +Are all these maintainers still working? + +00:24:40.580 --> 00:24:43.720 +They just don't have to deal with trying to boot it all into one thing? + +00:24:44.000 --> 00:24:46.500 +They might still have to do, yes. + +00:24:46.760 --> 00:24:47.980 +I think essentially you're correct. + +00:24:48.080 --> 00:24:52.620 +You still need to write the actual code that uses the SIMD instructions. + +00:24:52.620 --> 00:25:00.880 +But then you can just produce a wheel that says like, okay, it works on this specific CPU architecture and just ignore this code if I'm building for another architecture. + +00:25:01.260 --> 00:25:06.540 +And all the, you know, detecting the CPU at runtime and the dynamic dispatch features you all don't need. + +00:25:06.740 --> 00:25:07.700 +Will it make the code faster? + +00:25:08.120 --> 00:25:09.140 +It will, well. + +00:25:09.760 --> 00:25:11.280 +Like will you have a better cache hits? + +00:25:11.380 --> 00:25:12.600 +Will there be smaller stuff in memory? + +00:25:12.680 --> 00:25:13.240 +You know, that kind of stuff. + +00:25:13.240 --> 00:25:16.400 +I don't think it will make the NumPy code much faster. + +00:25:17.240 --> 00:25:24.240 +It will, you know, it will make a huge difference for all the other packages that don't have this amount of complexity today. + +00:25:24.360 --> 00:25:31.660 +So like SciPy, scikit-learn, Pandas, Pillow, like none of these packages actually use SIMD code. + +00:25:31.960 --> 00:25:35.320 +And for SciPy, it's the easiest for me to talk about because I'm also a SciPy maintainer. + +00:25:35.480 --> 00:25:40.900 +We actually have a lot of code that, you know, got vendored in from somehow, like Fourier transforms, for example. + +00:25:41.060 --> 00:25:42.020 +They benefit a lot as well. + +00:25:42.020 --> 00:25:50.900 +We have AVX2 and ARM Neon implementations, but we just don't build them and don't ship that as wheels because we have no way of doing that. + +00:25:51.240 --> 00:25:56.340 +As soon as we have, you know, wheel variants, we can say, okay, let's ship two sets of wheels. + +00:25:56.600 --> 00:25:58.880 +I mean, that's more CI jobs to build more wheels. + +00:25:59.140 --> 00:26:02.680 +But, you know, when it's worth it, you know, you can make that trade-off, right? + +00:26:02.720 --> 00:26:03.740 +Like we already have the code. + +00:26:03.820 --> 00:26:07.020 +We just have to change a build option, produce a different wheel, and ship it. + +00:26:07.020 --> 00:26:15.160 +So do you just set up something like a hash if def sort of thing for like if defs this capability? + +00:26:15.900 --> 00:26:17.640 +Else you put in the generic code? + +00:26:18.320 --> 00:26:18.720 +Exactly. + +00:26:19.180 --> 00:26:22.260 +The, yeah, the C code is basically just a bunch of if defs. + +00:26:22.660 --> 00:26:29.800 +And, you know, if you only, you know, for maintainability reasons, you only add more if defs if, you know, it's really much faster. + +00:26:29.800 --> 00:26:36.020 +Like you are going to do it for 10 or 20% faster, but if it's 2x faster, well, why not have an extra else bridge? + +00:26:36.400 --> 00:26:37.040 +Yeah, absolutely. + +00:26:37.380 --> 00:26:39.720 +Charlie, does Rust have a hash if def equivalent? + +00:26:40.040 --> 00:26:40.600 +It must, right? + +00:26:40.860 --> 00:26:41.760 +Yeah, you can do. + +00:26:42.340 --> 00:26:43.920 +It has directives like that. + +00:26:44.120 --> 00:26:47.480 +Yeah, but you guys don't really need to worry about using this for yourself. + +00:26:47.880 --> 00:26:52.080 +This is more for the things that you service providing to everyone, right? + +00:26:52.400 --> 00:26:52.800 +Yeah. + +00:26:52.800 --> 00:27:00.000 +Yeah, this is mostly, this wouldn't have a huge impact on uv or, I mean, it could have some small impact. + +00:27:00.080 --> 00:27:04.960 +But I think largely this is about, yeah, how can we make it easier for users to consume this stuff? + +00:27:04.960 --> 00:27:09.740 +And I mean, the NumPy, like this is a good example of how it affects like build and distribution. + +00:27:10.080 --> 00:27:15.200 +Because, yes, they still have to write like architecture specific code if they want to get these optimizations. + +00:27:15.200 --> 00:27:24.060 +But what we'll be doing with these proposals is making it much easier for them to ship separate builds that are like dedicated for each of those different variants. + +00:27:24.400 --> 00:27:28.420 +So like the end user, you know, will get access to it. + +00:27:28.860 --> 00:27:35.680 +But in this case, it's like the bottleneck is, or part of the bottleneck is like all the complexity it puts on the maintainers and the people publishing. + +00:27:36.480 --> 00:27:42.020 +How much do you think it would impact the performance to ship Python standalone with different CPU extension? + +00:27:42.420 --> 00:27:44.040 +That is a good question, Jonathan. + +00:27:44.040 --> 00:27:49.400 +So we'd actually like to do, I don't know that I have a great answer to that. + +00:27:49.520 --> 00:27:52.260 +I mean, like a good quantitative answer to it. + +00:27:52.320 --> 00:27:55.080 +I think we are very interested in doing stuff like that. + +00:27:55.160 --> 00:27:57.640 +We've also considered, for example, shipping a build. + +00:27:57.920 --> 00:28:00.800 +Like we ship with a relatively old like glibc minimum. + +00:28:01.040 --> 00:28:07.200 +We've considered shipping a build, a variant, not in the sense of the, sorry, a different build. + +00:28:07.420 --> 00:28:08.200 +Let me just put it that way. + +00:28:08.560 --> 00:28:11.100 +That uses a more modern glibc version, for example. + +00:28:11.940 --> 00:28:13.640 +We do run into other problems with that. + +00:28:13.640 --> 00:28:15.440 +Like our build matrix is really big. + +00:28:15.500 --> 00:28:17.880 +We have to split it across multiple GitHub actions now. + +00:28:18.060 --> 00:28:20.600 +And so like we need to, we just have like a lot of builds. + +00:28:20.720 --> 00:28:24.260 +So we'd probably, we're worried about like doubling the size of the build matrix, for example. + +00:28:24.460 --> 00:28:26.060 +But that's a separate problem. + +00:28:26.340 --> 00:28:28.220 +But yes, it could actually, it could actually be helpful there. + +00:28:28.280 --> 00:28:29.880 +Although we don't ship those as wheels today. + +00:28:29.880 --> 00:28:30.820 +Yeah, that's awesome. + +00:28:31.020 --> 00:28:36.960 +In a very interesting angle to think about how much leverage, I mean, this probably does, this is probably something you've thought about. + +00:28:37.060 --> 00:28:44.260 +But how much leverage you and your team actually have on Python performance by how you control Python build standalone. + +00:28:44.260 --> 00:28:48.780 +This portion of Talk Python To Me is sponsored by Temporal. + +00:28:48.780 --> 00:28:55.660 +Ever since I had Mason Egger on the podcast for episode 515, I've been fascinated with durable workflows in Python. + +00:28:56.160 --> 00:29:00.820 +That's why I'm thrilled that Temporal has decided to become a podcast sponsor since that episode. + +00:29:00.820 --> 00:29:09.400 +If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:29:10.020 --> 00:29:15.200 +I'm sure many of you have written brutal code to keep the workflow moving and to track when you run into problems. + +00:29:15.600 --> 00:29:16.580 +But it's trickier than that. + +00:29:16.800 --> 00:29:21.760 +What if you have a long-running workflow and you need to redeploy the app or restart the server while it's running? + +00:29:22.320 --> 00:29:25.660 +This is where Temporal's open source framework is a game changer. + +00:29:25.660 --> 00:29:40.600 +You write workflows as normal Python code and Temporal ensures that they execute reliably, even across crashes, restarts, or long-running processes, while handling retries, states, and orchestrations for you so you don't have to build and maintain that logic yourself. + +00:29:41.300 --> 00:29:46.520 +You may be familiar with writing asynchronous code using the async and await keywords in Python. + +00:29:46.980 --> 00:29:55.300 +Temporal's brilliant programming model leverages the exact same programming model that you are familiar with, but uses it for durability, not just concurrency. + +00:29:55.660 --> 00:30:00.160 +Imagine writing awaitworkflow.sleep, time delta, 30 days. + +00:30:00.500 --> 00:30:02.440 +Yes, seriously, sleep for 30 days. + +00:30:02.580 --> 00:30:04.500 +Restart the server, deploy new versions of the app. + +00:30:04.720 --> 00:30:05.140 +That's it. + +00:30:05.340 --> 00:30:06.500 +Temporal takes care of the rest. + +00:30:07.040 --> 00:30:11.520 +Temporal is used by teams at Netflix, Snap, and NVIDIA for critical production systems. + +00:30:12.060 --> 00:30:14.740 +Get started with the open source Python SDK today. + +00:30:15.040 --> 00:30:17.480 +Learn more at talkpython.fm/Temporal. + +00:30:17.800 --> 00:30:19.800 +The link is in your podcast player's show notes. + +00:30:19.940 --> 00:30:22.200 +Thank you to Temporal for supporting the show. + +00:30:22.200 --> 00:30:26.400 +Maybe just tell people, what is the relevance there? + +00:30:26.660 --> 00:30:27.160 +Like, why? + +00:30:27.500 --> 00:30:30.680 +What is Python Build Standalone and how does this even apply to what we're talking about? + +00:30:30.880 --> 00:30:31.400 +Oh, yeah, sure. + +00:30:31.400 --> 00:30:32.520 +I use it every day. + +00:30:32.580 --> 00:30:32.960 +I love it. + +00:30:33.100 --> 00:30:34.480 +A lot of people use it and don't even know. + +00:30:34.560 --> 00:30:41.420 +I mean, it's probably the least, it's the least like public or like user, direct user facing thing that we do. + +00:30:41.420 --> 00:30:47.680 +But we took over maintenance of a project called Python Build Standalone probably like a year ago, maybe a little more. + +00:30:49.180 --> 00:31:04.880 +And that project, the basic idea is like typically when you build CPython, you know, at least like on Linux, for example, a bunch of absolute paths get embedded into the binary, which makes it hard to build like reproducible and relocatable CPythons. + +00:31:05.080 --> 00:31:08.740 +Like it's hard for someone to build a CPython that you can then download and run on your machine. + +00:31:08.740 --> 00:31:11.120 +You typically need to build it on your own machine. + +00:31:12.500 --> 00:31:17.200 +So what this project does is it's sort of like a fork of the CPython build system. + +00:31:17.340 --> 00:31:20.920 +It's like the CPython build system with a bunch of patches and other changes applied on top. + +00:31:21.040 --> 00:31:27.560 +And it makes it so that we can build Pythons that you can just download, unzip and run. + +00:31:27.560 --> 00:31:35.720 +So when you install Python with uv, and these are also used in like Bazel and in a bunch of other tools, we don't actually like build Python from source. + +00:31:35.840 --> 00:31:39.660 +We actually download, unzip and run Python, which just makes it much easier. + +00:31:39.800 --> 00:31:41.180 +It means it's faster. + +00:31:41.460 --> 00:31:45.320 +You don't have to have like the build tool chain on your machine. + +00:31:45.540 --> 00:31:48.380 +You don't run into problems around like failing to build it or anything like that. + +00:31:48.380 --> 00:31:53.240 +But the other thing that's been cool about that project, at least recently, is we've been very focused on performance. + +00:31:53.860 --> 00:32:00.140 +So on actually just trying to make sure that we're distributing, like our goal is to be like the fastest Python distribution. + +00:32:00.140 --> 00:32:07.320 +Like even without changing CPython source code, just changing how we build it and various things that we can tweak there. + +00:32:07.420 --> 00:32:08.800 +And so we've been working on a bunch of benchmarks. + +00:32:08.980 --> 00:32:14.920 +I do think we have the fastest Python now, but we haven't actually published our rigorous benchmark methodology. + +00:32:14.920 --> 00:32:19.520 +So I won't stake my reputation on that claim yet, but we've been very focused on it. + +00:32:19.580 --> 00:32:28.520 +And it's been a cool point of leverage because like we can just, yeah, if we can make Python, you know, if we can put out a Python distribution that's like 10 or 15% faster, you know, just by changing how we build it. + +00:32:28.900 --> 00:32:30.300 +Yeah, it's a big lever for impact. + +00:32:30.700 --> 00:32:31.460 +Yeah, it's a huge lever. + +00:32:31.600 --> 00:32:34.940 +And I hadn't really thought about it being a lever until Jonathan brought it up. + +00:32:35.220 --> 00:32:40.540 +But for example, it's not directly impacted by this because we don't ship it, I guess, for the reason that we don't ship it as a wheel. + +00:32:40.840 --> 00:32:42.260 +Although someday we potentially could. + +00:32:42.260 --> 00:32:45.040 +Right now it's just, they're just the files that uv knows how to install. + +00:32:45.440 --> 00:32:46.940 +But it's the same logic at the core. + +00:32:47.600 --> 00:32:53.960 +Once you start tweaking the packaging of Python packages, the next part you want to tweak is your Python install. + +00:32:55.420 --> 00:33:03.540 +Well, for example, all of my stuff that runs in on the servers, it's all in Docker and it has a base Docker image. + +00:33:03.540 --> 00:33:09.780 +And one of the very first lines is, you know, install the, use curl plus the shell to install uv. + +00:33:09.940 --> 00:33:12.480 +The next line is uv, V, E, and V. + +00:33:12.840 --> 00:33:16.040 +And that, that installs Python from Python build standalone. + +00:33:16.220 --> 00:33:19.560 +And then whatever, you need to make an actual app out of that afterwards. + +00:33:19.560 --> 00:33:19.900 +Right. + +00:33:20.220 --> 00:33:22.040 +And so how many people are doing that? + +00:33:22.140 --> 00:33:29.640 +I, it seems like a huge portion of the world has adopted uv for sort of bootstrapping Python instead of the other way. + +00:33:29.700 --> 00:33:31.700 +So that's, that's why it's such a big lever, right? + +00:33:31.700 --> 00:33:32.100 +Yep. + +00:33:32.260 --> 00:33:32.700 +Yeah, exactly. + +00:33:33.260 --> 00:33:33.620 +All right. + +00:33:33.800 --> 00:33:39.840 +As a way to sort of get into the peps, Charlie, you mentioned variants. + +00:33:39.920 --> 00:33:41.360 +You're like, wait, wait, wait, not that variant. + +00:33:42.080 --> 00:33:43.700 +What variant are we talking about? + +00:33:43.920 --> 00:33:44.980 +That's not that variant. + +00:33:45.260 --> 00:33:46.240 +What is that variant? + +00:33:46.280 --> 00:33:48.760 +I guess that we're not talking about in uv or Python build standalone. + +00:33:49.000 --> 00:33:49.700 +Who wants to take that? + +00:33:49.820 --> 00:33:50.580 +Ralph, do you want to take that? + +00:33:50.880 --> 00:33:52.780 +I'm not actually sure what the question is here. + +00:33:53.020 --> 00:33:55.240 +I think you were targeted for the question. + +00:33:55.880 --> 00:33:58.560 +Yeah, yeah, yeah. + +00:33:58.760 --> 00:33:59.340 +That's fine. + +00:33:59.340 --> 00:34:03.760 +I mean, like the, so we use, so the peps revolves around this concept of wheel variants. + +00:34:03.760 --> 00:34:09.420 +And the idea is you can have, I'll keep using the word variants. + +00:34:09.540 --> 00:34:20.580 +You can have different variants, different builds, you know, of a wheel that are intended to be installed based on properties that are known or detected on the machine. + +00:34:20.580 --> 00:34:28.580 +So, for example, that could be like, okay, what NVIDIA drivers do you have on your machine? + +00:34:29.060 --> 00:34:30.560 +Like, what are the versions of those drivers? + +00:34:30.680 --> 00:34:34.500 +Because that then implies things about what versions of the CUDA runtime you can use. + +00:34:34.920 --> 00:34:43.540 +And so when someone publishes a wheel, maybe that wheel, you know, leverages CUDA and needs to be built against CUDA and needs to be built, you know, in a way that leverages CUDA. + +00:34:43.540 --> 00:34:58.040 +And so they might publish different variants, effectively just different, you know, slightly different versions of, versions is wrong, different variants, slightly different flavors of that package that are all built against different, you know, different CUDA versions. + +00:34:58.820 --> 00:35:01.420 +And so we would call those different, you know, different variants. + +00:35:01.960 --> 00:35:03.680 +It's a, it's a, you need to correct me. + +00:35:03.680 --> 00:35:14.580 +The terminology across what I understand the packaging space, even outside of Python, if you type variants in general, this is, we try to reuse the terminology that ends up being + +00:35:14.580 --> 00:35:20.580 +pretty widely adopted in the packaging ecosystem, not Python packaging, the packaging at large. + +00:35:21.800 --> 00:35:25.980 +This is the, variants is the name that you'll find around for this kind of concept. + +00:35:26.460 --> 00:35:40.820 +You know, related to that, like, especially in the astral flavor these days, but also in many other areas, I feel like crates and rust, what they've done with their packaging system has kind of influenced some of the things we're adopting in the Python world. + +00:35:41.240 --> 00:35:46.360 +Has anything from the rust world influenced the, these peps that we're about to talk about? + +00:35:46.620 --> 00:35:49.380 +Well, crates are source distribution now, mostly. + +00:35:49.940 --> 00:35:50.080 +Yeah. + +00:35:50.420 --> 00:35:53.740 +Well, in this case, we're talking about actually binary distribution. + +00:35:54.060 --> 00:35:54.220 +Yeah. + +00:35:54.240 --> 00:35:54.380 +Yeah. + +00:35:54.380 --> 00:35:54.540 +Yeah. + +00:35:54.540 --> 00:35:55.080 +So not really. + +00:35:55.240 --> 00:35:55.400 +Okay. + +00:35:56.000 --> 00:35:56.740 +But in a sense. + +00:35:56.980 --> 00:35:58.180 +That's actually interesting, right? + +00:35:58.480 --> 00:35:58.700 +Yes. + +00:35:58.700 --> 00:35:58.820 +Yeah. + +00:35:59.100 --> 00:36:07.040 +Because a lot of the best packaging systems, you know, whether it's, it's rust or, you know, Nix, they start from source, right? + +00:36:07.080 --> 00:36:09.300 +And they know exactly what's, you know, in the box. + +00:36:09.640 --> 00:36:12.540 +And then binaries are kind of like an optimization, right? + +00:36:12.560 --> 00:36:19.080 +It's like, you have a thing that you know exactly what is the binary and you can check like, oh, I don't have to build this thing from source. + +00:36:19.200 --> 00:36:20.600 +I can grab a binary somewhere. + +00:36:20.880 --> 00:36:20.980 +Right. + +00:36:20.980 --> 00:36:22.900 +The packaging is absolutely not like that. + +00:36:23.080 --> 00:36:28.020 +Like if you build a wheel and you have an sdisk, I mean, you have no idea if they're the same thing. + +00:36:28.140 --> 00:36:35.920 +If you, you know, you cannot rebuild the wheel from the sdisk unless, you know, you use very, very well predefined constraints. + +00:36:36.300 --> 00:36:36.320 +Yeah. + +00:36:36.320 --> 00:36:36.500 +Yeah. + +00:36:36.500 --> 00:36:39.960 +I hadn't really thought about that either, but that is an interesting juxtaposition. + +00:36:40.240 --> 00:36:46.400 +Like the binary stuff that is all binary is shipping as source, but the interpreted stuff is shipping as binary. + +00:36:46.400 --> 00:36:57.640 +And I think part of the reason, or maybe the main reason is if we're talking about binary stuff for Rust, well, it's all Rust that's compiled, but for Python, it's this mix, this + +00:36:57.640 --> 00:37:03.660 +crazy mix of all these different libraries that are not, none of them are Python, but they're all binary in the end. + +00:37:03.720 --> 00:37:10.340 +And so you've got to get around the fact like, well, I don't have a Fortran and a Haskell compiler, so I can't run this project, you know? + +00:37:10.340 --> 00:37:15.300 +There's something quite amazing to Python in general, which is called the CFFI. + +00:37:15.500 --> 00:37:23.300 +So the C foreign function interface, which essentially allows you to build any sort of application you want in whatever language. + +00:37:23.460 --> 00:37:33.260 +As long as you're compatible with CFFI standard, you can call it from Python and it's incredible and amazingly useful. + +00:37:33.260 --> 00:37:46.280 +But to come back on what Ralph was saying, a lot of the design actually from WeR variant has been inspired by a system that is called SPAC that was designed for supercomputers. + +00:37:47.400 --> 00:38:00.440 +And we use this, especially around the design of CPU variants to kind of get a lot of inspiration around a package called RSpec that is just, from my perspective, pure brilliance in some of design. + +00:38:02.360 --> 00:38:07.800 +Just my words, but in my opinion, but I really think they got the thing right. + +00:38:08.800 --> 00:38:10.360 +It's just beautifully designed. + +00:38:10.360 --> 00:38:15.040 +Everything is static and JSON-fired and it's extremely easy to scale and maintain. + +00:38:15.820 --> 00:38:29.240 +But yes, if you take all the kind of system designed to support the most specific deployment scenarios like SPAC, like Nix, or even in some cases, cargo, well, they mostly ship sources + +00:38:29.240 --> 00:38:34.580 +to go around this variant problem because that allows you to control the entire build chain essentially. + +00:38:35.300 --> 00:38:44.600 +And in some cases, maybe Ralph can talk about it, but Conda Forge also kind of take an approach that is similar to Nix to kind of go around these issues a little bit. + +00:38:44.800 --> 00:38:46.920 +Maybe Ralph, if you want to talk a little bit about that. + +00:38:47.280 --> 00:38:51.720 +Not quite because Conda and Conda Forge don't do source distributions at all. + +00:38:52.080 --> 00:38:54.960 +They just take a release and they build binaries. + +00:38:54.960 --> 00:38:57.500 +And if there are no binaries, you can't install it. + +00:38:58.080 --> 00:39:00.220 +But yeah, I would say that's a good point, right? + +00:39:00.240 --> 00:39:02.380 +We have people that worked on all these systems. + +00:39:02.640 --> 00:39:06.980 +Like one of Jonathan's colleagues at NVIDIA, Mike Saran, used to work on Conda. + +00:39:07.200 --> 00:39:08.820 +I contribute to Conda Forge as well. + +00:39:09.000 --> 00:39:14.220 +And so we have some ideas that originally came from Conda, some that came from SPAC. + +00:39:14.500 --> 00:39:22.660 +And the end result is nothing like, not exactly like any of those systems, but it takes some of the best aspects of them to enhance Python packaging. + +00:39:22.660 --> 00:39:24.040 +Not reinventing the wheel. + +00:39:24.340 --> 00:39:26.020 +I mean, maybe, but not too much. + +00:39:27.600 --> 00:39:28.580 +Yeah, not too much. + +00:39:29.140 --> 00:39:35.220 +But it's kind of, it's cool because I think, like, I feel like a lot of this work really got kicked off. + +00:39:35.340 --> 00:39:36.900 +We did an in-person summit. + +00:39:37.900 --> 00:39:41.000 +And I honestly can't remember when that was because my mind is such a blurb. + +00:39:41.000 --> 00:39:41.520 +March 2025. + +00:39:42.660 --> 00:39:43.020 +Thank you. + +00:39:43.100 --> 00:39:44.140 +Okay, so it was about a year ago. + +00:39:44.480 --> 00:39:46.320 +And there's a bunch of notes about this. + +00:39:46.520 --> 00:39:56.420 +And we had people from probably like, I don't know, I'd have to guess 20 different companies, maybe more, all in person for a day, just talking about these problems. + +00:39:56.420 --> 00:40:07.160 +And a bunch of people presented on their own open source projects and how they intersect with, like, we had people from PyTorch, people from the JAX team, just talking about like, how, what their concerns are, like, what's working well for them, what's not. + +00:40:07.520 --> 00:40:14.560 +And so, you know, similarly to how we've, I think a lot of the design has really been influenced by like, what are other designs? + +00:40:14.560 --> 00:40:16.680 +What's the prior art and like, what's working well? + +00:40:17.640 --> 00:40:23.780 +You know, a lot of it was also informed by like, just talking to a bunch of people across the industry and understanding like, what their concerns are. + +00:40:24.300 --> 00:40:31.060 +And so, at least from my perspective, having not, honestly, by calendar time, I have not been involved in Python that long. + +00:40:31.140 --> 00:40:38.940 +But it's been like, definitely the most like cross company, cross project, cross organization effort I've been involved in by a lot. + +00:40:38.940 --> 00:40:45.920 +We try to replicate a model that I really like in the Python community that was faster cpython. + +00:40:46.260 --> 00:40:51.240 +We try to philosophically create the packaging child of faster cpython. + +00:40:51.600 --> 00:40:54.200 +But, and that's how we created We Are Next. + +00:40:54.540 --> 00:41:05.740 +It was all the amazing work that the faster cpython community did on the cpython side, and kind of creating the same synergy, but around Python packaging. + +00:41:05.960 --> 00:41:06.780 +And that's why it was. + +00:41:06.780 --> 00:41:10.920 +I would almost say it's even, you know, quite a bit more diverse. + +00:41:11.080 --> 00:41:18.640 +At least my understanding is faster cpython is primarily like funded and created by Microsoft, and it kind of turned into a community thing. + +00:41:18.840 --> 00:41:21.140 +But like, all the money came from Microsoft, I think. + +00:41:21.360 --> 00:41:25.420 +I think the majority of the people were working in a team inside Microsoft, at least. + +00:41:25.840 --> 00:41:30.500 +And here, we've got NVIDIA, Meta, the PyTorch folks at Meta. + +00:41:30.820 --> 00:41:35.360 +We got some contributions from AMD and Intel, and then Astral, QuantSight. + +00:41:35.360 --> 00:41:43.260 +Large amount of the time that we've been able to spend at QuantSight came from funding from Red Hat, who came with their own problem sets. + +00:41:43.780 --> 00:41:47.140 +And, you know, so, and that's just the most prominent contributors. + +00:41:47.340 --> 00:41:52.620 +So there's like at least 10 companies that started investing in this, because it solves so many problems. + +00:41:53.000 --> 00:41:53.780 +Yeah, that's really encouraging as well. + +00:41:53.780 --> 00:41:56.860 +On the left side, you'll see a section called Who We Are. + +00:41:57.500 --> 00:41:59.540 +Yeah, so I pulled up this project, Wheel Next. + +00:42:00.260 --> 00:42:02.200 +And, you know, Ralph, this is yours? + +00:42:02.200 --> 00:42:03.100 +Yeah, who are we? + +00:42:03.100 --> 00:42:08.600 +And the name of also the open source project that contributed time and expertise. + +00:42:09.480 --> 00:42:18.820 +Yeah, AMD, Anaconda, Aprio, Astral, Google, Huawei, Intel, Lap, Lab, Meta, NVIDIA, Preferred Networks, QuantSight, and Red Hat. + +00:42:18.940 --> 00:42:21.180 +That's a bit of a group working on this. + +00:42:21.180 --> 00:42:32.180 +And you can see just above all the different open source projects that different OSS and lead maintainers have contributed time and energy to kind of try to make this move forward. + +00:42:32.300 --> 00:42:34.040 +So it is quite a few people. + +00:42:35.220 --> 00:42:35.640 +Yeah, yeah. + +00:42:35.720 --> 00:42:38.540 +Most notably, maybe QPy and PyTorch, possibly. + +00:42:38.900 --> 00:42:39.460 +I mean, they're all... + +00:42:39.460 --> 00:42:52.400 +Maybe one company that is not too well known, undeservingly, because they should, which is Probable at the bottom that you mentioned, which is essentially the support company behind scikit-learn. + +00:42:52.400 --> 00:42:59.460 +So if people don't know it, Probable is essentially representing Scikit. + +00:43:00.920 --> 00:43:03.720 +Yeah, so this is wheelnext.dev. + +00:43:03.780 --> 00:43:08.340 +This is basically the website for the group, the working group, something like that. + +00:43:08.340 --> 00:43:08.660 +Yep. + +00:43:09.180 --> 00:43:13.100 +We try to leave our notes, our thinking, our drafts. + +00:43:13.440 --> 00:43:18.480 +One aspect that I really like on the work that we did is that it kind of felt like a startup. + +00:43:18.480 --> 00:43:23.200 +We were making a mock-up and iterating very fast and getting feedback. + +00:43:23.600 --> 00:43:24.660 +And this, I don't like this. + +00:43:24.720 --> 00:43:25.340 +I don't like this. + +00:43:25.400 --> 00:43:26.800 +I don't like change it. + +00:43:27.060 --> 00:43:34.500 +I worked really closely with two people, one from QuantSight, one from Astro, Constantine and Michel. + +00:43:34.900 --> 00:43:37.560 +And we did so many hours of work. + +00:43:37.560 --> 00:43:52.060 +So many different prototypes, iterating, exposing the work to people, collecting feedback, adjusting, and repeating the cycle so many times until we finally got to something that we thought was reasonable. + +00:43:52.580 --> 00:43:54.560 +And that's where we started to write the peps. + +00:43:54.840 --> 00:43:56.340 +But that process took us a year. + +00:43:57.560 --> 00:43:58.020 +All right. + +00:43:58.020 --> 00:44:00.320 +Well, we should probably jump into the peps. + +00:44:00.320 --> 00:44:04.900 +And I'll tell you what, you all have quite the authorship attribution here. + +00:44:04.960 --> 00:44:10.340 +But also, I believe, correct me if I'm wrong, that this PEP is notable in that it's the longest PEP ever. + +00:44:10.540 --> 00:44:11.340 +Something like that, right? + +00:44:12.080 --> 00:44:12.240 +Yeah. + +00:44:12.240 --> 00:44:14.000 +I don't know if it's an achievement to be proud of. + +00:44:17.340 --> 00:44:19.560 +It's the most powerful PEP ever. + +00:44:19.720 --> 00:44:20.120 +Yes. + +00:44:20.280 --> 00:44:20.600 +No, no. + +00:44:20.920 --> 00:44:21.780 +It's a super pep. + +00:44:21.780 --> 00:44:31.340 +So much so that we're talking about PEP 817 wheel variants, which is the variant thing that we actually are talking about, not the other variants, beyond platform tags. + +00:44:31.340 --> 00:44:38.000 +But then so much so that it actually got kicked to the curb for like, well, what is the minimal viable PEP of this pep? + +00:44:38.260 --> 00:44:39.760 +So we can take it in steps. + +00:44:40.700 --> 00:44:52.160 +And Jonathan, you just told me really good news that pep, so you spun off this other pep, PEP 825 wheel variants package format, which is smaller, which still has a significant authorship. + +00:44:52.660 --> 00:44:56.320 +But that this was just, it says draft, but is that true? + +00:44:56.320 --> 00:44:56.440 +Yeah. + +00:44:56.960 --> 00:44:57.120 +Yeah. + +00:44:57.260 --> 00:45:04.060 +So peps, maybe Ralph, you want to discuss a little about what's the process for PEP that I think that's important. + +00:45:04.440 --> 00:45:04.460 +Yeah. + +00:45:04.460 --> 00:45:08.120 +So when you submit a pep, it first, you know, submit up on GitHub. + +00:45:08.340 --> 00:45:17.340 +And then there's a group of folks called the PEP editors who basically just edit, you know, they review it for clarity, you know, language, consistency with other peps and so on. + +00:45:17.400 --> 00:45:20.360 +So they don't really look at the content of what you're proposing. + +00:45:21.180 --> 00:45:24.300 +So it's just, as long as it's clear, they're happy, you merge it in. + +00:45:24.300 --> 00:45:28.640 +But because the first PEP was already so long, that process took like over a month already. + +00:45:29.340 --> 00:45:31.700 +But at that point, it's merged as draft. + +00:45:31.900 --> 00:45:37.820 +And then you go to the Python packaging discourse where you say, okay, here's our pep. + +00:45:38.160 --> 00:45:40.800 +You know, now please let's start the actual community review. + +00:45:41.100 --> 00:45:44.360 +And then basically anybody with an opinion can weigh in. + +00:45:44.840 --> 00:45:46.580 +And it's just, it's a forum. + +00:45:47.340 --> 00:45:49.020 +They're not, it's not even a threaded forum. + +00:45:49.020 --> 00:45:53.580 +So it's just one long thread of comments, which tends to make it like a little challenging. + +00:45:54.000 --> 00:45:59.040 +You know, the more complex the topic gets, the harder it is to make sense of this conversation. + +00:45:59.480 --> 00:46:03.660 +It's really hard to have a threaded multi-component conversation. + +00:46:04.060 --> 00:46:04.660 +It is. + +00:46:04.960 --> 00:46:05.320 +Exactly. + +00:46:05.320 --> 00:46:09.140 +So that's one of the reasons it's now split into smaller parts. + +00:46:09.260 --> 00:46:12.100 +So you can at least have separate threads about different topics, right? + +00:46:12.240 --> 00:46:17.300 +So, and because especially not all of the parts of the design apply to everybody. + +00:46:17.300 --> 00:46:26.680 +When we're talking about installers, we want to hear primarily from the authors of uv and pip, Poetry, Hatch, PDM. + +00:46:26.680 --> 00:46:36.560 +But if we're talking about how do you build a wheel, well, we have to talk primarily to setup tools, Zykit, BuildCore, Meson Python, the build backends. + +00:46:36.980 --> 00:46:39.220 +And, you know, the index server the same, right? + +00:46:39.240 --> 00:46:41.940 +Do you want to know that the PyPI maintainers are happy? + +00:46:42.740 --> 00:46:50.440 +So that's why, you know, organizing this review and chopping it up into a complex part, it's still going to be really hard to get the right amount of feedback. + +00:46:50.440 --> 00:46:55.880 +But we now have like the first PR, you know, the first merge path in draft status. + +00:46:56.380 --> 00:47:01.920 +So it's going to only be accepted once the whole community review process is done. + +00:47:02.120 --> 00:47:09.720 +And probably what will happen is it's going to be provisionally accepted only because we know there's like three more paths coming for the other parts. + +00:47:10.160 --> 00:47:15.460 +And eventually, like the, you know, you want all four to be, you know, working and accepted. + +00:47:15.460 --> 00:47:26.560 +Like, you know, we now have prototypes, but, you know, we want the prototypes for the final design and have like, you know, the tool author say like, yeah, this works for us before you really go from provisional to actually accept. + +00:47:26.740 --> 00:47:26.980 +Amazing. + +00:47:27.220 --> 00:47:33.020 +So this is part of what I get out when I said at the beginning that this touches like every part of the packaging stack. + +00:47:33.120 --> 00:47:39.180 +There's just like, it's very hard to break it up into, I mean, that's what we're trying to do in some sense. + +00:47:39.180 --> 00:47:41.000 +But like, it's from the start, it's been hard. + +00:47:41.180 --> 00:47:53.240 +It's hard to, there aren't necessarily super great cut points because it does affect how you build packages, how you publish them, like how they get hosted and served from the registry, how installers like look at them and understand them. + +00:47:53.280 --> 00:47:58.660 +All of those things, like marker syntax, all of that stuff gets impacted in different ways. + +00:47:59.280 --> 00:47:59.980 +It's very funny. + +00:47:59.980 --> 00:48:03.280 +We're prototyping this for a year. + +00:48:03.400 --> 00:48:06.740 +We ended up pretty much forking the entire ecosystem. + +00:48:06.740 --> 00:48:20.600 +PIP got fork, EV got forks, warehouse got fork, packaging got fork, like absolutely every package in the ecosystem, but it didn't being forked because we needed to test our implementation. + +00:48:21.220 --> 00:48:21.960 +And we needed to verify. + +00:48:21.960 --> 00:48:24.240 +The goal, of course, is to unfork those things. + +00:48:24.320 --> 00:48:24.580 +Yes. + +00:48:24.580 --> 00:48:25.580 +Like over time. + +00:48:25.580 --> 00:48:36.020 +It's a re-merge pack, but we needed to have a playground to be able to experiment and see how the concept that we were developing was functioning in pip. + +00:48:36.020 --> 00:48:38.920 +And then in packaging, but then also in setup tools. + +00:48:39.360 --> 00:48:41.640 +And then in scikit build core. + +00:48:42.040 --> 00:48:43.100 +And then in Python method. + +00:48:43.260 --> 00:48:51.180 +And it just keeps spreading essentially to every single corner of the packaging, installation, and distribution aspect of Python. + +00:48:51.600 --> 00:48:52.760 +So that was pretty funny. + +00:48:53.320 --> 00:48:53.500 +Yeah. + +00:48:53.780 --> 00:48:56.120 +What ecosystem you got in? + +00:48:56.120 --> 00:49:06.040 +I think you have fork in uv, or I guess technically it's just a branch that Constantine on our team on here on the PEP has been, who's been super involved. + +00:49:06.240 --> 00:49:06.720 +Oh, thanks. + +00:49:07.020 --> 00:49:12.740 +Who's been super involved, you know, throughout and done a ton of work on basically implementing the standard in uv. + +00:49:12.740 --> 00:49:22.080 +So we have like a working implementation that we've used to, yeah, you can actually install it from, you know, we basically distribute it to a slightly different URL. + +00:49:22.360 --> 00:49:24.180 +So you can install it and test it. + +00:49:24.580 --> 00:49:29.720 +But yeah, that's been, that fork has evolved a lot, or that branch has evolved a lot. + +00:49:29.820 --> 00:49:35.700 +And it's been a lot of work to, I mean, it's been incredibly helpful for the design process for us to understand like what's hard, what's easy. + +00:49:35.700 --> 00:49:40.200 +And then I also think it's important for PEPs just to have like working implementations too. + +00:49:40.360 --> 00:49:46.900 +And I mean, a lot of people agree that's not an awful point, but that's been one of the goals too, is to show what it's like in practice and that it actually works. + +00:49:47.320 --> 00:49:48.600 +So people want to play around with this. + +00:49:48.720 --> 00:49:52.260 +An easy way might be to try to use this fork. + +00:49:52.500 --> 00:49:56.020 +We put a lot of work to actually make it, go ahead and try it. + +00:49:56.020 --> 00:50:05.260 +Because I think it's, I personally have a lot of like admiration for the work done in free threading Python, especially to the PEP. + +00:50:05.580 --> 00:50:15.300 +And I think Sam Gross, who is the main author, managed to make significant amount of progress as he was coming up with prototypes that are, it's not just my word. + +00:50:15.620 --> 00:50:16.660 +Let me show it to you. + +00:50:16.780 --> 00:50:17.240 +It works. + +00:50:18.540 --> 00:50:22.440 +And there was so much skepticism around that idea of free threading Python. + +00:50:22.720 --> 00:50:24.960 +He had to, had to show, not tell. + +00:50:24.960 --> 00:50:34.100 +But I think if we didn't do the work similarly on variant enabled wheels, people would have told us, oh, well, resolution is too slow. + +00:50:34.260 --> 00:50:36.180 +It's going to slow down installers too much. + +00:50:36.820 --> 00:50:40.520 +And Astro is probably one of the installers that care the most about speed. + +00:50:41.320 --> 00:50:47.860 +So we need to both convince us, but also Charlie's and his team to be, hey, it's not going to slow down anything. + +00:50:47.860 --> 00:50:48.020 +Yeah. + +00:50:48.120 --> 00:50:49.840 +And we had plenty of feedback on that front too. + +00:50:49.920 --> 00:50:52.500 +Well, during the design, we were like, no, this is going to be too slow. + +00:50:52.640 --> 00:50:54.800 +Or like, this is like a better way to do it, et cetera. + +00:50:54.960 --> 00:50:57.420 +But, but I like, I mean, I like this little snippet. + +00:50:57.420 --> 00:51:02.020 +Cause like, this is basically like, if you haven't felt this pain, it might not be meaningful to you. + +00:51:02.020 --> 00:51:06.320 +But if you've like worked with PyTorch, like this is kind of like, this is what we want to enable. + +00:51:06.320 --> 00:51:06.660 +Right. + +00:51:06.700 --> 00:51:12.880 +Is like, you don't, you don't have to like configure a specific index URL that like captures the CUDA variant or anything like that. + +00:51:12.880 --> 00:51:15.180 +Like you just say, hey, install Torch. + +00:51:15.180 --> 00:51:19.580 +And then in this variant enabled build, uv would, it would go look at Torch. + +00:51:19.820 --> 00:51:25.600 +It would see, okay, Torch, you know, it has different variants for different CUDA versions. + +00:51:25.920 --> 00:51:29.200 +And here's how I inspect, you know, what CUDA version I should use on your machine. + +00:51:29.200 --> 00:51:32.600 +And then it would pick out the right version based on what's supported by the GPU that's running. + +00:51:32.600 --> 00:51:39.100 +Like that should all happen and users shouldn't have to think about configuring it effectively is like what we were, what we have been working towards. + +00:51:39.520 --> 00:51:44.680 +And in the future, the first line doesn't exist because right now the first line is just here to install this variant enabled. + +00:51:44.760 --> 00:51:44.920 +Yeah. + +00:51:44.940 --> 00:51:46.080 +That just installs the fork. + +00:51:46.360 --> 00:51:46.500 +Yeah. + +00:51:46.700 --> 00:51:50.920 +And for people listening and not watching what they mean by this, there's three lines here to say how to use this. + +00:51:51.140 --> 00:51:51.800 +It says curl. + +00:51:51.800 --> 00:51:52.100 +I'm sorry. + +00:51:52.340 --> 00:51:52.700 +Basically. + +00:51:52.940 --> 00:51:53.320 +Yeah, no worries. + +00:51:53.380 --> 00:51:59.460 +It's the install statement for uv, which is typical, except for that it overrides the. + +00:51:59.580 --> 00:52:00.480 +The download URL. + +00:52:00.900 --> 00:52:01.840 +The download URL. + +00:52:01.840 --> 00:52:05.420 +It's a different URL, which is wheelnext.astral.sh. + +00:52:05.560 --> 00:52:09.980 +We serve, we distribute a separate variant enabled experimental quote unquote prototype build. + +00:52:10.160 --> 00:52:10.540 +Right. + +00:52:10.600 --> 00:52:16.380 +And then you just create a virtual environment, uv, V and V, and then you just uv pip install like normal, but it handles this. + +00:52:16.380 --> 00:52:25.800 +And, you know, Charlie, we spoke, I think on the pyx episode about just how large some of these things are like PyTorch and others that are compiled there. + +00:52:26.060 --> 00:52:30.140 +You can't just come download everything, all the variations into one wheel. + +00:52:30.140 --> 00:52:32.240 +I mean, I guess you could, but it'd be crazy, right? + +00:52:32.620 --> 00:52:33.980 +That's actually a big benefit, right? + +00:52:34.140 --> 00:52:37.700 +Like right now you go to PyPI, you download the PyTorch wheel. + +00:52:37.800 --> 00:52:39.880 +It'll be about 900 megabytes. + +00:52:40.240 --> 00:52:41.380 +You could make it small. + +00:52:41.620 --> 00:52:44.360 +You know, part of the reason it's so large is, again, these bad binaries, right? + +00:52:44.360 --> 00:52:46.920 +Like the NumPy ones are like a few megabytes. + +00:52:47.140 --> 00:52:52.260 +The PyTorch ones have a bunch of CUDA inside, like for five or six different CUDA architectures. + +00:52:52.480 --> 00:52:54.420 +And, you know, it floats very, very quickly. + +00:52:54.620 --> 00:52:59.660 +And actually the PyTorch team has to try incredibly hard to stay under one gigabyte. + +00:52:59.660 --> 00:53:09.680 +If we have variants, we can just slim it down to one CUDA architecture per wheel, you know, so you can go down to like, you know, 200 megabytes or so, 250 maybe. + +00:53:09.860 --> 00:53:13.840 +But it's way better for, you know, both for index servers, it's better for users. + +00:53:14.580 --> 00:53:16.120 +It's going to be pretty slow too. + +00:53:16.120 --> 00:53:22.560 +The only thing it's not better for it's CI servers that have to build all these different things if you start sharding. + +00:53:23.300 --> 00:53:26.220 +But that's a one-time cost that at the end ends up being. + +00:53:26.560 --> 00:53:33.040 +It's much better to have a slight increase one time and massive decrease scalable, essentially. + +00:53:33.380 --> 00:53:36.660 +You build it once, it gets installed a million times. + +00:53:36.840 --> 00:53:38.420 +That's a massive difference. + +00:53:38.880 --> 00:53:43.300 +And, you know, it's also better for the warehouse folks like PyPI. + +00:53:43.300 --> 00:53:49.560 +And it's easy for people to just assume pip install, uv pip install, that sort of stuff is going to work. + +00:53:49.860 --> 00:53:55.160 +But the cost of just the bandwidth in that infrastructure is astronomical, which is crazy. + +00:53:55.460 --> 00:53:58.700 +So this is going to be a major benefit for bandwidth. + +00:53:59.060 --> 00:54:01.240 +Yeah, and like also like install speed. + +00:54:01.980 --> 00:54:07.200 +You'll also benefit from that because you're no longer downloading as much stuff to actually install PyTorch. + +00:54:07.520 --> 00:54:12.020 +I mean, if you use uv, it's got some really good caching and it's pretty quick. + +00:54:12.020 --> 00:54:14.640 +Oh, but it doesn't multiply your bandwidth by magic. + +00:54:16.820 --> 00:54:17.620 +I wish. + +00:54:18.060 --> 00:54:19.740 +Charlie, if you find a solution to that. + +00:54:20.840 --> 00:54:21.920 +I haven't yet. + +00:54:22.180 --> 00:54:26.860 +But yeah, if you're downloading Torch and all the NVIDIA, all the CUDA stuff, it's, yeah. + +00:54:27.040 --> 00:54:27.560 +It's hefty. + +00:54:27.780 --> 00:54:30.560 +It's a large number of megabytes. + +00:54:31.040 --> 00:54:34.260 +Let's talk real quick about the PyPackaging Native Guide. + +00:54:34.360 --> 00:54:37.380 +And then I want to get an update on pyx real quick before we go. + +00:54:37.460 --> 00:54:38.920 +So, Ralph, this is your project, right? + +00:54:39.060 --> 00:54:39.700 +Tell us about this. + +00:54:39.800 --> 00:54:41.020 +I'll be sure to show it. + +00:54:41.020 --> 00:54:51.220 +Okay, so I've been watching discussions about some of the topics we've talked about in this episode, you know, since 2010 or so in Python packaging. + +00:54:51.220 --> 00:54:59.840 +And even back then, long before we had wheels, you know, NumPy, for example, had different .exe installers that we would upload to PyPI. + +00:54:59.840 --> 00:55:04.120 +And, like, there would be one named underscore sse2, one underscore sse3. + +00:55:04.300 --> 00:55:08.580 +And, like, user had just the right .exe and install it on their Windows machine. + +00:55:09.000 --> 00:55:09.140 +What? + +00:55:09.640 --> 00:55:10.040 +Wow. + +00:55:10.040 --> 00:55:10.820 +Oh, I have no idea. + +00:55:11.080 --> 00:55:11.240 +Okay. + +00:55:11.860 --> 00:55:14.100 +Yes, it was not fun. + +00:55:14.400 --> 00:55:21.300 +And actually, this was by far the hardest thing when I became NumPy release manager because we had to build these things on Linux under Wine. + +00:55:21.300 --> 00:55:24.300 +And there were no instructions and there were really janky scripts. + +00:55:24.420 --> 00:55:26.680 +So, it took me three months to get the first release out. + +00:55:27.800 --> 00:55:34.060 +But, yeah, so we, I saw all these discussions about, you know, this was sse2 and sse3. + +00:55:34.060 --> 00:55:43.660 +And, like, you know, the pip authors and, you know, most of the people who work with pure Python, like, you know, the DevOps folks, the, you know, web framework folks, they had no idea about this. + +00:55:44.020 --> 00:55:52.120 +And usually, these conversations went in circles because when you explain something to one person, the next person would come in and, like, you know, this is endless mailing list threats. + +00:55:52.380 --> 00:55:53.380 +That would never go anywhere. + +00:55:53.740 --> 00:55:59.600 +So, after, you know, seeing that for 12, 13 years or so, I, you know, finally got tired of that. + +00:55:59.680 --> 00:56:03.240 +And I thought, I'm going to write a reference site that explains the problem. + +00:56:03.240 --> 00:56:05.900 +I don't want to propose any solutions, but just explain the problem. + +00:56:06.000 --> 00:56:18.400 +So, the next time someone starts a new conversation about, you know, SIMD extensions or about GPUs or, you know, about some of the issues with mixing, you know, source and binary distributions. + +00:56:18.720 --> 00:56:19.540 +Just link to this site. + +00:56:19.660 --> 00:56:24.340 +Like, please use that as our best, you know, approach at trying to summarize a problem, you know. + +00:56:24.560 --> 00:56:26.960 +So, we have a baseline to start talking about solutions. + +00:56:27.360 --> 00:56:30.780 +And I think, you know, Jonathan, you know, is one of the people who saw this. + +00:56:30.780 --> 00:56:32.140 +I think a lot of people read this. + +00:56:32.140 --> 00:56:38.560 +But it was a nice basis to, you know, just point at this as, like, there are your problem descriptions. + +00:56:38.920 --> 00:56:45.940 +And, you know, for the GPU part, like, NVIDIA folks really helped to make sure that all the explanations of the problems were correct. + +00:56:46.360 --> 00:56:51.140 +So, when we started Wheel Next, we could just start talking about, like, okay, what are the solutions here? + +00:56:51.380 --> 00:56:54.080 +This website is absolutely incredible. + +00:56:54.600 --> 00:56:55.360 +It's amazing. + +00:56:55.560 --> 00:56:56.380 +Yeah, it's amazing. + +00:56:56.380 --> 00:57:01.420 +Thanks to the work that Ralph and every contributor to this website have made. + +00:57:01.620 --> 00:57:07.980 +This is by far the best explanation anywhere on the internet to all these packaging issues. + +00:57:08.320 --> 00:57:12.220 +And I really like the perspective that Ralph has took, which is don't state the solution. + +00:57:12.360 --> 00:57:14.720 +Just focus on stating the problem very clear. + +00:57:14.720 --> 00:57:21.760 +And then with Wheel Next, we try to take the exact flip coin, flip side of the coin, which is don't focus on the problem. + +00:57:21.860 --> 00:57:22.780 +It's already explained. + +00:57:23.020 --> 00:57:26.480 +Just focus on proposing one solution to some of the problems. + +00:57:26.780 --> 00:57:28.400 +And this is how we created Wheel Next. + +00:57:28.560 --> 00:57:29.140 +I love it. + +00:57:29.380 --> 00:57:37.760 +You know, one of the big problems, challenges, I guess, is if you don't fully understand the problem space, you could be debating two different things. + +00:57:37.760 --> 00:57:41.440 +And one person sees a really important angle, the other person doesn't even see that angle. + +00:57:41.960 --> 00:57:45.040 +They have a different perspective that they're arguing for optimizing for. + +00:57:45.240 --> 00:57:48.100 +And so, yeah, it's sort of a little bit like the Wheel Next stuff. + +00:57:48.180 --> 00:57:52.060 +Like, let's get everyone involved and see all the angles and then discuss it, right? + +00:57:52.240 --> 00:57:52.520 +Exactly. + +00:57:52.960 --> 00:57:57.440 +Well, you know the saying, a problem well stated is a problem half solved. + +00:57:57.780 --> 00:58:01.780 +So, this is exactly what we are trying to say. + +00:58:02.020 --> 00:58:02.540 +I love it. + +00:58:02.540 --> 00:58:10.200 +All right, I want to get a quick update on pyx since I feel like, Charlie, you're right in the middle of this. + +00:58:10.380 --> 00:58:13.600 +I know pyx was looking to solve some of these problems as well. + +00:58:14.220 --> 00:58:19.560 +Give us the elevator pitch and just, we have a whole episode on this from, I don't know, six months ago or something. + +00:58:19.920 --> 00:58:24.640 +But, yeah, give us the, what's the situation here and does this change things on how you're handling it and make things easier? + +00:58:24.980 --> 00:58:26.040 +Yeah, yeah, yeah, for sure. + +00:58:26.040 --> 00:58:32.460 +So, like, yeah, pyx is our hosted package registry and it's in beta right now. + +00:58:32.560 --> 00:58:35.520 +So, we're live with a bunch of great customers. + +00:58:37.420 --> 00:58:51.320 +The goal of pyx is basically to enable us to solve, like, more of the packaging problems that we see in the uv issue tracker by having our own registry that we think is well implemented and solves problems that we see that other registries don't really solve. + +00:58:51.320 --> 00:58:59.820 +So, like, basically from the start, the way that we've approached the wheel, these, like, problems around the GPU stuff is from, like, two perspectives. + +00:59:00.820 --> 00:59:07.100 +And in pyx, we're really just focused, in terms of how it overlaps with wheel variants, we're really just focused on the GPU part. + +00:59:07.360 --> 00:59:13.940 +But the way that we've approached it has basically been try to push the standards forward as much as we can. + +00:59:14.200 --> 00:59:15.960 +And that's what we've been doing in this effort. + +00:59:15.960 --> 00:59:21.040 +And then simultaneously try to figure out how we can help users, like, until the standards change. + +00:59:21.720 --> 00:59:32.580 +And so, pyx has mostly been, has more been in that second camp of, like, assuming the standards don't change because we don't want to, we don't want to, like, unilaterally start changing a bunch of things, like, without going through the process. + +00:59:32.740 --> 00:59:36.520 +How can we make the world, like, a little bit easier for people who are working with this kind of stuff? + +00:59:36.520 --> 00:59:46.020 +So, for example, like, in pyx, we take a lot of packages that are, like, PyTorch extensions or need to be built against CUDA, and we, like, build those. + +00:59:46.300 --> 00:59:53.660 +Like, we build them across a wide range of, like, CUDA versions, PyTorch versions, Python versions, CPU architectures, and we make those available to users. + +00:59:53.960 --> 00:59:58.640 +So, it doesn't solve the core problem of, like, how do you build and distribute this stuff? + +00:59:58.640 --> 01:00:06.520 +But it does mean that, like, if you're operating within the constraints of, like, the current set of standards, we can make people's lives easier by making it so they don't have to build so many things. + +01:00:06.600 --> 01:00:09.720 +Like, we build them well, they all work together, all that kind of stuff. + +01:00:10.140 --> 01:00:12.020 +So, that's what, like, we've been focused on. + +01:00:12.080 --> 01:00:21.840 +And I think, like, looking forward, like, our goal is to support Wheelnecks, like, as soon as, like, sorry, Wheel Variants, like, as soon as possible, and, like, put those into the registry. + +01:00:21.840 --> 01:00:28.940 +So, as soon as we feel like that's a, you know, a feasible thing to do on the registry, we'll support it in pyx and support it for, like, our users and our customers. + +01:00:29.560 --> 01:00:39.780 +But in the meantime, it's kind of been, like, a parallel track effort of pushing forward on all the Wheelnecks work and standards, and then just trying to, like, solve immediate user problems without changing standards, like, partly through the registry. + +01:00:40.160 --> 01:00:41.320 +Things are going good at pyx? + +01:00:41.680 --> 01:00:42.540 +You're making progress? + +01:00:42.680 --> 01:00:42.780 +Yeah. + +01:00:42.780 --> 01:00:43.840 +Getting closer to public launch? + +01:00:43.860 --> 01:00:44.060 +Yeah, we're making progress. + +01:00:44.520 --> 01:00:45.300 +Yeah, yeah. + +01:00:45.600 --> 01:00:46.080 +No, it's good. + +01:00:46.400 --> 01:00:47.120 +Customers are growing. + +01:00:47.220 --> 01:00:47.920 +Numbers are growing up. + +01:00:48.000 --> 01:00:48.340 +It's good. + +01:00:48.580 --> 01:00:48.860 +Awesome. + +01:00:49.100 --> 01:00:50.360 +People want to try pyx? + +01:00:50.360 --> 01:00:50.800 +What are they? + +01:00:51.040 --> 01:00:51.300 +Are they? + +01:00:51.300 --> 01:00:52.820 +They can join the waitlist here. + +01:00:53.120 --> 01:00:53.560 +Yeah, yeah. + +01:00:53.820 --> 01:01:01.380 +This is, you know, you just, we have a, yeah, or you can go to astral.sh.pyx, and we look at all the responses, and we basically onboard people one by one. + +01:01:01.580 --> 01:01:09.700 +So, talking about when is this stuff going to be ready, you'll be able to adopt it, I guess maybe that's a good place to close out our conversation here is, what's the timeline? + +01:01:10.140 --> 01:01:11.080 +What are expectations? + +01:01:11.740 --> 01:01:12.540 +How are things going? + +01:01:12.660 --> 01:01:13.160 +What's next? + +01:01:13.460 --> 01:01:14.240 +It's a great question. + +01:01:14.280 --> 01:01:14.880 +Everything's open source. + +01:01:15.060 --> 01:01:16.060 +It's a two-month delay. + +01:01:20.060 --> 01:01:20.420 +No. + +01:01:20.420 --> 01:01:22.840 +What's the party line on this question? + +01:01:25.940 --> 01:01:26.660 +Oh, gosh. + +01:01:27.060 --> 01:01:44.620 +Well, it's, I liked, we have a joke inside, I don't know if it's inside widespread inside WeOnX, but we call this the Barry's fourth law, Varso fourth law, I don't remember exactly how, which is essentially make an estimate, multiply it by two, and change the unit. + +01:01:44.620 --> 01:01:47.980 +So, if you think it's going to take six months, it's one year. + +01:01:47.980 --> 01:01:48.380 +Oh, no. + +01:01:48.380 --> 01:01:50.140 +Change the unit, one decade. + +01:01:50.140 --> 01:02:00.700 +And it's a running joke that we have that I think is really good. + +01:02:00.700 --> 01:02:09.220 +Realistically, I think it depends on where are we going to set the bar for starting to roll things out. + +01:02:09.220 --> 01:02:13.160 +So, as Ralf was saying, we'll probably see some provisionally accepted. + +01:02:13.800 --> 01:02:17.900 +But as we get to that point, some of the stuff will be possible. + +01:02:18.280 --> 01:02:28.180 +For example, I expect that little by little, we can start experimenting with things without getting necessarily to the absolute final stage. + +01:02:28.180 --> 01:02:32.480 +But the full feature will be available to the app at the last stage. + +01:02:32.680 --> 01:02:35.520 +So, complicated question to answer. + +01:02:35.660 --> 01:02:37.760 +We hope that it's not going to take too many years. + +01:02:38.020 --> 01:02:40.260 +I'll make a connection back to pyx here. + +01:02:40.380 --> 01:02:45.080 +Because I think, you know, there's part is like, okay, there's four PAPs that need to be reviewed. + +01:02:45.400 --> 01:02:47.920 +Probably we need to update some prototypes here and there. + +01:02:48.040 --> 01:02:50.880 +It's probably going to take, you know, the better part of this year. + +01:02:51.080 --> 01:02:53.440 +At that point, you know, you have accepted PAPs, right? + +01:02:53.440 --> 01:02:55.380 +But then PyPI needs to be updated. + +01:02:55.760 --> 01:02:58.840 +Like, you know, all the tools that, like Twine, would need to be updated. + +01:02:59.160 --> 01:03:01.360 +Like, there's a new metadata version. + +01:03:01.560 --> 01:03:10.020 +So, everything that consumes that needs to be updated before, you know, package authors can actually start producing these wheels and upload them to PyPI. + +01:03:10.460 --> 01:03:12.560 +So, that's going to not be this year, right? + +01:03:12.600 --> 01:03:17.140 +There's a very long tail of, you know, how the implementation rolls through the ecosystem. + +01:03:17.140 --> 01:03:20.240 +And then you have to wait until users get newer tools. + +01:03:20.620 --> 01:03:22.900 +And then, only then can you start uploading wheels. + +01:03:22.900 --> 01:03:25.480 +So, I'm going to poke at Charlie a bit here. + +01:03:25.580 --> 01:03:31.240 +Because one of the advantages of having a separate registry is, you know, plus the ability to rebuild everything. + +01:03:31.720 --> 01:03:36.060 +You can start using variant wheels, like, the moment that everything is accepted. + +01:03:36.420 --> 01:03:36.960 +It's way sooner. + +01:03:36.960 --> 01:03:36.320 +I know. + +01:03:36.320 --> 01:03:36.960 +It's way sooner. + +01:03:36.960 --> 01:03:37.240 +Yes. + +01:03:37.560 --> 01:03:38.360 +Have you thought about that? + +01:03:38.560 --> 01:03:39.140 +That is true. + +01:03:39.320 --> 01:03:40.300 +Yeah, yeah, of course. + +01:03:40.480 --> 01:03:40.660 +Yeah. + +01:03:40.900 --> 01:03:46.700 +I think from our perspective, we're mostly like, do we feel like the design is done or how much turn will there be on the design? + +01:03:47.220 --> 01:03:51.580 +But yeah, we're definitely in a position to, like, start building and distributing this stuff much, much sooner. + +01:03:51.580 --> 01:03:56.940 +UV has a second advantage, which is I think they have a much shorter tail of users in terms of version. + +01:03:57.300 --> 01:04:01.660 +I think uv users end up on a much more quote-unquote recent version. + +01:04:02.000 --> 01:04:12.920 +If you look at pip, I think, I don't remember the statistic on top of my head, but a still significant portion of users use five-year-old version of pip, which I don't even know which version of Python. + +01:04:12.920 --> 01:04:14.740 +It was 3.9 or something. + +01:04:15.520 --> 01:04:21.460 +So it is, uv is able to move a lot faster, but also the users are more reactive. + +01:04:21.840 --> 01:04:23.140 +That's a very interesting point. + +01:04:23.480 --> 01:04:24.200 +A very interesting angle. + +01:04:24.340 --> 01:04:30.260 +I mean, I think a lot of people who are very tuned into the Python space have switched to uv, started using uv. + +01:04:30.260 --> 01:04:35.580 +And there's probably a lot of people who don't read the newsletters, don't listen to the podcast, and so on. + +01:04:35.640 --> 01:04:38.800 +And they know pip, and they just keep on PIPing, which is fine. + +01:04:38.960 --> 01:04:39.840 +I'm not knocking it. + +01:04:39.960 --> 01:04:46.800 +But, you know, it means not only are they using pip, they might be using an older version of Python because they don't want to shake it up. + +01:04:47.080 --> 01:04:50.080 +And, you know, those are going to be the long tails that are going to be hard. + +01:04:50.080 --> 01:04:54.840 +I guess one more thought about what's next here before we call this a show here. + +01:04:55.100 --> 01:04:56.100 +What is the minimal? + +01:04:56.440 --> 01:04:58.960 +We talked about PEP 825, the minimal PEP. + +01:04:59.040 --> 01:05:01.640 +What is the minimal amount of adoption, right? + +01:05:01.680 --> 01:05:15.760 +So if the top five biggest data science and machine learning libraries adopt this and the installer tools like uv and pip support it, that actually alone might be a really big benefit if all the other packages are just ignored, right? + +01:05:15.760 --> 01:05:21.900 +So that's way more achievable than every single package that has native code has all these specifiers, right? + +01:05:22.180 --> 01:05:24.520 +What's the minimum level of adoption? + +01:05:24.900 --> 01:05:30.720 +I'd say that, I mean, the minimum level at which you can call it a success, yeah, five is probably not that far off. + +01:05:30.760 --> 01:05:32.400 +The benefits start to accumulate quickly. + +01:05:32.540 --> 01:05:43.380 +But I would expect once packages like PyTorch start adopting this, especially in the deep learning space, you know, this will be adopted very widely, very quickly because it solves so many problems. + +01:05:43.380 --> 01:05:49.860 +Like many of the most popular packages like VLLM with very large development teams and very large numbers of users. + +01:05:50.300 --> 01:05:53.660 +If you look at their install pages, it's like, you know, it's like a puzzle book. + +01:05:54.020 --> 01:06:00.120 +You just don't know how to install this stuff and they don't have wheels on PyPI and they have their own extra index servers. + +01:06:00.640 --> 01:06:02.460 +And it's not for lack of trying. + +01:06:02.600 --> 01:06:03.740 +It's not for lack of trying. + +01:06:03.900 --> 01:06:09.580 +Like those teams like put a lot of effort into trying to make it easier to install, but they basically all run into different kinds of roadblocks. + +01:06:09.580 --> 01:06:14.000 +I think five packages is what you'll get after maybe two weeks. + +01:06:14.560 --> 01:06:21.620 +After a month, you will get twice that amount and probably a quadratic progression for quite a few weeks. + +01:06:21.760 --> 01:06:26.940 +But it's especially in the scientific compute space and maybe machine learning to be more specific. + +01:06:27.400 --> 01:06:31.100 +Well, the moment that it works, so many packages will switch. + +01:06:31.380 --> 01:06:32.360 +Like so many. + +01:06:32.640 --> 01:06:36.860 +If you just take PyTorch, half of its dependencies will probably activate variant mode. + +01:06:36.860 --> 01:06:41.220 +And then the people that build on top of PyTorch are people who build on top of Jax. + +01:06:41.640 --> 01:06:46.140 +So just that, you end up with at least 50 packages in a matter of a few months. + +01:06:47.100 --> 01:06:51.720 +Yeah, I'm just thinking there's probably a very small set that are feeling the most pain. + +01:06:52.380 --> 01:07:00.560 +You could do direct outreach to just the most important projects and get that adopted and make a really big difference, even if it's not every package. + +01:07:00.560 --> 01:07:07.340 +But the funny part is that most of the packages that would be interested, that we would reach out to are already part of We Are Next. + +01:07:08.280 --> 01:07:16.200 +Because they in some way find the pain pretty significant and are starving for a solution. + +01:07:16.580 --> 01:07:17.080 +They know. + +01:07:17.340 --> 01:07:17.840 +They already know. + +01:07:18.100 --> 01:07:19.700 +All right, let's call it a show, folks. + +01:07:20.000 --> 01:07:21.960 +Let's final call to action. + +01:07:21.960 --> 01:07:29.980 +People out there listening, either they're maintainers of packages or they're users of these libraries or they got their own open source project. + +01:07:30.400 --> 01:07:31.440 +They're seeing the light. + +01:07:31.540 --> 01:07:32.320 +They want to get involved. + +01:07:32.860 --> 01:07:33.720 +They want to try it out. + +01:07:33.860 --> 01:07:34.380 +What do you tell them? + +01:07:34.560 --> 01:07:39.440 +Well, first, it would be great if people were to come and discuss the Python.org. + +01:07:39.560 --> 01:07:45.000 +That's where the community is trying to aggregate to discuss all these different proposals. + +01:07:45.000 --> 01:07:49.600 +So I think the more people get involved, the more better. + +01:07:51.200 --> 01:08:01.680 +But also trying the different packages that we are trying to publish that Charlize has been helping us and his team has been helping us to create a sort of end-to-end experience. + +01:08:02.080 --> 01:08:07.720 +I think right now we have example of Linux, macOS, and Windows. + +01:08:08.060 --> 01:08:13.000 +It works on different type of hardware, different type of CPUs, different type of GPUs. + +01:08:13.000 --> 01:08:21.740 +It works pretty broadly, and we wanted to give a sort of sample flavor of what could be a variant-enabled world. + +01:08:22.100 --> 01:08:26.880 +Yeah, I'd say, yeah, for the majority of listeners, they're not going to be packaging tool authors, right? + +01:08:26.940 --> 01:08:31.280 +So those are the ones you would expect to participate in the review primarily. + +01:08:31.500 --> 01:08:35.480 +But I'd say if you're a user of any of the packages we mentioned, just try it out. + +01:08:35.740 --> 01:08:39.200 +You know, download the uv variant-enabled installer. + +01:08:39.200 --> 01:08:45.320 +And if you're a package author and we haven't mentioned your package, but it will solve a problem for you, get in touch. + +01:08:45.440 --> 01:08:49.540 +Because I think that's maybe the most relevant part here. + +01:08:49.800 --> 01:08:55.160 +There's at least hundreds, maybe thousands of packages that we think we have answers for. + +01:08:55.340 --> 01:09:02.700 +But if their solution or their problem statement is slightly different, I think now would be a great time to learn and make sure we cover as many use cases as possible. + +01:09:02.700 --> 01:09:08.700 +Yeah, I mean, I guess the only thing I'd say is ideally the average user won't even have to think about this, right? + +01:09:08.760 --> 01:09:13.240 +And hopefully they just get it through uv or through pip or whatever in the long term. + +01:09:13.540 --> 01:09:15.100 +But that may take time. + +01:09:15.380 --> 01:09:16.220 +But that's our goal, certainly. + +01:09:16.600 --> 01:09:17.920 +Yeah, it's all behind the scenes. + +01:09:18.260 --> 01:09:18.580 +They don't know. + +01:09:18.820 --> 01:09:22.340 +But certainly, if it solves a problem, reach out, be part of it. + +01:09:23.020 --> 01:09:25.040 +Jonathan, Ralph, Charlie, thanks for being on the show. + +01:09:25.260 --> 01:09:25.780 +It's been great. + +01:09:26.100 --> 01:09:26.540 +Keep up being around. + +01:09:26.540 --> 01:09:28.120 +Thanks for having us. + +01:09:28.400 --> 01:09:28.760 +Bye. + +01:09:29.060 --> 01:09:29.340 +Bye-bye. + +01:09:29.520 --> 01:09:29.620 +Bye. + +01:09:30.400 --> 01:09:32.820 +This has been another episode of Talk Python To Me. + +01:09:32.960 --> 01:09:33.920 +Thank you to our sponsors. + +01:09:34.120 --> 01:09:35.400 +Be sure to check out what they're offering. + +01:09:35.580 --> 01:09:36.960 +It really helps support the show. + +01:09:37.600 --> 01:09:39.400 +This episode is brought to you by Sentry. + +01:09:39.900 --> 01:09:43.320 +You know Sentry for the error monitoring, but they now have logs too. + +01:09:43.320 --> 01:09:50.580 +And with Sentry, your logs become way more usable, interleaving into your error reports to enhance debugging and understanding. + +01:09:51.080 --> 01:09:54.280 +Get started today at talkpython.fm/sentry. + +01:09:54.280 --> 01:09:58.540 +And it's brought to you by Temporal, durable workflows for Python. + +01:09:58.940 --> 01:10:05.520 +Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +01:10:05.760 --> 01:10:08.780 +Get started at talkpython.fm/Temporal. + +01:10:09.380 --> 01:10:21.860 +If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:10:21.860 --> 01:10:24.540 +Best of all, there's no subscription in sight. + +01:10:24.960 --> 01:10:26.720 +Browse the catalog at talkpython.fm. + +01:10:27.360 --> 01:10:32.040 +And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:10:32.640 --> 01:10:34.540 +Just search for Python in your podcast player. + +01:10:34.640 --> 01:10:35.480 +We should be right at the top. + +01:10:35.820 --> 01:10:38.800 +If you enjoyed that geeky rap song, you can download the full track. + +01:10:38.920 --> 01:10:40.800 +The link is actually in your podcast blur of show notes. + +01:10:41.520 --> 01:10:42.940 +This is your host, Michael Kennedy. + +01:10:43.200 --> 01:10:44.440 +Thank you so much for listening. + +01:10:44.620 --> 01:10:45.400 +I really appreciate it. + +01:10:45.820 --> 01:10:46.560 +I'll see you next time. + +01:10:46.560 --> 01:10:47.560 +Bye. + +01:11:16.560 --> 01:11:17.100 +Bye. From a358c34a617838b7032357993571199dd3666f95 Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 22 Apr 2026 16:25:17 -0700 Subject: [PATCH 10/16] transcripts 544 --- ...l-next-packaging-peps-transcript-final.txt | 20 +++++++++---------- ...l-next-packaging-peps-transcript-final.vtt | 20 +++++++++---------- 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/transcripts/544-wheel-next-packaging-peps-transcript-final.txt b/transcripts/544-wheel-next-packaging-peps-transcript-final.txt index 739209e..9a330f2 100644 --- a/transcripts/544-wheel-next-packaging-peps-transcript-final.txt +++ b/transcripts/544-wheel-next-packaging-peps-transcript-final.txt @@ -6,11 +6,11 @@ 00:00:16 The result is fat binaries, nearly gigabyte-sized wheels, and install pages that read like puzzle books. -00:00:22 A coalition from NVIDIA, Astral, and QuantSight has been working on WheelNext, a set of peps that let packages declare what hardware they need and let installers like uv pick the right build automatically. +00:00:22 A coalition from NVIDIA, Astral, and Quansight has been working on WheelNext, a set of peps that let packages declare what hardware they need and let installers like uv pick the right build automatically. 00:00:34 Just UVPip install Torch and it'll work. -00:00:37 I sit down with Jonathan Decker from NVIDIA, Ralph Gommers from QuantSight and the NumPy and SciPy teams, and Charlie Marsh, founder of Astral and creator of uv, to dig into it all. +00:00:37 I sit down with Jonathan Decker from NVIDIA, Ralph Gommers from Quansight and the NumPy and SciPy teams, and Charlie Marsh, founder of Astral and creator of uv, to dig into it all. 00:00:47 This is Talk Python To Me, episode 544, recorded March 2nd, 2026. @@ -278,11 +278,11 @@ 00:08:02 So I made it my job. -00:08:03 I joined QuantSight, which is a small consulting company. +00:08:03 I joined Quansight, which is a small consulting company. 00:08:06 Primarily around like data science, supplied AI, scientific computing. -00:08:10 And yeah, I'm now one of the two co-CEOs of QuantSight. +00:08:10 And yeah, I'm now one of the two co-CEOs of Quansight. 00:08:14 Awesome. @@ -294,11 +294,11 @@ 00:08:25 And yeah, we basically do consulting to allow ourselves to make impactful open source contributions. -00:08:32 QuantSight is doing a ton in the data science space. +00:08:32 Quansight is doing a ton in the data science space. 00:08:35 Scientific computing space, for sure. -00:08:37 I've had multiple rounds of QuantSight folks on the show and things like that. +00:08:37 I've had multiple rounds of Quansight folks on the show and things like that. 00:08:42 And very neat. @@ -1214,9 +1214,9 @@ 00:41:25 And here, we've got NVIDIA, Meta, the PyTorch folks at Meta. -00:41:30 We got some contributions from AMD and Intel, and then Astral, QuantSight. +00:41:30 We got some contributions from AMD and Intel, and then Astral, Quansight. -00:41:35 Large amount of the time that we've been able to spend at QuantSight came from funding from Red Hat, who came with their own problem sets. +00:41:35 Large amount of the time that we've been able to spend at Quansight came from funding from Red Hat, who came with their own problem sets. 00:41:43 And, you know, so, and that's just the most prominent contributors. @@ -1234,7 +1234,7 @@ 00:42:03 And the name of also the open source project that contributed time and expertise. -00:42:09 Yeah, AMD, Anaconda, Aprio, Astral, Google, Huawei, Intel, Lap, Lab, Meta, NVIDIA, Preferred Networks, QuantSight, and Red Hat. +00:42:09 Yeah, AMD, Anaconda, Aprio, Astral, Google, Huawei, Intel, Lap, Lab, Meta, NVIDIA, Preferred Networks, Quansight, and Red Hat. 00:42:18 That's a bit of a group working on this. @@ -1270,7 +1270,7 @@ 00:43:25 I don't like change it. -00:43:27 I worked really closely with two people, one from QuantSight, one from Astro, Constantine and Michel. +00:43:27 I worked really closely with two people, one from Quansight, one from Astro, Constantine and Michel. 00:43:34 And we did so many hours of work. diff --git a/transcripts/544-wheel-next-packaging-peps-transcript-final.vtt b/transcripts/544-wheel-next-packaging-peps-transcript-final.vtt index 8c9877c..a3864cc 100644 --- a/transcripts/544-wheel-next-packaging-peps-transcript-final.vtt +++ b/transcripts/544-wheel-next-packaging-peps-transcript-final.vtt @@ -13,13 +13,13 @@ Want GPU support? You're on your own configuring special index URLs. The result is fat binaries, nearly gigabyte-sized wheels, and install pages that read like puzzle books. 00:00:22.640 --> 00:00:34.240 -A coalition from NVIDIA, Astral, and QuantSight has been working on WheelNext, a set of peps that let packages declare what hardware they need and let installers like uv pick the right build automatically. +A coalition from NVIDIA, Astral, and Quansight has been working on WheelNext, a set of peps that let packages declare what hardware they need and let installers like uv pick the right build automatically. 00:00:34.660 --> 00:00:37.080 Just UVPip install Torch and it'll work. 00:00:37.440 --> 00:00:47.060 -I sit down with Jonathan Decker from NVIDIA, Ralph Gommers from QuantSight and the NumPy and SciPy teams, and Charlie Marsh, founder of Astral and creator of uv, to dig into it all. +I sit down with Jonathan Decker from NVIDIA, Ralph Gommers from Quansight and the NumPy and SciPy teams, and Charlie Marsh, founder of Astral and creator of uv, to dig into it all. 00:00:47.520 --> 00:00:52.160 This is Talk Python To Me, episode 544, recorded March 2nd, 2026. @@ -421,13 +421,13 @@ And then I got really too much. So I made it my job. 00:08:03.120 --> 00:08:06.240 -I joined QuantSight, which is a small consulting company. +I joined Quansight, which is a small consulting company. 00:08:06.620 --> 00:08:10.380 Primarily around like data science, supplied AI, scientific computing. 00:08:10.980 --> 00:08:14.080 -And yeah, I'm now one of the two co-CEOs of QuantSight. +And yeah, I'm now one of the two co-CEOs of Quansight. 00:08:14.740 --> 00:08:15.180 Awesome. @@ -445,13 +445,13 @@ Most of them are open source maintainers. And yeah, we basically do consulting to allow ourselves to make impactful open source contributions. 00:08:32.160 --> 00:08:35.500 -QuantSight is doing a ton in the data science space. +Quansight is doing a ton in the data science space. 00:08:35.940 --> 00:08:37.500 Scientific computing space, for sure. 00:08:37.680 --> 00:08:42.180 -I've had multiple rounds of QuantSight folks on the show and things like that. +I've had multiple rounds of Quansight folks on the show and things like that. 00:08:42.380 --> 00:08:43.100 And very neat. @@ -1825,10 +1825,10 @@ I think the majority of the people were working in a team inside Microsoft, at l And here, we've got NVIDIA, Meta, the PyTorch folks at Meta. 00:41:30.820 --> 00:41:35.360 -We got some contributions from AMD and Intel, and then Astral, QuantSight. +We got some contributions from AMD and Intel, and then Astral, Quansight. 00:41:35.360 --> 00:41:43.260 -Large amount of the time that we've been able to spend at QuantSight came from funding from Red Hat, who came with their own problem sets. +Large amount of the time that we've been able to spend at Quansight came from funding from Red Hat, who came with their own problem sets. 00:41:43.780 --> 00:41:47.140 And, you know, so, and that's just the most prominent contributors. @@ -1855,7 +1855,7 @@ Yeah, who are we? And the name of also the open source project that contributed time and expertise. 00:42:09.480 --> 00:42:18.820 -Yeah, AMD, Anaconda, Aprio, Astral, Google, Huawei, Intel, Lap, Lab, Meta, NVIDIA, Preferred Networks, QuantSight, and Red Hat. +Yeah, AMD, Anaconda, Aprio, Astral, Google, Huawei, Intel, Lap, Lab, Meta, NVIDIA, Preferred Networks, Quansight, and Red Hat. 00:42:18.940 --> 00:42:21.180 That's a bit of a group working on this. @@ -1909,7 +1909,7 @@ I don't like this. I don't like change it. 00:43:27.060 --> 00:43:34.500 -I worked really closely with two people, one from QuantSight, one from Astro, Constantine and Michel. +I worked really closely with two people, one from Quansight, one from Astro, Constantine and Michel. 00:43:34.900 --> 00:43:37.560 And we did so many hours of work. From f94bee50a1f8a83cd01a760ed2c5a8a1b54a7964 Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Mon, 4 May 2026 19:24:49 -0700 Subject: [PATCH 11/16] Many transcript fixes. --- transcripts/061_free_software_free_people.txt | 2 +- transcripts/061_free_software_free_people.vtt | 2 +- .../143-tuning-python-web-app-performance.txt | 2 +- .../143-tuning-python-web-app-performance.vtt | 2 +- transcripts/177-flask-goes-1.0.txt | 6 +-- transcripts/177-flask-goes-1.0.vtt | 6 +-- ...secure-all-the-things-with-hubblestack.txt | 2 +- ...secure-all-the-things-with-hubblestack.vtt | 2 +- transcripts/209-python-governance.txt | 2 +- transcripts/209-python-governance.vtt | 2 +- transcripts/231-freelancing.txt | 2 +- transcripts/231-freelancing.vtt | 2 +- transcripts/312-scaling.txt | 2 +- transcripts/312-scaling.vtt | 2 +- transcripts/316-flask-2-0.txt | 4 +- transcripts/316-flask-2-0.vtt | 4 +- transcripts/346-recommended-packages.txt | 2 +- transcripts/346-recommended-packages.vtt | 2 +- transcripts/362-hypermodern-python.txt | 2 +- transcripts/362-hypermodern-python.vtt | 2 +- ...y-hello-to-pyscript-webassembly-python.txt | 4 +- transcripts/395-readme-tools.txt | 2 +- transcripts/395-readme-tools.vtt | 2 +- transcripts/402-polars.txt | 10 ++-- transcripts/402-polars.vtt | 10 ++-- ...section-of-tabular-data-and-general-ai.txt | 2 +- ...section-of-tabular-data-and-general-ai.vtt | 2 +- transcripts/421-python-at-netflix.txt | 2 +- transcripts/421-python-at-netflix.vtt | 2 +- transcripts/433-litestar.txt | 4 +- transcripts/433-litestar.vtt | 4 +- transcripts/480-narwhals.txt | 10 ++-- transcripts/480-narwhals.vtt | 10 ++-- transcripts/490-django-ninja.txt | 4 +- transcripts/490-django-ninja.vtt | 4 +- ...ython-ducks-and-snakes-living-together.txt | 6 +-- ...ython-ducks-and-snakes-living-together.vtt | 6 +-- .../495-osmnx-python-and-openstreetmap.txt | 2 +- .../495-osmnx-python-and-openstreetmap.vtt | 2 +- transcripts/500-django-simple-deploy.txt | 2 +- transcripts/500-django-simple-deploy.vtt | 2 +- transcripts/502-django-ledger.txt | 2 +- transcripts/502-django-ledger.vtt | 2 +- ...olars-tools-and-techniques-to-level-up.txt | 32 +++++------ ...olars-tools-and-techniques-to-level-up.vtt | 32 +++++------ ...lerating-python-data-science-at-nvidia.txt | 2 +- ...lerating-python-data-science-at-nvidia.vtt | 2 +- transcripts/522-codecut-ai.txt | 2 +- transcripts/522-codecut-ai.vtt | 2 +- transcripts/523-pyrefly.txt | 2 +- transcripts/523-pyrefly.vtt | 2 +- ...velopers-should-learn-in-2025-no-names.vtt | 4 +- transcripts/525-nicegui-no-names.vtt | 4 +- transcripts/525-nicegui.txt | 4 +- transcripts/525-nicegui.vtt | 4 +- ...ata-science-with-foundation-llm-models.txt | 2 +- ...ata-science-with-foundation-llm-models.vtt | 2 +- ...8-python-apps-with-llm-building-blocks.txt | 2 +- ...8-python-apps-with-llm-building-blocks.vtt | 2 +- ...9-python-apps-with-llm-building-blocks.txt | 4 +- ...9-python-apps-with-llm-building-blocks.vtt | 4 +- transcripts/530-anywidget.txt | 2 +- transcripts/530-anywidget.vtt | 2 +- transcripts/531-talk-python-in-prod.txt | 4 +- transcripts/531-talk-python-in-prod.vtt | 4 +- .../532-python-2025-year-in-review.txt | 8 +-- .../532-python-2025-year-in-review.vtt | 8 +-- ...b-frameworks-in-prod-by-their-creators.txt | 2 +- ...b-frameworks-in-prod-by-their-creators.vtt | 2 +- ...skcache-your-secret-python-perf-weapon.txt | 2 +- ...skcache-your-secret-python-perf-weapon.vtt | 2 +- .../538-python-in-digital-humanities.txt | 6 +-- .../538-python-in-digital-humanities.vtt | 6 +-- ...hing-up-with-the-python-typing-council.txt | 4 +- ...n-monorepo-with-uv-and-prek-transcript.txt | 2 +- ...n-monorepo-with-uv-and-prek-transcript.vtt | 2 +- ...python-in-rust-for-ai-transcript-final.txt | 54 +++++++++---------- ...python-in-rust-for-ai-transcript-final.vtt | 54 +++++++++---------- youtube_transcripts/313-pydantic.vtt | 2 +- youtube_transcripts/335-gene-editing.vtt | 2 +- youtube_transcripts/345-10-tips-and-tools.txt | 2 +- ...y-hello-to-pyscript-webassembly-python.vtt | 2 +- youtube_transcripts/370-openbb.vtt | 2 +- .../376-pydantic-2-the-plan.vtt | 2 +- ...perf-specializing-adaptive-interpreter.vtt | 2 +- youtube_transcripts/395-readme-tools.vtt | 2 +- youtube_transcripts/402-polars.vtt | 12 ++--- ...lving-10-different-simulation-problems.vtt | 2 +- youtube_transcripts/425-shiny-for-python.vtt | 4 +- youtube_transcripts/426-pyscript-update.vtt | 2 +- ...llel-python-apps-with-sub-interpreters.vtt | 6 +-- youtube_transcripts/454-dagster.vtt | 4 +- youtube_transcripts/457-security-phylum.vtt | 2 +- .../462-pandas-and-beyond-with-wes.vtt | 6 +-- youtube_transcripts/480-narwhals.vtt | 10 ++-- youtube_transcripts/488-lancedb.vtt | 2 +- ...ython-ducks-and-snakes-living-together.vtt | 6 +-- youtube_transcripts/493-quarto.vtt | 2 +- .../495-osmnx-python-and-openstreetmap.vtt | 8 +-- ...olars-tools-and-techniques-to-level-up.vtt | 26 ++++----- .../525-nicegui-youtube-named.srt | 2 +- .../525-nicegui-youtube-named.vtt | 2 +- .../526-data-sci-with-ai-youtube-names.srt | 2 +- .../526-data-sci-with-ai-youtube-names.vtt | 2 +- ...-apps-with-llm-building-blocks-youtube.srt | 4 +- ...-apps-with-llm-building-blocks-youtube.vtt | 4 +- .../529-cs-from-scratch-youtube.srt | 8 +-- .../529-cs-from-scratch-youtube.vtt | 8 +-- youtube_transcripts/530-anywidget-youtube.srt | 2 +- youtube_transcripts/530-anywidget-youtube.vtt | 2 +- .../531-talk-python-in-prod-youtube.srt | 4 +- .../531-talk-python-in-prod-youtube.vtt | 4 +- ...532-python-2025-year-in-review-youtube.vtt | 20 +++---- .../536-fly-inside-fastapi-cloud-youtube.vtt | 4 +- ...ith-the-python-typing-council-original.vtt | 16 +++--- ...dern-python-monorepo-timeline-original.vtt | 4 +- youtube_transcripts/544-wheel-next.vtt | 10 ++-- 117 files changed, 313 insertions(+), 313 deletions(-) diff --git a/transcripts/061_free_software_free_people.txt b/transcripts/061_free_software_free_people.txt index a010ffc..3e59532 100644 --- a/transcripts/061_free_software_free_people.txt +++ b/transcripts/061_free_software_free_people.txt @@ -608,7 +608,7 @@ 00:25:59 So in Syria, we started working in Syria in the summer of 2011, maybe about three months in July, maybe about three months after the protests started. -00:26:08 And the first thing we started doing was NMAPing the networks. +00:26:08 And the first thing we started doing was NMAPIng the networks. 00:26:11 So basically, like NMAP is a network scanner. diff --git a/transcripts/061_free_software_free_people.vtt b/transcripts/061_free_software_free_people.vtt index f9cb017..9183a2f 100644 --- a/transcripts/061_free_software_free_people.vtt +++ b/transcripts/061_free_software_free_people.vtt @@ -916,7 +916,7 @@ Yeah. So in Syria, we started working in Syria in the summer of 2011, maybe about three months in July, maybe about three months after the protests started. 00:26:08.640 --> 00:26:11.840 -And the first thing we started doing was NMAPing the networks. +And the first thing we started doing was NMAPIng the networks. 00:26:11.840 --> 00:26:14.520 So basically, like NMAP is a network scanner. diff --git a/transcripts/143-tuning-python-web-app-performance.txt b/transcripts/143-tuning-python-web-app-performance.txt index 31923bf..3bc34b3 100644 --- a/transcripts/143-tuning-python-web-app-performance.txt +++ b/transcripts/143-tuning-python-web-app-performance.txt @@ -512,7 +512,7 @@ 00:22:31 Well, and with these deployment stacks, or what do you want to call them, you know, you've got -00:22:36 Nginx, and you can tune Nginx, you've got like with year, or G unicorn, or whatever, you can performance tune that. And you've got your Python code. And so measuring them +00:22:36 Nginx, and you can tune Nginx, you've got like with year, or Gunicorn, or whatever, you can performance tune that. And you've got your Python code. And so measuring them 00:22:46 separately, I think can be a little bit challenging. While you're talking, it occurred to me that diff --git a/transcripts/143-tuning-python-web-app-performance.vtt b/transcripts/143-tuning-python-web-app-performance.vtt index bb0c389..365afa3 100644 --- a/transcripts/143-tuning-python-web-app-performance.vtt +++ b/transcripts/143-tuning-python-web-app-performance.vtt @@ -775,7 +775,7 @@ Well, and with these deployment stacks, or what do you want to call them, you kn Nginx, and you can tune Nginx, you've got 00:22:38.960 --> 00:22:46.040 -like with year, or G unicorn, or whatever, you can performance tune that. And you've got your Python code. And so measuring them +like with year, or Gunicorn, or whatever, you can performance tune that. And you've got your Python code. And so measuring them 00:22:46.040 --> 00:22:50.540 separately, I think can be a little bit challenging. While you're talking, it occurred to me that diff --git a/transcripts/177-flask-goes-1.0.txt b/transcripts/177-flask-goes-1.0.txt index 954633d..4b0d6dd 100644 --- a/transcripts/177-flask-goes-1.0.txt +++ b/transcripts/177-flask-goes-1.0.txt @@ -512,7 +512,7 @@ 00:13:56 So it occurs to me that it might be worthwhile to spend just a moment talking about these projects that you discussed. -00:14:02 So we just quickly ran through it, like there's Flask and there's Jinja and there's Vexoig and stuff like that. +00:14:02 So we just quickly ran through it, like there's Flask and there's Jinja and there's Werkzeug and stuff like that. 00:14:07 And could you maybe give us a rundown of like what each one of those actually is so people know what they're about? @@ -528,11 +528,11 @@ 00:14:28 It's using what's provided there and just putting a nice framework around it. -00:14:32 So Vexoig is the closest to what Flask is doing. +00:14:32 So Werkzeug is the closest to what Flask is doing. 00:14:36 It's also a WSGI library, but it's dealing with kind of a lower level than what Flask is. -00:14:42 So Flask is providing like an application framework and Vexoig is providing all the parts for taking a HTTP request and a WSGI request and parsing out the headers and producing some data structure, +00:14:42 So Flask is providing like an application framework and Werkzeug is providing all the parts for taking a HTTP request and a WSGI request and parsing out the headers and producing some data structure, 00:14:57 like a request that we can use and look at and turning our response into something that our server can understand. diff --git a/transcripts/177-flask-goes-1.0.vtt b/transcripts/177-flask-goes-1.0.vtt index 6440525..e5e26ee 100644 --- a/transcripts/177-flask-goes-1.0.vtt +++ b/transcripts/177-flask-goes-1.0.vtt @@ -823,7 +823,7 @@ Yeah, that's a really good point, actually. So it occurs to me that it might be worthwhile to spend just a moment talking about these projects that you discussed. 00:14:02.620 --> 00:14:07.300 -So we just quickly ran through it, like there's Flask and there's Jinja and there's Vexoig and stuff like that. +So we just quickly ran through it, like there's Flask and there's Jinja and there's Werkzeug and stuff like that. 00:14:07.300 --> 00:14:12.440 And could you maybe give us a rundown of like what each one of those actually is so people know what they're about? @@ -847,13 +847,13 @@ And Flask, honestly, is just a wrapper around all these other libraries. It's using what's provided there and just putting a nice framework around it. 00:14:32.360 --> 00:14:36.660 -So Vexoig is the closest to what Flask is doing. +So Werkzeug is the closest to what Flask is doing. 00:14:36.660 --> 00:14:42.580 It's also a WSGI library, but it's dealing with kind of a lower level than what Flask is. 00:14:42.580 --> 00:14:57.300 -So Flask is providing like an application framework and Vexoig is providing all the parts for taking a HTTP request and a WSGI request and parsing out the headers and producing some data structure, +So Flask is providing like an application framework and Werkzeug is providing all the parts for taking a HTTP request and a WSGI request and parsing out the headers and producing some data structure, 00:14:57.300 --> 00:15:04.640 like a request that we can use and look at and turning our response into something that our server can understand. diff --git a/transcripts/187-secure-all-the-things-with-hubblestack.txt b/transcripts/187-secure-all-the-things-with-hubblestack.txt index 3c77682..3998b67 100644 --- a/transcripts/187-secure-all-the-things-with-hubblestack.txt +++ b/transcripts/187-secure-all-the-things-with-hubblestack.txt @@ -1668,7 +1668,7 @@ 00:38:54 Well, let's talk about, say, a web server, right? -00:38:56 So it's got like a micro whiskey or a G unicorn worker process. +00:38:56 So it's got like a micro whiskey or a Gunicorn worker process. 00:39:00 But that worker process is probably talking to a database. diff --git a/transcripts/187-secure-all-the-things-with-hubblestack.vtt b/transcripts/187-secure-all-the-things-with-hubblestack.vtt index 41291b0..02de2c9 100644 --- a/transcripts/187-secure-all-the-things-with-hubblestack.vtt +++ b/transcripts/187-secure-all-the-things-with-hubblestack.vtt @@ -2512,7 +2512,7 @@ Obviously, on something like, what's a good example of something that reaches ou Well, let's talk about, say, a web server, right? 00:38:56.480 --> 00:39:00.140 -So it's got like a micro whiskey or a G unicorn worker process. +So it's got like a micro whiskey or a Gunicorn worker process. 00:39:00.140 --> 00:39:02.760 But that worker process is probably talking to a database. diff --git a/transcripts/209-python-governance.txt b/transcripts/209-python-governance.txt index a069619..7c9d1cd 100644 --- a/transcripts/209-python-governance.txt +++ b/transcripts/209-python-governance.txt @@ -238,7 +238,7 @@ 00:09:29 the assignment operator and things like this. Exactly. There are certain things that you -00:09:33 basically couldn't do without making your function calls item potent and such in generator +00:09:33 basically couldn't do without making your function calls idempotent and such in generator 00:09:39 expressions. I personally also think it's a key point that it improves the expressiveness and diff --git a/transcripts/209-python-governance.vtt b/transcripts/209-python-governance.vtt index e752914..16b5aee 100644 --- a/transcripts/209-python-governance.vtt +++ b/transcripts/209-python-governance.vtt @@ -364,7 +364,7 @@ return some part of that, right? Those would actually be like double function ca the assignment operator and things like this. Exactly. There are certain things that you 00:09:33.880 --> 00:09:39.580 -basically couldn't do without making your function calls item potent and such in generator +basically couldn't do without making your function calls idempotent and such in generator 00:09:39.580 --> 00:09:45.580 expressions. I personally also think it's a key point that it improves the expressiveness and diff --git a/transcripts/231-freelancing.txt b/transcripts/231-freelancing.txt index 14592bb..7fd1d05 100644 --- a/transcripts/231-freelancing.txt +++ b/transcripts/231-freelancing.txt @@ -1360,7 +1360,7 @@ 00:50:23 you know, it was three months worth of work in one check. And it was like, it was like the biggest -00:50:27 check I had ever cashed, but it was, I was so hungry. You know, I thought about eating it. +00:50:27 check I had ever cached, but it was, I was so hungry. You know, I thought about eating it. 00:50:31 And so I wasn't as, as prepared from, from a financial standpoint to jump into it. And so, diff --git a/transcripts/231-freelancing.vtt b/transcripts/231-freelancing.vtt index f210c67..4e6513a 100644 --- a/transcripts/231-freelancing.vtt +++ b/transcripts/231-freelancing.vtt @@ -2047,7 +2047,7 @@ for many weeks waiting for this check to come. And then this check came and I wa you know, it was three months worth of work in one check. And it was like, it was like the biggest 00:50:27.180 --> 00:50:31.120 -check I had ever cashed, but it was, I was so hungry. You know, I thought about eating it. +check I had ever cached, but it was, I was so hungry. You know, I thought about eating it. 00:50:31.120 --> 00:50:38.080 And so I wasn't as, as prepared from, from a financial standpoint to jump into it. And so, diff --git a/transcripts/312-scaling.txt b/transcripts/312-scaling.txt index e23854c..9734306 100644 --- a/transcripts/312-scaling.txt +++ b/transcripts/312-scaling.txt @@ -1002,7 +1002,7 @@ 00:38:52 any of those things, the way that you host those is you go to a server or you use a platform as a -00:38:58 service, which does this for you. And you, you run it in something like micro whiskey or G unicorn or +00:38:58 service, which does this for you. And you, you run it in something like micro whiskey or Gunicorn or 00:39:02 something. And what those do immediately is they say, well, we're not really going to run it in the diff --git a/transcripts/312-scaling.vtt b/transcripts/312-scaling.vtt index 15ca262..cb853d9 100644 --- a/transcripts/312-scaling.vtt +++ b/transcripts/312-scaling.vtt @@ -1522,7 +1522,7 @@ all. And the example I'm thinking of, if I'm writing an API, if I'm writing a we any of those things, the way that you host those is you go to a server or you use a platform as a 00:38:58.180 --> 00:39:02.740 -service, which does this for you. And you, you run it in something like micro whiskey or G unicorn or +service, which does this for you. And you, you run it in something like micro whiskey or Gunicorn or 00:39:02.740 --> 00:39:07.060 something. And what those do immediately is they say, well, we're not really going to run it in the diff --git a/transcripts/316-flask-2-0.txt b/transcripts/316-flask-2-0.txt index 00df3a9..f8ae374 100644 --- a/transcripts/316-flask-2-0.txt +++ b/transcripts/316-flask-2-0.txt @@ -836,9 +836,9 @@ 00:32:36 or not out of the box, but it's possible to extend it to do that. -00:32:40 Yeah, super cool. All right. Another thing I saw in the release notes was that around Vexoig, +00:32:40 Yeah, super cool. All right. Another thing I saw in the release notes was that around Werkzeug, -00:32:45 there's performance improvements coming in Flask and Vexoig. Want to just give people a little +00:32:45 there's performance improvements coming in Flask and Werkzeug. Want to just give people a little 00:32:51 insight into what's coming there? diff --git a/transcripts/316-flask-2-0.vtt b/transcripts/316-flask-2-0.vtt index 7725ec1..64a7f1c 100644 --- a/transcripts/316-flask-2-0.vtt +++ b/transcripts/316-flask-2-0.vtt @@ -1267,10 +1267,10 @@ compatibility between sync and async code. And Flask's async support should supp or not out of the box, but it's possible to extend it to do that. 00:32:40.140 --> 00:32:45.180 -Yeah, super cool. All right. Another thing I saw in the release notes was that around Vexoig, +Yeah, super cool. All right. Another thing I saw in the release notes was that around Werkzeug, 00:32:45.180 --> 00:32:51.060 -there's performance improvements coming in Flask and Vexoig. Want to just give people a little +there's performance improvements coming in Flask and Werkzeug. Want to just give people a little 00:32:51.060 --> 00:32:52.100 insight into what's coming there? diff --git a/transcripts/346-recommended-packages.txt b/transcripts/346-recommended-packages.txt index 8258e13..283ede1 100644 --- a/transcripts/346-recommended-packages.txt +++ b/transcripts/346-recommended-packages.txt @@ -2496,7 +2496,7 @@ 01:10:13 you know, which was the notable, the favorite editor and the times and see how things change -01:10:19 over the time, have a really nice graph of how these codes start coming up and by charm as well. +01:10:19 over the time, have a really nice graph of how these codes start coming up and PyCharm as well. 01:10:24 There's a lot of ways to gather all this up and like, turn it into computer legible data and do all sorts of fun stuff. diff --git a/transcripts/346-recommended-packages.vtt b/transcripts/346-recommended-packages.vtt index 50327a6..16a8c3a 100644 --- a/transcripts/346-recommended-packages.vtt +++ b/transcripts/346-recommended-packages.vtt @@ -3760,7 +3760,7 @@ And next probably we can do, you know, the, the, how over the time, you know, if you know, which was the notable, the favorite editor and the times and see how things change 01:10:19.200 --> 01:10:24.880 -over the time, have a really nice graph of how these codes start coming up and by charm as well. +over the time, have a really nice graph of how these codes start coming up and PyCharm as well. 01:10:24.880 --> 01:10:27.340 There's a lot of ways to gather all this up and like, diff --git a/transcripts/362-hypermodern-python.txt b/transcripts/362-hypermodern-python.txt index b4cad65..f409bf2 100644 --- a/transcripts/362-hypermodern-python.txt +++ b/transcripts/362-hypermodern-python.txt @@ -1484,7 +1484,7 @@ 00:57:29 And that's basically what Sphinx click does. -00:57:32 So it takes, so you, when you build your documentation and you use things like auto doc or Sphinx click, +00:57:32 So it takes, so you, when you build your documentation and you use things like autodoc or Sphinx click, 00:57:39 you have to remember to install your own package. diff --git a/transcripts/362-hypermodern-python.vtt b/transcripts/362-hypermodern-python.vtt index ba854bc..b15c5ab 100644 --- a/transcripts/362-hypermodern-python.vtt +++ b/transcripts/362-hypermodern-python.vtt @@ -2233,7 +2233,7 @@ So why not just use them to generate the documentation. And that's basically what Sphinx click does. 00:57:32.160 --> 00:57:39.220 -So it takes, so you, when you build your documentation and you use things like auto doc or Sphinx click, +So it takes, so you, when you build your documentation and you use things like autodoc or Sphinx click, 00:57:39.220 --> 00:57:41.260 you have to remember to install your own package. diff --git a/transcripts/367-say-hello-to-pyscript-webassembly-python.txt b/transcripts/367-say-hello-to-pyscript-webassembly-python.txt index 77807a1..f3dbf59 100644 --- a/transcripts/367-say-hello-to-pyscript-webassembly-python.txt +++ b/transcripts/367-say-hello-to-pyscript-webassembly-python.txt @@ -1844,7 +1844,7 @@ 01:01:56 html but the rest of my app is like one directory up from that could I say -01:02:03 my path to my module is dot dot dot slash app dot pi and then maybe read +01:02:03 my path to my module is dot dot dot slash app.py and then maybe read 01:02:07 like some right source code with like a token that I put in there that I @@ -1888,7 +1888,7 @@ 01:03:42 now it's just you're like it's not our responsibility but you shouldn't be -01:03:46 allowing you know slash static slash dot slash app dot pi being served anyway +01:03:46 allowing you know slash static slash dot slash app.py being served anyway 01:03:51 right right okay cool the other thing is one of the things that Steve Dower diff --git a/transcripts/395-readme-tools.txt b/transcripts/395-readme-tools.txt index 120836f..bd203bf 100644 --- a/transcripts/395-readme-tools.txt +++ b/transcripts/395-readme-tools.txt @@ -2088,7 +2088,7 @@ 01:00:36 becomes part of the document. So sort of the input file and the output file are the same file so that -01:00:41 you get sort of an item potent kind of processing of the file. So you can have a readme.md. You run cog +01:00:41 you get sort of an idempotent kind of processing of the file. So you can have a readme.md. You run cog 01:00:47 on it. You still just have a readme.md, but now you've got the results of the computation in the file. diff --git a/transcripts/395-readme-tools.vtt b/transcripts/395-readme-tools.vtt index da46c7c..9989882 100644 --- a/transcripts/395-readme-tools.vtt +++ b/transcripts/395-readme-tools.vtt @@ -3226,7 +3226,7 @@ can go into the document so that you process, you run the document, the output o becomes part of the document. So sort of the input file and the output file are the same file so that 01:00:41.640 --> 01:00:47.620 -you get sort of an item potent kind of processing of the file. So you can have a readme.md. You run cog +you get sort of an idempotent kind of processing of the file. So you can have a readme.md. You run cog 01:00:47.620 --> 01:00:52.380 on it. You still just have a readme.md, but now you've got the results of the computation in the file. diff --git a/transcripts/402-polars.txt b/transcripts/402-polars.txt index 25df95d..462fffc 100644 --- a/transcripts/402-polars.txt +++ b/transcripts/402-polars.txt @@ -2400,7 +2400,7 @@ 00:55:09 Ajit says, excellent content guys. -00:55:12 It certainly helps me kickstart my journey from pandas to pollers. +00:55:12 It certainly helps me kickstart my journey from pandas to Polars. 00:55:16 Awesome. @@ -2416,19 +2416,19 @@ 00:55:23 People are interested in this project. -00:55:25 They want to start playing and learning pollers. +00:55:25 They want to start playing and learning Polars. 00:55:27 maybe try it out on some other code that is and is at the moment. 00:55:30 What do they do? -00:55:31 I'd recommend if you have a new project, just start in pollers. +00:55:31 I'd recommend if you have a new project, just start in Polars. 00:55:34 Because you can also rewrite some comments, but the most fun experience will just start a new -00:55:41 project in pollers. +00:55:41 project in Polars. -00:55:42 And because then you can really enjoy what pollers offers. +00:55:42 And because then you can really enjoy what Polars offers. 00:55:46 The only expression API, learn how you use it declaratively. diff --git a/transcripts/402-polars.vtt b/transcripts/402-polars.vtt index 6e02536..dd7a863 100644 --- a/transcripts/402-polars.vtt +++ b/transcripts/402-polars.vtt @@ -3856,7 +3856,7 @@ Let's wrap it up with a comment from the audience here. Ajit says, excellent content guys. 00:55:12.160 --> 00:55:16.080 -It certainly helps me kickstart my journey from pandas to pollers. +It certainly helps me kickstart my journey from pandas to Polars. 00:55:16.080 --> 00:55:16.560 Awesome. @@ -3880,7 +3880,7 @@ So Richie, let's close it out with final call action. People are interested in this project. 00:55:25.200 --> 00:55:27.520 -They want to start playing and learning pollers. +They want to start playing and learning Polars. 00:55:27.520 --> 00:55:30.640 maybe try it out on some other code that is and is at the moment. @@ -3889,16 +3889,16 @@ maybe try it out on some other code that is and is at the moment. What do they do? 00:55:31.200 --> 00:55:34.480 -I'd recommend if you have a new project, just start in pollers. +I'd recommend if you have a new project, just start in Polars. 00:55:34.480 --> 00:55:41.120 Because you can also rewrite some comments, but the most fun experience will just start a new 00:55:41.120 --> 00:55:42.160 -project in pollers. +project in Polars. 00:55:42.160 --> 00:55:46.000 -And because then you can really enjoy what pollers offers. +And because then you can really enjoy what Polars offers. 00:55:46.000 --> 00:55:49.120 The only expression API, learn how you use it declaratively. diff --git a/transcripts/410-intersection-of-tabular-data-and-general-ai.txt b/transcripts/410-intersection-of-tabular-data-and-general-ai.txt index c895f08..62cb609 100644 --- a/transcripts/410-intersection-of-tabular-data-and-general-ai.txt +++ b/transcripts/410-intersection-of-tabular-data-and-general-ai.txt @@ -1596,7 +1596,7 @@ 00:58:54 that sort of like inspection on the function, part. I did a little bit of middleware to get the -00:59:00 two happy together. And then all you have to do is import FastAPI and then run, you know, G unicorn +00:59:00 two happy together. And then all you have to do is import FastAPI and then run, you know, Gunicorn 00:59:06 that app. And, it's two lines and any prompts you have made become their own independent rest diff --git a/transcripts/410-intersection-of-tabular-data-and-general-ai.vtt b/transcripts/410-intersection-of-tabular-data-and-general-ai.vtt index 40efd78..0afe953 100644 --- a/transcripts/410-intersection-of-tabular-data-and-general-ai.vtt +++ b/transcripts/410-intersection-of-tabular-data-and-general-ai.vtt @@ -5662,7 +5662,7 @@ so that all the, all the magic you get from that works. The server bit is I took that sort of like inspection on the function, part. I did a little bit of middleware to get the 00:59:00.720 --> 00:59:06.780 -two happy together. And then all you have to do is import FastAPI and then run, you know, G unicorn +two happy together. And then all you have to do is import FastAPI and then run, you know, Gunicorn 00:59:06.780 --> 00:59:14.220 that app. And, it's two lines and any prompts you have made become their own independent rest diff --git a/transcripts/421-python-at-netflix.txt b/transcripts/421-python-at-netflix.txt index ea66a7e..7007b93 100644 --- a/transcripts/421-python-at-netflix.txt +++ b/transcripts/421-python-at-netflix.txt @@ -1488,7 +1488,7 @@ 00:41:16 That's part of the auto, auto remediation of it. -00:41:18 And it says it's built on G unicorn flask and flask rest plus. +00:41:18 And it says it's built on Gunicorn flask and flask rest plus. 00:41:24 I'm familiar with the first batch, but the flask rest plus, this is new, an extension for flask that adds diff --git a/transcripts/421-python-at-netflix.vtt b/transcripts/421-python-at-netflix.vtt index d22d588..44c5b07 100644 --- a/transcripts/421-python-at-netflix.vtt +++ b/transcripts/421-python-at-netflix.vtt @@ -2245,7 +2245,7 @@ Cool. That's part of the auto, auto remediation of it. 00:41:18.820 --> 00:41:23.940 -And it says it's built on G unicorn flask and flask rest plus. +And it says it's built on Gunicorn flask and flask rest plus. 00:41:24.340 --> 00:41:30.860 I'm familiar with the first batch, but the flask rest plus, this is new, an extension for flask that adds diff --git a/transcripts/433-litestar.txt b/transcripts/433-litestar.txt index 917d1c0..8895d22 100644 --- a/transcripts/433-litestar.txt +++ b/transcripts/433-litestar.txt @@ -898,13 +898,13 @@ 00:31:41 I think that's one optimization we did across the board for everybody. -00:31:44 Everyone uses a single uv corn worker. +00:31:44 Everyone uses a single uvicorn worker. 00:31:47 Yes. 00:31:48 So the environment is the same for all frameworks that we test. -00:31:52 It's uv corn with uv loop, the siphon dependencies, and one worker pinned to one CPU core that's shielded. +00:31:52 It's uvicorn with uv loop, the siphon dependencies, and one worker pinned to one CPU core that's shielded. 00:32:01 So it just sort of gets something comparable. diff --git a/transcripts/433-litestar.vtt b/transcripts/433-litestar.vtt index 6c262d9..9072cc7 100644 --- a/transcripts/433-litestar.vtt +++ b/transcripts/433-litestar.vtt @@ -1411,7 +1411,7 @@ The benchmarks are on uv loop. I think that's one optimization we did across the board for everybody. 00:31:44.960 --> 00:31:47.840 -Everyone uses a single uv corn worker. +Everyone uses a single uvicorn worker. 00:31:47.960 --> 00:31:48.180 Yes. @@ -1420,7 +1420,7 @@ Yes. So the environment is the same for all frameworks that we test. 00:31:52.180 --> 00:32:01.140 -It's uv corn with uv loop, the siphon dependencies, and one worker pinned to one CPU core that's shielded. +It's uvicorn with uv loop, the siphon dependencies, and one worker pinned to one CPU core that's shielded. 00:32:01.140 --> 00:32:04.280 So it just sort of gets something comparable. diff --git a/transcripts/480-narwhals.txt b/transcripts/480-narwhals.txt index dd8f60a..8575d01 100644 --- a/transcripts/480-narwhals.txt +++ b/transcripts/480-narwhals.txt @@ -1360,7 +1360,7 @@ 00:39:35 Yeah. -00:39:36 But if it has enough of the functions of pandas or pollers, you're like, all right, this is +00:39:36 But if it has enough of the functions of pandas or Polars, you're like, all right, this is 00:39:41 probably good. @@ -1386,11 +1386,11 @@ 00:39:56 I do like the MKDocs where you can have these different examples. -00:40:00 One thing I noticed is you've got the pollers eager evaluation and you've got the pollers +00:40:00 One thing I noticed is you've got the Polars eager evaluation and you've got the Polars 00:40:06 lazy evaluation. -00:40:08 And when you have the pollers lazy, this function decorated with the decorator, the Narwhalify +00:40:08 And when you have the Polars lazy, this function decorated with the decorator, the Narwhalify 00:40:14 decorator, it itself returns something that is lazy and you've got to call collect on, right? @@ -1418,7 +1418,7 @@ 00:40:39 Yeah. -00:40:40 So the way you do that in pollers is you create a lazy frame versus data frame, right? +00:40:40 So the way you do that in Polars is you create a lazy frame versus data frame, right? 00:40:45 But then you've got to call collect on it, kind of like awaiting it if it were async, which @@ -1438,7 +1438,7 @@ 00:41:00 So one of the things that you talk about here is the pandas index, which is one of the -00:41:06 key differences between pollers and pandas. +00:41:06 key differences between Polars and pandas. 00:41:08 And you've classified pandas people into two categories. diff --git a/transcripts/480-narwhals.vtt b/transcripts/480-narwhals.vtt index 656e0b0..9fac6d5 100644 --- a/transcripts/480-narwhals.vtt +++ b/transcripts/480-narwhals.vtt @@ -2056,7 +2056,7 @@ Okay. Yeah. 00:39:36.340 --> 00:39:41.820 -But if it has enough of the functions of pandas or pollers, you're like, all right, this is +But if it has enough of the functions of pandas or Polars, you're like, all right, this is 00:39:41.820 --> 00:39:42.320 probably good. @@ -2095,13 +2095,13 @@ Let's see. I do like the MKDocs where you can have these different examples. 00:40:00.520 --> 00:40:06.600 -One thing I noticed is you've got the pollers eager evaluation and you've got the pollers +One thing I noticed is you've got the Polars eager evaluation and you've got the Polars 00:40:06.600 --> 00:40:08.020 lazy evaluation. 00:40:08.020 --> 00:40:14.420 -And when you have the pollers lazy, this function decorated with the decorator, the Narwhalify +And when you have the Polars lazy, this function decorated with the decorator, the Narwhalify 00:40:14.420 --> 00:40:19.460 decorator, it itself returns something that is lazy and you've got to call collect on, right? @@ -2143,7 +2143,7 @@ Exactly. Yeah. 00:40:40.300 --> 00:40:45.240 -So the way you do that in pollers is you create a lazy frame versus data frame, right? +So the way you do that in Polars is you create a lazy frame versus data frame, right? 00:40:45.240 --> 00:40:50.480 But then you've got to call collect on it, kind of like awaiting it if it were async, which @@ -2173,7 +2173,7 @@ Exactly. So one of the things that you talk about here is the pandas index, which is one of the 00:41:06.440 --> 00:41:08.620 -key differences between pollers and pandas. +key differences between Polars and pandas. 00:41:08.620 --> 00:41:12.540 And you've classified pandas people into two categories. diff --git a/transcripts/490-django-ninja.txt b/transcripts/490-django-ninja.txt index bafec6f..da220f4 100644 --- a/transcripts/490-django-ninja.txt +++ b/transcripts/490-django-ninja.txt @@ -754,7 +754,7 @@ 00:24:33 And then you, you work on it. -00:24:34 It's probably down to the production app server that use like G uv a corn or hyper corn or, or whatever you run it on. +00:24:34 It's probably down to the production app server that use like gunicorn or Hypercorn or, or whatever you run it on. 00:24:43 Right. @@ -1626,7 +1626,7 @@ 00:53:39 You don't have to use them. -00:53:40 You don't have to say G unicorn with UVicorn workers. +00:53:40 You don't have to say Gunicorn with UVicorn workers. 00:53:43 You can just run it as its own proper Linux server and so on. diff --git a/transcripts/490-django-ninja.vtt b/transcripts/490-django-ninja.vtt index 96cf1bf..5cafd73 100644 --- a/transcripts/490-django-ninja.vtt +++ b/transcripts/490-django-ninja.vtt @@ -1135,7 +1135,7 @@ So like, yeah, basically Django starts, is GI process, do some stuff on top, you And then you, you work on it. 00:24:34.540 --> 00:24:43.020 -It's probably down to the production app server that use like G uv a corn or hyper corn or, or whatever you run it on. +It's probably down to the production app server that use like gunicorn or Hypercorn or, or whatever you run it on. 00:24:43.020 --> 00:24:43.240 Right. @@ -2449,7 +2449,7 @@ They're now a standalone server. You don't have to use them. 00:53:40.980 --> 00:53:43.980 -You don't have to say G unicorn with UVicorn workers. +You don't have to say Gunicorn with UVicorn workers. 00:53:43.980 --> 00:53:47.980 You can just run it as its own proper Linux server and so on. diff --git a/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.txt b/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.txt index c4a8c81..651b67f 100644 --- a/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.txt +++ b/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.txt @@ -360,7 +360,7 @@ 00:08:45 Doing that in memory in Python is not that great. -00:08:47 But it sounds a little bit more like vector programming in the sense of kind of like you would do with pandas or pollers, right? +00:08:47 But it sounds a little bit more like vector programming in the sense of kind of like you would do with pandas or Polars, right? 00:08:56 You wouldn't loop over, you shouldn't loop over a pandas data frame processing each item. @@ -376,7 +376,7 @@ 00:09:17 So, it's a little different than pandas. -00:09:18 It's, you know, pollers is a bit more similar. +00:09:18 It's, you know, Polars is a bit more similar. 00:09:21 Where instead, pandas will process a whole column at a time. @@ -506,7 +506,7 @@ 00:12:37 Right. -00:12:38 And row-based is SQLite and column-based would be your pandas, your pollers, +00:12:38 And row-based is SQLite and column-based would be your pandas, your Polars, 00:12:42 and most of your cloud data warehouses, your snowflakes, that type of thing. diff --git a/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt b/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt index 321bff4..2bb000c 100644 --- a/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt +++ b/transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt @@ -562,7 +562,7 @@ Yeah, doing that in memory in JavaScript is not that great. Doing that in memory in Python is not that great. 00:08:47.880 --> 00:08:56.720 -But it sounds a little bit more like vector programming in the sense of kind of like you would do with pandas or pollers, right? +But it sounds a little bit more like vector programming in the sense of kind of like you would do with pandas or Polars, right? 00:08:56.720 --> 00:09:02.300 You wouldn't loop over, you shouldn't loop over a pandas data frame processing each item. @@ -586,7 +586,7 @@ It's vectorized execution. So, it's a little different than pandas. 00:09:18.720 --> 00:09:21.340 -It's, you know, pollers is a bit more similar. +It's, you know, Polars is a bit more similar. 00:09:21.340 --> 00:09:25.500 Where instead, pandas will process a whole column at a time. @@ -781,7 +781,7 @@ which is you want to be row-based or do you want to be column-based? Right. 00:12:38.340 --> 00:12:42.380 -And row-based is SQLite and column-based would be your pandas, your pollers, +And row-based is SQLite and column-based would be your pandas, your Polars, 00:12:42.380 --> 00:12:46.220 and most of your cloud data warehouses, your snowflakes, that type of thing. diff --git a/transcripts/495-osmnx-python-and-openstreetmap.txt b/transcripts/495-osmnx-python-and-openstreetmap.txt index 774cc8f..95744b5 100644 --- a/transcripts/495-osmnx-python-and-openstreetmap.txt +++ b/transcripts/495-osmnx-python-and-openstreetmap.txt @@ -1562,7 +1562,7 @@ 00:52:11 - Yeah, it does look interesting. -00:52:12 I believe its foundational internals are maybe just the API, But there are pollers, not pandas, +00:52:12 I believe its foundational internals are maybe just the API, But there are Polars, not pandas, 00:52:19 so maybe that's not as good of a fit. diff --git a/transcripts/495-osmnx-python-and-openstreetmap.vtt b/transcripts/495-osmnx-python-and-openstreetmap.vtt index debfb6a..9cedc91 100644 --- a/transcripts/495-osmnx-python-and-openstreetmap.vtt +++ b/transcripts/495-osmnx-python-and-openstreetmap.vtt @@ -3544,7 +3544,7 @@ or languages on people. I believe its foundational internals are maybe just the API, 00:52:16.960 --> 00:52:19.040 -But there are pollers, not pandas, +But there are Polars, not pandas, 00:52:19.140 --> 00:52:20.540 so maybe that's not as good of a fit. diff --git a/transcripts/500-django-simple-deploy.txt b/transcripts/500-django-simple-deploy.txt index 8cb0570..724de0e 100644 --- a/transcripts/500-django-simple-deploy.txt +++ b/transcripts/500-django-simple-deploy.txt @@ -328,7 +328,7 @@ 00:17:04 You have the CLI installed, simple deploy calls that deploy command. -00:17:09 For Heroku, when it gets to that deployment stage, in the automated mode, Jingo simple deploy runs git push Heroku main. +00:17:09 For Heroku, when it gets to that deployment stage, in the automated mode, Django simple deploy runs git push Heroku main. 00:17:17 And so it's one of the nice things about this project, and part of the reason it has taken so long to get to 1.0 is quite adaptable to different approaches. diff --git a/transcripts/500-django-simple-deploy.vtt b/transcripts/500-django-simple-deploy.vtt index a821b65..01c6710 100644 --- a/transcripts/500-django-simple-deploy.vtt +++ b/transcripts/500-django-simple-deploy.vtt @@ -556,7 +556,7 @@ So I don't even have to know what fly deploy is using behind the scenes. You have the CLI installed, simple deploy calls that deploy command. 00:17:09.680 --> 00:17:16.800 -For Heroku, when it gets to that deployment stage, in the automated mode, Jingo simple deploy runs git push Heroku main. +For Heroku, when it gets to that deployment stage, in the automated mode, Django simple deploy runs git push Heroku main. 00:17:17.089 --> 00:17:27.240 And so it's one of the nice things about this project, and part of the reason it has taken so long to get to 1.0 is quite adaptable to different approaches. diff --git a/transcripts/502-django-ledger.txt b/transcripts/502-django-ledger.txt index d8fa1b5..308a460 100644 --- a/transcripts/502-django-ledger.txt +++ b/transcripts/502-django-ledger.txt @@ -928,7 +928,7 @@ 00:40:17 Yeah, the CMS. -00:40:19 Yeah, probably the most popular Jingo CMS, which is cool. +00:40:19 Yeah, probably the most popular Django CMS, which is cool. 00:40:22 Do you know? diff --git a/transcripts/502-django-ledger.vtt b/transcripts/502-django-ledger.vtt index 1f90209..46663fe 100644 --- a/transcripts/502-django-ledger.vtt +++ b/transcripts/502-django-ledger.vtt @@ -1549,7 +1549,7 @@ Are you familiar with Wagtail? Yeah, the CMS. 00:40:19.400 --> 00:40:22.220 -Yeah, probably the most popular Jingo CMS, which is cool. +Yeah, probably the most popular Django CMS, which is cool. 00:40:22.520 --> 00:40:22.780 Do you know? diff --git a/transcripts/510-10-polars-tools-and-techniques-to-level-up.txt b/transcripts/510-10-polars-tools-and-techniques-to-level-up.txt index 6988e83..cd64499 100644 --- a/transcripts/510-10-polars-tools-and-techniques-to-level-up.txt +++ b/transcripts/510-10-polars-tools-and-techniques-to-level-up.txt @@ -226,19 +226,19 @@ 00:09:14 Okay. -00:09:15 Well, let's talk data science and let's talk pollers. +00:09:15 Well, let's talk data science and let's talk Polars. -00:09:18 So I guess there's a lot of people who have heard of pollers and are experts. +00:09:18 So I guess there's a lot of people who have heard of Polars and are experts. 00:09:23 I see out in the live audience already. -00:09:25 There are some folks whose things they've created for pollers we're going to talk about in this show, actually. +00:09:25 There are some folks whose things they've created for Polars we're going to talk about in this show, actually. -00:09:31 So some people need no introduction to pollers. +00:09:31 So some people need no introduction to Polars. 00:09:34 But there are many people listening to the show who are just getting into Python, using this as one of the footholds to kind of get a feel for the space. -00:09:42 And they might go like, what is pollers? +00:09:42 And they might go like, what is Polars? 00:09:44 Sure. @@ -248,7 +248,7 @@ 00:09:47 Like, tell us about this. -00:09:48 Yeah, I was about to say, before we get to the pollers, maybe we should just start with a data frame. +00:09:48 Yeah, I was about to say, before we get to the Polars, maybe we should just start with a data frame. 00:09:52 So a data frame is kind of an in-memory spreadsheet sort of thing. @@ -318,7 +318,7 @@ 00:12:11 And I find some of the stuff in Pandas kind of falls into that bucket. -00:12:14 Whereas you just do nice little chained function calls in pollers. +00:12:14 Whereas you just do nice little chained function calls in Polars. 00:12:18 And so you're reading, you know, filter, select kinds of things. @@ -356,7 +356,7 @@ 00:14:10 And maybe take the US as an example, that might be one fiftieth of the data that you wouldn't have to multiply. -00:14:16 So with pollers, it can even optimize. +00:14:16 So with Polars, it can even optimize. 00:14:19 Well, if there's a filter, do the filter first and then the calculation. @@ -474,7 +474,7 @@ 00:18:47 Sure. -00:18:48 As you might guess from the title, it kind of teaches your pollers. +00:18:48 As you might guess from the title, it kind of teaches your Polars. 00:18:51 And you can see on the screen there, there's a bunch of different icons as well. @@ -1216,7 +1216,7 @@ 00:44:55 And the only thing I really remember is really, really don't do it unless you absolutely have to. -00:45:01 And, you know, I routinely ingest a million rows with pollers in a sub-second. +00:45:01 And, you know, I routinely ingest a million rows with Polars in a sub-second. 00:45:07 So I've yet to need concurrency because the size of the data stuff that I deal with isn't in that petabyte range where it actually would start to matter. @@ -1450,7 +1450,7 @@ 00:54:28 column element wise, which is a long function name, but it does aggregation operations on the values -00:54:34 inside the list data, supporting the same kind of aggregation that regular pollers does. So you +00:54:34 inside the list data, supporting the same kind of aggregation that regular Polars does. So you 00:54:39 can do a mean account or some or whatever on those things that are in the column. So the library seems @@ -1546,7 +1546,7 @@ 00:57:59 And, you know, we only really covered a few of the plugins. -00:58:03 The Awesome Polars list also has links to articles, links to blog posts, things on different interfaces for pollers on other languages that you can use to interact with it. +00:58:03 The Awesome Polars list also has links to articles, links to blog posts, things on different interfaces for Polars on other languages that you can use to interact with it. 00:58:18 So even over and above the libraries we talked about there, there's a fair amount of decent content there. @@ -1558,9 +1558,9 @@ 00:58:27 I'll let you have the final word but I'll do a few for you check out your Django book very cool -00:58:33 even has some HTMX in it check out your pollers course which I'll link to in the show notes and +00:58:33 even has some HTMX in it check out your Polars course which I'll link to in the show notes and -00:58:38 is very germane to this conversation and with that people want to do some pollers maybe want to +00:58:38 is very germane to this conversation and with that people want to do some Polars maybe want to 00:58:45 level up a little bit with some of these things we talked about what do you tell them um you know @@ -1588,9 +1588,9 @@ 00:59:33 which is a library that will help you bridge code between working with pandas -00:59:37 and working with pollers and other data frame libraries. +00:59:37 and working with Polars and other data frame libraries. -00:59:40 So if you've got some code and you want to try out pollers on it, but it's mostly in pandas +00:59:40 So if you've got some code and you want to try out Polars on it, but it's mostly in pandas 00:59:44 or some other data frame library, like, I don't know, Dask or something, diff --git a/transcripts/510-10-polars-tools-and-techniques-to-level-up.vtt b/transcripts/510-10-polars-tools-and-techniques-to-level-up.vtt index edb0e31..4f71591 100644 --- a/transcripts/510-10-polars-tools-and-techniques-to-level-up.vtt +++ b/transcripts/510-10-polars-tools-and-techniques-to-level-up.vtt @@ -388,25 +388,25 @@ Yeah, indeed. Okay. 00:09:15.630 --> 00:09:18.420 -Well, let's talk data science and let's talk pollers. +Well, let's talk data science and let's talk Polars. 00:09:18.670 --> 00:09:23.120 -So I guess there's a lot of people who have heard of pollers and are experts. +So I guess there's a lot of people who have heard of Polars and are experts. 00:09:23.290 --> 00:09:25.780 I see out in the live audience already. 00:09:25.960 --> 00:09:31.140 -There are some folks whose things they've created for pollers we're going to talk about in this show, actually. +There are some folks whose things they've created for Polars we're going to talk about in this show, actually. 00:09:31.460 --> 00:09:34.700 -So some people need no introduction to pollers. +So some people need no introduction to Polars. 00:09:34.730 --> 00:09:42.060 But there are many people listening to the show who are just getting into Python, using this as one of the footholds to kind of get a feel for the space. 00:09:42.270 --> 00:09:44.260 -And they might go like, what is pollers? +And they might go like, what is Polars? 00:09:44.940 --> 00:09:45.300 Sure. @@ -421,7 +421,7 @@ And what's a data frame? Like, tell us about this. 00:09:48.200 --> 00:09:51.960 -Yeah, I was about to say, before we get to the pollers, maybe we should just start with a data frame. +Yeah, I was about to say, before we get to the Polars, maybe we should just start with a data frame. 00:09:52.030 --> 00:09:55.480 So a data frame is kind of an in-memory spreadsheet sort of thing. @@ -526,7 +526,7 @@ And that pulls me away from the, I'm trying to do something. And I find some of the stuff in Pandas kind of falls into that bucket. 00:12:14.540 --> 00:12:18.560 -Whereas you just do nice little chained function calls in pollers. +Whereas you just do nice little chained function calls in Polars. 00:12:18.880 --> 00:12:21.940 And so you're reading, you know, filter, select kinds of things. @@ -583,7 +583,7 @@ If I did that in pandas, it would do the math on every single thing and then fil And maybe take the US as an example, that might be one fiftieth of the data that you wouldn't have to multiply. 00:14:16.600 --> 00:14:18.460 -So with pollers, it can even optimize. +So with Polars, it can even optimize. 00:14:19.240 --> 00:14:22.100 Well, if there's a filter, do the filter first and then the calculation. @@ -823,7 +823,7 @@ Maybe give people a super quick rundown on what's on the course, and then we'll Sure. 00:18:48.010 --> 00:18:50.940 -As you might guess from the title, it kind of teaches your pollers. +As you might guess from the title, it kind of teaches your Polars. 00:18:51.760 --> 00:18:55.120 And you can see on the screen there, there's a bunch of different icons as well. @@ -2074,7 +2074,7 @@ It hasn't been something, you know, one of the things with concurrency, it's act And the only thing I really remember is really, really don't do it unless you absolutely have to. 00:45:01.820 --> 00:45:07.420 -And, you know, I routinely ingest a million rows with pollers in a sub-second. +And, you know, I routinely ingest a million rows with Polars in a sub-second. 00:45:07.640 --> 00:45:15.640 So I've yet to need concurrency because the size of the data stuff that I deal with isn't in that petabyte range where it actually would start to matter. @@ -2470,7 +2470,7 @@ the column. So for example, one of the stronger uses here is something called ag column element wise, which is a long function name, but it does aggregation operations on the values 00:54:34.840 --> 00:54:39.400 -inside the list data, supporting the same kind of aggregation that regular pollers does. So you +inside the list data, supporting the same kind of aggregation that regular Polars does. So you 00:54:39.400 --> 00:54:46.360 can do a mean account or some or whatever on those things that are in the column. So the library seems @@ -2614,7 +2614,7 @@ Yeah, for sure. And, you know, we only really covered a few of the plugins. 00:58:03.200 --> 00:58:17.340 -The Awesome Polars list also has links to articles, links to blog posts, things on different interfaces for pollers on other languages that you can use to interact with it. +The Awesome Polars list also has links to articles, links to blog posts, things on different interfaces for Polars on other languages that you can use to interact with it. 00:58:18.020 --> 00:58:23.880 So even over and above the libraries we talked about there, there's a fair amount of decent content there. @@ -2632,10 +2632,10 @@ A couple of final calls to action. I'll let you have the final word but I'll do a few for you check out your Django book very cool 00:58:33.240 --> 00:58:38.900 -even has some HTMX in it check out your pollers course which I'll link to in the show notes and +even has some HTMX in it check out your Polars course which I'll link to in the show notes and 00:58:38.940 --> 00:58:45.220 -is very germane to this conversation and with that people want to do some pollers maybe want to +is very germane to this conversation and with that people want to do some Polars maybe want to 00:58:45.720 --> 00:58:49.440 level up a little bit with some of these things we talked about what do you tell them um you know @@ -2686,13 +2686,13 @@ which is a library that will help you bridge code between working with pandas 00:59:37.520 --> 00:59:40.420 -and working with pollers and other data frame libraries. +and working with Polars and other data frame libraries. 00:59:40.700 --> 00:59:41.780 So if you've got some code 00:59:41.920 --> 00:59:43.220 -and you want to try out pollers on it, +and you want to try out Polars on it, 00:59:43.230 --> 00:59:44.620 but it's mostly in pandas diff --git a/transcripts/516-accelerating-python-data-science-at-nvidia.txt b/transcripts/516-accelerating-python-data-science-at-nvidia.txt index 96f6092..48e3f4e 100644 --- a/transcripts/516-accelerating-python-data-science-at-nvidia.txt +++ b/transcripts/516-accelerating-python-data-science-at-nvidia.txt @@ -210,7 +210,7 @@ 00:09:51 And I think just coincidentally, or just the way it works out, data science type of work, and really most significantly, the data science libraries, the way that they're built and the way they execute, line up perfectly with the way GPUs do their work. -00:10:07 And what I'm thinking of is pandas, pollers, all the vector type of stuff. +00:10:07 And what I'm thinking of is pandas, Polars, all the vector type of stuff. 00:10:12 So instead of saying, I'm going to loop over and do one thing at a time, you just say, here's a million rows apply this operation to all million and then either update it in place or give me a new data frame or whatever and that is perfect for like let me load that into a gpu and turn it loose in parallel on all these pieces because as a programmer a data scientist i don't i don't write the imperative bits of it right i just let it go and it's easy for things like rapids to grab that and parallelize diff --git a/transcripts/516-accelerating-python-data-science-at-nvidia.vtt b/transcripts/516-accelerating-python-data-science-at-nvidia.vtt index a67bdbb..887412b 100644 --- a/transcripts/516-accelerating-python-data-science-at-nvidia.vtt +++ b/transcripts/516-accelerating-python-data-science-at-nvidia.vtt @@ -319,7 +319,7 @@ Yeah, absolutely. And I think just coincidentally, or just the way it works out, data science type of work, and really most significantly, the data science libraries, the way that they're built and the way they execute, line up perfectly with the way GPUs do their work. 00:10:07.560 --> 00:10:12.740 -And what I'm thinking of is pandas, pollers, all the vector type of stuff. +And what I'm thinking of is pandas, Polars, all the vector type of stuff. 00:10:12.920 --> 00:10:43.640 So instead of saying, I'm going to loop over and do one thing at a time, you just say, here's a million rows apply this operation to all million and then either update it in place or give me a new data frame or whatever and that is perfect for like let me load that into a gpu and turn it loose in parallel on all these pieces because as a programmer a data scientist i don't i don't write the imperative bits of it right i just let it go and it's easy for things like rapids to grab that and parallelize diff --git a/transcripts/522-codecut-ai.txt b/transcripts/522-codecut-ai.txt index f7c3a96..8d16789 100644 --- a/transcripts/522-codecut-ai.txt +++ b/transcripts/522-codecut-ai.txt @@ -30,7 +30,7 @@ 00:01:17 We live stream the raw, uncut version of each episode on YouTube. -00:01:22 Just visit talkpython.fm/youtube to see the schedule of upcoming events. +00:01:22 Just visit talkpython.fm/youtube to see the schedule of upcoming events. 00:01:27 And be sure to subscribe and press the bell so you'll get notified anytime we're recording. diff --git a/transcripts/522-codecut-ai.vtt b/transcripts/522-codecut-ai.vtt index e04573c..ad3d4d2 100644 --- a/transcripts/522-codecut-ai.vtt +++ b/transcripts/522-codecut-ai.vtt @@ -49,7 +49,7 @@ That's right. We live stream the raw, uncut version of each episode on YouTube. 00:01:22.020 --> 00:01:26.680 -Just visit talkpython.fm/youtube to see the schedule of upcoming events. +Just visit talkpython.fm/youtube to see the schedule of upcoming events. 00:01:27.240 --> 00:01:31.520 And be sure to subscribe and press the bell so you'll get notified anytime we're recording. diff --git a/transcripts/523-pyrefly.txt b/transcripts/523-pyrefly.txt index a5ac403..0b6fb8e 100644 --- a/transcripts/523-pyrefly.txt +++ b/transcripts/523-pyrefly.txt @@ -26,7 +26,7 @@ 00:01:14 We live stream the raw, uncut version of each episode on YouTube. -00:01:18 Just visit talkpython.fm/youtube to see the schedule of upcoming events. +00:01:18 Just visit talkpython.fm/youtube to see the schedule of upcoming events. 00:01:23 and be sure to subscribe and press the bell so you'll get notified anytime we're recording. diff --git a/transcripts/523-pyrefly.vtt b/transcripts/523-pyrefly.vtt index 24eb0a0..3f1b3c4 100644 --- a/transcripts/523-pyrefly.vtt +++ b/transcripts/523-pyrefly.vtt @@ -43,7 +43,7 @@ That's right. We live stream the raw, uncut version of each episode on YouTube. 00:01:18.480 --> 00:01:22.920 -Just visit talkpython.fm/youtube to see the schedule of upcoming events. +Just visit talkpython.fm/youtube to see the schedule of upcoming events. 00:01:23.420 --> 00:01:27.780 and be sure to subscribe and press the bell so you'll get notified anytime we're recording. diff --git a/transcripts/524-things-python-developers-should-learn-in-2025-no-names.vtt b/transcripts/524-things-python-developers-should-learn-in-2025-no-names.vtt index a8ed10e..9a2858f 100644 --- a/transcripts/524-things-python-developers-should-learn-in-2025-no-names.vtt +++ b/transcripts/524-things-python-developers-should-learn-in-2025-no-names.vtt @@ -31,7 +31,7 @@ This is Talk Python To Me, episode 524, recorded September 22nd, 2025. five. Welcome to Talk Python To Me, a weekly podcast on Python. This is your host, Michael 00:01:04.000 --> 00:01:33.680 -Kennedy. Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython, both accounts over at fosstodon.org, and keep up with the show and listen to over nine years of episodes at talkpython.fm. If you want to be part of our live episodes, you can find the live streams over on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. This episode is brought to you by Sentry. Don't let those errors go unnoticed. Use Sentry like we do here at Talk Python. +Kennedy. Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython, both accounts over at fosstodon.org, and keep up with the show and listen to over nine years of episodes at talkpython.fm. If you want to be part of our live episodes, you can find the live streams over on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. This episode is brought to you by Sentry. Don't let those errors go unnoticed. Use Sentry like we do here at Talk Python. 00:01:34.220 --> 00:01:37.120 Sign up at talkpython.fm/sentry. @@ -2401,7 +2401,7 @@ You can also find the iTunes feed at /itunes, the Google Play feed at /play, and We're live streaming most of our recordings these days. 01:08:30.500 --> 01:08:37.980 -If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. +If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. 01:08:38.779 --> 01:08:40.120 This is your host, Michael Kennedy. diff --git a/transcripts/525-nicegui-no-names.vtt b/transcripts/525-nicegui-no-names.vtt index 0fcb7e6..f663fff 100644 --- a/transcripts/525-nicegui-no-names.vtt +++ b/transcripts/525-nicegui-no-names.vtt @@ -28,7 +28,7 @@ Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpy If you want to be part of our live episodes, you can find the live streams over on YouTube. 00:01:27.560 --> 00:01:33.420 -Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. +Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. 00:01:34.020 --> 00:01:37.700 This episode is sponsored by Posit Connect from the makers of Shiny. @@ -2458,7 +2458,7 @@ You can also find the iTunes feed at /itunes, the Google Play feed at /play, and We're live streaming most of our recordings these days. 01:17:01.100 --> 01:17:08.520 -If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. +If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. 01:17:09.520 --> 01:17:10.640 This is your host, Michael Kennedy. diff --git a/transcripts/525-nicegui.txt b/transcripts/525-nicegui.txt index b715368..2c421f4 100644 --- a/transcripts/525-nicegui.txt +++ b/transcripts/525-nicegui.txt @@ -16,7 +16,7 @@ 00:01:23 If you want to be part of our live episodes, you can find the live streams over on YouTube. -00:01:27 Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. +00:01:27 Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. 00:01:34 This episode is sponsored by Posit Connect from the makers of Shiny. @@ -1628,7 +1628,7 @@ 01:16:58 We're live streaming most of our recordings these days. -01:17:01 If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. +01:17:01 If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. 01:17:09 This is your host, Michael Kennedy. diff --git a/transcripts/525-nicegui.vtt b/transcripts/525-nicegui.vtt index 4bb7c3f..8d7febf 100644 --- a/transcripts/525-nicegui.vtt +++ b/transcripts/525-nicegui.vtt @@ -28,7 +28,7 @@ WEBVTT If you want to be part of our live episodes, you can find the live streams over on YouTube. 00:01:27.560 --> 00:01:33.420 -Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. +Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming shows. 00:01:34.020 --> 00:01:37.700 This episode is sponsored by Posit Connect from the makers of Shiny. @@ -2458,7 +2458,7 @@ WEBVTT We're live streaming most of our recordings these days. 01:17:01.100 --> 01:17:08.520 -If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. +If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. 01:17:09.520 --> 01:17:10.640 This is your host, Michael Kennedy. diff --git a/transcripts/526-building-data-science-with-foundation-llm-models.txt b/transcripts/526-building-data-science-with-foundation-llm-models.txt index 8f1f1aa..8c1ed8a 100644 --- a/transcripts/526-building-data-science-with-foundation-llm-models.txt +++ b/transcripts/526-building-data-science-with-foundation-llm-models.txt @@ -1456,7 +1456,7 @@ 00:45:50 one little piece of functionality. You can vendor in stuff a lot easier if it's a low stakes. -00:45:54 The other thing that I think AI assisted coding help I've seen help with is, now this is, +00:45:54 The other thing that I think AI assisted coding help I've seen help with is, now this is, 00:45:59 this will be a bit controversial, but it's going from prototype to production. And what I really diff --git a/transcripts/526-building-data-science-with-foundation-llm-models.vtt b/transcripts/526-building-data-science-with-foundation-llm-models.vtt index bd43fa3..0aea24f 100644 --- a/transcripts/526-building-data-science-with-foundation-llm-models.vtt +++ b/transcripts/526-building-data-science-with-foundation-llm-models.vtt @@ -2347,7 +2347,7 @@ WEBVTT one little piece of functionality. You can vendor in stuff a lot easier if it's a low stakes. 00:45:54.800 --> 00:45:59.400 -The other thing that I think AI assisted coding help I've seen help with is, now this is, +The other thing that I think AI assisted coding help I've seen help with is, now this is, 00:45:59.420 --> 00:46:03.680 this will be a bit controversial, but it's going from prototype to production. And what I really diff --git a/transcripts/528-python-apps-with-llm-building-blocks.txt b/transcripts/528-python-apps-with-llm-building-blocks.txt index 9a8e9ab..ea0dcd7 100644 --- a/transcripts/528-python-apps-with-llm-building-blocks.txt +++ b/transcripts/528-python-apps-with-llm-building-blocks.txt @@ -1732,7 +1732,7 @@ 00:48:07 Yeah. -00:48:08 So the thing that Simon Willis's library does allow you to do, is you are able to +00:48:08 So the thing that Simon Willis's library does allow you to do, is you are able to 00:48:13 say, well, I have a prompt over here, but the output that I'm supposed to get out, well, that has to be diff --git a/transcripts/528-python-apps-with-llm-building-blocks.vtt b/transcripts/528-python-apps-with-llm-building-blocks.vtt index 56894d3..e436542 100644 --- a/transcripts/528-python-apps-with-llm-building-blocks.vtt +++ b/transcripts/528-python-apps-with-llm-building-blocks.vtt @@ -2956,7 +2956,7 @@ WEBVTT Yeah. 00:48:08.060 --> 00:48:13.140 -So the thing that Simon Willis's library does allow you to do, is you are able to +So the thing that Simon Willis's library does allow you to do, is you are able to 00:48:13.160 --> 00:48:17.920 say, well, I have a prompt over here, but the output that I'm supposed to get out, well, that has to be diff --git a/transcripts/529-python-apps-with-llm-building-blocks.txt b/transcripts/529-python-apps-with-llm-building-blocks.txt index 3b5e817..aab799d 100644 --- a/transcripts/529-python-apps-with-llm-building-blocks.txt +++ b/transcripts/529-python-apps-with-llm-building-blocks.txt @@ -1594,11 +1594,11 @@ 00:54:30 than they are today, but they had to be perfect, like literally had to be perfect. and so -00:54:35 there was a certain, I think different attitude around, game development then, than there is +00:54:35 there was a certain, I think different attitude around, game development then, than there is 00:54:41 today. And remember they were also working in assembly language. so it was easy language -00:54:46 to work. I don't know if people like this word, but I would call it kind of hardcore, writing +00:54:46 to work. I don't know if people like this word, but I would call it kind of hardcore, writing 00:54:51 NES games in the 1980s. It was, you know, that kind of programming it's, it's just, it was just diff --git a/transcripts/529-python-apps-with-llm-building-blocks.vtt b/transcripts/529-python-apps-with-llm-building-blocks.vtt index 86e6055..88de2f3 100644 --- a/transcripts/529-python-apps-with-llm-building-blocks.vtt +++ b/transcripts/529-python-apps-with-llm-building-blocks.vtt @@ -2650,13 +2650,13 @@ WEBVTT than they are today, but they had to be perfect, like literally had to be perfect. and so 00:54:35.400 --> 00:54:41.200 -there was a certain, I think different attitude around, game development then, than there is +there was a certain, I think different attitude around, game development then, than there is 00:54:41.340 --> 00:54:46.160 today. And remember they were also working in assembly language. so it was easy language 00:54:46.360 --> 00:54:51.340 -to work. I don't know if people like this word, but I would call it kind of hardcore, writing +to work. I don't know if people like this word, but I would call it kind of hardcore, writing 00:54:51.600 --> 00:54:57.580 NES games in the 1980s. It was, you know, that kind of programming it's, it's just, it was just diff --git a/transcripts/530-anywidget.txt b/transcripts/530-anywidget.txt index 07fdd1a..49d750d 100644 --- a/transcripts/530-anywidget.txt +++ b/transcripts/530-anywidget.txt @@ -228,7 +228,7 @@ 00:07:42 And then you've got the backend stuff where Python people live doing NumPy, -00:07:46 pollers, AppLotlib, et cetera. +00:07:46 Polars, AppLotlib, et cetera. 00:07:48 Do you want to riff on that challenge a little bit? diff --git a/transcripts/530-anywidget.vtt b/transcripts/530-anywidget.vtt index dd54695..4c94d35 100644 --- a/transcripts/530-anywidget.vtt +++ b/transcripts/530-anywidget.vtt @@ -403,7 +403,7 @@ WEBVTT where Python people live doing NumPy, 00:07:46.080 --> 00:07:47.880 -pollers, AppLotlib, et cetera. +Polars, AppLotlib, et cetera. 00:07:48.280 --> 00:07:50.560 Do you want to riff on that challenge a little bit? diff --git a/transcripts/531-talk-python-in-prod.txt b/transcripts/531-talk-python-in-prod.txt index e43070e..bb02901 100644 --- a/transcripts/531-talk-python-in-prod.txt +++ b/transcripts/531-talk-python-in-prod.txt @@ -1702,7 +1702,7 @@ 00:52:12 That was my philosophy. -00:52:13 The, the structure of your site is, it has a lot of different pieces to it using +00:52:13 The, the structure of your site is, it has a lot of different pieces to it using 00:52:20 different technology. You spend some time talking about like static sites and using static sites for @@ -2066,7 +2066,7 @@ 01:05:12 Right. -01:05:13 I think that's the stuff that's, that's, globally applicable to a reader, which is nice. +01:05:13 I think that's the stuff that's, that's, globally applicable to a reader, which is nice. 01:05:18 so you've, it's now even a few months further on with Hetzner. diff --git a/transcripts/531-talk-python-in-prod.vtt b/transcripts/531-talk-python-in-prod.vtt index f89f290..7ce6b76 100644 --- a/transcripts/531-talk-python-in-prod.vtt +++ b/transcripts/531-talk-python-in-prod.vtt @@ -2848,7 +2848,7 @@ WEBVTT That was my philosophy. 00:52:13.460 --> 00:52:20.240 -The, the structure of your site is, it has a lot of different pieces to it using +The, the structure of your site is, it has a lot of different pieces to it using 00:52:20.260 --> 00:52:25.580 different technology. You spend some time talking about like static sites and using static sites for @@ -3430,7 +3430,7 @@ WEBVTT Right. 01:05:13.100 --> 01:05:18.220 -I think that's the stuff that's, that's, globally applicable to a reader, which is nice. +I think that's the stuff that's, that's, globally applicable to a reader, which is nice. 01:05:18.690 --> 01:05:22.320 so you've, it's now even a few months further on with Hetzner. diff --git a/transcripts/532-python-2025-year-in-review.txt b/transcripts/532-python-2025-year-in-review.txt index 82b9233..21db9dc 100644 --- a/transcripts/532-python-2025-year-in-review.txt +++ b/transcripts/532-python-2025-year-in-review.txt @@ -726,7 +726,7 @@ 00:25:20 Like inline script metadata coming in and help making that more of a thing. -00:25:24 Disclaimer, I was the PEP delegate for getting that in. +00:25:24 Disclaimer, I was the PEP delegate for getting that in. 00:25:27 But I just think that's been a really awesome trend And I'm hoping we can kind of leverage that a bit. @@ -1612,7 +1612,7 @@ 00:56:48 Yes, exactly. -00:56:49 It should also be preface that Barry created the PEP process. He should have started that one. +00:56:49 It should also be preface that Barry created the PEP process. He should have started that one. 00:56:55 It is that old. @@ -1636,7 +1636,7 @@ 00:57:26 Let's put it that way. -00:57:28 For the PEP process, I think for a lot of people, it's not obvious how difficult the process is. +00:57:28 For the PEP process, I think for a lot of people, it's not obvious how difficult the process is. 00:57:35 I mean, it wasn't even obvious to me. @@ -1694,7 +1694,7 @@ 00:59:42 And that cannot be an acceptable way to discuss the evolution of the language. -00:59:48 Especially since apparently now every single PEP author of any contentious or semi contentious pep. +00:59:48 Especially since apparently now every single PEP author of any contentious or semi contentious pep. 00:59:55 Although I have to say, Pep 810 had such broad support. diff --git a/transcripts/532-python-2025-year-in-review.vtt b/transcripts/532-python-2025-year-in-review.vtt index f792004..cb1d336 100644 --- a/transcripts/532-python-2025-year-in-review.vtt +++ b/transcripts/532-python-2025-year-in-review.vtt @@ -1201,7 +1201,7 @@ WEBVTT and help making that more of a thing. 00:25:24.380 --> 00:25:26.860 -Disclaimer, I was the PEP delegate for getting that in. +Disclaimer, I was the PEP delegate for getting that in. 00:25:27.140 --> 00:25:29.400 But I just think that's been a really awesome trend @@ -2638,7 +2638,7 @@ WEBVTT Yes, exactly. 00:56:49.210 --> 00:56:53.060 -It should also be preface that Barry created the PEP process. He should have started that one. +It should also be preface that Barry created the PEP process. He should have started that one. 00:56:55.780 --> 00:56:57.280 It is that old. @@ -2674,7 +2674,7 @@ WEBVTT Let's put it that way. 00:57:28.840 --> 00:57:30.900 -For the PEP process, I think for a lot of people, +For the PEP process, I think for a lot of people, 00:57:31.160 --> 00:57:35.080 it's not obvious how difficult the process is. @@ -2782,7 +2782,7 @@ WEBVTT And that cannot be an acceptable way to discuss the evolution of the language. 00:59:48.420 --> 00:59:55.100 -Especially since apparently now every single PEP author of any contentious or semi contentious pep. +Especially since apparently now every single PEP author of any contentious or semi contentious pep. 00:59:55.420 --> 00:59:58.820 Although I have to say, Pep 810 had such broad support. diff --git a/transcripts/533-web-frameworks-in-prod-by-their-creators.txt b/transcripts/533-web-frameworks-in-prod-by-their-creators.txt index c6bdb4e..d841204 100644 --- a/transcripts/533-web-frameworks-in-prod-by-their-creators.txt +++ b/transcripts/533-web-frameworks-in-prod-by-their-creators.txt @@ -782,7 +782,7 @@ 00:21:21 Don't have to start calculating whatever. -00:21:23 Most of the stuff we run nowadays with uv corn or Django deployment +00:21:23 Most of the stuff we run nowadays with uvicorn or Django deployment 00:21:28 up until I think three months ago or so was running under the Unicorn, diff --git a/transcripts/533-web-frameworks-in-prod-by-their-creators.vtt b/transcripts/533-web-frameworks-in-prod-by-their-creators.vtt index c322bf6..79b56b9 100644 --- a/transcripts/533-web-frameworks-in-prod-by-their-creators.vtt +++ b/transcripts/533-web-frameworks-in-prod-by-their-creators.vtt @@ -1312,7 +1312,7 @@ WEBVTT Don't have to start calculating whatever. 00:21:23.280 --> 00:21:28.580 -Most of the stuff we run nowadays with uv corn or Django deployment +Most of the stuff we run nowadays with uvicorn or Django deployment 00:21:28.900 --> 00:21:34.080 up until I think three months ago or so was running under the Unicorn, diff --git a/transcripts/534-diskcache-your-secret-python-perf-weapon.txt b/transcripts/534-diskcache-your-secret-python-perf-weapon.txt index 29b687d..b777ee1 100644 --- a/transcripts/534-diskcache-your-secret-python-perf-weapon.txt +++ b/transcripts/534-diskcache-your-secret-python-perf-weapon.txt @@ -626,7 +626,7 @@ 00:16:22 Cause one thing that's like really nice and convenient in terms of like CICD and deployments -00:16:26 and all that, oh, suppose you want to scale horizontally and there's like Docker containers +00:16:26 and all that, oh, suppose you want to scale horizontally and there's like Docker containers 00:16:31 running on the left and there's this one Postgres thing running on the right. diff --git a/transcripts/534-diskcache-your-secret-python-perf-weapon.vtt b/transcripts/534-diskcache-your-secret-python-perf-weapon.vtt index 46e3d67..a322400 100644 --- a/transcripts/534-diskcache-your-secret-python-perf-weapon.vtt +++ b/transcripts/534-diskcache-your-secret-python-perf-weapon.vtt @@ -1114,7 +1114,7 @@ Had that stable base, yeah sir Cause one thing that's like really nice and convenient in terms of like CICD and deployments 00:16:26.620 --> 00:16:31.140 -and all that, oh, suppose you want to scale horizontally and there's like Docker containers +and all that, oh, suppose you want to scale horizontally and there's like Docker containers 00:16:31.460 --> 00:16:34.780 running on the left and there's this one Postgres thing running on the right. diff --git a/transcripts/538-python-in-digital-humanities.txt b/transcripts/538-python-in-digital-humanities.txt index 67af577..bf68584 100644 --- a/transcripts/538-python-in-digital-humanities.txt +++ b/transcripts/538-python-in-digital-humanities.txt @@ -1378,7 +1378,7 @@ 00:44:13 If I could get rid of all the ads, I do not need a Yeti thing, whatever that is. -00:44:17 the glass, not the mythical thing, but frozen flask, which does a similar thing for flask +00:44:17 the glass, not the mythical thing, but frozen flask, which does a similar thing for flask 00:44:25 apps. If you're a flask person probably would work with court. Don't know for sure, but probably. @@ -1776,9 +1776,9 @@ 00:56:52 Somebody got, oh gosh, what was the chain? -00:56:55 This is the whole, JavaScript, the PyCon talk where got like Firefox +00:56:55 This is the whole, JavaScript, the PyCon talk where got like Firefox -00:57:04 compiled into, not WASM, into, ASM JS or something like that. +00:57:04 compiled into, not WASM, into, ASM JS or something like that. 00:57:10 So it was run like Chrome was running Firefox, which was running, I think diff --git a/transcripts/538-python-in-digital-humanities.vtt b/transcripts/538-python-in-digital-humanities.vtt index a3be1c2..3e76270 100644 --- a/transcripts/538-python-in-digital-humanities.vtt +++ b/transcripts/538-python-in-digital-humanities.vtt @@ -2233,7 +2233,7 @@ WEBVTT If I could get rid of all the ads, I do not need a Yeti thing, whatever that is. 00:44:17.200 --> 00:44:24.900 -the glass, not the mythical thing, but frozen flask, which does a similar thing for flask +the glass, not the mythical thing, but frozen flask, which does a similar thing for flask 00:44:25.300 --> 00:44:30.180 apps. If you're a flask person probably would work with court. Don't know for sure, but probably. @@ -2908,10 +2908,10 @@ WEBVTT Somebody got, oh gosh, what was the chain? 00:56:55.510 --> 00:57:03.180 -This is the whole, JavaScript, the PyCon talk where got like Firefox +This is the whole, JavaScript, the PyCon talk where got like Firefox 00:57:04.280 --> 00:57:10.080 -compiled into, not WASM, into, ASM JS or something like that. +compiled into, not WASM, into, ASM JS or something like that. 00:57:10.250 --> 00:57:14.300 So it was run like Chrome was running Firefox, which was running, I think diff --git a/transcripts/539-catching-up-with-the-python-typing-council.txt b/transcripts/539-catching-up-with-the-python-typing-council.txt index 25d6386..55b6f02 100644 --- a/transcripts/539-catching-up-with-the-python-typing-council.txt +++ b/transcripts/539-catching-up-with-the-python-typing-council.txt @@ -742,9 +742,9 @@ 00:20:31 Yeah, that made it hard because often there's peps built on top of each other. -00:20:35 So then in the extreme, you might see like one thing in one PEP and then there's +00:20:35 So then in the extreme, you might see like one thing in one PEP and then there's -00:20:39 another PEP that adds an aspect of it, another one that adds another aspect. +00:20:39 another PEP that adds an aspect of it, another one that adds another aspect. 00:20:43 And overall it makes it very hard to follow. diff --git a/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt index bcfce13..8ac2958 100644 --- a/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt +++ b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.txt @@ -180,7 +180,7 @@ 00:06:29 where members are corporates. -00:06:30 And people make decisions in both foundation and projects or in PMCs, so-called project management committees. +00:06:30 And people make decisions in both foundation and projects or NPMCs, so-called project management committees. 00:06:39 And Airflow is one of the PMCs. diff --git a/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt index 6c9349c..dba25f4 100644 --- a/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt +++ b/transcripts/540-modern-python-monorepo-with-uv-and-prek-transcript.vtt @@ -307,7 +307,7 @@ So every member is an individual, not a corporate, as opposed to like Linux Soft where members are corporates. 00:06:30.760 --> 00:06:38.720 -And people make decisions in both foundation and projects or in PMCs, so-called project management committees. +And people make decisions in both foundation and projects or NPMCs, so-called project management committees. 00:06:39.280 --> 00:06:41.160 And Airflow is one of the PMCs. diff --git a/transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt index e8c561f..3d13093 100644 --- a/transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt +++ b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.txt @@ -1020,7 +1020,7 @@ 00:35:02 And even more powerful. -00:35:03 If I go click on that, on that particular one, if you kick on the pair tuples, or go just, perhaps. +00:35:03 If I go click on that, on that particular one, if you kick on the pair tuples, or go just, perhaps. 00:35:10 Yeah. @@ -1052,7 +1052,7 @@ 00:35:51 What's even cooler is under the hood. -00:35:52 They're using, Oh, I'm having a blank on the name, but they're, they're not even measuring, they're measuring like CPU and CPU instructions. +00:35:52 They're using, Oh, I'm having a blank on the name, but they're, they're not even measuring, they're measuring like CPU and CPU instructions. 00:36:00 Okay. @@ -1070,7 +1070,7 @@ 00:36:16 Well, cool. -00:36:18 I don't know what that's about, but there's a, a polygonal polygon. +00:36:18 I don't know what that's about, but there's a, a polygonal polygon. 00:36:23 No, well, I don't know what this is a cartoon, but there's also the app. @@ -1104,7 +1104,7 @@ 00:37:11 The thing where Monty really excels. -00:37:13 So if you scroll down a bit and I can talk you through the table, it's like near the bottom of the, of the read me. +00:37:13 So if you scroll down a bit and I can talk you through the table, it's like near the bottom of the, of the read me. 00:37:19 but yeah, there we are. @@ -1116,13 +1116,13 @@ 00:37:37 So, and actually in the hot, hot loop in benchmarks, we see one plus one, going from codes to result in Monty taking about 900 nanoseconds. -00:37:45 So under a microsecond, again, that's, that's microsecond, not millisecond or second. +00:37:45 So under a microsecond, again, that's, that's microsecond, not millisecond or second. -00:37:51 when you compare that to like running something in Docker, which is taking in, in my example here, 195 milliseconds, Pyodide, Pyodide is awesome project. +00:37:51 when you compare that to like running something in Docker, which is taking in, in my example here, 195 milliseconds, Pyodide, Pyodide is awesome project. 00:38:01 Big fan of, of the team, allowing you to run Python in the browser, but wasn't designed for this use case, running, going from zero to like getting a result in Pyodide is, 2.8 seconds. -00:38:13 Starlark's a special case of another project, a bit like Monty, but a bit more limited. +00:38:13 Starlark's a special case of another project, a bit like Monty, but a bit more limited. 00:38:19 but sandboxing, I was talking earlier about that being one of the main options, like go run a, basically spin up a new container somewhere. @@ -1176,7 +1176,7 @@ 00:40:02 Right. -00:40:03 Like you'll see it doing, you know, CSV types of things with Python and all sorts of stuff. +00:40:03 Like you'll see it doing, you know, CSV types of things with Python and all sorts of stuff. 00:40:08 And so that's a really good place where that Monty could be the foundation of it. @@ -1184,7 +1184,7 @@ 00:40:13 Yeah, exactly. -00:40:13 And, you know, the other nice thing about that is if you have the Python code and something does go wrong, you're not having to like kind of guess at what's going on inside the black box of the LLM. +00:40:13 And, you know, the other nice thing about that is if you have the Python code and something does go wrong, you're not having to like kind of guess at what's going on inside the black box of the LLM. 00:40:23 Well, I suppose you are at some level, but you have the code, which is kind of the intermediate step where you can go and verify. @@ -1216,9 +1216,9 @@ 00:41:02 So the things we miss right now, I'll start with, with the downside. -00:41:05 The things we miss right now are, classes, context managers. +00:41:05 The things we miss right now are, classes, context managers. -00:41:10 So, so with expressions, and match expressions, which are obviously relatively new. +00:41:10 So, so with expressions, and match expressions, which are obviously relatively new. 00:41:16 I think classes are by far the most complex of those. @@ -1254,7 +1254,7 @@ 00:42:19 And then, then the other big part of partial is we don't have the full standard library. -00:42:23 So we have a very, very limited standard library today of some bits of typing, some bits of the SIS, module, OS dot environment, as a PR up from someone to add re, regexes date, date time. +00:42:23 So we have a very, very limited standard library today of some bits of typing, some bits of the SIS, module, OS dot environment, as a PR up from someone to add re, regexes date, date time. 00:42:38 And I think we'll add Jason. @@ -1268,7 +1268,7 @@ 00:42:54 but we're never going to go and support the whole standard library. -00:42:58 It'll be on a case by case to LLMs actually need this thing, that we can go and go and add them. +00:42:58 It'll be on a case by case to LLMs actually need this thing, that we can go and go and add them. 00:43:02 I will say, and I know we're going to talk about this at some point, but like, it is amazing what this project is only made possible by LLMs and not, not that we're ever aiming to full standard library, but adding support for certain, certain modules of the standard library is a heck of a lot easier when you can, again, @@ -1338,7 +1338,7 @@ 00:45:53 Right. -00:45:54 but I guess you probably kind of, you kind of got to do a similar analysis, but not for evil where you say like, well, if I just ask Claude or, or, codex or whatever to do a thing, what is it? +00:45:54 but I guess you probably kind of, you kind of got to do a similar analysis, but not for evil where you say like, well, if I just ask Claude or, or, codex or whatever to do a thing, what is it? 00:46:06 What does it try to do? @@ -1368,9 +1368,9 @@ 00:46:57 so yes, exactly. -00:47:00 And yeah, I mean, I think Boris, Boris, the, code code creator talked about this. +00:47:00 And yeah, I mean, I think Boris, Boris, the, code code creator talked about this. -00:47:06 So I saw him speaking, he was saying like, you know, one of the reasons they gave the LLM bash early on was like, you can tell it to use the make dark, tool to make directories, but half the time it'll just go and call make dark --P and make the directory that way. +00:47:06 So I saw him speaking, he was saying like, you know, one of the reasons they gave the LLM bash early on was like, you can tell it to use the make dark, tool to make directories, but half the time it'll just go and call make dark --P and make the directory that way. 00:47:20 And like, are we going to fight it and always return an error being like, you should do this other thing, or are we just going to make that thing work? @@ -1416,7 +1416,7 @@ 00:48:34 So someone invents the hammer and I think it's going to be used for nails. -00:48:36 And then someone else realizes that you can like change the, like knockout, like mistakes in your bumper of your car with a hammer. +00:48:36 And then someone else realizes that you can like change the, like knockout, like mistakes in your bumper of your car with a hammer. 00:48:43 Right. @@ -1446,7 +1446,7 @@ 00:49:39 It needs to be really very, very controlled. -00:49:42 this could be a really interesting, thing. +00:49:42 this could be a really interesting, thing. 00:49:44 So does it compile to WebAssembly? @@ -1454,7 +1454,7 @@ 00:49:48 Yep. -00:49:48 And in fact, Simon Willison, the day it came out or Simon Willison, Claude prompted by Simon Willison set one up. +00:49:48 And in fact, Simon Willison, the day it came out or Simon Willison, Claude prompted by Simon Willison set one up. 00:49:54 So I think if you go to Simon's blog somewhere, there's actually an example of Monty running somewhere, somewhere in a browser that you can, you can go and go and try it. @@ -1462,7 +1462,7 @@ 00:50:04 yeah, somewhere here, I think he'll have a link to, to his, his version of it. -00:50:09 so as he pointed out that you can do the really crazy thing, which is you can, you can compile the Python package for, yeah, so this is, this is his example, which is, I think like, WebAssembly running directly in the browser, but he did something even more crazy, which is he took the Python library, +00:50:09 so as he pointed out that you can do the really crazy thing, which is you can, you can compile the Python package for, yeah, so this is, this is his example, which is, I think like, WebAssembly running directly in the browser, but he did something even more crazy, which is he took the Python library, 00:50:24 compiled that to, to Wasm and then called that from inside Pyodide, which is like crazy worlds within worlds. @@ -1482,7 +1482,7 @@ 00:50:39 Yeah. -00:50:39 And I think the other, the other thing we really need to add to this table, in terms of, of latency and complexity is calling back to the host. +00:50:39 And I think the other, the other thing we really need to add to this table, in terms of, of latency and complexity is calling back to the host. 00:50:45 So one of the reasons a number of people have reached out to me and excited about this is sure that they're happy to have a sandboxing service. @@ -1670,7 +1670,7 @@ 00:56:04 I saw your announcement of this on X actually is where I saw it. -00:56:08 And I believe, it's been a little while since I saw it, but it said something to the effect of like, this is way too early, but what the heck, here we go. +00:56:08 And I believe, it's been a little while since I saw it, but it said something to the effect of like, this is way too early, but what the heck, here we go. 00:56:17 Posted the GitHub link, right? @@ -1712,7 +1712,7 @@ 00:57:26 And it has some, some overheads and some, some, challenges around, security. -00:57:33 but yeah, this is very similar in the sense of like, it's basically vibe coding, all of the terminal methods that you might want, and using a bunch of existing unit tests to, to check that they're correct. +00:57:33 but yeah, this is very similar in the sense of like, it's basically vibe coding, all of the terminal methods that you might want, and using a bunch of existing unit tests to, to check that they're correct. 00:57:43 interesting that obviously Vasell is a much, much bigger name than we are. @@ -1756,7 +1756,7 @@ 00:59:02 I think there were three reasons why these things have, why I'll talk about Monty in particular, why it is possible now when it wasn't before and why it is something where the like speed up from an LLM is even greater than in most, most coding tasks. -00:59:14 One, the LLM, knows in its soul, in its weights, the internal implementation, how to go about implementing a bytecode interpreter or how to implement it. +00:59:14 One, the LLM, knows in its soul, in its weights, the internal implementation, how to go about implementing a bytecode interpreter or how to implement it. 00:59:25 If I asked most even experienced Python engineers or Rust engineers, how do I write a bytecode interpreter? @@ -1868,17 +1868,17 @@ 01:02:53 The fuzzing is another amazing technique. -01:02:55 So we use, so we have a Jason parser called jitter, which is about the fastest Jason parser in rust that we also is built into, Pydantic core, but it's also actually independently a package in, in PyPI that's used an awful lot. +01:02:55 So we use, so we have a Jason parser called jitter, which is about the fastest Jason parser in rust that we also is built into, Pydantic core, but it's also actually independently a package in, in PyPI that's used an awful lot. 01:03:09 You'll see it in the dependencies of open AI, for example. 01:03:12 but jitter was where we, I discovered about fuzzing really. -01:03:15 No, I found out about it through, the hypothesis project project of, my friends, Zach Hatfield Dodds in Python, but then fuzzing in rust because the performance is so much better is, is, is really powerful. +01:03:15 No, I found out about it through, the hypothesis project project of, my friends, Zach Hatfield Dodds in Python, but then fuzzing in rust because the performance is so much better is, is, is really powerful. 01:03:27 So basically it's generating random strings and using them as an input something, but then it's using very clever stochastic techniques to work out where to try more things. -01:03:35 And so you can basically fuzz, Monty, you can just give it arbitrary strings for hour after hour. +01:03:35 And so you can basically fuzz, Monty, you can just give it arbitrary strings for hour after hour. 01:03:41 And periodically it'll find something where there's an error where like the memory usage is too high. diff --git a/transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt index 32b0f6f..b2a1949 100644 --- a/transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt +++ b/transcripts/541-monty-python-in-rust-for-ai-transcript-final.vtt @@ -1534,7 +1534,7 @@ So we can't, as long as we have enough benchmarks, we can't have like silent reg And even more powerful. 00:35:03.320 --> 00:35:10.000 -If I go click on that, on that particular one, if you kick on the pair tuples, or go just, perhaps. +If I go click on that, on that particular one, if you kick on the pair tuples, or go just, perhaps. 00:35:10.240 --> 00:35:10.480 Yeah. @@ -1582,7 +1582,7 @@ Yeah. What's even cooler is under the hood. 00:35:52.720 --> 00:36:00.240 -They're using, Oh, I'm having a blank on the name, but they're, they're not even measuring, they're measuring like CPU and CPU instructions. +They're using, Oh, I'm having a blank on the name, but they're, they're not even measuring, they're measuring like CPU and CPU instructions. 00:36:00.400 --> 00:36:00.680 Okay. @@ -1609,7 +1609,7 @@ See what this pulls up. Well, cool. 00:36:18.000 --> 00:36:22.780 -I don't know what that's about, but there's a, a polygonal polygon. +I don't know what that's about, but there's a, a polygonal polygon. 00:36:23.680 --> 00:36:27.060 No, well, I don't know what this is a cartoon, but there's also the app. @@ -1660,7 +1660,7 @@ And that's not going to matter when you add a LLM requests are taking seconds. The thing where Monty really excels. 00:37:13.460 --> 00:37:19.480 -So if you scroll down a bit and I can talk you through the table, it's like near the bottom of the, of the read me. +So if you scroll down a bit and I can talk you through the table, it's like near the bottom of the, of the read me. 00:37:19.780 --> 00:37:22.120 but yeah, there we are. @@ -1678,16 +1678,16 @@ So that's six, microseconds. So, and actually in the hot, hot loop in benchmarks, we see one plus one, going from codes to result in Monty taking about 900 nanoseconds. 00:37:45.600 --> 00:37:51.220 -So under a microsecond, again, that's, that's microsecond, not millisecond or second. +So under a microsecond, again, that's, that's microsecond, not millisecond or second. 00:37:51.580 --> 00:38:01.660 -when you compare that to like running something in Docker, which is taking in, in my example here, 195 milliseconds, Pyodide, Pyodide is awesome project. +when you compare that to like running something in Docker, which is taking in, in my example here, 195 milliseconds, Pyodide, Pyodide is awesome project. 00:38:01.740 --> 00:38:12.880 Big fan of, of the team, allowing you to run Python in the browser, but wasn't designed for this use case, running, going from zero to like getting a result in Pyodide is, 2.8 seconds. 00:38:13.380 --> 00:38:18.740 -Starlark's a special case of another project, a bit like Monty, but a bit more limited. +Starlark's a special case of another project, a bit like Monty, but a bit more limited. 00:38:19.260 --> 00:38:25.060 but sandboxing, I was talking earlier about that being one of the main options, like go run a, basically spin up a new container somewhere. @@ -1768,7 +1768,7 @@ And then I'll just apply this data set to it. Right. 00:40:03.060 --> 00:40:08.480 -Like you'll see it doing, you know, CSV types of things with Python and all sorts of stuff. +Like you'll see it doing, you know, CSV types of things with Python and all sorts of stuff. 00:40:08.480 --> 00:40:12.580 And so that's a really good place where that Monty could be the foundation of it. @@ -1780,7 +1780,7 @@ Right. Yeah, exactly. 00:40:13.860 --> 00:40:23.280 -And, you know, the other nice thing about that is if you have the Python code and something does go wrong, you're not having to like kind of guess at what's going on inside the black box of the LLM. +And, you know, the other nice thing about that is if you have the Python code and something does go wrong, you're not having to like kind of guess at what's going on inside the black box of the LLM. 00:40:23.560 --> 00:40:28.280 Well, I suppose you are at some level, but you have the code, which is kind of the intermediate step where you can go and verify. @@ -1828,10 +1828,10 @@ Yeah. So the things we miss right now, I'll start with, with the downside. 00:41:05.560 --> 00:41:10.720 -The things we miss right now are, classes, context managers. +The things we miss right now are, classes, context managers. 00:41:10.940 --> 00:41:15.860 -So, so with expressions, and match expressions, which are obviously relatively new. +So, so with expressions, and match expressions, which are obviously relatively new. 00:41:16.100 --> 00:41:18.600 I think classes are by far the most complex of those. @@ -1885,7 +1885,7 @@ What we will never. And then, then the other big part of partial is we don't have the full standard library. 00:42:23.980 --> 00:42:38.600 -So we have a very, very limited standard library today of some bits of typing, some bits of the SIS, module, OS dot environment, as a PR up from someone to add re, regexes date, date time. +So we have a very, very limited standard library today of some bits of typing, some bits of the SIS, module, OS dot environment, as a PR up from someone to add re, regexes date, date time. 00:42:38.980 --> 00:42:39.840 And I think we'll add Jason. @@ -1906,7 +1906,7 @@ I mean, we're a bit of overhead to creating the Monty object, but, but very, ver but we're never going to go and support the whole standard library. 00:42:58.300 --> 00:43:02.560 -It'll be on a case by case to LLMs actually need this thing, that we can go and go and add them. +It'll be on a case by case to LLMs actually need this thing, that we can go and go and add them. 00:43:02.640 --> 00:43:17.620 I will say, and I know we're going to talk about this at some point, but like, it is amazing what this project is only made possible by LLMs and not, not that we're ever aiming to full standard library, but adding support for certain, certain modules of the standard library is a heck of a lot easier when you can, again, @@ -2011,7 +2011,7 @@ So people would go and find popular ones of those and then register malicious pa Right. 00:45:54.020 --> 00:46:06.640 -but I guess you probably kind of, you kind of got to do a similar analysis, but not for evil where you say like, well, if I just ask Claude or, or, codex or whatever to do a thing, what is it? +but I guess you probably kind of, you kind of got to do a similar analysis, but not for evil where you say like, well, if I just ask Claude or, or, codex or whatever to do a thing, what is it? 00:46:06.800 --> 00:46:07.860 What does it try to do? @@ -2056,10 +2056,10 @@ And we're going to be like, our principle is give the LLM what it wants, not her so yes, exactly. 00:47:00.860 --> 00:47:06.240 -And yeah, I mean, I think Boris, Boris, the, code code creator talked about this. +And yeah, I mean, I think Boris, Boris, the, code code creator talked about this. 00:47:06.240 --> 00:47:20.260 -So I saw him speaking, he was saying like, you know, one of the reasons they gave the LLM bash early on was like, you can tell it to use the make dark, tool to make directories, but half the time it'll just go and call make dark --P and make the directory that way. +So I saw him speaking, he was saying like, you know, one of the reasons they gave the LLM bash early on was like, you can tell it to use the make dark, tool to make directories, but half the time it'll just go and call make dark --P and make the directory that way. 00:47:20.260 --> 00:47:25.260 And like, are we going to fight it and always return an error being like, you should do this other thing, or are we just going to make that thing work? @@ -2128,7 +2128,7 @@ but of course, you know, the best tools are the ones where you, people use the t So someone invents the hammer and I think it's going to be used for nails. 00:48:36.640 --> 00:48:43.000 -And then someone else realizes that you can like change the, like knockout, like mistakes in your bumper of your car with a hammer. +And then someone else realizes that you can like change the, like knockout, like mistakes in your bumper of your car with a hammer. 00:48:43.000 --> 00:48:43.320 Right. @@ -2173,7 +2173,7 @@ Do you know what I mean? It needs to be really very, very controlled. 00:49:42.240 --> 00:49:44.480 -this could be a really interesting, thing. +this could be a really interesting, thing. 00:49:44.700 --> 00:49:46.440 So does it compile to WebAssembly? @@ -2185,7 +2185,7 @@ Can I in browser it? Yep. 00:49:48.520 --> 00:49:54.840 -And in fact, Simon Willison, the day it came out or Simon Willison, Claude prompted by Simon Willison set one up. +And in fact, Simon Willison, the day it came out or Simon Willison, Claude prompted by Simon Willison set one up. 00:49:54.900 --> 00:50:03.020 So I think if you go to Simon's blog somewhere, there's actually an example of Monty running somewhere, somewhere in a browser that you can, you can go and go and try it. @@ -2197,7 +2197,7 @@ Probably an earlier version. yeah, somewhere here, I think he'll have a link to, to his, his version of it. 00:50:09.460 --> 00:50:23.920 -so as he pointed out that you can do the really crazy thing, which is you can, you can compile the Python package for, yeah, so this is, this is his example, which is, I think like, WebAssembly running directly in the browser, but he did something even more crazy, which is he took the Python library, +so as he pointed out that you can do the really crazy thing, which is you can, you can compile the Python package for, yeah, so this is, this is his example, which is, I think like, WebAssembly running directly in the browser, but he did something even more crazy, which is he took the Python library, 00:50:24.060 --> 00:50:30.900 compiled that to, to Wasm and then called that from inside Pyodide, which is like crazy worlds within worlds. @@ -2227,7 +2227,7 @@ Yeah. Yeah. 00:50:39.560 --> 00:50:45.820 -And I think the other, the other thing we really need to add to this table, in terms of, of latency and complexity is calling back to the host. +And I think the other, the other thing we really need to add to this table, in terms of, of latency and complexity is calling back to the host. 00:50:45.920 --> 00:50:51.700 So one of the reasons a number of people have reached out to me and excited about this is sure that they're happy to have a sandboxing service. @@ -2509,7 +2509,7 @@ You know what I saw? I saw your announcement of this on X actually is where I saw it. 00:56:08.900 --> 00:56:17.500 -And I believe, it's been a little while since I saw it, but it said something to the effect of like, this is way too early, but what the heck, here we go. +And I believe, it's been a little while since I saw it, but it said something to the effect of like, this is way too early, but what the heck, here we go. 00:56:17.820 --> 00:56:19.120 Posted the GitHub link, right? @@ -2572,7 +2572,7 @@ Cause they have some way of calling Python code within this, which I think uses And it has some, some overheads and some, some, challenges around, security. 00:57:33.000 --> 00:57:43.500 -but yeah, this is very similar in the sense of like, it's basically vibe coding, all of the terminal methods that you might want, and using a bunch of existing unit tests to, to check that they're correct. +but yeah, this is very similar in the sense of like, it's basically vibe coding, all of the terminal methods that you might want, and using a bunch of existing unit tests to, to check that they're correct. 00:57:43.880 --> 00:57:47.540 interesting that obviously Vasell is a much, much bigger name than we are. @@ -2638,7 +2638,7 @@ And so, so I was mentioning this earlier. I think there were three reasons why these things have, why I'll talk about Monty in particular, why it is possible now when it wasn't before and why it is something where the like speed up from an LLM is even greater than in most, most coding tasks. 00:59:14.780 --> 00:59:25.480 -One, the LLM, knows in its soul, in its weights, the internal implementation, how to go about implementing a bytecode interpreter or how to implement it. +One, the LLM, knows in its soul, in its weights, the internal implementation, how to go about implementing a bytecode interpreter or how to implement it. 00:59:25.600 --> 00:59:30.720 If I asked most even experienced Python engineers or Rust engineers, how do I write a bytecode interpreter? @@ -2806,7 +2806,7 @@ I mean, well, or we have fuzzing going on. The fuzzing is another amazing technique. 01:02:55.180 --> 01:03:09.480 -So we use, so we have a Jason parser called jitter, which is about the fastest Jason parser in rust that we also is built into, Pydantic core, but it's also actually independently a package in, in PyPI that's used an awful lot. +So we use, so we have a Jason parser called jitter, which is about the fastest Jason parser in rust that we also is built into, Pydantic core, but it's also actually independently a package in, in PyPI that's used an awful lot. 01:03:09.540 --> 01:03:12.360 You'll see it in the dependencies of open AI, for example. @@ -2815,13 +2815,13 @@ You'll see it in the dependencies of open AI, for example. but jitter was where we, I discovered about fuzzing really. 01:03:15.820 --> 01:03:27.000 -No, I found out about it through, the hypothesis project project of, my friends, Zach Hatfield Dodds in Python, but then fuzzing in rust because the performance is so much better is, is, is really powerful. +No, I found out about it through, the hypothesis project project of, my friends, Zach Hatfield Dodds in Python, but then fuzzing in rust because the performance is so much better is, is, is really powerful. 01:03:27.000 --> 01:03:35.200 So basically it's generating random strings and using them as an input something, but then it's using very clever stochastic techniques to work out where to try more things. 01:03:35.200 --> 01:03:41.320 -And so you can basically fuzz, Monty, you can just give it arbitrary strings for hour after hour. +And so you can basically fuzz, Monty, you can just give it arbitrary strings for hour after hour. 01:03:41.600 --> 01:03:46.020 And periodically it'll find something where there's an error where like the memory usage is too high. diff --git a/youtube_transcripts/313-pydantic.vtt b/youtube_transcripts/313-pydantic.vtt index 343de28..acd78ad 100644 --- a/youtube_transcripts/313-pydantic.vtt +++ b/youtube_transcripts/313-pydantic.vtt @@ -977,7 +977,7 @@ But the big difference between Pydantic and Marshmallow 00:10:39.600 --> 00:10:42.960 -and most of the other competitors is Pydantic uses type ints. +and most of the other competitors is Pydantic uses type hints. 00:10:42.960 --> 00:10:46.120 diff --git a/youtube_transcripts/335-gene-editing.vtt b/youtube_transcripts/335-gene-editing.vtt index efde45e..eb5efb7 100644 --- a/youtube_transcripts/335-gene-editing.vtt +++ b/youtube_transcripts/335-gene-editing.vtt @@ -4685,7 +4685,7 @@ Like, how do you run that thing? 00:59:31.940 --> 00:59:33.780 -Is it with a G unicorn? +Is it with a Gunicorn? 00:59:33.780 --> 00:59:35.260 diff --git a/youtube_transcripts/345-10-tips-and-tools.txt b/youtube_transcripts/345-10-tips-and-tools.txt index 590ec76..90c2698 100644 --- a/youtube_transcripts/345-10-tips-and-tools.txt +++ b/youtube_transcripts/345-10-tips-and-tools.txt @@ -1572,7 +1572,7 @@ 01:11:24 I'm pretty sure the folks listening know like, black is a great formatter for your code. -01:11:31 If you're like me, and paid is a thing that you want to you want to understand and believe But from time to time you're like, but what about this weird scenario? I don't think about that anymore I just run black and then I'm done with it Value that black ads is not necessarily that it does the formatting like I could open up I charm go to the top level of Project right-click and save format everything in this directory and it would format it the way I told by charm I like it, but it solves the debate. It's like you don't have to debate Is there a space before the comma or after the comma does it go on one line? +01:11:31 If you're like me, and paid is a thing that you want to you want to understand and believe But from time to time you're like, but what about this weird scenario? I don't think about that anymore I just run black and then I'm done with it Value that black ads is not necessarily that it does the formatting like I could open up I charm go to the top level of Project right-click and save format everything in this directory and it would format it the way I told PyCharm I like it, but it solves the debate. It's like you don't have to debate Is there a space before the comma or after the comma does it go on one line? 01:12:07 Or does it go on three lines like black puts it on the lines? diff --git a/youtube_transcripts/367-say-hello-to-pyscript-webassembly-python.vtt b/youtube_transcripts/367-say-hello-to-pyscript-webassembly-python.vtt index 9aec69d..8c76d68 100644 --- a/youtube_transcripts/367-say-hello-to-pyscript-webassembly-python.vtt +++ b/youtube_transcripts/367-say-hello-to-pyscript-webassembly-python.vtt @@ -5613,7 +5613,7 @@ like, could I say my path to my module is 01:10:25.100 --> 01:10:29.100 -dot dot slash app dot pi and then maybe read like some- +dot dot slash app.py and then maybe read like some- 01:10:29.100 --> 01:10:30.100 diff --git a/youtube_transcripts/370-openbb.vtt b/youtube_transcripts/370-openbb.vtt index de6a72c..9df1f4f 100644 --- a/youtube_transcripts/370-openbb.vtt +++ b/youtube_transcripts/370-openbb.vtt @@ -1745,7 +1745,7 @@ I was blown away. 00:19:02.700 --> 00:19:06.700 -Chris Moffett did a course for TalkBython +Chris Moffett did a course for Talk Python 00:19:06.700 --> 00:19:10.140 diff --git a/youtube_transcripts/376-pydantic-2-the-plan.vtt b/youtube_transcripts/376-pydantic-2-the-plan.vtt index 292853b..2b04add 100644 --- a/youtube_transcripts/376-pydantic-2-the-plan.vtt +++ b/youtube_transcripts/376-pydantic-2-the-plan.vtt @@ -1113,7 +1113,7 @@ which maps perfectly to document databases. 00:12:21.980 --> 00:12:24.820 -So yeah, this is actually what TalkBython +So yeah, this is actually what Talk Python 00:12:24.820 --> 00:12:26.980 diff --git a/youtube_transcripts/381-python-perf-specializing-adaptive-interpreter.vtt b/youtube_transcripts/381-python-perf-specializing-adaptive-interpreter.vtt index 756fc7c..f459ab3 100644 --- a/youtube_transcripts/381-python-perf-specializing-adaptive-interpreter.vtt +++ b/youtube_transcripts/381-python-perf-specializing-adaptive-interpreter.vtt @@ -2973,7 +2973,7 @@ Like, it should probably always be one. 00:35:45.000 --> 00:35:49.320 -You just don't express that in code unless you're using type ints, right? +You just don't express that in code unless you're using type hints, right? 00:35:49.320 --> 00:35:50.960 diff --git a/youtube_transcripts/395-readme-tools.vtt b/youtube_transcripts/395-readme-tools.vtt index 6fbf0bf..e889a1f 100644 --- a/youtube_transcripts/395-readme-tools.vtt +++ b/youtube_transcripts/395-readme-tools.vtt @@ -6189,7 +6189,7 @@ So sort of the input file and the output file 01:04:34.420 --> 01:04:37.900 -are the same file, so that you get sort of an item potent +are the same file, so that you get sort of an idempotent 01:04:37.900 --> 01:04:39.980 diff --git a/youtube_transcripts/402-polars.vtt b/youtube_transcripts/402-polars.vtt index 30a23b4..b733692 100644 --- a/youtube_transcripts/402-polars.vtt +++ b/youtube_transcripts/402-polars.vtt @@ -317,7 +317,7 @@ statically. We just cannot know probably. I don't know how far it can go. 00:06:05.240 --> 00:06:13.600 -But I, yeah, in Polaris as well, we use mypy type ints, which prevent us from having a +But I, yeah, in Polaris as well, we use mypy type hints, which prevent us from having a 00:06:13.600 --> 00:06:18.720 @@ -325,7 +325,7 @@ lot of bugs and also make the IDE experience much nicer. 00:06:18.720 --> 00:06:20.960 -Yeah, type ints are great. +Yeah, type hints are great. 00:06:20.960 --> 00:06:25.480 @@ -389,7 +389,7 @@ I can just hack away and try interactively what happens. 00:07:37.040 --> 00:07:40.120 -And for such code, type ints don't matter. +And for such code, type hints don't matter. 00:07:40.120 --> 00:07:45.120 @@ -397,7 +397,7 @@ But once I write more of a library or product or tool, 00:07:45.120 --> 00:07:49.320 -then type ints are really great. +then type hints are really great. 00:07:49.320 --> 00:07:53.120 @@ -4933,7 +4933,7 @@ Ajit says, "Excellent content, guys. 01:06:53.140 --> 01:06:55.620 -from pandas to pollers. +from pandas to Polars. 01:06:55.620 --> 01:06:56.900 @@ -4957,7 +4957,7 @@ People are interested in this project. 01:07:05.820 --> 01:07:08.460 -They want to start playing and learning pollers. +They want to start playing and learning Polars. 01:07:08.460 --> 01:07:10.100 diff --git a/youtube_transcripts/424-solving-10-different-simulation-problems.vtt b/youtube_transcripts/424-solving-10-different-simulation-problems.vtt index 710fe4a..c9c90db 100644 --- a/youtube_transcripts/424-solving-10-different-simulation-problems.vtt +++ b/youtube_transcripts/424-solving-10-different-simulation-problems.vtt @@ -525,7 +525,7 @@ whereas you put it out into the world 00:05:49.440 --> 00:05:51.740 -at Brilliant or TalkByThunder or wherever, +at Brilliant or Talk Pythonder or wherever, 00:05:51.740 --> 00:05:54.560 diff --git a/youtube_transcripts/425-shiny-for-python.vtt b/youtube_transcripts/425-shiny-for-python.vtt index 3b4d8b0..45d2f3c 100644 --- a/youtube_transcripts/425-shiny-for-python.vtt +++ b/youtube_transcripts/425-shiny-for-python.vtt @@ -1785,11 +1785,11 @@ you can host Shiny for Python. 00:23:11.160 --> 00:23:15.040 -Just put some G unicorn, uv corn workers in front of it. +Just put some Gunicorn, uvicorn workers in front of it. 00:23:15.040 --> 00:23:16.640 -- Except G unicorn. +- Except Gunicorn. 00:23:16.640 --> 00:23:18.760 diff --git a/youtube_transcripts/426-pyscript-update.vtt b/youtube_transcripts/426-pyscript-update.vtt index 81d1014..1ca7e76 100644 --- a/youtube_transcripts/426-pyscript-update.vtt +++ b/youtube_transcripts/426-pyscript-update.vtt @@ -4649,7 +4649,7 @@ instant 01:03:35.600 --> 01:03:41.120 -Yeah, it is instant it's amazing yeah, I'm once it's cashed it's +Yeah, it is instant it's amazing yeah, I'm once it's cached it's 01:03:41.120 --> 01:03:43.640 diff --git a/youtube_transcripts/447-parallel-python-apps-with-sub-interpreters.vtt b/youtube_transcripts/447-parallel-python-apps-with-sub-interpreters.vtt index b626b21..b309220 100644 --- a/youtube_transcripts/447-parallel-python-apps-with-sub-interpreters.vtt +++ b/youtube_transcripts/447-parallel-python-apps-with-sub-interpreters.vtt @@ -2032,7 +2032,7 @@ one gil per web server, you can only have one user per website, which is not gre So the way that most web servers implement this is that they have a pool of workers. 00:44:38.700 --> 00:44:45.340 -G unicorn does that by spawning Python processes and then using the multi processing module. +Gunicorn does that by spawning Python processes and then using the multi processing module. 00:44:45.340 --> 00:44:51.460 So it basically creates multiple Python processes, all listening to the same socket. @@ -2047,7 +2047,7 @@ It also then inside that has a thread pool. So even basically a thread pool is, is better for concurrent code. 00:45:06.260 --> 00:45:11.020 -So G unicorn normally is used in a multi worker, multi thread model. +So Gunicorn normally is used in a multi worker, multi thread model. 00:45:11.020 --> 00:45:13.100 That's how we kind of talk about it. @@ -2125,7 +2125,7 @@ to web requests and basically running and serving web requests with multiple gil So that was the, that was the task. 00:47:05.180 --> 00:47:09.180 ->> In your article, you said you had started with a G unicorn and they just made too many +>> In your article, you said you had started with a Gunicorn and they just made too many 00:47:09.180 --> 00:47:15.340 assumptions about the multi processing, the web workers being truly sub processes, but diff --git a/youtube_transcripts/454-dagster.vtt b/youtube_transcripts/454-dagster.vtt index 57160fe..ecab876 100644 --- a/youtube_transcripts/454-dagster.vtt +++ b/youtube_transcripts/454-dagster.vtt @@ -2734,7 +2734,7 @@ so you don't have to rerun that. - Yeah, I mean, it depends on how you built the pipeline. 00:39:11.000 --> 00:39:12.880 -We like to build item potent pipelines +We like to build idempotent pipelines 00:39:12.880 --> 00:39:14.320 is how we sort of talk about it, @@ -2773,7 +2773,7 @@ we can just only run orders and not have to worry about rewriting the whole thing from scratch. 00:39:40.440 --> 00:39:44.520 -- Excellent, so item potent for people who maybe don't know, +- Excellent, so idempotent for people who maybe don't know, 00:39:44.520 --> 00:39:46.960 if you run it once or you perform the operation once diff --git a/youtube_transcripts/457-security-phylum.vtt b/youtube_transcripts/457-security-phylum.vtt index 82d694e..4a6f714 100644 --- a/youtube_transcripts/457-security-phylum.vtt +++ b/youtube_transcripts/457-security-phylum.vtt @@ -2488,7 +2488,7 @@ So number one, these super strict lock files are awesome when you're building an application. 00:38:01.080 --> 00:38:03.400 -I want to ship TalkBython training out. +I want to ship Talk Python training out. 00:38:03.400 --> 00:38:04.840 It's got its strict APIs. diff --git a/youtube_transcripts/462-pandas-and-beyond-with-wes.vtt b/youtube_transcripts/462-pandas-and-beyond-with-wes.vtt index 153dd8f..b8bf2fa 100644 --- a/youtube_transcripts/462-pandas-and-beyond-with-wes.vtt +++ b/youtube_transcripts/462-pandas-and-beyond-with-wes.vtt @@ -793,7 +793,7 @@ execute on different backends. And so around the time that that I was helping st I created this project called Ibis which is basically a portable DataFrame API that knows 00:56:03.760 --> 00:56:12.080 -how to generate SQL queries and compile to pandas and pollers and different DataFrame DataFrame +how to generate SQL queries and compile to pandas and Polars and different DataFrame DataFrame 00:56:12.080 --> 00:56:19.360 backends. And the goal is to provide a really productive DataFrame API that gives you portability @@ -808,7 +808,7 @@ data stack. So you aren't stuck with using one particular system because all of you've written is specialized to that system. You have this tool which so maybe you could work with 00:56:36.480 --> 00:56:42.880 -you know DuckDB on your laptop or pandas or pollers with Ibis on your laptop. But if you have +you know DuckDB on your laptop or pandas or Polars with Ibis on your laptop. But if you have 00:56:42.880 --> 00:56:47.280 if you need to run that workload someplace else maybe with you know Clickhouse or BigQuery @@ -1045,7 +1045,7 @@ query transpilation and generating generating sequel generating sequel outputs s ibis had its own kind of bad version of sequel glot kind of a query transpi like sequel 01:04:50.880 --> 01:04:58.720 -transpilation that was uh powered by i think powered by uh sequel alchemy and and some +transpilation that was uh powered by i think powered by uh SQLAlchemy and and some 01:04:58.720 --> 01:05:03.200 and a bunch of custom code and so i think they've been able to delete a lot in ibis by moving to diff --git a/youtube_transcripts/480-narwhals.vtt b/youtube_transcripts/480-narwhals.vtt index a99d369..46311cf 100644 --- a/youtube_transcripts/480-narwhals.vtt +++ b/youtube_transcripts/480-narwhals.vtt @@ -1744,7 +1744,7 @@ That's what I was thinking, yeah. Okay. 00:41:54.340 --> 00:42:01.380 -Yeah, but if it has enough of the functions of pandas or pollers, you're like, all right, this is probably good. +Yeah, but if it has enough of the functions of pandas or Polars, you're like, all right, this is probably good. 00:42:01.380 --> 00:42:01.960 All right? @@ -1777,10 +1777,10 @@ Let's see. I do like the mkdocs where you can have these different examples. 00:42:23.140 --> 00:42:32.140 -One thing I noticed is you've got the pollers eager evaluation and you've got the pollers lazy evaluation. +One thing I noticed is you've got the Polars eager evaluation and you've got the Polars lazy evaluation. 00:42:32.140 --> 00:42:40.820 -And when you have the pollers lazy, this function decorated with the decorator, the narwhalify decorator, +And when you have the Polars lazy, this function decorated with the decorator, the narwhalify decorator, 00:42:40.820 --> 00:42:47.120 it itself returns something that is lazy and you've got to call collect on, right? @@ -1822,7 +1822,7 @@ Yeah. Yeah. 00:43:11.040 --> 00:43:16.640 -So the way you do that in pollers, you create a lazy frame versus a data frame, right? +So the way you do that in Polars, you create a lazy frame versus a data frame, right? 00:43:16.640 --> 00:43:22.760 But then you've got to call collect on it, kind of like awaiting it and fit more async, which is cool. @@ -1852,7 +1852,7 @@ Exactly. Exactly. 00:43:33.160 --> 00:43:43.820 -So one of the things that you talk about here is the pandas index, which is one of the key differences between pollers and pandas. +So one of the things that you talk about here is the pandas index, which is one of the key differences between Polars and pandas. 00:43:44.900 --> 00:43:48.860 And you've classified pandas people into two categories. diff --git a/youtube_transcripts/488-lancedb.vtt b/youtube_transcripts/488-lancedb.vtt index 0020679..6cf24ea 100644 --- a/youtube_transcripts/488-lancedb.vtt +++ b/youtube_transcripts/488-lancedb.vtt @@ -2119,7 +2119,7 @@ So, well, when you, you can say table.search and you can, pass in the vector-- t And then a dot two underscore, you know, blah, blah, blah, to, determine what format you want the results back. 00:51:29.320 --> 00:51:38.320 -So you have two pandas here in the, in the example, but you can convert it to pollers, or you can just get it back as a-- as a list. +So you have two pandas here in the, in the example, but you can convert it to Polars, or you can just get it back as a-- as a list. 00:51:38.320 --> 00:51:39.320 Awesome. diff --git a/youtube_transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt b/youtube_transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt index d40c3c1..b411692 100644 --- a/youtube_transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt +++ b/youtube_transcripts/491-duckdb-and-python-ducks-and-snakes-living-together.vtt @@ -523,7 +523,7 @@ Doing that in memory in Python is not that great. But it sounds a little bit mor like vector programming in the sense of kind of like 00:08:04.480 --> 00:08:10.640 -you would do with pandas or pollers, right? You wouldn't loop over, +you would do with pandas or Polars, right? You wouldn't loop over, 00:08:10.640 --> 00:08:16.000 you shouldn't loop over a pandas data frame processing each item. You should issue @@ -538,7 +538,7 @@ rather than for every C in call, you know, whatever, like looping over. Yes. So it is, it is exactly what you said. It's designed for that. It's vectorized 00:08:31.520 --> 00:08:36.000 -execution. So it's a little different than pandas. It's a, you know, pollers is a bit more +execution. So it's a little different than pandas. It's a, you know, Polars is a bit more 00:08:36.000 --> 00:08:36.320 similar. @@ -682,7 +682,7 @@ a pretty hard fork in the road pretty early on in your architecture, which is yo to be row-based or do you want to be column-based? And row-based is SQLite and column-based would be 00:12:13.360 --> 00:12:19.120 -your pandas, your pollers, and most of your cloud data warehouses, your snowflakes, that type of thing. +your pandas, your Polars, and most of your cloud data warehouses, your snowflakes, that type of thing. 00:12:19.120 --> 00:12:24.960 And once you go column-based, suddenly your compression is amazing because you store data diff --git a/youtube_transcripts/493-quarto.vtt b/youtube_transcripts/493-quarto.vtt index 288ea80..d7ef108 100644 --- a/youtube_transcripts/493-quarto.vtt +++ b/youtube_transcripts/493-quarto.vtt @@ -3073,7 +3073,7 @@ So that's the idea. So Posit Connect is a commercial product that we sell. 00:57:05.800 --> 00:57:15.040 -And so I think from a business model standpoint, if people are successful with Cordo, as obviously we make it very easy to publish it to everywhere. +And so I think from a business model standpoint, if people are successful with Quarto, as obviously we make it very easy to publish it to everywhere. 00:57:15.040 --> 00:57:19.780 We're not trying to privilege, you know, or say, oh, oh, you know, it's a roach motel. diff --git a/youtube_transcripts/495-osmnx-python-and-openstreetmap.vtt b/youtube_transcripts/495-osmnx-python-and-openstreetmap.vtt index 5ae7134..24503b7 100644 --- a/youtube_transcripts/495-osmnx-python-and-openstreetmap.vtt +++ b/youtube_transcripts/495-osmnx-python-and-openstreetmap.vtt @@ -8812,7 +8812,7 @@ abstracts away from pandas 00:52:14.800 --> 00:52:15.780 -versus pollers, +versus Polars, 00:52:15.780 --> 00:52:16.700 so you're @@ -8857,10 +8857,10 @@ to do their work 00:52:24.280 --> 00:52:24.900 -in pollers, +in Polars, 00:52:24.900 --> 00:52:25.840 -talk pollers +talk Polars 00:52:25.840 --> 00:52:26.180 to your @@ -9031,7 +9031,7 @@ the API, but they're 00:52:59.400 --> 00:53:00.380 -pollers, +Polars, 00:53:00.380 --> 00:53:01.100 not pandas, diff --git a/youtube_transcripts/510--youtube10-polars-tools-and-techniques-to-level-up.vtt b/youtube_transcripts/510--youtube10-polars-tools-and-techniques-to-level-up.vtt index 7fbaa90..ecac268 100644 --- a/youtube_transcripts/510--youtube10-polars-tools-and-techniques-to-level-up.vtt +++ b/youtube_transcripts/510--youtube10-polars-tools-and-techniques-to-level-up.vtt @@ -301,13 +301,13 @@ hairs? That's C++. Exactly. 00:07:28.700 --> 00:07:48.920 -Yeah, indeed. Okay. Well, let's talk data science and let's talk pollers. So I guess there's a lot of people who have heard of pollers and are experts. I see out in the live audience already. There are some folks whose things they've created for pollers we're going to talk about in this show, actually. So +Yeah, indeed. Okay. Well, let's talk data science and let's talk Polars. So I guess there's a lot of people who have heard of Polars and are experts. I see out in the live audience already. There are some folks whose things they've created for Polars we're going to talk about in this show, actually. So 00:07:48.920 --> 00:07:50.100 some 00:07:50.100 --> 00:08:04.480 -people need no introduction to pollers, but there are many people listening to the show who are just getting into Python, using this as like one of the footholds to kind of get a feel for the space. And they might go like, what is pollers? Sure. You know, +people need no introduction to Polars, but there are many people listening to the show who are just getting into Python, using this as like one of the footholds to kind of get a feel for the space. And they might go like, what is Polars? Sure. You know, 00:08:05.039 --> 00:08:05.400 what's @@ -406,7 +406,7 @@ It's very, very powerful, very, very useful. It uses a lot of black magic. 00:10:17.800 --> 00:11:26.580 -And I find as a developer, as somebody with a software engineering background, sometimes code can be distracting at it and go, "How does that even compile? What is that? What are they doing?" And that pulls me away from the, "I'm trying to do something." And I find some of the stuff in Pandas kind of falls into that bucket. Whereas you just do nice little chained function calls in pollers. And so you're reading filter, select kinds of things. And it feels a lot cleaner to me. So it tends to be where I want to go. And then the other big advantage of it, which some of the other data frame libraries have, but Pandas, as far as I know, doesn't, is lazy evaluation. And this is the idea of rather than executing each operation as you call it, you can chain a bunch of things together and then call them all at the same time. This tends to give you large amounts of speed up. So for example, if you're reading in a CSV file and you want to do something only on certain rows in that CSV file, the filter can throw those rows that aren't important, and then you would only run the operation on those rows that you were interested in. +And I find as a developer, as somebody with a software engineering background, sometimes code can be distracting at it and go, "How does that even compile? What is that? What are they doing?" And that pulls me away from the, "I'm trying to do something." And I find some of the stuff in Pandas kind of falls into that bucket. Whereas you just do nice little chained function calls in Polars. And so you're reading filter, select kinds of things. And it feels a lot cleaner to me. So it tends to be where I want to go. And then the other big advantage of it, which some of the other data frame libraries have, but Pandas, as far as I know, doesn't, is lazy evaluation. And this is the idea of rather than executing each operation as you call it, you can chain a bunch of things together and then call them all at the same time. This tends to give you large amounts of speed up. So for example, if you're reading in a CSV file and you want to do something only on certain rows in that CSV file, the filter can throw those rows that aren't important, and then you would only run the operation on those rows that you were interested in. 00:11:26.700 --> 00:11:31.660 Of course, that tends to be an awful lot faster than going through every single row at a time. @@ -508,7 +508,7 @@ actually using this index or is it not? 00:14:26.610 --> 00:14:28.560 -But then there's something like that in pollers as well, yeah? +But then there's something like that in Polars as well, yeah? 00:14:28.740 --> 00:14:30.880 Yeah, and it calls it explain as well. @@ -535,10 +535,10 @@ Well, we are going to not talk about the exact subject of your course that you w So your course that you wrote at Talk Python is Polars for the Power Users, Transform Your Data Analysis Game. 00:15:14.460 --> 00:15:19.840 -What we're going to talk about is a bunch of different cool tools that extend and power up pollers. +What we're going to talk about is a bunch of different cool tools that extend and power up Polars. 00:15:20.340 --> 00:15:24.080 -But if people are interested in learning pollers more from scratch, they could take your course. +But if people are interested in learning Polars more from scratch, they could take your course. 00:15:24.220 --> 00:15:25.780 So I'll be sure to link to that in the show notes. @@ -550,7 +550,7 @@ Maybe give people the super quick rundown on what's on the course, and then we'l Sure. 00:15:33.420 --> 00:15:36.600 -As you might guess from the title, it kind of teaches your pollers. +As you might guess from the title, it kind of teaches your Polars. 00:15:37.380 --> 00:15:40.760 And you can see on the screen there, there's a bunch of different icons as well. @@ -562,7 +562,7 @@ What we're trying to do is essentially take you to that next step. So if you're doing data science-y stuff in Excel and you're trying to figure out how to move to the Python world with this, then this is kind of a course to try and help you do that. 00:15:58.770 --> 00:16:00.420 -And it does it through the pollers library. +And it does it through the Polars library. 00:16:01.180 --> 00:16:03.560 So there's a whole bunch of different examples as you go along. @@ -625,13 +625,13 @@ I enjoyed it. Okay. 00:17:29.040 --> 00:17:31.080 -pollers is pretty awesome wouldn't you say yes +Polars is pretty awesome wouldn't you say yes 00:17:31.080 --> 00:17:36.280 that's a nice you know there you go how's that for a segue pretty 00:17:36.280 --> 00:18:11.940 -awesome in fact let's reverse that we're gonna talk about awesome pollers so there are so many of these awesome lists out there and you know if you are interested in something and you haven't looked yet for an awesome list for it go do that there's got to be an awesome django i know there's there's got to be an awesome flask and there's an awesome async there's an awesome Python and Ddata has gone and created awesome pollers. So what we're going to do is we're going to go to this curated list and pull out 10-ish number of things that we think. +awesome in fact let's reverse that we're gonna talk about awesome Polars so there are so many of these awesome lists out there and you know if you are interested in something and you haven't looked yet for an awesome list for it go do that there's got to be an awesome django i know there's there's got to be an awesome flask and there's an awesome async there's an awesome Python and Ddata has gone and created awesome Polars. So what we're going to do is we're going to go to this curated list and pull out 10-ish number of things that we think. 00:18:12.020 --> 00:18:12.200 On the @@ -1423,7 +1423,7 @@ something, you know, one of the things with concurrency, I guess it's actually w And the only thing I really remember is really, really don't do it unless you absolutely have to. 00:41:46.960 --> 00:42:01.220 -And, you know, I routinely ingest a million rows with pollers in a sub-second, so I've yet to need concurrency because the size of the data stuff that I deal with isn't in that petabyte range where it actually would start to matter. +And, you know, I routinely ingest a million rows with Polars in a sub-second, so I've yet to need concurrency because the size of the data stuff that I deal with isn't in that petabyte range where it actually would start to matter. 00:42:01.950 --> 00:42:05.060 So there are mechanisms there, but I can't speak to them very well. @@ -1495,7 +1495,7 @@ Yeah, I'm pretty sure this is somebody's playground. So I think they're maintain and it isn't a 00:45:12.480 --> 00:45:20.400 -github sorry it is in the github uh pollers repos not an organization not just yeah yeah and +github sorry it is in the github uh Polars repos not an organization not just yeah yeah and 00:45:20.400 --> 00:45:30.540 that was kind of what i was about to say right back to your question about sort of a trust thing well the people who write the library when they write another library to go with it you're probably okay absolutely um @@ -1768,7 +1768,7 @@ Yeah, for sure. And we only really covered a few of the plug-ins. 00:55:33.660 --> 00:55:47.940 -The awesome pollers list also has links to articles, links to blog posts, things on different interfaces for pollers on other languages that you can use to interact with it. +The awesome Polars list also has links to articles, links to blog posts, things on different interfaces for Polars on other languages that you can use to interact with it. 00:55:48.580 --> 00:55:54.460 So even over and above the libraries we talked about there, there's a fair amount of decent content there. diff --git a/youtube_transcripts/525-nicegui-youtube-named.srt b/youtube_transcripts/525-nicegui-youtube-named.srt index 2ec5f20..b65ecbb 100644 --- a/youtube_transcripts/525-nicegui-youtube-named.srt +++ b/youtube_transcripts/525-nicegui-youtube-named.srt @@ -772,7 +772,7 @@ Falko: And it turned out pretty, pretty successful. 194 00:19:45,920 --> 00:20:07,640 -Falko: I think we, we turned out, yeah, with less bugs, with a better performance and, and maybe, maybe a year later, just by was basically, put on hold and, isn't developed any, any longer, but it was a very important, tool for us. +Falko: I think we, we turned out, yeah, with less bugs, with a better performance and, and maybe, maybe a year later, just by was basically, put on hold and, isn't developed any, any longer, but it was a very important, tool for us. 195 00:20:07,720 --> 00:20:09,920 diff --git a/youtube_transcripts/525-nicegui-youtube-named.vtt b/youtube_transcripts/525-nicegui-youtube-named.vtt index 9850528..2b2f4ee 100644 --- a/youtube_transcripts/525-nicegui-youtube-named.vtt +++ b/youtube_transcripts/525-nicegui-youtube-named.vtt @@ -580,7 +580,7 @@ WEBVTT And it turned out pretty, pretty successful. 00:19:45.920 --> 00:20:07.640 -I think we, we turned out, yeah, with less bugs, with a better performance and, and maybe, maybe a year later, just by was basically, put on hold and, isn't developed any, any longer, but it was a very important, tool for us. +I think we, we turned out, yeah, with less bugs, with a better performance and, and maybe, maybe a year later, just by was basically, put on hold and, isn't developed any, any longer, but it was a very important, tool for us. 00:20:07.720 --> 00:20:09.920 a very important foundation. diff --git a/youtube_transcripts/526-data-sci-with-ai-youtube-names.srt b/youtube_transcripts/526-data-sci-with-ai-youtube-names.srt index 4232517..e29b6ce 100644 --- a/youtube_transcripts/526-data-sci-with-ai-youtube-names.srt +++ b/youtube_transcripts/526-data-sci-with-ai-youtube-names.srt @@ -916,7 +916,7 @@ Michael Kennedy: So let's start with the first one. 230 00:13:11,340 --> 00:13:18,920 -Michael Kennedy: What tools, you know, things like pollers maybe or whatever is like jumping out at you over the last year or so that's like, wow. +Michael Kennedy: What tools, you know, things like Polars maybe or whatever is like jumping out at you over the last year or so that's like, wow. 231 00:13:19,580 --> 00:13:32,200 diff --git a/youtube_transcripts/526-data-sci-with-ai-youtube-names.vtt b/youtube_transcripts/526-data-sci-with-ai-youtube-names.vtt index 5c68c2c..7477872 100644 --- a/youtube_transcripts/526-data-sci-with-ai-youtube-names.vtt +++ b/youtube_transcripts/526-data-sci-with-ai-youtube-names.vtt @@ -688,7 +688,7 @@ WEBVTT So let's start with the first one. 00:13:11.340 --> 00:13:18.920 -What tools, you know, things like pollers maybe or whatever is like jumping out at you over the last year or so that's like, wow. +What tools, you know, things like Polars maybe or whatever is like jumping out at you over the last year or so that's like, wow. 00:13:19.580 --> 00:13:32.200 Such a wonderful question. And I'll actually, I chatted about this on Vanishing Gradients with Akshay Agrawal, who built Marimo and develops Marimo, which I encourage everyone to check out. diff --git a/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.srt b/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.srt index 8a5bc05..555cfe0 100644 --- a/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.srt +++ b/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.srt @@ -3148,11 +3148,11 @@ maybe small enough that we could run it ourselves on our servers or on, on my Ma 788 00:59:29,140 --> 00:59:39,440 -I have the, the open AI 20 billion parameter model running as like my, my default LLM for any of my code that I write that needs to talk to an LLM. +I have the open AI 20 billion parameter model running as like my, my default LLM for any of my code that I write that needs to talk to an LLM. 789 00:59:40,340 --> 00:59:42,200 -So, if you, yeah, +So, if you, yeah, 790 00:59:42,260 --> 00:59:45,940 diff --git a/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.vtt b/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.vtt index 24f7d4c..556a574 100644 --- a/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.vtt +++ b/youtube_transcripts/528-python-apps-with-llm-building-blocks-youtube.vtt @@ -2362,10 +2362,10 @@ You know, maybe we could, as you said, get away with running a smaller model. maybe small enough that we could run it ourselves on our servers or on, on my Mac mini here, M2 pro 32 gigs of Ram. 00:59:29.140 --> 00:59:39.440 -I have the, the open AI 20 billion parameter model running as like my, my default LLM for any of my code that I write that needs to talk to an LLM. +I have the open AI 20 billion parameter model running as like my, my default LLM for any of my code that I write that needs to talk to an LLM. 00:59:40.340 --> 00:59:42.200 -So, if you, yeah, +So, if you, yeah, 00:59:42.260 --> 00:59:45.940 so we actually have a video on the Maroon channel on that one, but go check that out later. diff --git a/youtube_transcripts/529-cs-from-scratch-youtube.srt b/youtube_transcripts/529-cs-from-scratch-youtube.srt index 76400de..6dac187 100644 --- a/youtube_transcripts/529-cs-from-scratch-youtube.srt +++ b/youtube_transcripts/529-cs-from-scratch-youtube.srt @@ -1752,7 +1752,7 @@ Michael Kennedy: of the NES and you're giving it the potentially given it open s 439 00:44:44,020 --> 00:45:38,060 -David Kopec: right? Correct. Yeah. It runs several NES open source games. Like I said, I didn't put it for legal reasons, but it can run real commercial games as well for the NES. And so you're doing the whole soup to nuts, like entire system, except the audio, but it is Python and it's pure Python, except for, of course, the library we're using. And we're using, Pygame for, for displaying the window, which is written a lot of it in C. But because it's pure Python, the emulator doesn't run at full speed. So it runs on my Mac. It runs at about 15 frames per second. The real NES ran at 60 frames per second. So we leave as an exercise to the reader. Yeah, go use Cython or something like that. And I'm sure you can get this up with several different techniques. Definitely with Cython, but with several different techniques, you could get this up to 60 frames per second, but you're going to have to incorporate something that gets outside of just pure Python. +David Kopec: right? Correct. Yeah. It runs several NES open source games. Like I said, I didn't put it for legal reasons, but it can run real commercial games as well for the NES. And so you're doing the whole soup to nuts, like entire system, except the audio, but it is Python and it's pure Python, except for, of course, the library we're using. And we're using, Pygame for, for displaying the window, which is written a lot of it in C. But because it's pure Python, the emulator doesn't run at full speed. So it runs on my Mac. It runs at about 15 frames per second. The real NES ran at 60 frames per second. So we leave as an exercise to the reader. Yeah, go use Cython or something like that. And I'm sure you can get this up with several different techniques. Definitely with Cython, but with several different techniques, you could get this up to 60 frames per second, but you're going to have to incorporate something that gets outside of just pure Python. 440 00:45:38,500 --> 00:45:38,920 @@ -2608,7 +2608,7 @@ Michael Kennedy: book? I'm pretty positive for it. I mean, there's, there's a lo 653 01:05:34,080 --> 01:05:45,100 -David Kopec: What do you think about, and I know you talked about on the show before I heard about a while ago on the show about Mojo and about, you know, a total attempt that, you know, let's just +David Kopec: What do you think about, and I know you talked about on the show before I heard about a while ago on the show about Mojo and about, you know, a total attempt that, you know, let's just 654 01:05:46,120 --> 01:05:50,860 @@ -2644,11 +2644,11 @@ Michael Kennedy: And then, like I said, with 600,000 or whatever there are packa 662 01:07:05,540 --> 01:07:09,060 -Michael Kennedy: Like, okay, we're going to use pollers, and we're going to do this thing. +Michael Kennedy: Like, okay, we're going to use Polars, and we're going to do this thing. 663 01:07:09,350 --> 01:07:12,700 -Michael Kennedy: But when I call the pollers functions, I'm no longer running Python. +Michael Kennedy: But when I call the Polars functions, I'm no longer running Python. 664 01:07:12,920 --> 01:07:16,300 diff --git a/youtube_transcripts/529-cs-from-scratch-youtube.vtt b/youtube_transcripts/529-cs-from-scratch-youtube.vtt index a5a420a..67dcf9a 100644 --- a/youtube_transcripts/529-cs-from-scratch-youtube.vtt +++ b/youtube_transcripts/529-cs-from-scratch-youtube.vtt @@ -1315,7 +1315,7 @@ WEBVTT of the NES and you're giving it the potentially given it open source NES games and it can run them, 00:44:44.020 --> 00:45:38.060 -right? Correct. Yeah. It runs several NES open source games. Like I said, I didn't put it for legal reasons, but it can run real commercial games as well for the NES. And so you're doing the whole soup to nuts, like entire system, except the audio, but it is Python and it's pure Python, except for, of course, the library we're using. And we're using, Pygame for, for displaying the window, which is written a lot of it in C. But because it's pure Python, the emulator doesn't run at full speed. So it runs on my Mac. It runs at about 15 frames per second. The real NES ran at 60 frames per second. So we leave as an exercise to the reader. Yeah, go use Cython or something like that. And I'm sure you can get this up with several different techniques. Definitely with Cython, but with several different techniques, you could get this up to 60 frames per second, but you're going to have to incorporate something that gets outside of just pure Python. +right? Correct. Yeah. It runs several NES open source games. Like I said, I didn't put it for legal reasons, but it can run real commercial games as well for the NES. And so you're doing the whole soup to nuts, like entire system, except the audio, but it is Python and it's pure Python, except for, of course, the library we're using. And we're using, Pygame for, for displaying the window, which is written a lot of it in C. But because it's pure Python, the emulator doesn't run at full speed. So it runs on my Mac. It runs at about 15 frames per second. The real NES ran at 60 frames per second. So we leave as an exercise to the reader. Yeah, go use Cython or something like that. And I'm sure you can get this up with several different techniques. Definitely with Cython, but with several different techniques, you could get this up to 60 frames per second, but you're going to have to incorporate something that gets outside of just pure Python. 00:45:38.500 --> 00:45:38.920 Sure. @@ -1957,7 +1957,7 @@ WEBVTT book? I'm pretty positive for it. I mean, there's, there's a lot of things that could be better, but I think one of the real superpowers is it's approachable, but it's ceiling of Python. That is it's ceiling of what you can accomplish is not that low, right? You can go pretty far if you have CS skills and ideas. And then, you know, pip install, uv install, like the options of what is out there to just build and click together are incredible. 01:05:34.080 --> 01:05:45.100 -What do you think about, and I know you talked about on the show before I heard about a while ago on the show about Mojo and about, you know, a total attempt that, you know, let's just +What do you think about, and I know you talked about on the show before I heard about a while ago on the show about Mojo and about, you know, a total attempt that, you know, let's just 01:05:46.120 --> 01:05:50.860 redo it. and we'll get, let's keep the language syntax, but not the runtime. @@ -1984,10 +1984,10 @@ WEBVTT And then, like I said, with 600,000 or whatever there are packages, you know, how much of that are you willing to carve away to get a faster language? And what I think also is a really interesting aspect that people might not think about or take into account that often is you'll see a lot of these benchmarks like here's the three body solving the three body problem in Python and here's solving it in Mojo. Here's solving the three body problem in Rust and look at that huge difference. But what often happens in Python is you find yourself orchestrating native code anyway. 01:07:05.540 --> 01:07:09.060 -Like, okay, we're going to use pollers, and we're going to do this thing. +Like, okay, we're going to use Polars, and we're going to do this thing. 01:07:09.350 --> 01:07:12.700 -But when I call the pollers functions, I'm no longer running Python. +But when I call the Polars functions, I'm no longer running Python. 01:07:12.920 --> 01:07:16.300 I'm running like a drop of Python and a bunch of Rust. diff --git a/youtube_transcripts/530-anywidget-youtube.srt b/youtube_transcripts/530-anywidget-youtube.srt index 107303a..c54d9aa 100644 --- a/youtube_transcripts/530-anywidget-youtube.srt +++ b/youtube_transcripts/530-anywidget-youtube.srt @@ -364,7 +364,7 @@ Michael Kennedy: And then you've got the backend stuff where Python people live 92 00:05:59,760 --> 00:06:01,960 -Michael Kennedy: pollers, at Plotlib, et cetera. +Michael Kennedy: Polars, at Plotlib, et cetera. 93 00:06:02,920 --> 00:06:05,260 diff --git a/youtube_transcripts/530-anywidget-youtube.vtt b/youtube_transcripts/530-anywidget-youtube.vtt index b7234dc..675846d 100644 --- a/youtube_transcripts/530-anywidget-youtube.vtt +++ b/youtube_transcripts/530-anywidget-youtube.vtt @@ -274,7 +274,7 @@ WEBVTT And then you've got the backend stuff where Python people live doing NumPy, 00:05:59.760 --> 00:06:01.960 -pollers, at Plotlib, et cetera. +Polars, at Plotlib, et cetera. 00:06:02.920 --> 00:06:05.260 Do you want to riff on that challenge a little bit? diff --git a/youtube_transcripts/531-talk-python-in-prod-youtube.srt b/youtube_transcripts/531-talk-python-in-prod-youtube.srt index 524b02a..7d370b4 100644 --- a/youtube_transcripts/531-talk-python-in-prod-youtube.srt +++ b/youtube_transcripts/531-talk-python-in-prod-youtube.srt @@ -3608,7 +3608,7 @@ Michael Kennedy: That was my philosophy. 903 00:53:46,340 --> 00:53:52,100 -Christopher Trudeau: The, the structure of your site is, it has a lot of different pieces to it +Christopher Trudeau: The, the structure of your site is, it has a lot of different pieces to it 904 00:53:52,460 --> 00:53:54,120 @@ -3892,7 +3892,7 @@ Christopher Trudeau: unnatural for a CMS. I can just mount it under slash blog a 974 00:57:39,080 --> 00:57:44,340 -Michael Kennedy: Yeah. Jingo is very powerful. It definitely is. And I actually talked a lot about that in the +Michael Kennedy: Yeah. Django is very powerful. It definitely is. And I actually talked a lot about that in the 975 00:57:44,820 --> 00:57:51,020 diff --git a/youtube_transcripts/531-talk-python-in-prod-youtube.vtt b/youtube_transcripts/531-talk-python-in-prod-youtube.vtt index 66132fd..9152ae0 100644 --- a/youtube_transcripts/531-talk-python-in-prod-youtube.vtt +++ b/youtube_transcripts/531-talk-python-in-prod-youtube.vtt @@ -2707,7 +2707,7 @@ WEBVTT That was my philosophy. 00:53:46.340 --> 00:53:52.100 -The, the structure of your site is, it has a lot of different pieces to it +The, the structure of your site is, it has a lot of different pieces to it 00:53:52.460 --> 00:53:54.120 using different technology. @@ -2920,7 +2920,7 @@ WEBVTT unnatural for a CMS. I can just mount it under slash blog and it'll work fine. Yeah. 00:57:39.080 --> 00:57:44.340 -Yeah. Jingo is very powerful. It definitely is. And I actually talked a lot about that in the +Yeah. Django is very powerful. It definitely is. And I actually talked a lot about that in the 00:57:44.820 --> 00:57:51.020 book, like evaluating web frameworks. But I would say before we, if we're going to that, diff --git a/youtube_transcripts/532-python-2025-year-in-review-youtube.vtt b/youtube_transcripts/532-python-2025-year-in-review-youtube.vtt index 73c9c76..4926618 100644 --- a/youtube_transcripts/532-python-2025-year-in-review-youtube.vtt +++ b/youtube_transcripts/532-python-2025-year-in-review-youtube.vtt @@ -1090,7 +1090,7 @@ WEBVTT and help making that more of a thing. 00:24:53.160 --> 00:24:55.080 -I just claim where I was the PEP delegate +I just claim where I was the PEP delegate 00:24:55.400 --> 00:24:56.240 for getting that in. @@ -2731,7 +2731,7 @@ WEBVTT Yes, exactly. 01:01:29.160 --> 01:01:31.400 -It should also be prefaced that Barry created the PEP process. +It should also be prefaced that Barry created the PEP process. 01:01:31.680 --> 01:01:33.000 He should have started that one. @@ -2776,7 +2776,7 @@ WEBVTT So it is different when you know each other in person, let's put it that way. 01:02:12.760 --> 01:02:20.060 -I think for the PEP process, I think for a lot of people, it's not obvious how difficult the process is. +I think for the PEP process, I think for a lot of people, it's not obvious how difficult the process is. 01:02:20.300 --> 01:02:21.760 I mean, it wasn't even obvious to me. @@ -2791,13 +2791,13 @@ WEBVTT I saw people making changes where I thought this is definitely something that should have 01:02:34.460 --> 01:02:38.660 -been discussed in a PEP and the discussion should be recorded in a PEP and all that. +been discussed in a PEP and the discussion should be recorded in a PEP and all that. 01:02:39.360 --> 01:02:44.680 -And I didn't understand why they didn't until basically until PEP 810. +And I didn't understand why they didn't until basically until PEP 810. 01:02:44.750 --> 01:02:52.920 -So I did PEP 779, which was the, giving free threading, supported status at +So I did PEP 779, which was the, giving free threading, supported status at 01:02:52.940 --> 01:02:59.220 the start of the year. And the discussion there was, you know, sort of as expected and it's already, @@ -2815,10 +2815,10 @@ WEBVTT co-workers and one of my former co-workers, who had all had a lot of experience with Lazy Imports, 01:03:22.280 --> 01:03:25.300 -but not necessarily as much experience with the PEP process. +but not necessarily as much experience with the PEP process. 01:03:26.280 --> 01:03:30.940 -And Pablo took the front seat because he knew the PEP process and he's done +And Pablo took the front seat because he knew the PEP process and he's done 01:03:31.040 --> 01:03:34.740 like five peps in the last year or something, some ridiculous number. @@ -2860,10 +2860,10 @@ WEBVTT And, and that's, that, that cannot be an acceptable way to, discuss the evolution of the language. 01:04:56.220 --> 01:05:03.320 -Especially since apparently now every single PEP author of, of any contentious or semi-contentious pep. +Especially since apparently now every single PEP author of, of any contentious or semi-contentious pep. 01:05:03.700 --> 01:05:07.100 -Although I have to say PEP 810 had such broad support. +Although I have to say PEP 810 had such broad support. 01:05:07.260 --> 01:05:09.200 It was hard to call it contentious. diff --git a/youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt b/youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt index d4580d4..6d6f7a9 100644 --- a/youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt +++ b/youtube_transcripts/536-fly-inside-fastapi-cloud-youtube.vtt @@ -1972,7 +1972,7 @@ WEBVTT I think, I feel like I should maybe give a little bit of a, 00:41:36.820 --> 00:41:41.060 -I tell a little bit of the story of what's going on with, where did I put it? +I tell a little bit of the story of what's going on with, where did I put it? 00:41:42.560 --> 00:41:46.020 I don't think I paste it over here is what's going on with tail end right now. @@ -2992,7 +2992,7 @@ WEBVTT Yeah, exactly. 01:04:45.740 --> 01:04:52.440 -So, so I can't really move it because it has, you know, some sub domain +So, so I can't really move it because it has, you know, some sub domain 01:04:52.560 --> 01:04:53.220 of talk Python, right? diff --git a/youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt b/youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt index 6993e95..deaf428 100644 --- a/youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt +++ b/youtube_transcripts/539-catching-up-with-the-python-typing-council-original.vtt @@ -313,7 +313,7 @@ WEBVTT Yeah, it postdates most of the peps. 00:07:38.370 --> 00:07:42.400 -So initially, the type system was created just through the regular PEP process, +So initially, the type system was created just through the regular PEP process, 00:07:42.610 --> 00:07:43.960 which means that something gets submitted. @@ -448,7 +448,7 @@ WEBVTT in the same way. Yeah. Maybe a smaller example that is an example of something that would have been 00:10:17.160 --> 00:10:23.040 -too small for a PEP and hard to accomplish before the typing council existed. And this +too small for a PEP and hard to accomplish before the typing council existed. And this 00:10:23.040 --> 00:10:28.860 is actually a change that I pushed through before I was on the typing council, but the typing @@ -931,7 +931,7 @@ WEBVTT Yeah, that is hard because often those peps build on top of each other. 00:21:39.900 --> 00:21:45.320 -So then in the extreme, you might see like one thing in one PEP and then another PEP that +So then in the extreme, you might see like one thing in one PEP and then another PEP that 00:21:45.420 --> 00:21:47.640 adds an aspect of it, another one that adds another aspect. @@ -943,7 +943,7 @@ WEBVTT One of the things I did recently was rewrite the typed dict spec. 00:21:55.180 --> 00:22:01.860 -dict is a feature of the python type system that has been added to a lot from one PEP to another +dict is a feature of the python type system that has been added to a lot from one PEP to another 00:22:02.720 --> 00:22:07.280 and i ended up rewriting the whole thing to basically put all those features together in @@ -1195,10 +1195,10 @@ WEBVTT Does ty do something like that, Carl? 00:27:45.440 --> 00:27:47.360 -Yes, we also have inlay type ints. +Yes, we also have inlay type hints. 00:27:47.660 --> 00:27:48.500 -Yeah, inlay type ints. +Yeah, inlay type hints. 00:27:48.560 --> 00:27:49.000 That's what it's called. @@ -2929,7 +2929,7 @@ WEBVTT before we can approve them. 01:01:45.780 --> 01:01:48.360 -I think there's PEP 747 for type form, +I think there's PEP 747 for type form, 01:01:48.660 --> 01:01:52.080 which I think we recommended its acceptance, @@ -3358,7 +3358,7 @@ WEBVTT peps have just been community members who saw something they 01:09:31.839 --> 01:09:35.960 -wanted to improve, proposed a PEP and saw it to completion. If +wanted to improve, proposed a PEP and saw it to completion. If 01:09:36.060 --> 01:09:38.560 there's something you want to see in the type system, then you diff --git a/youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt b/youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt index 3b053fd..e08d348 100644 --- a/youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt +++ b/youtube_transcripts/540-modern-python-monorepo-timeline-original.vtt @@ -241,7 +241,7 @@ WEBVTT And people make decisions in both foundation and projects 00:05:44.960 --> 00:05:48.200 -or in PMCs so-called or project management committees. +or NPMCs so-called or project management committees. 00:05:49.120 --> 00:05:50.960 And Airflow is one of the PMCs. @@ -1531,7 +1531,7 @@ WEBVTT define the dependency groups in our by projects and it's it's nice to uh how it's how it's really 00:34:05.410 --> 00:34:12.280 -nice how it works with uv so we are very happy with this particular uh dependency group PEP as +nice how it works with uv so we are very happy with this particular uh dependency group PEP as 00:34:12.280 --> 00:34:18.080 well as the uh inline scripts i think right the inline scripts are cool i you know especially diff --git a/youtube_transcripts/544-wheel-next.vtt b/youtube_transcripts/544-wheel-next.vtt index 84d500f..72f299b 100644 --- a/youtube_transcripts/544-wheel-next.vtt +++ b/youtube_transcripts/544-wheel-next.vtt @@ -2533,7 +2533,7 @@ WEBVTT done a ton of work on basically implementing the standard in uv. So we have like a working 00:48:53.860 --> 00:48:59.560 -implementation that we've used to, yeah, you can actually install it, from, you know, +implementation that we've used to, yeah, you can actually install it, from, you know, 00:48:59.560 --> 00:49:04.820 we basically distribute it to a slightly different URL so you can install it and test it. but, @@ -2542,10 +2542,10 @@ WEBVTT uh, yeah, that's been, that fork has evolved a lot or that branch has evolved a lot and it's 00:49:10.240 --> 00:49:14.600 -been a lot of work to, it's been incredibly helpful for the design process for us to understand +been a lot of work to, it's been incredibly helpful for the design process for us to understand 00:49:14.840 --> 00:49:19.080 -like what's hard, what's easy. And then, I also think it's important for PES just to have +like what's hard, what's easy. And then, I also think it's important for PES just to have 00:49:19.100 --> 00:49:22.600 working implementations too. And I mean, a lot of people agree that's not an awful point, but @@ -2563,7 +2563,7 @@ WEBVTT I personally have a lot of admiration for the work done in free-threading Python. 00:49:49.100 --> 00:49:56.380 -especially to the PEP and i think sam gross who is the main author managed to make significant +especially to the PEP and i think sam gross who is the main author managed to make significant 00:49:56.550 --> 00:50:02.380 amount of progress as he was coming up with prototypes that uh it's not just my word let @@ -3409,7 +3409,7 @@ WEBVTT But, you know, it, it means not only are these in pip, they might be using an older version of Python, because they don't want to, they don't want to shake it up. And, you know, that, that those are going to be the long tails that are going to be hard. I guess one more thought about what's next here before we call this a show here. 01:06:23.000 --> 01:06:28.360 -What is the minimal? We talked about a PEP 825, the minimal pep. What is the minimal amount of +What is the minimal? We talked about a PEP 825, the minimal pep. What is the minimal amount of 01:06:28.580 --> 01:06:34.940 adoption? Right? So if if the top five biggest data science and machine learning libraries adopt this, From 2d38376d61aaeefdd355281bd87eef21d80e58ba Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 6 May 2026 13:32:01 -0700 Subject: [PATCH 12/16] transcripts --- ...ython-at-any-scale-with-ray-transcript.vtt | 1111 +++++++++++++++++ 1 file changed, 1111 insertions(+) create mode 100644 transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt diff --git a/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt b/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt new file mode 100644 index 0000000..bbb2645 --- /dev/null +++ b/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt @@ -0,0 +1,1111 @@ +WEBVTT + +00:00:00.000 --> 00:00:04.640 +When OpenAI trained GPT-3, they didn't roll their own orchestration layer. + +00:00:04.980 --> 00:00:15.740 +They used Ray, an open-source Python framework born out of the same Berkeley Research Lab lineage that gave us Apache Spark. And here's the twist. Ray was originally built for reinforcement + +00:00:15.740 --> 00:00:26.520 +learning research and then quietly faded as RL hit a wall. Until ChatGPT showed up, suddenly reinforcement learning was back. As the post-training step, that turns a raw language + +00:00:26.520 --> 00:00:37.740 +model into something genuinely useful. Edward Oaks and Richard Law, two founding engineers behind Ray and AnyScale, joined me on Talk Python to tell that story. We'll trace Ray from its + +00:00:37.740 --> 00:00:42.860 +RISE lab origins at UC Berkeley to powering some of the largest training runs in the world. + +00:00:43.340 --> 00:00:54.560 +We'll talk about what Ray actually is, a distributed execution engine for AI workloads, and how a few lines of Python become work running across hundreds of GPUs. We'll cover Ray data for + +00:00:54.560 --> 00:01:07.000 +multimodal pipelines, the dashboard, the VS Code remote debugger, CubeRay for Kubernetes, and where Ray fits alongside Dask, multiprocessing, and AsyncIO. If you've ever stared at a single + +00:01:07.000 --> 00:01:11.580 +machine Python script and thought, there has to be a better way to scale this, this one's for you. + +00:01:11.580 --> 00:01:18.220 +It's Talk Python To Me, episode 547, recorded April 27th, 2026. + +00:01:18.220 --> 00:01:40.820 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:40.820 --> 00:01:46.600 +This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. + +00:01:47.140 --> 00:02:01.680 +Let's connect on social media. You'll find me and Talk Python on Mastodon, BlueSky, and X. The social links are all in your show notes. You can find over 10 years of past episodes at talkpython.fm. And if you want to be part of the show, you can join our recording live streams. + +00:02:01.840 --> 00:02:12.040 +That's right, we live stream the raw uncut version of each episode on YouTube. Just visit Talk Python.fm slash YouTube to see the schedule of upcoming events. Be sure to subscribe there + +00:02:12.040 --> 00:02:26.740 +and press the bell so you'll get notified anytime we're recording. This episode is sponsored by Sentry's Seer. If you're tired of debugging in the dark, give Seer a try. There are plenty of AI tools that help you write code, but Sentry's Seer is built to help you fix it when it breaks. + +00:02:27.080 --> 00:02:39.280 +Visit talkpython.fm/sentry and use the code Talk Python26, all one word, no spaces, or $100 in Sentry credits. What if your AI agents worked like FastAPI microservices, + +00:02:39.780 --> 00:02:44.540 +typed, autonomous, and discovering each other at runtime? That's the world AgentField is building. + +00:02:45.120 --> 00:02:56.420 +Join them at talkpython.fm/AgentField. Edward, Richard, welcome to Talk Python To Me. Great to be here with both of you and talking about parallel computing and beyond. Thanks for having us on. + +00:02:56.620 --> 00:03:01.480 +Excited to be here and share some hopefully interesting information about Ray with the audience. + +00:03:01.480 --> 00:03:16.060 +Thanks for having us. I don't know how many people know about Ray, but it's a really cool parallel computing framework that's got this sort of big data angle and it's got an AI angle. We're going to talk about both of those and dive into the history and maybe even the future, who knows? + +00:03:16.260 --> 00:03:21.260 +But before we get into those, let's just start with your stories. Edward, I'll let you go first. + +00:03:21.780 --> 00:03:33.580 +Introduce you all, please. Yeah, my name is Edward, also go by Ed, and I've been working on Ray since I think about 2019, maybe late 2018. At that time, I was a grad student at UC Berkeley. So that's + +00:03:33.580 --> 00:03:45.360 +actually where Richard and I met, and that's where Ray kind of originated. So we were grad students in what was called the RISE Lab under Professor Jan Stoica. So he's also the professor that had, + +00:03:45.600 --> 00:03:56.060 +and the predecessor to that lab is what Spark came out of. Oh yeah, really? Wow. Yeah. So a lot of people view Ray as like kind of a successor to Spark. That's not really how we talk about it. I think + +00:03:56.060 --> 00:04:05.660 +it's kind of a different system solving different problems, but we did originate from the same university and sort of a similar lab. Yeah. And just kind of about me, what I'm interested in, + +00:04:05.900 --> 00:04:15.960 +I would say I'm not really like an AI person as much as I am like an infrastructure and like distributed computing person. So the reason why I was originally attracted to working on Ray and why + +00:04:15.960 --> 00:04:29.480 +I'm still doing it however many years later is I just really feel motivated by this idea of like providing an easier way for our users to leverage like large scale computing and sort of like building + +00:04:29.480 --> 00:04:40.180 +that like abstraction or like bridge layer that enables people to do it. Incredible. Richard, how about you? I'm one of the founding engineers here with Edward, and currently I'm on more of the + +00:04:40.180 --> 00:04:46.940 +product management side at any scale. And my background here is that I was actually an undergrad + +00:04:46.940 --> 00:05:01.080 +that was working on various like machine learning research projects. And at the time, Ray was still not like a very, it wasn't even like an early project yet. But the thing that was very exciting + +00:05:01.080 --> 00:05:14.760 +at Berkeley was reinforcement learning. At the time, like DeepMind was getting a lot of popularity and press for a game, like sort of innovations that we're doing for game AI. And eventually that + +00:05:14.760 --> 00:05:27.200 +sort of culminated in the AlphaGo moment. Tell people what that is. I'm sure some of us know, but that was kind of the first time that an AI system beat other competitors, where it wasn't just + +00:05:27.200 --> 00:05:39.860 +a memorization, or like a, we're going to load every possible combination of moves into the system, right? Tell us about that. I didn't follow it too closely, but at the time there were previous + +00:05:39.860 --> 00:05:52.560 +game AIs, like, like, you know, IBM sort of. Yeah. Stockfish, I think is what it's called. The original like chess AI. Right. And I think Go was a much more high dimensional complex game. So there + +00:05:52.560 --> 00:06:06.280 +was a lot. The first one, IBM won beat one of the grandmasters, but people were like, yeah, but it doesn't really count because it just knew all the possibilities and played it out, you know, which is, which I think that's a fair criticism. Yeah. And the other thing is it was like a very like hand + +00:06:06.280 --> 00:06:16.740 +tuned algorithm that took like years to build. So it was, it was like many people kind of using chess knowledge to like build a search algorithm that was like, you know, very specific to chess. + +00:06:16.880 --> 00:06:28.700 +AlphaGo one was, first of all, like the game was much harder than chess. Second, like it was, you know, a widely staged event. And then in terms of the learning algorithms, they did use + +00:06:28.700 --> 00:06:39.760 +reinforcement learning to train the model. And as far as I understand, like a lot of the ways they applied the machine learning techniques were not memorization or were not caching, but rather like + +00:06:39.760 --> 00:06:50.500 +having sort of like neural networks that could estimate the state and the value and the current of the current position and to be able to sort of extend and decide what the next move was given + +00:06:50.500 --> 00:07:00.940 +their internal representation of what the state was. So yeah, so that was obviously very, very impressive. And a lot of the technology that led to that moment was reinforcement learning. + +00:07:00.940 --> 00:07:14.240 +For us in Berkeley, we were interested in being able to sort of provide that sort of technology to researchers also at Berkeley that didn't have access to large engineering teams and Google's + +00:07:14.240 --> 00:07:25.280 +infrastructure and stuff like that. And so that's kind of where Ray came out of. Like it was baked out of doing reinforcement learning research and machine learning research and sort of evolved from that. + +00:07:25.280 --> 00:07:29.700 +Give people a look inside this research lab that y'all are talking about. It sounds super interesting. + +00:07:30.260 --> 00:07:41.440 +And I guess I have a couple of things that are wondering about. One is just, you know, what is a lab that generates like grid computing systems and, you know, large big data systems? + +00:07:41.980 --> 00:07:50.060 +How do you think about problems and then solve them? I know what a chemistry lab does, but I'm not entirely sure what this thing does to result in that coming out. And then two, + +00:07:50.060 --> 00:08:02.160 +how does it go from being something created in the lab that's really powerful or useful to either an open source product or even a product product service type product? Like what's that journey look like? + +00:08:02.320 --> 00:08:11.920 +One thing that I think is pretty unique. Well, let me take a step back for this type of like computer systems research where, you know, like grid computing or like networking or like large scale data + +00:08:11.920 --> 00:08:22.520 +processing. It can be hard to do that in an academic setting because a lot of times the like requirements and the infrastructure are like, well, they're expensive. And also like the types of problems + +00:08:22.520 --> 00:08:32.220 +that you work on, you know, like data center networking algorithms are only relevant to like the few companies that operate data centers. So it can be kind of hard to do that in an academic setting. + +00:08:32.380 --> 00:08:45.600 +Yeah. I was thinking about that when I was preparing for the show is like, I really want to try out some things with Ray and some of this computing stuff, but I just don't have the problems or the data that justify like genuinely using it, not just taking it through a sample. You know what I mean? + +00:08:45.600 --> 00:08:47.860 +I feel like academics would have a similar issue. + +00:08:48.100 --> 00:09:01.640 +The thing that was unique. So the lab that we were in was called the Rise Lab and the one before it was called the Amp Lab and the one after it was called the Sky Lab. And each of them kind of had a theme. So the Amp Lab was like mostly about like big data. So that was like the one that generated + +00:09:01.640 --> 00:09:14.560 +Spark. The Rise Lab was about like machine learning and reinforcement learning. And then the Sky Lab is about like sky computing. So like cross cloud and stuff like that. Richard and I are a little bit less familiar with that one because it was after we left. But the thing about the Amp and specifically the + +00:09:14.560 --> 00:09:28.640 +Rise Lab is that it was very like interdisciplinary. So the professor I mentioned that we work with, Jan, he had really intentionally set it up so that, you know, the students who are really passionate about like distributed systems and networking were working like really closely with the students who + +00:09:28.640 --> 00:09:37.360 +were the like machine learning and reinforcement learning experts. And then there were also folks who were really interested in security were also like working closely with both of them. + +00:09:37.360 --> 00:09:49.360 +And I think that kind of like cross pollination really helped yield like interesting project ideas and more kind of like realistic requirements. Because what Ray originally came from was like + +00:09:49.360 --> 00:09:59.520 +one of classmates and then the co-founder of AnyScale or the two of them, Robert and Philip, they were more like ML focused people. And they were trying to do reinforcement learning research, + +00:09:59.520 --> 00:10:09.520 +but they were trying to sort of put a square peg in a round hole by doing it on Spark. And it turned out that Spark like just really wasn't built for the requirements of reinforcement learning, + +00:10:09.520 --> 00:10:21.200 +which are a little bit more like dynamic in nature. And it was that kind of, and then they had access to, you know, professors who professors and students who were passionate about like distributed systems + +00:10:21.200 --> 00:10:34.720 +and data systems and stuff. So that's kind of where Ray came from was like organically, you had students who were trying to do reinforcement learning, they kind of hit this wall that the tools like didn't help them solve. So it was like, okay, let's start a new project and build the tool that we need. + +00:10:34.720 --> 00:10:37.360 +Yeah, makes a lot of sense. Richard, anything you want to add to that? + +00:10:37.360 --> 00:10:48.560 +Edward comes a little bit from the more systems side. And I was a little bit more on like the machine learning applied side. And I remember when I was in the RISE lab, there was a lot of + +00:10:49.120 --> 00:11:03.680 +interactions with the, like the one of the best machine learning, like the best machine learning groups in Berkeley as well. Like, like Mike Jordan, who, who is one of like the, like very, very famous AI professor had his group sort of co-located in the same space, + +00:11:03.680 --> 00:11:05.280 +in addition to all these systems people. + +00:11:05.280 --> 00:11:08.080 +You're talking about bear, right? Berkeley AI research. + +00:11:08.080 --> 00:11:19.440 +There's bear. And then there's also like a subset, which is like a lot of the Mike's students were also in, in the RISE lab. And in addition to that, there was also a biannual. So every six months, + +00:11:19.440 --> 00:11:27.680 +we would have a industry retreat. So there'd be about 200, 250 people that show up at like a conference + +00:11:27.680 --> 00:11:42.000 +or like a hotel. And 70 of them would be the students that we just talked about. And 180 of them would be like top researchers or like executives from the industry. So we were able to + +00:11:42.000 --> 00:11:52.720 +sort of cross pollinate and share ideas and collaborate and get feedback from folks like Bill Daly, who was, who's like the NVIDIA's chief scientist, or, you know, like a lot of really, + +00:11:52.720 --> 00:11:57.040 +you know, top people at Google who were doing recommendation systems and so on and so forth. + +00:11:57.040 --> 00:12:08.560 +So that was like that sort of moment was, was very often reoccurring. So every six months, and then we would just have this opportunity to actually touch base with what was happening in + +00:12:08.560 --> 00:12:14.560 +the industry and therefore drive innovation so that we could be impactful and do impactful projects. + +00:12:14.560 --> 00:12:23.280 +What's the relationship between reinforcement learning and like the transformer stuff that we see powering LLMs these days? How similar or different is that? + +00:12:23.280 --> 00:12:35.520 +Reinforcing learning is more of a, you can think of it as like a learning paradigm, right? It's like a way how it's kind of like this framework that you would use to, to set up a problem. And then, + +00:12:35.520 --> 00:12:49.360 +and like, it's fundamentally about like having a agent or like a, some actor or agent that interacts with the world, gets rewards or like some feedback signal from that world, and then sort of learns + +00:12:49.360 --> 00:13:01.760 +from that and continually updates its like, its policy. It's more focused on solving a single problem, you might say, or like a category problems, you know? It's just this very, very generic framework, + +00:13:01.760 --> 00:13:13.920 +right? And it can apply to like, you can imagine like the same thing is how like a mouse would interact with a maze or like a child would interact with a toy, right? So it's just a framework. It's like a + +00:13:13.920 --> 00:13:26.720 +symbolic representation of this framework. And whereas Transformers is like a, it's like a model architecture, right? It's like a way for us to be able to ingrain a particular modeling heuristic + +00:13:26.720 --> 00:13:39.200 +that tells us that like, hey, for certain types of data, in particular sequence data, there are patterns that you can learn across the sequences, and that can improve like the quality of modeling. + +00:13:39.200 --> 00:13:52.560 +And so like the two can be worked, like can be used together, you can do reinforcement learning with a transformer, but you can also have a transformer that stands by itself as trained with supervised learning and reinforcement learning that is done without a transformer model. + +00:13:52.560 --> 00:13:52.960 +Interesting. + +00:13:52.960 --> 00:14:04.640 +That question that you asked is actually, I think, like tightly intertwined with the history of Ray, because as we mentioned in like the 2017-2018 era, Ray was kind of originally motivated by + +00:14:04.640 --> 00:14:14.400 +reinforcement learning. But that reinforcement learning had like very little to do with like transformer models or LLMs. It was things along the line of the AlphaGo project that we talked about, + +00:14:14.400 --> 00:14:25.280 +or it was also being used a lot for robotics at Berkeley. And then reinforcement learning actually, like sort of, I would say died out for a while or like got less popular, kind of like hit a wall + +00:14:25.280 --> 00:14:39.280 +and it didn't, it was like viewed as not that practical. So the original Ray library, like the most popular one in the early days is called RLLib. And that was like far and away the most successful Ray library for a long time. And then it kind of like petered out for a while. + +00:14:39.280 --> 00:14:41.840 +RL for reinforcement learning, right? Something like that? + +00:14:41.840 --> 00:14:43.280 +Yeah, that's right. Reinforcement learning. + +00:14:43.280 --> 00:14:43.520 +Okay. + +00:14:43.520 --> 00:14:57.360 +And then we had this kind of ChatGPT or like LLM moment, which by the way, Ray is also like tightly intertwined with because GPT-3 and I think 4, I'm not actually sure about 4, but at least 3 was + +00:14:57.360 --> 00:15:11.280 +trained using Ray as like the compute framework by OpenAI. And the really big innovation that went from like GPT to ChatGPT was by applying reinforcement learning to the transformer models. + +00:15:11.280 --> 00:15:24.560 +So this technique is called post-training, which is like you have, you do the supervised learning that Richard was kind of talking about, or you do like what they call pre-training and you generate these like model weights that basically encode like a huge amount of information, like the whole internet. + +00:15:24.560 --> 00:15:34.800 +And then they are, but they're kind of unrefined, right? You can think of it as like a, I don't know, a child with a lot of intelligence, but not very good at communication or something. And they applied + +00:15:34.800 --> 00:15:39.760 +reinforcement learning techniques as a way to sort of tailor the model to specific use cases. + +00:15:39.760 --> 00:15:45.600 +So the first one was for this like chat application. So that's how you go from like GPT to ChatGPT. + +00:15:45.600 --> 00:15:57.360 +And then another example of that more recently is like these coding agents are also a different version of like post-trained LLMs or transformers. And we're seeing, so we originally had Ray kind + +00:15:57.360 --> 00:16:08.560 +of used for reinforcement learning, kind of dipped and it was used for like LLM things. And now we're actually seeing a huge resurgence in reinforcement learning specifically for this like post-training use case that I was talking about. + +00:16:08.560 --> 00:16:17.840 +Are you guys surprised just how far these GPT type things and clod code and so on have come given that you saw a little bit before then? + +00:16:17.840 --> 00:16:30.080 +I remember like Jan would occasionally pull me aside and say like, hey, you should work on like program synthesis and program synthesis is like effectively is like a model. It's a like a, + +00:16:30.080 --> 00:16:44.880 +it's a, it's a, it's a machinery problem where you try to like get models to write code. And then I don't think that was definitely not the right approach. Like that's not what ended up like not, it wasn't like the program synthesis line of work that ended up with coding agents, but like Jan was always + +00:16:44.880 --> 00:16:57.840 +like, hey, why don't we go work on programs with this? I have no idea what program synthesis is. I like, I have no expertise in this thing, but he wanted to work on the problem. Well, which is funny, because like in five years, seven years later, it turns out like this is like the biggest known + +00:16:57.840 --> 00:17:00.560 +economically valuable sort of application of these machinery. + +00:17:00.560 --> 00:17:05.680 +And solved in just a completely different way that I don't think anybody really saw coming. + +00:17:06.560 --> 00:17:10.640 +That was definitely an emergent thing. At least for me, I didn't expect that at all. + +00:17:10.640 --> 00:17:22.000 +Yeah. Well, I'm blown away by it. I honestly, I'm happy that it exists. I get to do cool stuff with it, but sure didn't see it coming. This portion of Talk Python To Me is brought to you by Sentry and + +00:17:22.000 --> 00:17:34.000 +Sear AI. There are plenty of AI tools that help you write code, but Sentry Sear is built to help you fix it when it breaks. The difference is context. Sear isn't just guessing based on syntax. It's + +00:17:34.000 --> 00:17:45.520 +analyzing your actual Sentry data, your stack traces, logs, and failure patterns. Because it has the full context, it can a spot buggy code in review and help prevent issues before they happen, + +00:17:45.520 --> 00:17:57.360 +and b identify the root cause of production errors. It can even draft a fix and hand the work off to an agent-like cursor to open a PR for you. Sear turns Sentry into a complete loop. You have your + +00:17:57.360 --> 00:18:07.600 +traces, errors, logs, and replays to see the problem, and now AI to help solve it. Join millions of devs at companies like Claude, Disney Plus, and even Talk Python who use Sentry to move + +00:18:07.600 --> 00:18:19.680 +faster. Check them out at talkpython.fm/sentry and use code talkpython26, all one word, for $100 in Sentry credits. Thank you to Sentry for supporting Talk Python. + +00:18:21.120 --> 00:18:26.080 +Let's switch over and talk about maybe set the foundations we're talking about, Ray, a little bit. + +00:18:26.080 --> 00:18:37.680 +And by that, I mean, let's talk about like different options for parallel computing and that kind of thing. So we have this sort of spectrum of compute, and it sounds to me like + +00:18:37.680 --> 00:18:50.000 +the history, the idea is, hey, let's move towards scaling this compute out across all the cores, across multiple machines, so that when you're doing training and reinforcement learning, things like + +00:18:50.000 --> 00:18:55.040 +that, you can actually take advantage of all the compute. And I'm guessing GPUs as well, right? + +00:18:55.040 --> 00:18:57.520 +Yeah, GPUs are definitely like bread and butter for Ray. + +00:18:57.520 --> 00:19:07.840 +So at the very smallest layer of parallelism, at least in Python land, we've got asyncio, which really still runs on a single thread, but it uses waiting periods like waiting on databases, + +00:19:07.840 --> 00:19:13.760 +waiting on API calls, and so on to interlace work without true parallelism, but still kind of. + +00:19:13.760 --> 00:19:18.160 +We have threads, which really, until recently, didn't do anything much different. + +00:19:18.720 --> 00:19:29.680 +Right? It's just less control structures, right? Because we had the gill, and then we now we've got free threaded Python. So it's a little bit better, but you got to have the library support. We have multi processing and sub processes. And that's + +00:19:29.680 --> 00:19:42.160 +kind of what we have out of the box in Python. But then we have stuff that both of you all are familiar with, or have built things like databases like Spark, or Ray, we've also got Dask and Coiled, + +00:19:42.160 --> 00:19:52.720 +which is, I'm interested to hear how you all see yourself as the same or different than Dask and Coiled and so on, which itself is different than when it started, at least Coiled. So it may be like, + +00:19:52.720 --> 00:19:58.560 +just speak to this, this arc of trying to get more compute out of our apps. + +00:19:58.560 --> 00:20:08.800 +I would kind of try to organize like a framework for thinking about those. So, and this is a little bit off the cuff. So hopefully it's, you guys can follow it. But I would say there's kind of like + +00:20:08.800 --> 00:20:23.600 +two axes I would think about. So the first one is like how specific versus kind of how general of like a parallelism framework you have. So something that is like really specific, like the most specific would be something that is like completely tailored to one use case, like a, + +00:20:23.600 --> 00:20:33.680 +this is not really Python, but like a SQL database. Like it's really good at like processing SQL queries, you can't really use it for anything else. And then a little bit more general than that is something + +00:20:33.680 --> 00:20:44.640 +like Spark. So you can use it for this kind of like big data processing type workload, you can use it for some streaming. But if you try to do anything that kind of goes outside the bounds of that, you start to + +00:20:44.640 --> 00:20:56.480 +run into a little bit of trouble because it has kind of an opinionated, like high level API, and an opinionated way that like data moves throughout the system, for example. And then you have kind of on + +00:20:56.480 --> 00:21:07.600 +the more general purpose and you have like Ray, and I would say desk is also more general purpose than the others. And so, so you have like specific to general purpose. And then there's also, I think, + +00:21:07.600 --> 00:21:20.960 +like the scale. So like asyncio is extremely useful for making many like concurrent, like IO bound requests, like HTTP requests, database queries, file operations, like anything like that. But it only + +00:21:20.960 --> 00:21:30.720 +works within one thread. Yeah, it feels a little bit like a scale up lever, even though you're not technically scaling up the hardware. It's like, yeah, you're still in the same box, just the box + +00:21:30.720 --> 00:21:42.320 +can do a little bit more. So asyncio is kind of scale up within a thread even. And then you can also have like scale up within a process. So if you have like multi threading, of course, like with free + +00:21:42.320 --> 00:21:56.880 +threading, you can actually get like parallelism. What most people do to scale up within a process, like historically with Python is they call into like native code, right? So you're using NumPy, you have basically like Python bindings, but in reality, almost all of the compute is happening in + +00:21:56.880 --> 00:22:07.120 +like a C extension library. And that's, that's also true for Torch. So those allow you to kind of like scale up to varying degrees, so like scale up within a thread within a process. And then + +00:22:07.120 --> 00:22:18.880 +multiprocessing also lets you scale up within like a whole host that you could use, you know, 64 cores of machine. And then at some point, you can't even fit on one machine anymore, you need to scale + +00:22:18.880 --> 00:22:29.760 +out even more. And that's where you need some kind of like parallel computing or like grid computing or kind of cluster framework, like Ray or Dask. It could be you need to scale up because of memory, + +00:22:29.760 --> 00:22:40.800 +or it could be a CPU, right? I think often people think just CPU, right? We just got to compute more, but it could be we've got a terabyte of stuff to try to process. Could be or it could be also for, + +00:22:40.800 --> 00:22:54.960 +because you need to use more GPUs, either for compute or for memory, like some of these large scale LLMs, you can't even fit it inside of like one single GPU. So you need to kind of like shard it across many machines. Yeah, we'll even see there's some, + +00:22:54.960 --> 00:23:07.520 +some ways to put these together, right? Like, I guess it's probably pretty straightforward, but we'll talk about the programming model and stuff. But you theoretically could use, I don't know, multiprocessing or something in your code, but then scale that across machines + +00:23:07.520 --> 00:23:10.160 +with Ray. Is that possible? You could. Does it make sense? + +00:23:10.160 --> 00:23:24.720 +Between a lot of these things, I think there's like some kind of unique parts and then some overlap. So like Ray can be used just on one machine. In that case, you know, Ray kind of manages its own processes and does like the delegation of work from like what we call your + +00:23:24.720 --> 00:23:36.000 +driver process, which is like the main Python program to the other processes, which in Ray terminology are like tasks and actors. If you're running Ray on one machine, then it looks quite similar to + +00:23:36.000 --> 00:23:47.280 +multiprocessing just with a little bit more opinionated of an API and some like integrated like observability features and stuff like that. But Ray definitely like is designed around the like + +00:23:47.280 --> 00:23:52.480 +multi-node kind of larger scale cluster use case. That's like where the value really comes in. + +00:23:52.480 --> 00:23:59.920 +I think you had a question about like Dask and Coiled. I think Dask and Coiled, they were more of like a + +00:23:59.920 --> 00:24:13.040 +comparison point for Ray, especially because like there was a Pandas on Ray project in 2018. And at that point, I think they were, yeah, it was, it did get brought up more often, but more recently, we don't + +00:24:13.040 --> 00:24:24.560 +hear about Coiled as often. I think in particular because we've sort of, you know, focused our, our product efforts a little bit more towards the AI side, whereas Coiled, I think is more like a + +00:24:24.560 --> 00:24:33.520 +scientific computing slash like general, you know, scale up Panda, scale up NumPy sort of approach. So we diverged and we don't see each other that often. + +00:24:33.520 --> 00:24:47.200 +Ian, from the last time I spoke with Matthew Rocklin, not too long ago, it looked like we were really focused on kind of creating and configuring and managing the infrastructure that allows for grid computing with + +00:24:47.200 --> 00:25:00.400 +data science type of stuff. A lot of like managing AWS and scaling them and, and so on. And more than the original Dask story, I think. All right. So, well, that brings us to what is Ray? I mean, + +00:25:00.400 --> 00:25:06.640 +we talked a little bit about it, but like, just give us the, like, what would you tell people if you made it a conference or something? + +00:25:06.640 --> 00:25:07.840 +You want to take this, Richard? You want me to? + +00:25:07.840 --> 00:25:09.200 +Yeah. I mean, I can start. + +00:25:09.200 --> 00:25:12.880 +We've both given that conference talk many times, by the way, so we should be good at this. + +00:25:12.880 --> 00:25:14.320 +Here's a rehearsal. + +00:25:14.320 --> 00:25:27.280 +So Ray is, by the way I would probably put it as like, it's a, it's a distributed execution engine for AI workloads. And in particular, it handles a lot of the orchestration aspects of the AI workloads and + +00:25:27.280 --> 00:25:40.720 +also has a variety of first party and third party libraries are built on top of it to help scale these AI workloads that we, we often see. So two popular, very, very popular applications of Ray today is that, + +00:25:41.280 --> 00:25:53.040 +is Reinforced Learning and then Multimodal Data Processing. Both of them are very, very relevant in today's AI world, but Reinforced Learning libraries, a lot of the third party ones, they will use Ray for + +00:25:53.600 --> 00:26:07.920 +coordinating the different components that you need to do Reinforced Learning with. There's like an inference engine that's involved. There's a training engine that's involved. And there's also like agents and sandboxes that are involved. So all three things, all, all these things need to be + +00:26:07.920 --> 00:26:18.000 +coordinated by one central orchestration system. And it's way easier to write this in Ray because Ray gives you that, that ability to control all these components as if you're writing single-threaded + +00:26:18.000 --> 00:26:29.120 +code. Multimodal Data Processing is the other big one where existing data processing libraries will focus on the ability to handle tabular data and work with Parquet, Iceberg, Delta, so on and so forth. + +00:26:29.120 --> 00:26:36.240 +Whereas like Ray finds its niche more in the, like the intersection between the data and the GPU. + +00:26:36.240 --> 00:26:48.960 +And so typically you're working with like larger unstructured data, for example, like images or embeddings. And oftentimes that requires like more complex scheduling and more complex orchestration + +00:26:48.960 --> 00:27:03.360 +that Ray is really good at. Given the origins, it certainly makes sense that you've got this focus on really nailing ML training and other types of workloads. Is it relevant to people who are just doing, I don't know, time series work or? We were going to talk about this at some point, but the, + +00:27:03.360 --> 00:27:14.080 +we kind of organize Ray in terms of like layers in a way. So that we call like the base, like Python API, which is quite simple. It's really just like, you know, for like people very familiar with Python, + +00:27:14.080 --> 00:27:25.520 +you could think of it as like multiprocessing for a cluster. So that we call kind of Ray core, is like that base, like distributed execution engine, sort of like core primitives for scaling up, + +00:27:25.520 --> 00:27:36.640 +distributing work and handling failures and like just overall kind of parallelism. And then on top of it, we have like a lot of library integrations, like that's what the Ray libraries are, + +00:27:36.640 --> 00:27:48.720 +like Ray train and serve. And then some of these post-training libraries. So that core layer is like absolutely relevant for non kind of AI workloads. And we do have many, many users that use it for + +00:27:48.720 --> 00:27:59.760 +things like in the finance world, they use it for parallel back testing or time series analysis, like you mentioned. Yeah. And any kind of like generic, just like parallel workload that you + +00:27:59.760 --> 00:28:09.360 +need to scale beyond the single machine. Now I'm thinking of it in finance and real-time trading type stuff. You could be running a whole bunch of scenarios in reverse. And then there are many of the + +00:28:09.920 --> 00:28:20.160 +largest hedge funds do exactly that using Ray. From my understanding, we could use Ray even on one machine. And it has some capabilities to help you sort of take better advantage of all your hardware. + +00:28:20.160 --> 00:28:31.040 +Like even my little streaming Mac mini has 10 CPUs and I just write regular Python code, I get like 16% or something or 10% of that. Right? Yeah. You certainly can use a Ray on one node. + +00:28:31.040 --> 00:28:43.040 +I think actually the kind of most compelling part of that is you can do it for development. So you can like, if you're working on this kind of large scale post-training thing, if it's useful to kind + +00:28:43.040 --> 00:28:53.680 +of think about what you'd have to do without Ray. So you would have like four different containers, each one would have its own like Python entry point, and you'd have to kind of like run and + +00:28:53.680 --> 00:29:07.040 +orchestrate them as like these independent services. So eventually maybe you'd like deploy them on Kubernetes or something like that. But even when testing locally, it's like, if you want to run all of them and like, make sure that kind of the integration points work well, and like quickly be + +00:29:07.040 --> 00:29:18.240 +able to like iterate and debug stuff. It's really painful if those are all kind of like loosely coupled as different processes. And especially if the way that you start them on your local machine is going to + +00:29:18.240 --> 00:29:32.400 +be very different than when you actually go to like scale it up in a cluster. Even if you just make a change, like, okay, now I got to go restart all the workers and so on. Right? I think a lot of people can relate to that pain. And with Ray, the thing that's really cool is you can, you can write kind + +00:29:32.400 --> 00:29:37.440 +of one Python script that like starts all those different processes and does the orchestration. + +00:29:37.440 --> 00:29:46.880 +You can run it just on your like local Mac or whatever local machine you have. And then once you kind of like have it working, then you can run it on a cluster and like scale it up using like + +00:29:46.880 --> 00:29:53.920 +the same code. Does it come with cluster management in terms of like infrastructure's code type of stuff? + +00:29:53.920 --> 00:30:03.680 +Will it spin up nodes and so on? Or do you have to have your cluster set up and then just it knows about it? You know what I mean? The answer is kind of both depending on your use case. + +00:30:03.680 --> 00:30:09.760 +So I'd categorize it as like there are maybe three or four ways that people run Ray clusters. + +00:30:09.760 --> 00:30:19.920 +So the first is using a tool that we call like the cluster launcher. So this is kind of like if you're an individual practitioner and you just want something like really low friction, + +00:30:19.920 --> 00:30:31.280 +we have a tool that will basically like spin up a Ray cluster on like AWS or GCP or Azure, or even on your own set of hardware, like you can kind of like bring your own set of machines. + +00:30:31.280 --> 00:30:36.160 +But that's not really like a fully managed experience. You can also run Ray on Kubernetes. + +00:30:36.160 --> 00:30:46.560 +So there's a community led project called KubeRay, which is a pretty tightly integrated like Kubernetes operator that makes it really easy to like run Ray clusters on Kubernetes. + +00:30:46.560 --> 00:31:00.400 +Or you can use like a more managed service like AnyScale, obviously where Richard and I work, we have like managed infrastructure for Ray clusters. But there are also, I think there are some other providers you can run Ray clusters to like AWS has an offering or + +00:31:00.400 --> 00:31:03.520 +Domino data labs has an offering. And I think there are a few more as well. + +00:31:03.520 --> 00:31:17.440 +You know, it makes a lot of sense that you guys have this sort of let us run the infrastructure side. We'll talk more about that later. With KubeRay though, do you just say like, as long as you have a Kubernetes cluster, you can just let it kind of create pods and scale up or down + +00:31:17.440 --> 00:31:19.040 +as demand is needed there, something like that. + +00:31:19.040 --> 00:31:24.720 +When you install KubeRay into your cluster, it will basically run the KubeRay controller as like a background pod. + +00:31:24.720 --> 00:31:36.800 +It's called like an operator in Kubernetes lingo. And then at that point, you now have these like custom resources. So you can like create a Ray cluster or a Ray job as like a custom resource. + +00:31:36.800 --> 00:31:43.840 +And then it will get spun up as a bunch of pods and they will connect to each other and get health checked. And all of that infrastructure management is done. + +00:31:43.840 --> 00:31:49.520 +KubeRay is pretty, pretty active. 2.5 thousand GitHub stars commits 17 hours ago. Nice. + +00:31:49.520 --> 00:32:04.080 +There's a huge community kind of initiative behind KubeRay and like we're involved with it too, but it really kind of is like kind of taken a life of its own. And it's really useful too, because like even on Kubernetes, everyone's environment is a little bit different. So having + +00:32:04.080 --> 00:32:12.480 +maintainers and committers from like many different companies and people who are running in like different environments makes it easier to sort of cover all the bases. + +00:32:12.480 --> 00:32:19.920 +For sure. Yeah. That diversity of use cases and stuff is always nice to create a better, better API, better library, and so on. + +00:32:22.320 --> 00:32:31.920 +This portion of Talk Python To Me is brought to you by Agentfield. What happens when you give hundreds of AI agents a shared code base and let them write code, review each other's work, + +00:32:31.920 --> 00:32:44.160 +and ship to production? Well, that's exactly what the team behind Agentfield AI built. And the wild part, it's not some proprietary system locked behind a paywall. It's an open source Python library. + +00:32:44.160 --> 00:32:57.040 +Now, where most agent frameworks have you wiring up DAGs and workflows, Agentfield lets you build AI agents the way you'd build FastAPI microservices. Think typed Python functions that become autonomous + +00:32:57.040 --> 00:33:08.320 +services. They discover each other at runtime, call each other like APIs, scale independently, fail independently, and recover on their own. And here's the thing. You're not just orchestrating + +00:33:08.320 --> 00:33:21.200 +LLM calls. You can orchestrate entire anonymous tools, spin up multiple cloud code instances, codec sessions, any coding harness you want, all running as live nodes on the same architecture, + +00:33:21.200 --> 00:33:32.400 +collaborating and verifying each other's output. That's how they build the factory. And it's completely free and open source. Check it out at talkpython.fm/agentfield. That's talkpython.fm + +00:33:32.400 --> 00:33:43.440 +slash agentfield. The link is in your podcast player show notes. Thank you to Agentfield for supporting the show. Let's talk to an example. You have a bunch of examples. So you have examples, + +00:33:43.440 --> 00:33:47.920 +and then you've got, is that also the gallery? Are these the same thing? I think those are the same. + +00:33:47.920 --> 00:33:52.000 +There's a ton here. This is kind of like all of them, and the others are like the highlighted ones. + +00:33:52.000 --> 00:34:04.640 +Some highlighted ones. Sure. Got it. So I think it would be nice to talk through the experience of doing a project in Ray, keeping in mind that it's always hard to talk about code over audio, + +00:34:05.360 --> 00:34:17.920 +but you know, let's maybe, maybe we could just like sort of skim over whoever wants to sort of narrate this experience of like going through one of the examples, you have an audio batch inference type of scenarios. Maybe we could talk. + +00:34:17.920 --> 00:34:20.640 +Can you score down so that I know where I'm going to end up? + +00:34:20.640 --> 00:34:34.080 +Yeah. Do some whisper stuff, do some GPU stuff, some LLM stuff, persist a curated subset, that sort of thing. Cool. Yeah. I kind of get the sense. So Ray is basically very similar to writing a + +00:34:34.080 --> 00:34:45.920 +standard Python script. So like ideally the way you sort of think about things in or in the way you read the code, it should be very similar to, should be minimally intrusive and should be very familiar + +00:34:45.920 --> 00:34:57.680 +with how you're, how you might sort of reason about, about like, you know, serial code or like single thread code. And so like, obviously the, the, a lot of the things that we do here don't demonstrate, + +00:34:58.320 --> 00:35:08.960 +or like demonstrate how you might sort of set up a project by yourself. So including like standard PIP installations, you can use uv if you want and then like standard imports. Right. And then moving down, + +00:35:08.960 --> 00:35:23.040 +we started to enter like using Ray data, which is the data processing multimodal data system that we have. It's a library on top of Ray and it provides a lot of simple abstractions to do all sorts of like + +00:35:23.040 --> 00:35:28.960 +big data tasks. So like here you have example, which is simply just like reading the dataset and then like subsampling it. + +00:35:28.960 --> 00:35:39.760 +So let me ask you a question about this. So you basically say ray.data.readparquet and you give it an S3 link to a parquet file, presumably either assigned or public. When I say that, does that + +00:35:39.760 --> 00:35:44.320 +load it into one machine or does that instruct all of the workers all to go and load this? + +00:35:44.320 --> 00:35:56.320 +It actually doesn't load anything, but if you do end up executing it, right? So it's lazy. So, so right now what you're doing is you're just actually just like constructing this, this program. + +00:35:56.320 --> 00:36:02.080 +But when you do execute it, it will execute on all the processes or like, you know, across like the entire cluster. + +00:36:02.080 --> 00:36:08.960 +In this scenario, it doesn't necessarily need to have one of them populate the data for all the others. They can all go straight to S3 and get it. + +00:36:08.960 --> 00:36:15.760 +And particularly in this example, this has, it probably points to a folder and the folder has many different files. + +00:36:15.760 --> 00:36:18.720 +Ah, so maybe it breaks. Yeah. Yeah. Maybe it breaks it up. + +00:36:18.720 --> 00:36:25.840 +We have a thing where every single line of the parquet file, every single row has some set of bytes. + +00:36:25.840 --> 00:36:36.400 +And what we want to do is transform those bytes into a, you know, something that's more manageable, like a numpy array. So that's kind of what we're doing here. We're loading the data + +00:36:36.400 --> 00:36:47.440 +with torch audio, and then we're doing some resampling and then, and then we're sort of like a returning that back to ray data. So that this is like a single map test map, where like a single function. + +00:36:47.440 --> 00:36:52.640 +So you write a function that does this, what you just described. It passes in an item. + +00:36:52.640 --> 00:37:05.600 +It's a row basically. Yeah. So I think it's like a row in the parquet file. And then you just say, go to your data that you, you know, you loaded with Ray and you say map given to the function, not called the function, right? Just give it the pointer to the function. + +00:37:05.600 --> 00:37:06.080 +That's right. + +00:37:06.080 --> 00:37:10.320 +And it figures out like, okay, here's how we'll distribute it across the cluster. + +00:37:10.320 --> 00:37:16.240 +This map, this resample function will be executed on like hundreds of processes across the cluster. + +00:37:16.240 --> 00:37:23.120 +And maybe it'll do something smart, like say I'm on row 1000. So it could do a skip, maybe, or something like that, potentially. + +00:37:23.120 --> 00:37:25.520 +All the data is already like sharded. + +00:37:25.520 --> 00:37:25.920 +Got it. + +00:37:25.920 --> 00:37:31.280 +So it will take the, whatever is available, and then it will just like run the function. + +00:37:31.280 --> 00:37:44.160 +That's pretty cool. And then you've got your whisper processor. Definitely have written some whisper processing code lately. This uses a class, not a function. And the reason for this is that, + +00:37:44.160 --> 00:37:47.360 +as you might have experienced, like loading whisper might take a little bit of time. + +00:37:47.360 --> 00:37:47.600 +Yes. + +00:37:47.600 --> 00:37:58.800 +If you scroll to the right on this. Okay. So here we don't use it, but like, you can also move the whisper model onto a GPU. And the way you would do that is you set on the bottom, and you just use like, you know, number GPUs equals one. + +00:37:58.800 --> 00:38:02.800 +Right here, it says device equals CPU, but yeah, but you could put GPU here, huh? + +00:38:02.800 --> 00:38:07.440 +You could. And also in map badges, you would put the map like GPU, whatever. + +00:38:07.440 --> 00:38:07.760 +Yeah. + +00:38:07.760 --> 00:38:19.920 +What's happening is that as you are doing the execution, what we will do is we will spawn a bunch of these classes across on different processes on the cluster. And so they'll be + +00:38:19.920 --> 00:38:31.360 +able to like preload the model, and then you can send data to this class, and then it will call the double under call. And then you have this basically like operator that streaming data in and out. + +00:38:31.360 --> 00:38:41.200 +I have something very embarrassing to admit, which is these double underscore methods. I always knew they were called dunder methods, but I didn't know that it's because it's like double underscore. + +00:38:41.200 --> 00:38:49.120 +I just put that together when Richard said double under. I've been using Python for like, you know, well over a decade and I never put that together. + +00:38:49.120 --> 00:39:02.880 +You know, what's really interesting, because I have to talk about so much of the stuff that is written and yeah, I've certainly gone through stages where like, I'll get a message, Michael, not like that. They say it like this. Like really, but how are we supposed to know? There are so many + +00:39:02.880 --> 00:39:17.200 +projects. I mean, dunder doesn't necessarily fall under this, but there's a lot of open source projects that could be pronounced so differently, so many ways. And I've seen a few that will have an MP3 file or an audio file that says, this is how it's pronounced. Press play. You know what I mean? + +00:39:17.200 --> 00:39:28.080 +Yeah. I'm right there with you. Amazing. One thing I wanted to cover with that. So that num GPUs thing is like really powerful. This is kind of like one of the core like powers of Ray. So this means that + +00:39:28.080 --> 00:39:38.480 +like, you know, if you think about this pipeline, right, we had first, we're kind of like chunking up the data and reading it across a bunch of processes in the cluster. So that's like a like IO bound + +00:39:38.480 --> 00:39:48.640 +operation. And then we had some kind of like pre-processing logic where we were like transforming those audio files, which is like a CPU bound operation. And then now we're doing this like + +00:39:48.640 --> 00:39:59.680 +GPU step, which here it's like this whisper preprocessor, or it could be any kind of like ML model inference or anything that runs on a GPU. So you have these like kind of very different + +00:39:59.680 --> 00:40:10.880 +like compute profiles, like the IO bound, the CPU bound, the GPU bound. And Ray, like the thing that makes it so powerful is that you can express this in like one program. And then you can also like + +00:40:10.880 --> 00:40:25.280 +efficiently use all of those resources. Okay. So maybe I've got five GPUs, but I've got a whole bunch of cores on each machine. Would it maybe make different choices about how it scales, given the different resources, like thinking about GPUs or versus CPUs? + +00:40:25.280 --> 00:40:37.680 +Yeah, that's exactly right. So you would, you know, maybe you need like four CPUs per GPU to like keep the GPU busy. So Ray data will, will basically do that kind of auto scaling itself in order to like + +00:40:37.680 --> 00:40:42.960 +keep the GPU as busy as possible. And this Ray data, it says Ray, a raw DS. + +00:40:42.960 --> 00:40:57.360 +That's a data set. Yeah. Data set. Is this have any analogies or sort of similar APIs to like DASC or not DASC to Polars, Polars or Pandas or any of these other, does it try to pretend to be one + +00:40:57.360 --> 00:41:10.560 +of these other things or is it just its own library? So the way you would do like a data frame library, I think would heavily index on the interactive experience. And that's not something that we + +00:41:10.560 --> 00:41:24.000 +focus so heavily on. In fact, like there's oftentimes where like, and also the other thing is like all those libraries, they will like DASC and Polars and Pandas and so on. Like they will focus a lot on + +00:41:24.000 --> 00:41:31.280 +TABUO data. And I think that's like, that's important, but it's not like our strong suite. + +00:41:31.280 --> 00:41:43.600 +Like our, the thing I think we would want to be 10x better is, is being able to do this sort of like heterogeneous compute and being able to orchestrate like very complex pipelines very simply. Whereas, + +00:41:43.600 --> 00:41:50.160 +and then like come back and sort of improve and make the tabular support like just on par and usable. + +00:41:50.160 --> 00:41:59.040 +I think that makes a lot of sense. It absolutely does. I guess maybe the last little bit, we have to go through this whole example, but maybe the persist story is a little bit interesting. + +00:41:59.040 --> 00:42:08.400 +The, if you go up one more, like the, to the tab before, I think actually, this is also very interesting where we're actually using the LLM based quality filter. Okay. + +00:42:08.400 --> 00:42:16.000 +We're using VLM as part of the pipeline. So VLM is like optimized inference engine for LLM models. + +00:42:16.400 --> 00:42:28.560 +And what you can do with RayData is you can actually just say like, Hey, I just want to shove VLM into one of the stages. And I want to, you can even do like more complex parallels and you can see like, Hey, this model is like a trillion parameters. + +00:42:28.560 --> 00:42:39.280 +And I just want to like put it somewhere inside. And that's something that you can very easily do with RayData. Is this a open weights, local running model or is, is that something like a API call to + +00:42:39.280 --> 00:42:50.800 +this? I mean, you can do here in this example, it is open weights model. So you would be able to self host and you can, there's also APIs to do like anthropic calls. Yeah. That is an interesting idea to + +00:42:50.800 --> 00:43:01.920 +put that in the middle there. And finally, like, yeah, writing out, you can write out to any source storage of like S3, NFS, so on and so forth. It's useful for like the data transformation tasks. + +00:43:01.920 --> 00:43:10.560 +This again, well, it's not like you're pulling all the data to one process and then writing it's like a distributed kind of partitioned, right? To the same file or to a set of files? + +00:43:10.560 --> 00:43:18.880 +To a set of files. Yeah. That makes sense. That seems a lot easier to coordinate like they just have. Yeah. Otherwise you'll have problems. Yeah, exactly. A bit of a race condition or something. + +00:43:19.240 --> 00:43:30.640 +Okay. This is super neat. I think this is a cool way to start writing the code, but then you've got to, you know, visualize it, right? See what's going on. So you have a dashboard, which is pretty cool. + +00:43:30.640 --> 00:43:42.960 +I'll scroll down and try to find some pictures of the dashboard. There's some, there's nice videos here as well, but it gives you, tell us about the dashboard. It gives you a lot of views into what's happening. The first thing I'd say is like, you know, the mission of Ray is sort of like make + +00:43:42.960 --> 00:43:55.680 +distributed computing easy. And I think anyone who's ever written like a multi-node, like application of any kind knows that like observability and debugging are like one of the core problems + +00:43:55.680 --> 00:43:59.680 +anytime that you're scaling out. So yeah, we invest a lot in this like observability tooling. + +00:43:59.680 --> 00:44:10.260 +So the Ray dashboard, it kind of mirrors the rest of Ray where we have sort of this like core, like parallel computing, like primitive part. So the Ray dashboard, you know, you can get like a + +00:44:10.260 --> 00:44:21.460 +cluster level view where you see like a summary of each node and like the resource consumption, like, you know, is it fully utilizing the CPUs and GPUs? What is running on that node? Like that + +00:44:21.460 --> 00:44:36.320 +kind of physical layout. But then we also have like more logical views. So what's shown on the screen now is this like task and actor breakdown. So you can see, you know, if you've submitted a thousand of a, like a read task, if you think about how that Ray data pipeline works, you're like + +00:44:36.320 --> 00:44:46.580 +submitting a bunch of tasks that are reading the data, you can see how many of those are running, how many have completed, if they failed, you can get like a summary of the stack traces. And then we + +00:44:46.580 --> 00:44:59.140 +also have some like higher level views that are specific to the Ray libraries. So you can imagine like this Ray core layer, it's really like kind of generic. So you have like tasks and actors and + +00:44:59.140 --> 00:45:08.680 +nodes, but it doesn't necessarily tell you about like, you know, the high level summary of what's happening in that data pipeline that we were talking about a few minutes ago. So we also have some, + +00:45:08.860 --> 00:45:14.840 +some high level visualizations for like surveying and training that help you understand what's happening in that. + +00:45:14.980 --> 00:45:27.740 +There's a bunch of different libraries that you've talked about. I don't know how much time we really have to go all into them, but you've got Ray core, which we talked about, and then Ray data, which we were using to read the data, but train, tune, serve, RL for reinforcement learning. + +00:45:28.060 --> 00:45:29.520 +And then even more libraries. + +00:45:29.960 --> 00:45:30.240 +Yeah. + +00:45:31.820 --> 00:45:33.160 +Expanded out to more libraries. + +00:45:33.340 --> 00:45:42.800 +One like high level comment is, I think Richard kind of mentioned this earlier, but like one of the things that we've really invested in a lot is like building this ecosystem around Ray. We want + +00:45:42.800 --> 00:45:53.680 +people to feel like Ray is not just a tool for like one workload. It's really something you can like build a platform around. So if you're doing any kind of like a large scale, like machine learning + +00:45:53.680 --> 00:46:04.100 +or AI, you know, Ray is, it's like, if you kind of build the infrastructure or like you use managed infrastructure for like the cluster setup and all that stuff. And then the people who are actually + +00:46:04.100 --> 00:46:14.460 +like writing the applications are like really empowered because they can write just like Python scripts to do all these different types of use cases from like training, the tuning to RL + +00:46:14.460 --> 00:46:25.760 +to data processing. So yeah, we see, I think it's very common that people who are using Ray are not just using one of these libraries. They're really kind of using a slew of them or maybe even all of them. + +00:46:25.920 --> 00:46:40.640 +I do think it empowers people quite a bit. Like write code, kind of like, you know, but call a Ray function instead. And then guess what? It's distributed across a bunch of machines, which is a really hard problem to solve. One of the extra libraries that's cool is the multi-processing pool. + +00:46:40.640 --> 00:46:53.960 +I just saw that one. We expanded it. That's kind of cool because if you're already trying to do scale out through multi-processing, just to get advantage, take advantage of the local cores, you could just say, use the Ray util multi-processing pool and then boom, off it goes. Right. + +00:46:54.100 --> 00:46:58.400 +I haven't looked at this in a long time. This is something that I wrote like eight years ago or something. + +00:46:59.000 --> 00:46:59.980 +2020 probably. + +00:47:00.260 --> 00:47:03.040 +It kind of one of those, I think that would be very general purpose. + +00:47:03.180 --> 00:47:17.200 +It's also, I think a good like conceptual introduction to Ray because, you know, people are familiar with multi-processing and they know that they can like use it to scale out on one node. Well, then Ray is just kind of like the next step if you want to scale out across multiple nodes. + +00:47:17.400 --> 00:47:30.880 +One thing that I thought is really cool is also you've got a debugger and a VS Code, presumably open VSX as well, extension that you can install and like look at the cluster, look at the jobs + +00:47:30.880 --> 00:47:35.380 +running. If something crashes, it'll like break and wait for a debugger to attach potentially. + +00:47:35.580 --> 00:47:36.340 +You want to talk about that? + +00:47:36.340 --> 00:47:51.180 +It's kind of like if you could use PDB, but across the cluster. So you can, you can like set a break point, like inside a remote function, that remote function might be running on like a different, a different machine. And then if like an exception is raised or like, there's + +00:47:51.180 --> 00:48:03.000 +just something happening there that like you couldn't debug locally, then you can like attach remotely to that process. And you can, you know, you can get like a backtrace and you can inspect local variables and stuff like that. + +00:48:03.000 --> 00:48:17.340 +It's very useful in the cases where maybe you did like local development and everything was working fine. And then for some reason, when you like deploy to a cluster, something is going wrong. Like maybe there's one piece of data that like is behaving in an unexpected + +00:48:17.340 --> 00:48:25.680 +way. This kind of gives you a way to directly debug that without having to write a ton of print statements and filter through them as I'm sure many people have. + +00:48:25.680 --> 00:48:36.000 +Exactly. You don't, you don't have to like print step one, step two, step 2.1, step 2.2, step 3. Like, cause you had to insert some more like to like break it down. + +00:48:36.080 --> 00:48:40.560 +The step 2.2.3.a has saved me a lot of times in my life though. + +00:48:41.060 --> 00:48:51.140 +I mean, it's like basically a bisection algorithm to find the problem, but like the, it's like having to go to and do the line numbers and basic eventually you just need to leave a gap. + +00:48:51.140 --> 00:49:04.660 +But it is really nice to use in VS Code because it gives you nearly the same debugger experience as you would get just for like a regular debugger. I saw a YouTube video about this and the question that somebody said, Hey, is there a PyCharm version of this? + +00:49:04.980 --> 00:49:07.800 +Is there a PyCharm version of it or just, just the VS Code derivatives? + +00:49:08.160 --> 00:49:22.060 +I think it's only VS Code, but Hey, we're always looking for contributors. It's probably not, it's probably not that hard to extend. It's just a, as you can see from the number of libraries over there. The Ray team is quite busy. Let's talk real briefly about the ecosystem. + +00:49:22.220 --> 00:49:27.120 +We're getting a little short on time, but what is this ecosystem compared to like all of your tools? + +00:49:27.120 --> 00:49:36.680 +So integrations with say like Airflow, Apache Airflow, or even Dask, which is kind of interesting that it integrates with Dask. And so what's the story with this? + +00:49:36.880 --> 00:49:42.560 +I think there are two aspects to integration. Actually, I'm reminded, I need to update this page. + +00:49:42.560 --> 00:49:55.560 +There's like projects where you want to interoperate with Ray. So they sit side by side or like, it's like a complimentary tool. Airflow is an example of that. Like Dask would be like something + +00:49:55.560 --> 00:50:05.920 +where you can do a lot more of your data processing on the side and then, and then Ray stuff on the other side. Flight would be like another, so, you know, workflow or automation, you would like use + +00:50:05.920 --> 00:50:18.140 +that with Ray, but not like in Ray or around Ray. Whereas like there are other projects that are built on top of Ray. So like Moden that you just saw, Daft, these are libraries that, that leverage Ray + +00:50:18.140 --> 00:50:30.880 +and to, to orchestrate and scale. And there's like a separate API and Ray isn't necessarily exposed as the API to the users. So yeah, so I think that's something that is particularly like lively, especially + +00:50:30.880 --> 00:50:42.800 +now in the reinforcement learning and multimodal data processing space. Frankly, I'm looking through this, like a lot of these projects have sort of like gone, gone, like have sort of evolved or like, + +00:50:42.860 --> 00:50:52.540 +have like lost their community. And I think there's a, actually a massive Ray ecosystem that isn't represented on this, this screen here that is like actively building on top of Ray. + +00:50:52.660 --> 00:50:54.580 +All right. Well, just give you some homework. There you go. + +00:50:54.740 --> 00:51:08.760 +Yeah. Richard kind of mentioned this, but the way I think about it is like, kind of like things above Ray and things below Ray. So like above Ray is like the, like higher level libraries, like the reinforcement learning library, data processing library. And then below Ray is like + +00:51:08.760 --> 00:51:19.220 +integrating Ray into like the different infrastructure. So like with Airflow and the cube Ray, and basically like allowing you to run Ray on top of like any type of like hardware cluster + +00:51:19.220 --> 00:51:33.880 +management solution. So we really like try to view Ray as this kind of like, like if people, I don't know if I'm dating myself, but you know, in the internet model, there's like the narrow waste, right? Which is like TCP IP. So we view Ray as kind of like the narrow waste of the like AI, + +00:51:33.880 --> 00:51:35.480 +like distributed computing ecosystem. + +00:51:35.760 --> 00:51:40.160 +One more thing. I think we're, we've got time to talk just a little bit about the business model. + +00:51:40.360 --> 00:51:53.880 +So over on Ray.io, I can see that I can go to like GitHub or go to the docs, but also you've got AnyScale, which lets you basically is the infrastructure behind running Ray, right? Is that + +00:51:53.880 --> 00:51:56.580 +this sort of the business side of Ray? + +00:51:56.720 --> 00:52:08.120 +AnyScale is a company, but also a product. So for example, like Ray is like a software library that you can run, but there is a lot of, if you're sort of deploying Ray for you're like an internal + +00:52:08.120 --> 00:52:18.340 +platform for a company, like there's still a lot of other bells and whistles that you'll, you'll sort of want. So for example, like being able to have a fast interactive development, + +00:52:18.740 --> 00:52:31.640 +being able to optimize, like the time takes for the workloads to start up, having great observability and debuggability and being able to sort of like share resources across different teams within, + +00:52:31.980 --> 00:52:36.540 +within like across different Ray jobs. And, and then also being able to optimize your Ray workloads. + +00:52:37.220 --> 00:52:48.500 +So these are all like features and capabilities that you'd get with AnyScale. And, and yeah, and then also like, you know, support, being able to sort of deploy and manage and upstream fixes to + +00:52:48.500 --> 00:52:56.200 +Ray that sort of help your, your enterprise, like in your company achieve a goal of needs for your machine learning platform. That's like a lot of stuff that we do. + +00:52:56.200 --> 00:53:10.380 +You know, I think this is one of the core ways that people are making open source stuff, their business, right? Like we built you a great library, but there's this whole operational side of it that you maybe either don't want to do, or you don't have a bunch of servers or whatever. + +00:53:10.480 --> 00:53:12.500 +And we'll just, for a price, we'll just take care of that. Right. + +00:53:12.580 --> 00:53:27.240 +There's like a couple of ways that you can go. Like, so one thing I want to, I want to say is that having a company, like a successful company behind Ray is like critical for its health. Like, there's no way that we could have, that we could have built like as many of the + +00:53:27.240 --> 00:53:40.620 +libraries and like funded as many of like the ecosystem integrations. And like, I mean, just built something with as big of a scope as Ray, if we didn't have like a company backing it, like paying as many people to work on it as were. And yeah, I think there's like a few different + +00:53:40.620 --> 00:53:54.440 +ways that you can go about this, like kind of open source monetization thing. Like AnyScale model is, is largely this, yeah, like managed infrastructure and like the hard parts around it. You know, there's some people that also kind of go for the more like support expertise model. I think that + +00:53:54.440 --> 00:54:06.800 +could work if, you know, if you really want to like stay small, like if you have a smaller open source project, it's just a couple of people. And like, you know, you're trying to make enough money to survive and keep working on that project. Then honestly, I think that's the easier route + +00:54:06.800 --> 00:54:10.560 +than trying to build a whole managed product because it's, it's not easy. + +00:54:10.560 --> 00:54:23.500 +It's kind of a, kind of just a consulting story. This, this other side you're talking about is like, I will be your X open source project, X consultant. And guess what? I created it. So I'm, who else is going to be better? You know what I mean? + +00:54:23.580 --> 00:54:37.180 +That's very real. Like if I would recommend like a lot of open source people, like consider that, even if it's just the, like the start of something is like, that's the way that you really like engage with people and understand their problems and like understand where the business value is. + +00:54:37.180 --> 00:54:40.900 +A hundred percent. Let me ask you one more tech oriented question before we call it. + +00:54:41.140 --> 00:54:51.800 +What about deployment? I have 10 servers in my cluster. I changed one line in my code and I want to try it now. Now what? How hard is it to get it to update everywhere? + +00:54:51.980 --> 00:55:02.420 +So that is something that we, that I personally spent a lot of time working on. I think that Ray actually has a very good story for it. So there's like, there's kind of a tiered approach. So it sort of + +00:55:02.420 --> 00:55:16.440 +depends. Like obviously if you're changing, like if you need a different, like, like CUDA version or something, then that will require you to basically like redeploy the cluster. But that's something that happens like pretty seldom. Like, you know, maybe you do that every couple of months, + +00:55:16.640 --> 00:55:30.540 +something like that. If you're just changing, like, you know, in Ray, you have this like driver script, which is the main like orchestration code. So if you're just changing that, and that's like kind of what you're iterating on, like more frequently, then you can just change like that code + +00:55:30.540 --> 00:55:40.880 +inline. And then when you submit the job or like connect to the cluster, Ray has this thing called runtime environment, which includes basically auto packaging your local code. So what it does is it + +00:55:40.880 --> 00:55:46.380 +actually just like zips up the local files, uploads them to like a coordinator process in the cluster. + +00:55:46.860 --> 00:56:00.460 +And then when you go to actually run the tasks and actors that require that code, they have like kind of a, an internal ID that points to it, and they'll pull it down. So that means that you can, like, if you're just editing your script and rerunning, it's a matter of like less than + +00:56:00.460 --> 00:56:04.960 +one second to update. Oh, that's nice. Yeah. Yeah. That's a huge productivity gain. + +00:56:05.100 --> 00:56:09.420 +Yeah. I was thinking this must, the more you scale out, the harder it's going to be as well. Right? + +00:56:09.420 --> 00:56:23.540 +Yeah. And if you need to wait for a hundred nodes to pull a Docker image, every time you change one line of code, you're going to have a bad time. That makes me think of one more real quick thing is, so I have a job that's running. Maybe it takes 10 minutes. I make a change three minutes after + +00:56:23.540 --> 00:56:28.640 +submitting it, a new version gets deployed. What's the story with versioning running workflows? + +00:56:28.640 --> 00:56:39.380 +That's something where, that we kind of like leave to the outside of Ray layer. So a lot of people have different ways to do that. Like if you're running on Kubernetes, like maybe you're like checking in + +00:56:39.380 --> 00:56:49.160 +your CRD into your like repo, or maybe you're using something like Apache Airflow. So we kind of leave that to like the orchestration layer. Like inside of AnyScale, we have a concept of like an AnyScale + +00:56:49.160 --> 00:56:59.120 +job, which is sort of the code artifact and like the cluster configuration and you're like infrastructure configuration. So that's like inside of AnyScale, that's kind of like the unit + +00:56:59.120 --> 00:57:04.460 +of like reproducibility or versioning. And yeah, folks basically build that kind of on top of Ray. + +00:57:04.520 --> 00:57:13.140 +Well, very cool project, Richard and Edward. Thank you both for being here. How about a final call to action? People are interested. They want to get started with Ray. What do you tell them? + +00:57:13.220 --> 00:57:14.940 +Go to the Ray website and try it out. + +00:57:15.040 --> 00:57:17.000 +Check out the documentation. We've got a whole lot of examples. + +00:57:17.660 --> 00:57:17.920 +Awesome. + +00:57:18.100 --> 00:57:27.540 +Yeah. I would say any kind of machine learning workload or, or just general, like parallel Python, like just give it a spin. Amazing. Well, thanks for being here and talk to y'all later. Thank you. + +00:57:27.600 --> 00:57:27.940 +Thank you. + +00:57:29.200 --> 00:57:43.300 +This has been another episode of Talk Python To Me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show. This episode is sponsored by Sentry's Seer. If you're tired of debugging in the dark, give Seer a try. There are plenty of AI tools that + +00:57:43.300 --> 00:57:54.520 +help you write code, but Sentry's Seer is built to help you fix it when it breaks. Visit talkpython.fm/sentry and use the code talkpython26, all one word, no spaces, for $100 + +00:57:54.520 --> 00:58:05.880 +in Sentry credits. What if your AI agents worked like FastAPI microservices, typed, autonomous, and discovering each other at runtime? That's the world Agent Field is building. Join them + +00:58:05.880 --> 00:58:18.500 +at talkpython.fm/Agent Field. If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, + +00:58:18.640 --> 00:58:29.680 +Flask, Django, HTMX, and even LLMs. Best of all, there's no subscription in sight. Browse the catalog at talkpython.fm. And if you're not already subscribed to the show on your favorite + +00:58:29.680 --> 00:58:38.380 +podcast player, what are you waiting for? Just search for Python in your podcast player. We should be right at the top. If you enjoy that geeky rap song, you can download the full track. + +00:58:38.380 --> 00:58:42.520 +The link is actually in your podcast below or share notes. This is your host, Michael Kennedy. + +00:58:42.720 --> 00:58:46.160 +Thank you so much for listening. I really appreciate it. I'll see you next time. + +00:58:46.160 --> 00:58:48.160 +Bye. + +00:59:16.160 --> 00:59:18.160 +Bye. From a4d69350a99791f64e4caf2c235270cbd183087b Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 6 May 2026 13:32:28 -0700 Subject: [PATCH 13/16] transcript fixes --- transcripts/205-beginners-panel.txt | 2 +- transcripts/205-beginners-panel.vtt | 2 +- transcripts/350-steering-council.txt | 2 +- transcripts/350-steering-council.vtt | 2 +- transcripts/537-datastar.txt | 2 +- transcripts/537-datastar.vtt | 2 +- .../parallel-python-at-any-scale-with-ray-transcript.vtt | 4 ++-- youtube_transcripts/488-lancedb.vtt | 2 +- youtube_transcripts/517-agentic-ai-youtube.vtt | 2 +- .../537-datastar-modern-web-dev-simplified-youtube.vtt | 2 +- 10 files changed, 11 insertions(+), 11 deletions(-) diff --git a/transcripts/205-beginners-panel.txt b/transcripts/205-beginners-panel.txt index d3596e0..8bd8c69 100644 --- a/transcripts/205-beginners-panel.txt +++ b/transcripts/205-beginners-panel.txt @@ -620,7 +620,7 @@ 00:22:25 I still have to take back and remind myself there could be a better way to structure my code in an object-oriented way. -00:22:32 This happens a lot when I start chaining functions together in an attempt to utilize drive principles. +00:22:32 This happens a lot when I start chaining functions together in an attempt to utilize DRY principles. 00:22:38 Nowadays, after I find myself overloading my overloaded functions with more overloaded functions, diff --git a/transcripts/205-beginners-panel.vtt b/transcripts/205-beginners-panel.vtt index 0e2c965..34b057e 100644 --- a/transcripts/205-beginners-panel.vtt +++ b/transcripts/205-beginners-panel.vtt @@ -958,7 +958,7 @@ Due to the fact the previous work I was doing was almost exclusively surrounding I still have to take back and remind myself there could be a better way to structure my code in an object-oriented way. 00:22:32.220 --> 00:22:38.360 -This happens a lot when I start chaining functions together in an attempt to utilize drive principles. +This happens a lot when I start chaining functions together in an attempt to utilize DRY principles. 00:22:38.360 --> 00:22:45.380 Nowadays, after I find myself overloading my overloaded functions with more overloaded functions, diff --git a/transcripts/350-steering-council.txt b/transcripts/350-steering-council.txt index 4f95093..d7dafb2 100644 --- a/transcripts/350-steering-council.txt +++ b/transcripts/350-steering-council.txt @@ -2290,7 +2290,7 @@ 00:59:34 Really quickly, because we're getting short on time. -00:59:37 Alvaro out in the audience asked, was enum.stir enum delay related to breaking Pydantic? +00:59:37 Alvaro out in the audience asked, was enum.StrEnum delay related to breaking Pydantic? 00:59:44 We can give a really, really quick answer. diff --git a/transcripts/350-steering-council.vtt b/transcripts/350-steering-council.vtt index 8ac1727..4f93bc4 100644 --- a/transcripts/350-steering-council.vtt +++ b/transcripts/350-steering-council.vtt @@ -3460,7 +3460,7 @@ Yeah, absolutely. Really quickly, because we're getting short on time. 00:59:37.660 --> 00:59:44.240 -Alvaro out in the audience asked, was enum.stir enum delay related to breaking Pydantic? +Alvaro out in the audience asked, was enum.StrEnum delay related to breaking Pydantic? 00:59:44.240 --> 00:59:45.960 We can give a really, really quick answer. diff --git a/transcripts/537-datastar.txt b/transcripts/537-datastar.txt index 5f49e54..6511dca 100644 --- a/transcripts/537-datastar.txt +++ b/transcripts/537-datastar.txt @@ -1960,7 +1960,7 @@ 00:57:58 While I'm sitting here on this open VSX registry, do you all have advice for making Datastar work well -00:58:05 with Identic AI and Claude Code, Cursor, et cetera? +00:58:05 with agentic AI and Claude Code, Cursor, et cetera? 00:58:08 There's some active research going on like in Oslo at a college diff --git a/transcripts/537-datastar.vtt b/transcripts/537-datastar.vtt index c21f5d7..6fcb669 100644 --- a/transcripts/537-datastar.vtt +++ b/transcripts/537-datastar.vtt @@ -3487,7 +3487,7 @@ WEBVTT do you all have advice for making Datastar work well 00:58:05.580 --> 00:58:08.500 -with Identic AI and Claude Code, Cursor, et cetera? +with agentic AI and Claude Code, Cursor, et cetera? 00:58:08.740 --> 00:58:10.280 There's some active research going on diff --git a/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt b/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt index bbb2645..5a73c42 100644 --- a/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt +++ b/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt @@ -37,7 +37,7 @@ This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over Let's connect on social media. You'll find me and Talk Python on Mastodon, BlueSky, and X. The social links are all in your show notes. You can find over 10 years of past episodes at talkpython.fm. And if you want to be part of the show, you can join our recording live streams. 00:02:01.840 --> 00:02:12.040 -That's right, we live stream the raw uncut version of each episode on YouTube. Just visit Talk Python.fm slash YouTube to see the schedule of upcoming events. Be sure to subscribe there +That's right, we live stream the raw uncut version of each episode on YouTube. Just visit talkpython.fm/youtube to see the schedule of upcoming events. Be sure to subscribe there 00:02:12.040 --> 00:02:26.740 and press the bell so you'll get notified anytime we're recording. This episode is sponsored by Sentry's Seer. If you're tired of debugging in the dark, give Seer a try. There are plenty of AI tools that help you write code, but Sentry's Seer is built to help you fix it when it breaks. @@ -628,7 +628,7 @@ standard Python script. So like ideally the way you sort of think about things i with how you're, how you might sort of reason about, about like, you know, serial code or like single thread code. And so like, obviously the, the, a lot of the things that we do here don't demonstrate, 00:34:58.320 --> 00:35:08.960 -or like demonstrate how you might sort of set up a project by yourself. So including like standard PIP installations, you can use uv if you want and then like standard imports. Right. And then moving down, +or like demonstrate how you might sort of set up a project by yourself. So including like standard pip installations, you can use uv if you want and then like standard imports. Right. And then moving down, 00:35:08.960 --> 00:35:23.040 we started to enter like using Ray data, which is the data processing multimodal data system that we have. It's a library on top of Ray and it provides a lot of simple abstractions to do all sorts of like diff --git a/youtube_transcripts/488-lancedb.vtt b/youtube_transcripts/488-lancedb.vtt index 6cf24ea..21624b6 100644 --- a/youtube_transcripts/488-lancedb.vtt +++ b/youtube_transcripts/488-lancedb.vtt @@ -421,7 +421,7 @@ Even for something like Apache Arrow, like PyArrow kind of API, it's the standar data. 00:08:40.200 --> 00:08:43.840 -But the ChatGPD still makes up APIs for that. +But the ChatGPT still makes up APIs for that. 00:08:43.840 --> 00:08:51.940 And also, I think if you were looking at in terms of the effect on the developer community, diff --git a/youtube_transcripts/517-agentic-ai-youtube.vtt b/youtube_transcripts/517-agentic-ai-youtube.vtt index 050ea6b..83623fb 100644 --- a/youtube_transcripts/517-agentic-ai-youtube.vtt +++ b/youtube_transcripts/517-agentic-ai-youtube.vtt @@ -1354,7 +1354,7 @@ Because right now I feel like if you want reliability in like code formatting, y Like the agentic tools are just not, I haven't, maybe I'm. 00:47:57.300 --> 00:48:07.460 -I've gotten Cloud Sonnet to know that it's supposed to run Ruff format and rough check --fix whenever it finishes anything. +I've gotten Claude Sonnet to know that it's supposed to run Ruff format and rough check --fix whenever it finishes anything. 00:48:07.890 --> 00:48:15.380 And so at the end, it'll say, and now I'm supposed to do this to make sure it's tidy and the style you like according to your ruff.toml, right? diff --git a/youtube_transcripts/537-datastar-modern-web-dev-simplified-youtube.vtt b/youtube_transcripts/537-datastar-modern-web-dev-simplified-youtube.vtt index c4f7fd7..58ec9df 100644 --- a/youtube_transcripts/537-datastar-modern-web-dev-simplified-youtube.vtt +++ b/youtube_transcripts/537-datastar-modern-web-dev-simplified-youtube.vtt @@ -3397,7 +3397,7 @@ WEBVTT do you have advice for making Datastar work well 01:02:21.320 --> 01:02:24.420 -with Identic AI and Claude Code, Cursor, et cetera. +with agentic AI and Claude Code, Cursor, et cetera. 01:02:25.620 --> 01:02:29.240 I will say that there's some active research going on From 6ac26e69d44b18bcd01f833234649b28c8028b5d Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 6 May 2026 13:36:01 -0700 Subject: [PATCH 14/16] transcripts --- .../545-owasp-top-10-transcript-final.txt | 2290 +++++++++++ .../545-owasp-top-10-transcript-final.vtt | 3436 +++++++++++++++++ ...pps-for-python-people-transcript-final.txt | 1638 ++++++++ ...pps-for-python-people-transcript-final.vtt | 2458 ++++++++++++ ...ython-at-any-scale-with-ray-transcript.txt | 740 ++++ ...thon-at-any-scale-with-ray-transcript.vtt} | 0 6 files changed, 10562 insertions(+) create mode 100644 transcripts/545-owasp-top-10-transcript-final.txt create mode 100644 transcripts/545-owasp-top-10-transcript-final.vtt create mode 100644 transcripts/546-self-hosting-apps-for-python-people-transcript-final.txt create mode 100644 transcripts/546-self-hosting-apps-for-python-people-transcript-final.vtt create mode 100644 transcripts/547-parallel-python-at-any-scale-with-ray-transcript.txt rename transcripts/{parallel-python-at-any-scale-with-ray-transcript.vtt => 547-parallel-python-at-any-scale-with-ray-transcript.vtt} (100%) diff --git a/transcripts/545-owasp-top-10-transcript-final.txt b/transcripts/545-owasp-top-10-transcript-final.txt new file mode 100644 index 0000000..1add0e6 --- /dev/null +++ b/transcripts/545-owasp-top-10-transcript-final.txt @@ -0,0 +1,2290 @@ +00:00:00 The OWASP Top 10 just got a fresh update, and there are some big changes. + +00:00:03 Supply chain attacks, exceptional condition handling, and more. + +00:00:07 Tanya Janca is back on Talk Python to walk us through every single one of them. + +00:00:12 And we're not just talking theory here. + +00:00:14 We're going to turn Claude Code loose on a particularly crappy web project and see what it finds. + +00:00:20 Let's do this. + +00:00:21 It's Talk Python To Me, episode 545, recorded April 8th, 2026. + +00:00:28 Talk Python To Me. + +00:00:30 Yeah, we ready to roll. + +00:00:31 Upgrading the code. + +00:00:32 No fear of getting old. + +00:00:34 Async in the air. + +00:00:35 New frameworks in sight. + +00:00:36 Geeky rap on deck. + +00:00:38 Quarth Crew, it's time to unite. + +00:00:39 We started in Pyramid. + +00:00:41 Cruising old school lanes. + +00:00:43 Had that stable base, yeah, sir. + +00:00:44 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:49 This is your host, Michael Kennedy. + +00:00:51 I'm a PSF fellow who's been coding for over 25 years. + +00:00:55 Let's connect on social media. + +00:00:56 You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:00 The social links are all in your show notes. + +00:01:02 You can find over 10 years of past episodes at talkpython.fm. + +00:01:06 And if you want to be part of the show, you can join our recording live streams. + +00:01:10 That's right. + +00:01:10 We live stream the raw uncut version of each episode on YouTube. + +00:01:14 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:19 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:23 This episode is brought to you by Temporal, durable workflows for Python. + +00:01:27 Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +00:01:34 Get started at talkpython.fm/Temporal. + +00:01:38 Hello, Tanya Janca. + +00:01:39 Welcome back to Talk Python To Me. + +00:01:40 Awesome to have you here. + +00:01:41 Oh my gosh, Michael. + +00:01:42 It's so nice to see you. + +00:01:43 Yeah, it's really great to see you as well. + +00:01:45 I remember last time I was nervous you were on the show because you're going to make me feel concerned about all my software running on the internet that now all has all these issues I just realized. + +00:01:55 We're back for the 2025 edition. + +00:01:57 And I know the year is 2026. + +00:01:59 Please don't email me. + +00:02:00 I mean, email me, but not for that reason. + +00:02:02 But this is the 2025 OWASP top 10, which is pretty new, right? + +00:02:07 Yeah, so we released it December 31st, 2025 so that it could stay 2025 on it. + +00:02:13 Do you know how much branding has to change if we don't get it out this year? + +00:02:17 Let's just go. + +00:02:18 That's incredible. + +00:02:19 I didn't realize it was that close to the wire. + +00:02:20 We had released the release candidate. + +00:02:23 So every time it's released, we release a release candidate to say, this is what we're thinking. + +00:02:29 And then we ask the community for feedback. + +00:02:32 And I don't know if you remember in previous versions before I joined, there was some drama where sometimes the community is like, absolutely not. + +00:02:39 You are incorrect. + +00:02:40 Or there's vendor influence or whatever. + +00:02:42 And then they've had to rework it. + +00:02:43 But this time it was the smoothest it's literally ever been since the first time where all the links were or all the GitHub issues were great. + +00:02:53 Like, hey, you know, here's a great example of that attack. + +00:02:57 Do you want to use it? + +00:02:59 And we're like, yes. + +00:02:59 Or, you know, the grammar is wrong here or the, you know, the links are wrong. + +00:03:03 We had a couple of, well, this one should be number one because that's what my product solves. + +00:03:08 We're like, well, we'll hear that feedback. + +00:03:11 Exactly. + +00:03:11 You failed to mention our product. + +00:03:13 I'm like, oh, I do see how that happened. + +00:03:15 Yeah. + +00:03:16 But other than like the feedback was overall just overwhelmingly, yes, we agree, which was very validating. + +00:03:22 Yeah. + +00:03:23 I'm sure there was a lot of, boy, if I could get the OWAS top 10 to reference my solution. + +00:03:28 That's some good marketing right there. + +00:03:30 Incredible. + +00:03:30 So before we dive into that with a bit of a Python focus, let's just hear a little bit about you, who you are. + +00:03:39 You've started a podcast since you've been on the show. + +00:03:42 Tell us about you. + +00:03:43 So I'm Tanya and I was a software developer that switched into application security. + +00:03:48 I went to the dark side, Michael. + +00:03:50 I started speaking at conferences so I could get in free and writing and then ended up writing two books. + +00:03:55 And now I teach secure coding and like how to use AI securely and all of those things to large companies. + +00:04:02 And then speak at conferences. + +00:04:04 Serve as like a habit. + +00:04:05 I can't stop. + +00:04:06 You don't get paid to do that usually. + +00:04:09 And so recently I started a podcast called DevSec Station. + +00:04:12 And it's five to 10 minute lessons on secure coding and it's free. + +00:04:19 I used to have a podcast called We Hack Purple Podcast and I, like my company got bought and absorbed, et cetera. + +00:04:26 And eventually the podcast was retired. + +00:04:28 I've missed having a podcast, Michael. + +00:04:30 I'm sure as a podcast host, you can relate. + +00:04:31 It's really nice to be able to create a piece of art and release it. + +00:04:35 It's a very interesting medium and you get to just reach out to people or explore ideas that are just interesting to you. + +00:04:43 And long as there's a through thread, you can kind of do whatever you want to. + +00:04:47 It's great. + +00:04:48 Yeah, I love it. + +00:04:48 I wanted to teach some lessons and I wanted them to be just really short. + +00:04:53 And the first season I'm exploring the idea that the supply chain is changing. + +00:04:58 The supply chain security used to be just dependencies that people worried about. + +00:05:02 But now I'm like, what if that tech surface is actually very different than we realize? + +00:05:08 And so I'm talking about how developers can protect themselves, protect the organizations, protect their build pipelines, et cetera. + +00:05:14 And so, yeah, I'm excited to see what people think of it. + +00:05:17 Yeah, I encourage people to subscribe. + +00:05:19 That's really cool. + +00:05:19 Five to 10 minutes. + +00:05:21 Is it daily or weekly? + +00:05:22 So I released one two weeks ago and I was kind of thinking of releasing one tomorrow, but I need to just get the edits. + +00:05:27 It's, I hired some students and they're learning video editing and it's been very exciting. + +00:05:33 Yeah, that's the back end of all this type of work that people don't realize is it's an hour or 10 minutes or whatever, but then there's the whole production distribution, et cetera. + +00:05:43 I'm just a control freak. + +00:05:44 That's the problem, Michael. + +00:05:46 I just want to do it myself. + +00:05:48 Yes, exactly. + +00:05:49 There's a bit of a wind noise thing. + +00:05:51 Can we do that again? + +00:05:52 No, it is tough. + +00:05:53 It is really, really tough to kind of find that balance. + +00:05:56 But yeah, people check that out. + +00:05:57 That's awesome. + +00:05:58 Yeah. + +00:05:59 And is SheHacksPurple still your domain? + +00:06:02 Yeah. + +00:06:02 So if people go to SheHacksPurple.ca, you will find my website and my services and my blog. + +00:06:08 So lately I'm blogging a lot about how we can combine behavioral economic interventions, which is like the science of why people make decisions to the software development ecosystem + +00:06:19 so that we basically set up secure defaults and other things that just nudge developers to do the secure thing and make the secure thing always the easiest path. + +00:06:30 And so not how do we manipulate them and pressure them and make them feel bad, but more how can we remove cognitive load that's not necessary? + +00:06:39 How can we make it more obvious what we hope that they'll do? + +00:06:42 How can we make it so like it requires effort to do the bad thing? + +00:06:46 There's a phrase that I got, I think from Scott Guthrie was the one who spoke about it at Microsoft, but it doesn't really matter. + +00:06:53 The help people, help developers and security folks fall into the pit of success. + +00:06:58 Like you've got to climb out of the thing you're supposed to do and actively do it wrong. + +00:07:02 You know what I mean? + +00:07:03 Exactly. + +00:07:04 Exactly. + +00:07:04 So I'm writing a series on that based on a talk I did. + +00:07:08 So sometimes I'll do a talk and then I'm really excited about it. + +00:07:10 And I'm like, well, now I can nerd out as much as I want on my blog. + +00:07:13 I want to talk about what is the OWASP top 10? + +00:07:17 What is OWASP? + +00:07:17 What is the OWASP top 10 and all of that? + +00:07:20 But before we kind of get into that, I do want to set the stage just a little bit, because when people think about Python security or you pick your language, there's like every language and framework has a little security gotchas. + +00:07:34 Like in Python, there's a, I think it's a YAML parser, but it's too, it can run arbitrary code. + +00:07:41 So you got to do like the safe YAML parsing. + +00:07:43 And then there's, there's pickles, which is a serialization format that can run arbitrary code. + +00:07:47 So people say, these are the things to look out for. + +00:07:50 I actually think those are fairly useless. + +00:07:52 I think the real problem is things that developers do to write code and they either omit or add actions or steps that they should or shouldn't have done depending on the situation. + +00:08:04 And really that's what OWASP focuses on, right? + +00:08:07 Technically it is in a nonprofit organization, but in what I feel it is, is an international community of thousands and thousands and thousands of people who want there to be more secure software. + +00:08:20 And so we have chapters where people meet each month. + +00:08:22 There's over 300 worldwide, Michael. + +00:08:25 It's amazing. + +00:08:27 Wow. + +00:08:27 Yeah. + +00:08:27 Almost every big city in all of Canada has one. + +00:08:30 And like, we can't agree on a lot of things in Canada, but apparently we love OWASP. + +00:08:34 And like in India, oh my gosh, they have so many. + +00:08:37 And in the United States, like people love, oh, I love OWASP. + +00:08:41 And then we have over 100 active open source projects. + +00:08:45 So we have free books, free documents, free tools, free software, like everything you can think of. + +00:08:51 We're like, oh, I'll build one for you. + +00:08:53 And then, you know, there's a Slack channel with thousands of us on there being nerds together. + +00:08:56 And then twice a year, we have one in Europe and one in North America where we have a conference and we gather and there's talks and teachings and all the things, right? + +00:09:08 And I've been a part of OWASP since 2015 was the first thing I attended. + +00:09:16 And then 2016, I was a chapter leader because I don't know how to do small things. + +00:09:19 I only know how to do really big things because I was like, oh, I'm in. + +00:09:23 And I've just been, so I'm now a lifetime distinguished member because I've volunteered for over 10 years and I'm their biggest fan. + +00:09:32 I'm totally president of the non-existent fan club. + +00:09:35 And the thing, so the weird thing, Michael, so OWASP does so many amazing things. + +00:09:40 There's so many amazing people. + +00:09:41 But the thing that we're absolutely most famous for is called the OWASP top 10. + +00:09:46 And I volunteered in Norway at the OWASP booth at a conference because I gave a talk and then I had nothing else to do. + +00:09:55 And I don't mean to sound rude, but unless it's a security talk, I'm not going. + +00:09:59 And it was a developer talk. + +00:10:00 And so I went to the other two security talks and then I was like, what am I going to do? + +00:10:03 So I volunteered at the OWASP booth. + +00:10:05 And every person that walked by went top 10. + +00:10:08 That's awesome. + +00:10:09 And any of them that knew us, they only knew the top 10. + +00:10:13 They didn't know we had chapters. + +00:10:14 They didn't know we had any other things to help them. + +00:10:16 But literally person after person, top 10. + +00:10:20 That's pretty funny. + +00:10:21 Like a few weeks later, the top 10 team invited me to join. + +00:10:24 And I was like, me? + +00:10:26 And I was like, okay. + +00:10:28 I knew about the whole project because of the top 10 as well, honestly. + +00:10:32 Well, I mean, then we're doing a great job. + +00:10:34 And so we finally wrote a new one. + +00:10:36 And the team's like, okay, Tanya, you like talking the most. + +00:10:39 So you go tell them. + +00:10:41 You go tell them it's ready. + +00:10:42 Actually, if you go to the OWASP top 10, there's a GitHub link at the top. + +00:10:47 And you can see that there's, if you go there, you can actually see historically the 2003, 4, 7, 10, 2017, 2021, and the 2025. + +00:10:55 And you go and there's all the markdown files and presentations and et cetera. + +00:10:59 So you can kind of get the historical evolution as well. + +00:11:02 People file issues. + +00:11:04 And then I have to, me or, you know, we're twisting or that's what we do. + +00:11:09 We respond. + +00:11:10 Sometimes we respond and we're like, no. + +00:11:13 This is the third time you've asked. + +00:11:14 No. + +00:11:15 We talked about it twice. + +00:11:16 Yeah. + +00:11:17 We try really hard to always be open because the community, sometimes we don't have the data that supports the thing. + +00:11:23 But let me tell everyone what it is. + +00:11:24 So if you've been hiding under a rock, the OWASP top 10 is an awareness document of the top 10 things, according to the data we gathered and multiple community surveys of risks to web applications. + +00:11:37 Most of them relate to all software, but technically this one is about web applications. + +00:11:43 And you might not realize, but underneath the next steps, there's three more secret items because we couldn't decide, Michael. + +00:11:52 And Vibe coding needed to be on there. + +00:11:54 And then there was sort of a tie for the number 10. + +00:11:58 So we want to include the one that was the tie. + +00:12:00 And then we felt memory safety is still so unbelievably critical. + +00:12:05 We had to comment on that as well. + +00:12:07 So we had to talk about those as well. + +00:12:10 So that was good. + +00:12:12 So it's the top 10, but these are three other things for your consideration that are very important. + +00:12:16 I did not realize those. + +00:12:17 How interesting. + +00:12:18 I do think in whenever you release your next version in two, three, four years, whatever, there's going to be a strong AI bent and we're going to get into the AI angle of security in this conversation, which is going to be really fun. + +00:12:32 But yeah, I think we're just beginning this. + +00:12:35 Like you called it out with Vibe coding, but there's more to that even. + +00:12:38 Making the top 10 is complicated because we have to do it based on data. + +00:12:43 And the data I would like would be the postmortem from security incidents and would be, you know, the AppSec team telling us the things that are happening, not what we actually + +00:12:54 get, which is a bunch of SaaS vendors and DAS vendors telling us what their automated tools are capable of finding. + +00:13:00 And then a bunch of boutique pen tester companies who are so generous to give us their reports and to try to normalize that for us. + +00:13:09 And like we end up with millions and millions and millions of records and that's great. + +00:13:13 But if a SaaS tool is really good at finding X, that doesn't necessarily mean that X is the biggest problem in our industry. + +00:13:22 And so when we put supply chain security on there and expanded it from just being libraries, like, oh, you're using outdated or vulnerable components. + +00:13:31 Like, yeah, that's bad. + +00:13:33 But there's also malicious components. + +00:13:35 There's also you didn't lock down your CI and then the co-op term, like the co-op or you call them interns, put too many zeros on the Kubernetes deployment. + +00:13:45 And then you've got $30,000 bill you weren't expecting or like I've seen it, right? + +00:13:50 I feel like it's hard to get data that tells the full picture. + +00:13:54 And we're not allowed to just say, well, this is what the team thinks, right? + +00:13:58 So that's where the surveys come in. + +00:13:59 And we're like, this is what we see. + +00:14:02 This is what we think. + +00:14:02 Do you agree? + +00:14:03 And like across the board, everyone was like, supply chain must be on that list. + +00:14:07 You must expand it. + +00:14:08 We agree. + +00:14:09 I think it's the biggest issue of the year, at least of the last six months. + +00:14:12 If we look at reports like the Verizon breach report and the CrowdStrike and the like the big reports of many, many breaches, if we look at the big ones, the nation state ones, + +00:14:24 their supply chain, or in my opinion, if we really look at it, we think about it, they exploited the developer. + +00:14:31 They compromised a developer within an organization. + +00:14:33 It gave them access to multiple parts of the supply chain. + +00:14:37 And then they owned the entire organization. + +00:14:40 So if you get SQL injection in one app, you got into one database and maybe you could read sensitive data. + +00:14:46 Maybe you could delete sensitive data. + +00:14:49 If that database was completely unpatched in a total like terrible mess, then maybe you could take over that server. + +00:14:56 Then if your network's totally not secure and crappy, which is not exactly that common, like I'll look at, you know, network diagrams with clients and I'm like, oh, that line, is that a firewall? + +00:15:06 They're like, it's more aspirational. + +00:15:07 I've heard that many times, right? + +00:15:09 Then maybe they could pivot and get to a couple of places, but that's like, maybe, maybe, maybe, maybe, maybe you get a little bit, but you compromise a senior developer. + +00:15:18 Right. + +00:15:18 And so, yeah, I was really glad when the team agreed that we would do this and then the community supported it. + +00:15:25 So I was like, yes, win. + +00:15:26 Yeah. + +00:15:27 You compromise a developer, then you get. + +00:15:29 Especially a senior. + +00:15:30 Yeah, exactly. + +00:15:31 You get arbitrary code execution on potentially all the stuff that they send out to the world. + +00:15:36 It's, it's really bad. + +00:15:37 You know, an example of that would be the last past breach. + +00:15:40 Yeah. + +00:15:40 And the way that that all started from what I understand is one of the developers had an outdated version of Plex, the like streaming ripped video player on his home network that was open on the internet. + +00:15:53 That got taken over. + +00:15:54 Then they got into the dev machine and then they got everybody's password vaults. + +00:15:57 It's like, excuse me, because you had a movie player on your home automation network. + +00:16:03 That's crazy. + +00:16:04 There's been some recent hacks where the developers download like a plugin or something and it's malicious. + +00:16:10 And then not only is it trying to steal like secrets off their computer, but then it robs their crypto wallets. + +00:16:17 Because why don't you just kick someone when they're down like jerks? + +00:16:21 It's terrible. + +00:16:22 Well, we'll circle back to that. + +00:16:23 But I do think the vibe coding side is going to be a big deal in the future and you'll have a little bit of a challenge there. + +00:16:29 I think obviously professional developers love to bag on vibe coding. + +00:16:34 And I think that that's totally fair. + +00:16:35 But the problem is, I think a lot of that is kind of dark matter. + +00:16:39 Like you'll never see the people because the people do vibe coding. + +00:16:43 They don't know to go and fill out a survey for OWASP. + +00:16:45 They don't even know what a line of code looks like. + +00:16:47 They're just make me happy. + +00:16:49 You know, it's crazy. + +00:16:50 And now some companies are building dark factories, which is a term that was new for me, which is where they replace their entire software development team only with complete AI end to end solutions where there's no human in the loop whatsoever. + +00:17:05 And oh my gosh, Michael, imagine the security posture of what's being released there and the fact that they don't know what the posture is, right? + +00:17:13 But they're getting to market faster than someone else. + +00:17:16 Yes. + +00:17:17 And so consumers don't know that what they're buying has been put together with duct tape and glue. + +00:17:23 So I actually, we didn't talk about this beforehand, but I hope it's okay to mention. + +00:17:27 So I'm trying to push a secure coding law in Canada. + +00:17:30 So Canada is cute and quaint, just like all the stereotypes. + +00:17:34 And a citizen is allowed to create a petition if she could get a member of parliament to support her. + +00:17:39 And after three years of letter writing, one of them did. + +00:17:42 Wow. + +00:17:42 Congratulations. + +00:17:43 Thank you. + +00:17:44 In the house of commons and I have enough signatures, so it's going to go to vote. + +00:17:47 And so right now I'm lobbying all the public to try to get them to call their member of parliament and ask them to vote. + +00:17:54 Yes. + +00:17:54 Because what happens is my member of parliament will be like, Hey, seven one, one, five says this. + +00:18:01 We should create a secure coding law for all governmental organizations, have a standard and then, you know, assure the standard, make, make sure there's compliance. + +00:18:09 And I also like wrote the standard for them and sent it to them. + +00:18:12 Cause that's what I'm like, I'm like, you don't have to use it, but like, here it is in case you want one. + +00:18:16 I've been writing letters for years. + +00:18:17 I'm very annoying. + +00:18:18 Guess what? + +00:18:19 Members of parliament don't know what the word cyber means. + +00:18:21 Like they're very smart. + +00:18:22 I'm not trying to mock them. + +00:18:23 I'm not an expert in what they do either. + +00:18:25 Right. + +00:18:26 And so I need members. + +00:18:28 So Canadians, if you're listening, go like, look up petition E seven one, one five. + +00:18:33 You'll find me sign it and then call or write your member of parliament. + +00:18:36 So if enough of us call, if like 10 or 20, 30 people call the member of parliament or they receive emails, Michael, when the petition comes up, they're like, Oh yeah, my constituents care. + +00:18:45 Therefore I do. + +00:18:46 And they'll raise their hand and I get one chance. + +00:18:49 And I really want a lot of hands going up specifically at least half of 334 people. + +00:18:54 Good luck with that. + +00:18:55 I hope that that goes through. + +00:18:56 That's cool. + +00:18:56 We'll know by June. + +00:18:58 Yeah. + +00:18:58 That's the challenge with legislatures and government in general. + +00:19:01 So many of the people, especially elected officials, they're not elected because they're developers or security specialists or whatever. + +00:19:08 Right. + +00:19:08 But the difference is that you said it's fine because you don't know what they do. + +00:19:13 You're not an expert in law or whatever. + +00:19:14 That's that is true. + +00:19:15 But they have to choose how it's going to work for us through technology, whereas you don't have to choose how law works for them as a tech. + +00:19:23 You know what I mean? + +00:19:23 Like it's they decide. + +00:19:25 So it's I'm not saying that it's their fault or anything, but it is a very tricky thing to balance. + +00:19:30 It is. + +00:19:30 And this is why I, as like a influential person or whatever, are trying to use my influence for good. + +00:19:37 And I'm trying to protect Canada. + +00:19:39 And here's the thing, Michael, is that if Canada creates a law that does this, that is huge momentum for every other country. + +00:19:46 And Canada was one of the first countries to have privacy laws. + +00:19:50 Like we really led the way in that. + +00:19:51 We really have led the way in like laws for quantum as well. + +00:19:56 And like, we're not really used to being that we're just, you know, struggling along and we could lead the way in this. + +00:20:04 Right. + +00:20:04 And then that means other countries can say, well, they have one. + +00:20:07 Like, do we really want to be behind Canada? + +00:20:09 I mean, come on. + +00:20:10 We love Canada. + +00:20:11 Canada is awesome. + +00:20:11 Yeah, we're sweet and we're wonderful. + +00:20:13 And we mean well all the time. + +00:20:16 This portion of Talk Python To Me is sponsored by Temporal. + +00:20:18 Ever since I had Mason Egger on the podcast for episode 515, I've been fascinated with durable workflows in Python. + +00:20:25 That's why I'm thrilled that Temporal has decided to become a podcast sponsor since that episode. + +00:20:30 If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:20:39 I'm sure many of you have written brutal code to keep the workflow moving and to track when you run into problems. + +00:20:45 But it's trickier than that. + +00:20:46 What if you have a long-running workflow and you need to redeploy the app or restart the server while it's running? + +00:20:52 This is where Temporal's open source framework is a game changer. + +00:20:56 You write workflows as normal Python code and Temporal ensures that they execute reliably, even across crashes, restarts, or long-running processes while handling retries, states, and orchestrations for you so you don't have to build and maintain that logic yourself. + +00:21:10 You may be familiar with writing asynchronous code using the async and await keywords in Python. + +00:21:16 Temporal's brilliant programming model leverages the exact same programming model that you are familiar with but uses it for durability, not just concurrency. + +00:21:25 Imagine writing await workflow.sleep. + +00:21:28 Heim Delta, 30 days. + +00:21:30 Yes, seriously. + +00:21:31 Sleep for 30 days. + +00:21:32 Restart the server. + +00:21:33 Deploy new versions of the app. + +00:21:34 That's it. + +00:21:35 Temporal takes care of the rest. + +00:21:36 Temporal is used by teams at Netflix, Snap, and NVIDIA for critical production systems. + +00:21:41 Get started with the open source Python SDK today. + +00:21:44 Learn more at talkpython.fm/Temporal. + +00:21:47 The link is in your podcast player's show notes. + +00:21:50 Thank you to Temporal for supporting the show. + +00:21:52 Let's shift just a little bit to maybe people who don't necessarily mean well, the people who might exploit, you know, broken access control or other types of things. + +00:22:03 And to aid us here, going through the OWASP top 10, I've come up with a little example of, well, this is what some of the concrete examples might look like. + +00:22:13 Maybe I'll put a link to this and I'll reference them. + +00:22:16 And I don't know how often I'll totally use this. + +00:22:18 But let's start with this one. + +00:22:20 And if I understand it correctly, you all don't like suspense. + +00:22:24 It's the worst comes first, then the second worst, and then the third worst. + +00:22:27 Yeah, we go in order of this. + +00:22:29 So this is, it causes lots and lots of damage. + +00:22:33 It's not that hard to find or exploit. + +00:22:35 And it's everywhere. + +00:22:37 It's everywhere. + +00:22:38 When I talk to people that do pen testing, they're like, yeah, I find this basically every time. + +00:22:43 Like one of my friends, Katie Paxton Fear, she does API content and pen testing and bug bounty and stuff. + +00:22:51 And she said, I have never not once found broken access control in an API, like never once. + +00:22:59 And it's really hard to get right, Michael, because every single page, every single record, every single access, we should check that the role is allowed. + +00:23:08 And that they still are that role, right? + +00:23:11 So we continue to make sure the session's accurate and then grant access. + +00:23:15 And unfortunately, we just forget to ask a bot. + +00:23:18 Or we return the entire record set, and we'll just sort it out on the front end. + +00:23:23 And the malicious actor is like, thanks for the data set. + +00:23:26 That's so sweet of you to give that to me. + +00:23:29 And we just screwed up so much, so much. + +00:23:32 I would like to point out that these are not a single vulnerability. + +00:23:35 It's not like this is the number one issue. + +00:23:38 They're like categories, right? + +00:23:39 It's like violations of these could be you didn't use least privilege, or you bypass a control check, or you didn't put access control on the delete part of the API. + +00:23:49 You know, like there's a bunch of things that fall into each one of these, right? + +00:23:52 It's a category. + +00:23:52 They're all a bucket. + +00:23:55 And when we looked at the data, poor code quality was a bucket. + +00:24:00 I'm like, no, because all of the things go into that bucket, right? + +00:24:03 The OWASP top one. + +00:24:04 OWASP top one. + +00:24:05 And then also the mitigation advice is, what if you sucked less, right? + +00:24:10 Like there's no constructive feedback to poor code quality. + +00:24:12 It's not specific enough. + +00:24:14 So we didn't want buckets like that. + +00:24:16 Keep it actionable, right? + +00:24:17 Yeah, exactly. + +00:24:18 If it's not actionable, it's not worth raising awareness about, we felt. + +00:24:21 And broken access control, Michael, it's everywhere. + +00:24:24 And I wish there was like a product that you could buy that could just do this for you. + +00:24:28 So you can buy authentication products that manage session and identity like really, really, really well, right? + +00:24:35 And people buy the crap out of them because they work really well. + +00:24:38 I'd like to be able to buy an access control tool that was as easy to implement as, I'm going to try not to name brands, but you know the products, right? + +00:24:48 People pay a lot of money for Okta, you know, at like Active Directory, you know, Incognito because they work, right? + +00:24:57 And they work well. + +00:24:58 And the less painful they are to implement, the more likely, like they're willing to pay more for that, right? + +00:25:04 And so if we could solve this issue, like I think that could be pretty huge. + +00:25:08 Yeah. + +00:25:09 I got a couple of examples here that were, they're not obvious. + +00:25:12 Some are obvious, kind of like, here's a Django example. + +00:25:15 We might see this later. + +00:25:16 So you might have an admin endpoint and it has at login required, which is a decorator in Django that will do the validation before the function even runs. + +00:25:25 And so you look at it like, oh yeah, this is fine. + +00:25:27 It's using authentication here, but it's not using authorization. + +00:25:30 It's not checking that the person necessarily is an admin. + +00:25:33 It's just that they're logged in, right? + +00:25:35 Like that's a real simple example. + +00:25:37 Yep. + +00:25:38 And I see the problem. + +00:25:39 Another one would be if you say we're going to let people read and write files. + +00:25:43 Maybe it's like a WordPress type thing or something. + +00:25:46 But then if they can put dot, dot in their path and break out, let me read the file dot, dot, slash, dot, dot, slash, et cetera, dot password and read the passwords or whatever, or usernames. + +00:25:57 This is the type of stuff that falls in a broken access control or just no login checks at all. + +00:26:01 I literally did this yesterday, Michael, because someone was like, hey, go get this file from there. + +00:26:05 And then I go in the folder and it's not there. + +00:26:07 Like the link wasn't correct. + +00:26:08 So I just went through the web directory with that. + +00:26:11 Like, but they wanted to send me the file. + +00:26:14 So just to be clear, like, like they sent me and told me to go get it. + +00:26:18 I wasn't stealing anything. + +00:26:19 And I didn't end up eventually finding it either. + +00:26:22 So then they had to send me another link that was correct. + +00:26:25 But I was like, oh, I'll just like not waste their time. + +00:26:27 I'll just go look for myself because it's that easy. + +00:26:30 Incredible. + +00:26:30 It's like, I'm going to use their tools, but not in a way that they necessarily expected. + +00:26:33 Well, I mean, they could have just sent me the right link. + +00:26:35 Exactly. + +00:26:36 They should have just sent you the right link. + +00:26:37 Yeah, I've I have some I cannot recount them here, but my dad, he I've got to let you help take care of him and stuff now these days. + +00:26:45 And he can't do a lot of his own paperwork and things. + +00:26:48 And so I've had to do some crazy stuff to to get access to or to help him fill out something that, yeah, it's insane. + +00:26:55 OK, let's go on to number two. + +00:26:57 This is just setting the wrong configurations, not following the hardening guide, not doing patching. + +00:27:02 And this is so easy for a malicious actor to find because there's scanners. + +00:27:06 I joke whenever anyone's like, oh, like, you know, we don't want to do a pen test because it might break our thing. + +00:27:12 And I'm like, well, you're actually having a penetration test done all the time. + +00:27:15 If you're on the Internet, you just aren't receiving the report. + +00:27:18 That is absolutely so true and so disturbing. + +00:27:21 If if if you're out there listening and you have something on the Internet, API website, whatever, and you have not just tailed the log of it and just see slash WP slash admin slash this slash that just coming at it left and right. + +00:27:34 You're like, what is going on? + +00:27:36 Because it doesn't show up in your analytics. + +00:27:37 I actually had someone find something on my website that I was surprised about because I had a user from my blog from long ago and they could see my user and then it gave my + +00:27:49 email address and it was actually a personal email address that I because I had a backup admin account and I used my personal email and he is like, did you want that on there? + +00:27:57 I'm like, no, only my mom and my dad email me there because I'm bad. + +00:28:01 And if my parents write me, I should write back. + +00:28:06 Right. + +00:28:07 And so I'm like, oh, that's the personal email that I'm supposed to answer on time. + +00:28:12 He wrote me and helped me like turn off that setting that I had no idea about despite having multiple security plugins and having run an audit. + +00:28:20 I'd miss that. + +00:28:20 This one could evolve for like Python people like Django with debug equals true. + +00:28:24 Now this is the most, probably most used misconfiguration example for Django apps out there. + +00:28:30 You're like, well, of course, Michael, I know you don't set debug true in production. + +00:28:34 You do it all the time. + +00:28:35 People do it. + +00:28:35 You're right. + +00:28:36 And then two, there's like 10 other settings that are, should be in production that are not in production in Django. + +00:28:42 Like HSTS. + +00:28:44 Yes, please. + +00:28:45 And a bunch of other, you know, do not allow me to put it, be put into an iframe and all sorts of other things, content security policies and security headers. + +00:28:53 Yes, exactly. + +00:28:54 This is what happened with Claude Code. + +00:28:56 And this is how they, they essentially allowed debug and production, like by not suppressing their map file and then also not having it as part of their git ignore, which is essentially having debug mode in prod. + +00:29:09 And that's how they lost their source code. + +00:29:11 Just to be clear, I'm not shaming the developer that did that. + +00:29:14 They probably didn't have a checklist for that person. + +00:29:17 They probably didn't have anything that scanned to tell them that those settings were incorrect. + +00:29:21 Right. + +00:29:22 They probably don't even have a policy that clarifies. + +00:29:25 They're like, oh, they should just know not to do that. + +00:29:27 Right. + +00:29:27 And then they were rushed. + +00:29:28 Probably they're in a hurry. + +00:29:30 And then. + +00:29:31 Yeah. + +00:29:31 They're shipping three or four times a day, which I appreciate, but at the same time. + +00:29:34 Yeah. + +00:29:34 And then now very bad things are happening. + +00:29:38 It is going to be the most audited code that has ever happened. + +00:29:41 Michael. + +00:29:41 I've seen so many videos parsing that stuff apart. + +00:29:45 It's wild. + +00:29:45 Yeah. + +00:29:46 For people who don't know, I'm sure people heard clog code got leaked, but basically the map file in JavaScript says, here's the minified version. + +00:29:54 But if you want to show the full source version for helpful debugging, here it is. + +00:29:59 And there's how you get to it. + +00:30:00 And that apparently got shipped. + +00:30:01 And people were just like, you know what? + +00:30:02 Why don't we find out what those files are actually? + +00:30:04 And it was two security misconfigurations because one of them would have stopped it from going and the other one would have like not had it been in the package in the first place. + +00:30:14 Right. + +00:30:14 And it's number two. + +00:30:16 And it happens even to the really, really, really, really, really high profile, you know, high security assurance requiring places. + +00:30:23 I have a third one on this list that I think will really surprise people like for real. + +00:30:29 So imagine this. + +00:30:30 I've got a self-hosted app and I'm going to run it on a Docker in Docker. + +00:30:35 You know, it could be Kubernetes. + +00:30:36 It could be whatever. + +00:30:37 But I'm going to run it in Docker on my server. + +00:30:40 And it has both a web interface and a database. + +00:30:44 And the database is running on the default port and so on. + +00:30:47 So that's all fine. + +00:30:48 But it's in Docker and everything's locked down. + +00:30:51 And so what you could do is you can use this thing called UFW, uncomplicated firewall on Linux. + +00:30:55 Turn that on and say block or don't block. + +00:30:58 Only allow my web port. + +00:31:00 And in your Docker compose file, you often or even just Docker statements, you see map, say Postgres, port 5432 to 5432. + +00:31:09 Guess what? + +00:31:09 That's actually open on the Internet, probably with the default password on that database. + +00:31:13 Because if you look at the Docker docs and you go to the bottom, it says uncomplicated firewalls, a friend that ships with Debian and Ubuntu unless you manage firewall. + +00:31:21 Docker and UFW use firewall rules in ways that make them incompatible. + +00:31:26 When you publish the container ports on Docker traffic, it gets diverted before that. + +00:31:30 So guess what? + +00:31:30 That's open on the Internet. + +00:31:32 Holy smokes. + +00:31:33 Is that it's so common to just see this port on the container map to this port on the server. + +00:31:39 And if you're thinking that UFW on your firewall is going to save you, it's actually just open on the Internet. + +00:31:45 Like I didn't realize that. + +00:31:46 And so, for example, what do you do? + +00:31:47 Well, you say localhost colon my port on the server. + +00:31:52 So you're shipping, you're only listening locally. + +00:31:54 You don't have, or you just don't do that. + +00:31:55 But that's a really subtle and sneaky one that people should be aware of. + +00:32:00 The thing is no one can memorize all of this. + +00:32:02 And so what is the answer, Michael? + +00:32:04 Checklists? + +00:32:05 Checklists are good. + +00:32:06 Scanners? + +00:32:06 Scanners are good. + +00:32:07 Honestly, I think the modern top tier AI agentic tools are really good. + +00:32:13 They find a surprising amount of these things. + +00:32:16 They find them if you ask them to find them, or they make it part of the code that they give you when you just ask for it. + +00:32:23 Because people just say, I want the app. + +00:32:24 They don't say, I want a secure app necessarily. + +00:32:26 And well, it's more efficient to not worry about the security. + +00:32:29 We'll save you some tokens. + +00:32:30 Even if you just say, I want a secure app. + +00:32:33 So I gave a conference talk two weeks ago at RSA called Insecure Vibes. + +00:32:37 In the demo that I recorded in advance that was not part of the slides when I gave my live presentation, but it's on my YouTube. + +00:32:44 I just asked Claude, I'm like, can you make a login function that's for an insulin pump? + +00:32:49 So this is a medical device that needs to be really secure. + +00:32:52 And it does it. + +00:32:53 And then after I'm like, analyze it for vulnerabilities. + +00:32:55 And multiple AIs found critical vulnerabilities in it. + +00:32:59 So I asked for it to be secure. + +00:33:00 So you can't just say, I want to be secure. + +00:33:02 You have to say, and this is what secure means. + +00:33:05 I do think you can find a lot if you use the tools in the right way. + +00:33:08 But like you said, you've got to ask. + +00:33:09 And it's a proper step. + +00:33:11 Software supply chain failures. + +00:33:13 Number three. + +00:33:14 This is the expansion. + +00:33:15 So this used to be vulnerable and outdated components, which is part of your software supply chain. + +00:33:20 But Michael, I'm sure that I've told you this before, but for people that haven't heard me, blah, blah, blah, about it. + +00:33:25 Every single thing that you use to create and maintain your software is part of your supply chain. + +00:33:31 So that includes your browser, the plugins in the browser, the sandbox you created, your CI, all the settings in the CI, where you're getting your libraries and packages from, how you're getting them. + +00:33:43 So do they maintain integrity across the wire when you get them? + +00:33:47 Could it be that you got something else? + +00:33:49 Like every single thing that you're using to maintain and create is part of the supply chain. + +00:33:56 And so we need to protect the whole thing. + +00:33:59 And like we were saying earlier, developers themselves are becoming targets of malicious actors. + +00:34:04 We need to find ways to defend the developer themselves, protect them, make them safer doing their jobs, right? + +00:34:12 And help them find ways to secure the whole supply chain that's not too painful because they still need flexibility in order to be creative. + +00:34:20 So some Python things that you can do concretely here is pin your dependencies. + +00:34:24 You can use pip compile or you can use uv lock files. + +00:34:28 There's all sorts of things that are possible there. + +00:34:31 And then you can also, I think the other side that we haven't mentioned, Tanya, is like known vulnerabilities in packages. + +00:34:38 I think a lot of people, I would say over 95% of the people that install libraries from PyPI, they don't even check whether or not there's a vulnerability in that package before they install it. + +00:34:50 I would like to see. + +00:34:51 So in 2022, the first company announced this idea of reachability. + +00:34:56 So let's say you want to do math. + +00:34:58 So you install a math library. + +00:34:59 We don't actually want to do all of math, right? + +00:35:02 We probably just want to do calculus, but maybe the vulnerabilities in the statistics function, right? + +00:35:07 And so when your code calls all the calculus functions, you're like, woo, derivatives. + +00:35:14 You're not actually, there's no reachable path from your code to the vulnerability. + +00:35:19 Most of the time, that means it's not exploitable, except for if it's log4j, then you're just screwed. + +00:35:24 Just to be clear, you're just in trouble, right? + +00:35:26 But for most things, like 99.9% of the time, then you're fine if there's no reachability. + +00:35:31 And so software composition analysis tools, sometimes called supply chain security tools, when the marketing teams got a little out of hand. + +00:35:39 I feel like if you do one of the 19 attack surfaces within the supply chain that you don't get to call yourself a supply chain tool, but I digress. + +00:35:47 I have strong feels. + +00:36:17 That's so overwhelming. + +00:36:18 I'm just not even going to look at it. + +00:36:20 I have a known vulnerability in one of the packages that I am shipping to production. + +00:36:25 I think it's a PDF package or something like that. + +00:36:28 I can't remember. + +00:36:29 And I scan all my builds with pip audit and it will fail the build. + +00:36:33 So I have to ignore it because it is a vulnerability when you call a path that I don't call and you're running on Windows. + +00:36:40 And I'm trying to deploy it to Docker. + +00:36:42 And I'm not calling that path. + +00:36:44 I'm like, I understand it's a problem that could be an issue under some circumstances, but it doesn't apply here. + +00:36:50 And I just need to use it. + +00:36:51 And it's a Windows problem on my Docker Linux. + +00:36:54 I don't really care right now. + +00:36:56 I mean, it's fine. + +00:36:56 Until they fix it, I'll be okay. + +00:36:58 Well, and especially if you know what the problem is and you're not going to suddenly switch to Windows, why would you do that? + +00:37:05 So the tools are maturing, but they're not perfect. + +00:37:09 And lots of them are going at different speeds. + +00:37:13 We'll just say that. + +00:37:13 So I look forward to the day where there's reachability done on all of those things. + +00:37:20 This portion of Talk Python To Me is brought to you by us. + +00:37:23 I want to tell you about a course I put together that I'm really proud of, Agentic AI Programming for Python Developers. + +00:37:31 I know a lot of you have tried AI coding tools and come away thinking, well, this is more hassle than it's worth. + +00:37:37 And honestly, all the vibe coding hype isn't helping. + +00:37:40 It's a smokescreen that hides what these tools can actually do. + +00:37:44 This course is about agentic engineering, applying real software engineering practices with AI that understands your entire code base, runs your tests, and builds complete features under your direction. + +00:37:57 I've used these techniques to ship real production code across Talk Python, Python Bytes, and completely new projects. + +00:38:04 I migrated an entire CSS framework on a production site with thousands of lines of HTML in a few hours. + +00:38:10 Twice. + +00:38:11 I shipped a new search feature with caching and async in under an hour. + +00:38:15 I built a complete CLI tool for Talk Python from scratch, tested, documented, and published to PyPI in an afternoon. + +00:38:24 Real projects, real production code, both Greenfield and legacy. + +00:38:29 No toy demos, no fluff. + +00:38:31 I'll show you the guardrails, the planning techniques, and the workflows that turn AI into a genuine engineering partner. + +00:38:37 Check it out at talkpython.fm/agentic dash engineering. + +00:38:41 That's talkpython.fm/agentic dash engineering. + +00:38:45 The link is in your podcast player's show notes. + +00:38:47 One of the other things I wanted to mention is like, so you said pin dependencies. + +00:38:52 And so I teach this and then inevitably every time someone's like, well, if I pin dependencies forever, then I just have all these really old dependencies. + +00:38:59 That's not what Michael means. + +00:39:01 He means you do development, you update your dependencies to a version. + +00:39:05 Like ideally you're like LTE, you're like latest, you know, stable version of whatever the thing is. + +00:39:10 Because you're trying to keep like definitely as supported version, recent, you're not picking terrible things where, you know, it hasn't been updated in two years, or there's one maintainer and they happen to live in Russia and work for the Russian government, right? + +00:39:23 So you're picking like decent ones. + +00:39:25 You're updating it in dev. + +00:39:26 You're like, okay, this is the one. + +00:39:27 Then you pin it. + +00:39:28 So as it goes up to different environments, you don't get a surprise update and it changes. + +00:39:33 And then there's something different in prod than what you tested in UAT and approved with the security tools. + +00:39:39 That's what, that's what. + +00:39:40 And a hundred percent. + +00:39:41 Because it gets misinterpreted. + +00:39:43 Yeah. + +00:39:43 And another thing to do that you can do real simple is like with some of the tools, like with uv, you can say, I have pin dependencies, update them to the current ones + +00:39:52 with a very important caveat that you can say that are older than a week or older than a day or something. + +00:39:58 But because, you know, the really big example here is LLM, light LLM, just this, was that just this week or was that last week? + +00:40:05 I can't keep it. + +00:40:06 It was very recent. + +00:40:07 Yeah. + +00:40:07 This thing has a dependency that itself became, like you talked about, the developer got taken over, I believe, and a virus got put in and it was only out for like half an hour or something, but it took over, it's so popular. + +00:40:20 It got, took it, took it over like 50,000 machines because it gets downloaded millions of times a day. + +00:40:26 Automatically. + +00:40:26 If you say, give me the latest, he's like, obviously waiting a week is not hardcore security, but at the same time, so many of these popular issues that people take, they only last briefly, right? + +00:40:36 For a few moments. + +00:40:37 And then somebody's like, oh my gosh, why is this thing using 100% CPU? + +00:40:41 You know what I mean? + +00:40:42 And here's the thing, Michael, is that not all of those 50,000 got the memo that this happened and they're still vulnerable in prod and they could be from them. + +00:40:49 Yeah. + +00:40:49 It could be for a long time. + +00:40:50 Yes, I agree. + +00:40:51 Like update, but to, you know, one that's three days old or one week old, it's weird. + +00:40:56 So this advice has drastically changed over the past six months. + +00:41:00 The best practice used to be auto update to latest version, period. + +00:41:03 That used to be the advice and that's no longer the advice. + +00:41:06 And it's kind of heartbreaking, especially if you use npm. + +00:41:09 npm is just like under siege. + +00:41:11 It is. + +00:41:12 Yeah. + +00:41:12 High PI is as well, but it looks over at npm as thankful for its situation. + +00:41:17 Number four. + +00:41:18 Or I got to keep cruising here. + +00:41:21 For cryptographic failures. + +00:41:22 Cryptographic failures. + +00:41:23 Yeah. + +00:41:24 Not encrypting. + +00:41:25 Encrypting using something really old. + +00:41:28 You start off encrypted and then briefly you're not encrypted and then you're encrypted again. + +00:41:32 You don't encrypt it when you're supposed to. + +00:41:34 Also in this realm, one way hashing, not just reversible encryption, right? + +00:41:41 It would probably fall in here. + +00:41:42 Encrypting user passwords and storing them in the database along with the key. + +00:41:47 Ideally, we would hash and we would salt and then hash user passwords. + +00:41:52 That would be the best. + +00:41:53 If you really, really, really, really are intense, you could pepper it too. + +00:41:56 And no, I did not make that up. + +00:41:58 That is a mathematical nerd joke, not an app sec joke. + +00:42:01 But a salt is unique per user and the salt itself isn't really a secret. + +00:42:06 Where a pepper is unique per system or per organization. + +00:42:09 And it is a secret. + +00:42:11 Right. + +00:42:11 Like a secret key that you set and then it gets factored in there. + +00:42:14 That's cool. + +00:42:15 Yeah. + +00:42:15 Yeah. + +00:42:15 Also, maybe choose more modern hashing algorithms, right? + +00:42:19 Obviously not MD5, but maybe something memory hard like Argon maybe. + +00:42:23 I don't know. + +00:42:23 Yes. + +00:42:23 Argon 2. + +00:42:24 That would be better for sure. + +00:42:27 And this is something where if you're going to do it, it's very easy to look up what you're supposed to do on the internet. + +00:42:33 This is something where you can ask the AI, like, are you using a good algorithm? + +00:42:38 Are you doing this? + +00:42:39 Like, make sure it's secure. + +00:42:40 And then it's good as long as it does it. + +00:42:43 Because one suggestion, if you are going to VibeCode and not do the 400 other things we'll talk about later when we talk about my prompt library, but ask it to list its security assumptions. + +00:42:54 So whatever it is you prompt, you give it to make a thing. + +00:42:56 You're like, make this, then do that, blah, blah, blah. + +00:42:59 Please list all your security assumptions. + +00:43:01 And it'll be like, oh, yeah, but obviously like you wouldn't do like authentication like that because that's terrible. + +00:43:06 And like in production, you would do this other thing and you're like, oh, yeah, because it'll assume that you're going to change a bunch of things later that it doesn't tell you unless you ask it to tell you its assumptions. + +00:43:18 Yeah. + +00:43:18 We'll use no password here, but when you ship it, you're going to add that, right? + +00:43:21 Like, no, I wasn't going to, but now I will. + +00:43:23 All right. + +00:43:24 I think one of the best known ones has got to be little Bobby tables and friends. + +00:43:29 Number five, injection. + +00:43:31 Yes. + +00:43:32 So injection, tricking an application in, like you put your code, the malicious actor's code into a place where it should be data, but you've tricked it into thinking it's its code. + +00:43:43 And then either it executes it or it interprets it. + +00:43:45 Like if there's an interpreter, there's a compiler, there's the potential for injection. + +00:43:49 And it, yeah, we don't want to mix data in with commands. + +00:43:54 We don't want to mix data in with anything that's going to be executed or interpreted. + +00:43:58 And we do it a lot, Michael. + +00:43:59 I know we make bad choices, don't we? + +00:44:01 We make bad choices. + +00:44:02 So obviously SQL injection is the number one in this world for sure. + +00:44:08 And still it's popular. + +00:44:10 Still people don't know. + +00:44:11 It's still tricky. + +00:44:12 I mean, we have certainly parametrized queries and ORMs and stuff that should be helping us or does help us if we choose to use them with this. + +00:44:19 However, I think other ones should just give them a quick shout out. + +00:44:23 Like for example, if you're accepting JSON and converting it to a dictionary in Python, you can do MongoDB injection. + +00:44:31 Like your password, you know, the Bobby tables one is like quote, semicolon drop table that, you know, like that's what that looks like in T-School. + +00:44:39 But in MongoDB, you can do queries that are dictionaries. + +00:44:43 That's like kind of how you do your filtering. + +00:44:44 So if you take something that would be a password in a JSON document, the password could be curly brace greater than, you know, one equals one. + +00:44:54 Like a really complicated JSON dictionary that is actually the query that is equivalent. + +00:44:58 So you got to be super careful there as well. + +00:44:59 And that's really tricky to do that. + +00:45:02 And then also like the pickles and like serialization, deserialization. + +00:45:06 There's a lot to this, not just SQL injection. + +00:45:09 It's a lot about input validation. + +00:45:11 So using a parametrized query, so store procedures, prepared statements, whatever you want to call them. + +00:45:17 What that does is says this is data only treated as data. + +00:45:21 And then the database can do that. + +00:45:24 But if we, on top of that, do input validation. + +00:45:27 So like we're getting the thing that looks correct and we're rejecting, we're not trying to fix it. + +00:45:33 We're just rejecting everything that looks not correct. + +00:45:36 And then if we have to accept any special characters, we escape them or sanitize them out. + +00:45:41 I prefer escaping. + +00:45:42 I think it's weird to remove stuff. + +00:45:43 That's my data. + +00:45:44 I probably want it. + +00:45:45 So like if you have to accept single quotes because you know you're going to have users named O'Malley, let's say. + +00:45:50 Right. + +00:45:50 So we accept the letters. + +00:45:52 We accept the numbers. + +00:45:53 We accept a single quote and some dashes, even though those are dangerous. + +00:45:56 And then we escape those characters because we know they're potentially dangerous. + +00:45:59 And then we specify, by the way, that's definitely data and not code by making it a parameter. + +00:46:06 Right. + +00:46:06 And then we're locked down. + +00:46:08 We're in really good shape. + +00:46:09 If we olded input validation on everything and then we just rejected everything that looked weird, especially against like we always need to do like a yes list and allow like this is allowed list. + +00:46:21 Not like when I was a pen tester, Michael, I was not. + +00:46:24 I was a pen tester for a year and a half. + +00:46:26 I had basically zero training and I could get around those in two seconds. + +00:46:31 And like I was not particularly superbly talented. + +00:46:34 And I was just like, pew, pew, pew, ha, ha, ha, ha, ha, block list. + +00:46:37 You just try. + +00:46:38 If you just know to look, right? + +00:46:40 Yeah. + +00:46:40 Well, and there's cheat sheets all over the internet of how to get around them. + +00:46:43 It's so everyone knows how to get around them. + +00:46:46 So, but you can't get around. + +00:46:48 Well, you're only allowed letters and numbers. + +00:46:49 It's like, well. + +00:46:50 Let's keep moving a little bit quickly so that we know that we're, we got time for a little retrospective. + +00:46:54 So insecure design, that's a fun one. + +00:46:57 It does not matter how perfectly you follow the plan if the plan is bad. + +00:47:02 Right. + +00:47:03 And so this is, this was new on the last time we released the list was the first time it was on there. + +00:47:10 And I'm really glad because all the other items are implementation. + +00:47:14 And this is the only one that is design the plan. + +00:47:17 And so essentially it means, you know, someone designed something and you don't talk about it. + +00:47:22 You don't analyze it for security. + +00:47:24 You don't intentionally apply secure design concepts. + +00:47:27 You don't do a threat model. + +00:47:29 There's no security review. + +00:47:30 You YOLO that. + +00:47:31 You don't even have a list of security requirements usually that you knew you should have added. + +00:47:35 Like if you're going to do an API and it's, you know, visible from accessible from the internet, to me, it should be behind an API gateway, period. + +00:47:44 That is my opinion. + +00:47:45 No, I don't sell one, but I think we all need one. + +00:47:47 Right. + +00:47:48 And to me, that should just be a requirement up front. + +00:47:51 And then when I see your design document and it's there, I'm like, thumbs up, let's go. + +00:47:55 Right. + +00:47:55 But if you're not giving clear requirements and then you're not reviewing the design, you are getting a YOLO approach at stuff. + +00:48:03 And that doesn't mean developers don't care. + +00:48:06 But if no one's taught them this, no one's asked for this, and then no one checks this. + +00:48:09 The problem with this one is the code looks fine. + +00:48:11 It looks like you're doing it right. + +00:48:12 It's just there's something important that's just not even there. + +00:48:15 Like examples that I came up with were like no rate limiting would be one. + +00:48:19 The login looks fine. + +00:48:20 You're checking the person's not a duplicate that they're there, et cetera, et cetera. + +00:48:23 Right. + +00:48:23 Or a client side enforcement and some kind of like Vue.js act. + +00:48:27 You've got all the validation there, but the API actually just assumes the client is doing it, which is never the way. + +00:48:32 And there's a lot of business logic issues that will be the way that you're solving the problem. + +00:48:40 If users do the thing you want, it's fine. + +00:48:43 But not all users do the thing you want. + +00:48:45 And some of them are Tanya. + +00:48:46 And they're like, well, I'm just going to click through your. + +00:48:48 Exactly. + +00:48:49 Some of us are curious and we like to click the buttons. + +00:48:53 We're like, oh, look at that. + +00:48:54 Oh, there's a next button, even though it says it's the last page. + +00:48:57 Well, what would happen if we click that? + +00:48:58 Absolutely. + +00:48:59 You've got to click that button, right? + +00:49:01 Authentication failures, number seven. + +00:49:03 So this is when an attacker can trick the app into thinking they're a different user, usually an admin user. + +00:49:09 Or if they're not a legitimate user, tricking them into being a legitimate user. + +00:49:13 But we all want to be admin, Michael. + +00:49:15 Oh, yeah. + +00:49:15 This can be caused by lots of things. + +00:49:17 We wrote our own authentication instead of buying a tried, tested, and true product that will do this for us easier, better, faster, and cheaper in the long run when we count maintenance. + +00:49:29 That's the biggest mistake. + +00:49:31 But we don't protect against credential stuffing. + +00:49:33 We don't protect against brute force. + +00:49:35 Those are the two super, super huge ones. + +00:49:37 We let people reuse passwords, use terribly insecure passwords, etc. We don't have multiple forms of authentication. + +00:49:43 So there's no second factor. + +00:49:46 Yeah. + +00:49:46 And there's, Michael, there are ways to do multi-factor that don't, like, have to be awful necessarily. + +00:49:54 Like, if you don't require the same level of security, like posture. + +00:49:57 So, for instance, you know, you do have multi-factor authentication for the first time. + +00:50:03 But then maybe you fingerprint their browser and their device and their network. + +00:50:07 And if they're going to log in from the same device, the same browser, and the same network, maybe you don't require an MFA challenge very often. + +00:50:14 A hundred percent. + +00:50:15 Like, you could say, trust this machine or this browser. + +00:50:18 And you're like, okay, we'll never ask you 2FA again. + +00:50:21 Just your username, password. + +00:50:23 Or we won't, unless you're doing something like deleting your account. + +00:50:26 There's ways to make this not necessarily always super painful. + +00:50:30 And I, you know, pass keys are so nice. + +00:50:33 Those are making things a lot nicer. + +00:50:35 But I still feel like there's a ways to go. + +00:50:38 I dream of the day where we trust our devices so well that, like, I can just touch the thing and I know it's okay. + +00:50:44 Right? + +00:50:44 And I know that someone can't just, like, XKCD hit me with a wrench until I put the thing in front of my face and then it unlocks. + +00:50:52 Exactly. + +00:50:52 That is one of the weakest parts of the security chain there. + +00:50:54 Mm-hmm. + +00:50:55 Software or data integrity failures? + +00:50:58 Number eight. + +00:50:58 So this is one we fought a lot about. + +00:51:01 Yeah. + +00:51:01 Especially me and Neil. + +00:51:02 Because Neil, I wrote this one and Neil wrote the supply chain one or vice versa. + +00:51:08 And there's lots of arguments of how to differentiate. + +00:51:10 So we need to make sure that things we download are exactly what we think they are and that the integrity, it's not been spoofed or tampered with in the meantime. + +00:51:20 So no one has changed it. + +00:51:22 And this is for data and for software. + +00:51:24 So, you know, third-party components that we're getting. + +00:51:27 Are we getting the thing we thought? + +00:51:29 Is there a table squat? + +00:51:30 Is, has someone been able to intercept in between and change it out? + +00:51:33 Same with data. + +00:51:35 Like, did someone change the data on the way to us, et cetera. + +00:51:39 And this is really key, especially for things that require anything medical. + +00:51:44 Like imagine the insulin pump that gets it wrong sometimes and people have comas. + +00:51:48 Like that would be so unbelievably bad. + +00:51:50 That's very bad. + +00:51:51 Yeah, I worked with a company that did like all the medical devices and instruments and ORs and ERs and security assurance. + +00:51:59 Hi. + +00:52:01 It was an awesome project. + +00:52:02 It was really cool, but it was also like, damn, your job hard. + +00:52:06 It's tough to sleep at night in that one. + +00:52:07 Yeah. + +00:52:07 The thing is, is that private industry tends to really focus on availability. + +00:52:11 Like if their website's down, they can't sell their thing. + +00:52:14 Clients call, it costs them money. + +00:52:16 Right. + +00:52:16 But integrity is like more silent hurt, if that makes sense. + +00:52:21 Yeah. + +00:52:22 So many of these are like this, honestly. + +00:52:24 Like the whole top 10 is only, it only slows you down and it's sand in the gears until something happens. + +00:52:30 And then it's your fault for not doing it. + +00:52:31 But before that, it's like all this stuff is a hassle. + +00:52:34 And integrity, it comes into play in so many situations because some, it can fail silently. + +00:52:42 Things that fail silently are more scary for security teams. + +00:52:45 Does that make sense? + +00:52:46 People love to use CDNs for their JavaScript and their CSS and so on. + +00:52:51 And there've been examples where the CDN was taken over or another developer could have been compromised who published a malicious JavaScript. + +00:53:02 And the danger with this is if you make that hack go through, you don't just take over that app. + +00:53:08 You take over all the people who use that app and everyone who uses the CDN to pull it out. + +00:53:12 Like it can be really the knock-on effects are mega. + +00:53:16 Yeah. + +00:53:16 And like checking the sub-resource integrity, doing that check, that can help. + +00:53:22 Sometimes we can do all the right things and we still get hurt. + +00:53:26 Because like for instance, with SolarWinds, the compromise was so deep in the organization that they were able to not only push in like code that was malicious, have it pass all the security tests in the pipeline, then sign it and then release it. + +00:53:43 And then not have customers also notice the problem. + +00:53:45 Like that situation is rare. + +00:53:48 What we want to do with this one is raise awareness that you should just be checking the integrity of your stuff, period. + +00:53:55 Right? + +00:53:56 So the software composition analysis companies, the security researchers, they're on it finding those rare edge case zero day situations. + +00:54:06 What we need the average developer to do is just check integrity, period. + +00:54:10 Like the thing you've got is what you think you've got and it's from the right place. + +00:54:15 And if we could all do that, like life would improve greatly. + +00:54:18 There's defaults that are not great. + +00:54:20 For example, check this out. + +00:54:21 JS deliver are Tailwind. + +00:54:24 So here's a real popular CDN delivering a very, very popular library. + +00:54:28 Here's how it tells me to use it. + +00:54:29 What's missing here? + +00:54:30 Sub-resource integrity check. + +00:54:32 Yes. + +00:54:33 So if I just say, I want to use this Tailwind and it says, great, source equals such and such. + +00:54:38 Good to go. + +00:54:39 You know what I mean? + +00:54:39 And that's it. + +00:54:40 So even the really popular CDNs and stuff are just encouraging you to fall, to scramble from the pit of success. + +00:54:47 You know, it's not that at all. + +00:54:48 Maybe we should write them and be like, I want you to change this, please. + +00:54:52 When I worked at Microsoft, I did that all the time. + +00:54:54 I'd be like, you need to change your readme page. + +00:54:56 It's wrong. + +00:54:57 You forgot the security thing. + +00:54:59 And they'd be like, Tanya, just, it's a demo. + +00:55:01 I'm like, nope. + +00:55:02 Two more real quick before we run out of time. + +00:55:04 Logging and alerting. + +00:55:05 So security logging and alerting. + +00:55:08 So developers might be doing lots of logging and they might be doing some alerting for debugging, which is important and you should still do it. + +00:55:15 But this is more that we're not logging when security controls are called and especially pass or fail. + +00:55:21 So if someone tries to log in 100 times in one second, I don't want to know the 100th time that they got in. + +00:55:28 I want to know all 99 times where they failed in the logs. + +00:55:32 Right. + +00:55:32 I want to have enough information in those logs that I can do a proper investigation. + +00:55:38 Like when I worked in AppSec, my job wasn't called incident responder. + +00:55:42 But every time an app got smashed, they're like, okay, Tanya, do you saying that you do that weird thing? + +00:55:49 And I would go look at the logs. + +00:55:50 And I remember a client calling me one day and they're like, Visa called us and 27 of our customers got popped and we need you to go investigate. + +00:55:59 And turns out they didn't have any logs. + +00:56:01 They didn't think they needed to log that. + +00:56:03 And so they had absolutely no application logs for that log for that app. + +00:56:08 Sorry. + +00:56:09 And I was like, when am I supposed to investigate walk around the building with like a magnifying glass and just look cool with a hat on? + +00:56:15 Like there's nothing, there's no evidence. + +00:56:16 There's probably somebody in the corner with a hoodie, sunglasses looking sort of hacker-ish. + +00:56:21 Right. + +00:56:22 Like, I'm just like, what am I supposed to investigate guys? + +00:56:24 Like you have no logs at all. + +00:56:27 You got to just let it keep going. + +00:56:28 Basically, you got to say, well, now we add logging and then we can figure out if there's new stuff happening or something. + +00:56:34 It's really bad. + +00:56:35 It turns out it wasn't them. + +00:56:36 It turned out that there's a sandwich shop downstairs and an employee had swiped cards and everything from our end was fine. + +00:56:44 But then I was like, we are rewriting this app so that it does security logging and failure. + +00:56:48 So essentially, you're making it so we can't investigate. + +00:56:51 You're making it so there's no evidence that a thing happened. + +00:56:54 We can't press charges in court. + +00:56:55 There's no chain of custody. + +00:56:56 We'll never know what happened. + +00:56:59 And then that means we don't know how to protect ourselves in the future. + +00:57:02 And we really need these logs. + +00:57:04 So every time a security thing happens, input validation, output encoding, like anything that is security related, just log that the attempt was made and it worked or it didn't work. + +00:57:14 And, you know, which user ID, et cetera, things like that. + +00:57:17 And the timestamp. + +00:57:18 All right. + +00:57:18 Last one. + +00:57:19 Let's round it out real quick with mishandling of exceptional conditions. + +00:57:22 So this is brand new and this one is related to the other one. + +00:57:27 So number nine was basically you're not doing logging when you should or your logs suck. + +00:57:33 They're incomplete. + +00:57:34 This one is errors happen and you just don't handle them properly. + +00:57:38 So I'm sure you've reviewed code and seen this where it's like try and it does a thing and then catch and then there's nothing and then end. + +00:57:47 I'm like, what? + +00:57:47 You didn't handle anything. + +00:57:50 Or the handling is just I'm going to print the entire system error to the screen with the stack trace and a mess. + +00:57:58 Nope, that's gross. + +00:57:59 I'm just going to not properly recover. + +00:58:02 Right. + +00:58:02 And so application resilience is important, but you can't have that at all. + +00:58:07 If you're not doing this, you can't recover. + +00:58:09 Yeah. + +00:58:10 Or you don't use a database transaction and it's data is corrupted, something like that. + +00:58:14 This is where a lot of business logic flaws, like really unique bugs happen that are harder to find because we are not handling our errors at all or we're handling them very, very poorly. + +00:58:27 And I was really excited to have this on here because lack of application resilience tied for this one for spot number 10. + +00:58:36 But if you solve this, you almost always solve lack of application resilience. + +00:58:40 But if you solve lack of application resilience, you do not solve this. + +00:58:45 And so I was like, and so that's how I got them to agree to put this one on. + +00:58:48 And so the other one, having technical discussions with really smart people, it's pretty cool. + +00:58:54 Absolutely. + +00:58:55 So I want to take a moment and talk about AI and security and give you a chance to talk about your prompt library and how people can get it. + +00:59:04 And while you're doing that, I'm going to pull up an example I can kick off. + +00:59:07 So tell people about it and I'll pull up the example. + +00:59:08 I give training and I do this bad, better, best thing where I give an example. + +00:59:12 So I'm like input validation or whatever the topic is, you know, like a brief lecture on it and best practices. + +00:59:17 Then I give an example of bad code. + +00:59:19 Then we fix that thing, better code. + +00:59:21 And then best codes, like layers of defenses. + +00:59:23 And when I was creating these examples with the AI, Michael, every time the example was bad code, like no security control whatsoever or completely incorrectly done. + +00:59:34 So like you get the input, you use it and then you validate it. + +00:59:38 Right. + +00:59:39 So it has gotten better. + +00:59:41 So over the past two years, I've seen it go from every time bad to maybe half the time. + +00:59:46 It's a bad example. + +00:59:47 Sometimes I have to dumb it down now, which is encouraging. + +00:59:49 But that's obviously not what we want. + +00:59:52 And so the AI, I think everyone knows, is not creating great code. + +00:59:57 And the reason is it was trained on not great code. + +01:00:02 Most code out there is not great code. + +01:00:03 The code specifically it used was demos, examples, things on GitHub, publicly available demos where there's no security team involved. + +01:00:11 Right. + +01:00:12 Right. + +01:00:12 So like if you went and scanned the code inside Microsoft that makes the Microsoft products, you better believe it, that'd probably be pretty darn good code versus some random crap Tanya did five years ago that's on her GitHub. + +01:00:25 That might be really crappy or it might even be intentionally vulnerable. + +01:00:29 Right. + +01:00:30 And it doesn't. + +01:00:30 Yeah. + +01:00:31 No. + +01:00:31 And so as a result, we have this thing that's trained that security just it's optional, it's low priority and it's missing. + +01:00:40 And so it is doing what it was trained to do. + +01:00:44 And developers and non-developers are constantly making apps now. + +01:00:50 We have CEOs making apps because they don't like what the marketing team did. + +01:00:54 And they're like, look what I did over the weekend. + +01:00:56 Boom. + +01:00:57 It's publish, please, because I'm the boss. + +01:01:00 Oh, I've seen it. + +01:01:01 Like literally. + +01:01:01 Yeah. + +01:01:01 Who's going to say no, right? + +01:01:02 Yeah, exactly. + +01:01:03 And so here we have very, very insecure code going onto the internet very, very quickly, often with no time for the security team to go look at it. + +01:01:13 All right. + +01:01:13 So you've got this prompt library that people can go and get from your website for free. + +01:01:18 You gave me an example to say, go find problems in this code. + +01:01:22 I took just some random code that I know has trouble in it and threw it in here. + +01:01:25 So the secure code prompt library, if you want to go, just go securemyvibe.ca and you do have to join my newsletter to get it. + +01:01:34 But I feel that's a reasonable price because my newsletter is awesome and you get memes. + +01:01:38 But anyway, so this is from that. + +01:01:41 So the prompt library has many things, but one of them is to review the code for security. + +01:01:46 So this is a code review prompt. + +01:01:48 So after you've generated the code, you would put this in. + +01:01:50 We have high risk findings. + +01:01:52 That looks like an, and what number was that? + +01:01:55 We got more findings, mass assignment, unvalidated JSON. + +01:01:59 Yeah. + +01:01:59 And look how short your code was. + +01:02:01 I gave it 62 lines of code here. + +01:02:03 And you are going to have more vulnerabilities than you have lines of code. + +01:02:07 Okay. + +01:02:07 So I'm not going to go into the details, but wow, I just gave it a little bit and it pulled up a whole bunch. + +01:02:11 So I think that that is. + +01:02:13 Did this find more than when you just asked it to review for vulnerability? + +01:02:18 Yeah, I think so. + +01:02:19 I think it did actually. + +01:02:20 Because if you put the AI in the right frame of mind, right? + +01:02:23 That's incredible. + +01:02:24 Well, and I gave it specific things that I wanted it to look for. + +01:02:27 So the prompt library has three levels. + +01:02:29 So prompt level one, you would make, you would add it to your memory or make a code skill, but you would make it run 100% of the times that you generate code. + +01:02:38 And it takes most of the first two thirds of my most recent book, Alice and Bob Learn Secure Coding, and it has condensed it into a set of prompts. + +01:02:47 Oh, that's awesome. + +01:02:48 When you build the code, these are the rules for doing so. + +01:02:52 Then, so that runs just every single time. + +01:02:54 And then it tells you all of its security assumptions and it flags any potential security issues for you automatically. + +01:03:01 So every time you generate code, it's like, I need you to know these things. + +01:03:03 And so then you can address them. + +01:03:05 And then level two prompts are, well, I'm going to build an API or I'm going to build a serverless app or I'm going to do this or I'm going to do that. + +01:03:12 And then you fill in the blanks and it helps basically set security requirements before the code's generated. + +01:03:19 So it does the first prompt and then that as like a double check. + +01:03:23 Then after you can run the secure code review check. + +01:03:26 And then level three is like where you want to get nitty gritty. + +01:03:29 Like you're like, I'm doing a user login feature and I want to hash these passwords very securely. + +01:03:36 Like, and then it's very specific about exactly how to do that. + +01:03:40 And it's free. + +01:03:40 Yeah. + +01:03:40 People should definitely check this out. + +01:03:41 That's very cool. + +01:03:42 All right. + +01:03:42 We are out of time, Tanya. + +01:03:44 Thank you so much for being here. + +01:03:45 Final thoughts. + +01:03:46 People want to get going with the new top 10. + +01:03:49 Please go take a look at it. + +01:03:50 So just look up OWASP top 10 and that will be us. + +01:03:53 Like Google's very good at finding us and maybe give it a read and maybe think about it the next time you are building an app. + +01:04:01 Also, maybe consider visiting your local OWASP chapter. + +01:04:05 Next time you want to, you know, search the internet how to do something that is security related, look up OWASP cheat sheets and then authentication, authorization or wherever you're doing. + +01:04:14 There's over 100 cheat sheets. + +01:04:16 We are a community that lives to serve and help you secure your software. + +01:04:20 And come check out me. + +01:04:21 If you look up She Acts Purple, I am all the things, the newsletter, the podcast, the blog, et cetera. + +01:04:26 And I'm also here to help. + +01:04:28 Well, I know you're doing really good stuff. + +01:04:30 I really appreciate your time here. + +01:04:31 Thank you, Tanya. + +01:04:32 Thank you, Michael. + +01:04:34 This has been another episode of Talk Python To Me. + +01:04:37 Thank you to our sponsors. + +01:04:38 Be sure to check out what they're offering. + +01:04:39 It really helps support the show. + +01:04:41 This episode is brought to you by Temporal, durable workflows for Python. + +01:04:45 Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +01:04:52 Get started at talkpython.fm/Temporal. + +01:04:56 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:05:08 Best of all, there's no subscription in sight. + +01:05:11 Browse the catalog at talkpython.fm. + +01:05:14 And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:05:19 Just search for Python in your podcast player. + +01:05:21 We should be right at the top. + +01:05:22 If you enjoyed that geeky rap song, you can download the full track. + +01:05:25 The link is actually in your podcast blur show notes. + +01:05:28 This is your host, Michael Kennedy. + +01:05:30 Thank you so much for listening. + +01:05:31 I really appreciate it. + +01:05:32 I'll see you next time. + +01:05:33 и Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би + +01:06:03 Thank you. + diff --git a/transcripts/545-owasp-top-10-transcript-final.vtt b/transcripts/545-owasp-top-10-transcript-final.vtt new file mode 100644 index 0000000..9085a81 --- /dev/null +++ b/transcripts/545-owasp-top-10-transcript-final.vtt @@ -0,0 +1,3436 @@ +WEBVTT + +00:00:00.000 --> 00:00:03.380 +The OWASP Top 10 just got a fresh update, and there are some big changes. + +00:00:03.900 --> 00:00:07.060 +Supply chain attacks, exceptional condition handling, and more. + +00:00:07.780 --> 00:00:11.740 +Tanya Janca is back on Talk Python to walk us through every single one of them. + +00:00:12.280 --> 00:00:14.000 +And we're not just talking theory here. + +00:00:14.140 --> 00:00:19.580 +We're going to turn Claude Code loose on a particularly crappy web project and see what it finds. + +00:00:20.100 --> 00:00:20.960 +Let's do this. + +00:00:21.320 --> 00:00:26.460 +It's Talk Python To Me, episode 545, recorded April 8th, 2026. + +00:00:28.460 --> 00:00:29.820 +Talk Python To Me. + +00:00:30.000 --> 00:00:31.100 +Yeah, we ready to roll. + +00:00:31.420 --> 00:00:32.480 +Upgrading the code. + +00:00:32.640 --> 00:00:33.940 +No fear of getting old. + +00:00:34.040 --> 00:00:35.160 +Async in the air. + +00:00:35.300 --> 00:00:36.560 +New frameworks in sight. + +00:00:36.700 --> 00:00:37.740 +Geeky rap on deck. + +00:00:38.040 --> 00:00:39.740 +Quarth Crew, it's time to unite. + +00:00:39.860 --> 00:00:41.280 +We started in Pyramid. + +00:00:41.380 --> 00:00:42.760 +Cruising old school lanes. + +00:00:43.040 --> 00:00:44.540 +Had that stable base, yeah, sir. + +00:00:44.540 --> 00:00:49.000 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:49.440 --> 00:00:50.860 +This is your host, Michael Kennedy. + +00:00:51.200 --> 00:00:54.840 +I'm a PSF fellow who's been coding for over 25 years. + +00:00:55.400 --> 00:00:56.540 +Let's connect on social media. + +00:00:56.540 --> 00:01:00.020 +You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:00.220 --> 00:01:02.160 +The social links are all in your show notes. + +00:01:02.880 --> 00:01:06.420 +You can find over 10 years of past episodes at talkpython.fm. + +00:01:06.500 --> 00:01:09.920 +And if you want to be part of the show, you can join our recording live streams. + +00:01:10.100 --> 00:01:10.600 +That's right. + +00:01:10.780 --> 00:01:14.160 +We live stream the raw uncut version of each episode on YouTube. + +00:01:14.160 --> 00:01:19.160 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:19.340 --> 00:01:22.980 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:23.600 --> 00:01:27.120 +This episode is brought to you by Temporal, durable workflows for Python. + +00:01:27.520 --> 00:01:34.100 +Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +00:01:34.620 --> 00:01:37.400 +Get started at talkpython.fm/Temporal. + +00:01:38.000 --> 00:01:39.160 +Hello, Tanya Janca. + +00:01:39.440 --> 00:01:40.940 +Welcome back to Talk Python To Me. + +00:01:40.960 --> 00:01:41.680 +Awesome to have you here. + +00:01:41.900 --> 00:01:42.600 +Oh my gosh, Michael. + +00:01:42.680 --> 00:01:43.680 +It's so nice to see you. + +00:01:43.680 --> 00:01:45.220 +Yeah, it's really great to see you as well. + +00:01:45.480 --> 00:01:54.140 +I remember last time I was nervous you were on the show because you're going to make me feel concerned about all my software running on the internet that now all has all these issues I just realized. + +00:01:55.400 --> 00:01:57.280 +We're back for the 2025 edition. + +00:01:57.840 --> 00:01:59.260 +And I know the year is 2026. + +00:01:59.580 --> 00:02:00.420 +Please don't email me. + +00:02:00.720 --> 00:02:01.960 +I mean, email me, but not for that reason. + +00:02:02.180 --> 00:02:07.380 +But this is the 2025 OWASP top 10, which is pretty new, right? + +00:02:07.580 --> 00:02:13.580 +Yeah, so we released it December 31st, 2025 so that it could stay 2025 on it. + +00:02:13.580 --> 00:02:17.000 +Do you know how much branding has to change if we don't get it out this year? + +00:02:17.080 --> 00:02:17.660 +Let's just go. + +00:02:18.260 --> 00:02:18.860 +That's incredible. + +00:02:19.000 --> 00:02:20.460 +I didn't realize it was that close to the wire. + +00:02:20.660 --> 00:02:23.040 +We had released the release candidate. + +00:02:23.040 --> 00:02:29.580 +So every time it's released, we release a release candidate to say, this is what we're thinking. + +00:02:29.660 --> 00:02:31.340 +And then we ask the community for feedback. + +00:02:32.040 --> 00:02:39.000 +And I don't know if you remember in previous versions before I joined, there was some drama where sometimes the community is like, absolutely not. + +00:02:39.220 --> 00:02:40.000 +You are incorrect. + +00:02:40.000 --> 00:02:42.080 +Or there's vendor influence or whatever. + +00:02:42.080 --> 00:02:43.640 +And then they've had to rework it. + +00:02:43.640 --> 00:02:53.560 +But this time it was the smoothest it's literally ever been since the first time where all the links were or all the GitHub issues were great. + +00:02:53.860 --> 00:02:57.680 +Like, hey, you know, here's a great example of that attack. + +00:02:57.860 --> 00:02:59.060 +Do you want to use it? + +00:02:59.100 --> 00:02:59.740 +And we're like, yes. + +00:02:59.840 --> 00:03:03.900 +Or, you know, the grammar is wrong here or the, you know, the links are wrong. + +00:03:03.900 --> 00:03:08.200 +We had a couple of, well, this one should be number one because that's what my product solves. + +00:03:08.560 --> 00:03:10.660 +We're like, well, we'll hear that feedback. + +00:03:11.000 --> 00:03:11.360 +Exactly. + +00:03:11.700 --> 00:03:13.020 +You failed to mention our product. + +00:03:13.100 --> 00:03:14.880 +I'm like, oh, I do see how that happened. + +00:03:15.260 --> 00:03:15.480 +Yeah. + +00:03:16.160 --> 00:03:22.620 +But other than like the feedback was overall just overwhelmingly, yes, we agree, which was very validating. + +00:03:22.780 --> 00:03:22.980 +Yeah. + +00:03:23.020 --> 00:03:27.980 +I'm sure there was a lot of, boy, if I could get the OWAS top 10 to reference my solution. + +00:03:28.380 --> 00:03:29.880 +That's some good marketing right there. + +00:03:30.240 --> 00:03:30.660 +Incredible. + +00:03:30.660 --> 00:03:39.220 +So before we dive into that with a bit of a Python focus, let's just hear a little bit about you, who you are. + +00:03:39.320 --> 00:03:41.800 +You've started a podcast since you've been on the show. + +00:03:42.240 --> 00:03:43.220 +Tell us about you. + +00:03:43.380 --> 00:03:48.220 +So I'm Tanya and I was a software developer that switched into application security. + +00:03:48.340 --> 00:03:49.660 +I went to the dark side, Michael. + +00:03:50.200 --> 00:03:55.580 +I started speaking at conferences so I could get in free and writing and then ended up writing two books. + +00:03:55.580 --> 00:04:02.460 +And now I teach secure coding and like how to use AI securely and all of those things to large companies. + +00:04:02.460 --> 00:04:04.040 +And then speak at conferences. + +00:04:04.040 --> 00:04:05.080 +Serve as like a habit. + +00:04:05.200 --> 00:04:05.940 +I can't stop. + +00:04:06.500 --> 00:04:08.800 +You don't get paid to do that usually. + +00:04:09.460 --> 00:04:12.360 +And so recently I started a podcast called DevSec Station. + +00:04:12.360 --> 00:04:18.600 +And it's five to 10 minute lessons on secure coding and it's free. + +00:04:19.140 --> 00:04:25.980 +I used to have a podcast called We Hack Purple Podcast and I, like my company got bought and absorbed, et cetera. + +00:04:26.180 --> 00:04:27.920 +And eventually the podcast was retired. + +00:04:28.420 --> 00:04:29.980 +I've missed having a podcast, Michael. + +00:04:30.060 --> 00:04:31.960 +I'm sure as a podcast host, you can relate. + +00:04:31.960 --> 00:04:34.920 +It's really nice to be able to create a piece of art and release it. + +00:04:35.100 --> 00:04:43.380 +It's a very interesting medium and you get to just reach out to people or explore ideas that are just interesting to you. + +00:04:43.480 --> 00:04:46.940 +And long as there's a through thread, you can kind of do whatever you want to. + +00:04:47.300 --> 00:04:48.240 +It's great. + +00:04:48.340 --> 00:04:48.920 +Yeah, I love it. + +00:04:48.980 --> 00:04:53.220 +I wanted to teach some lessons and I wanted them to be just really short. + +00:04:53.980 --> 00:04:58.700 +And the first season I'm exploring the idea that the supply chain is changing. + +00:04:58.700 --> 00:05:02.440 +The supply chain security used to be just dependencies that people worried about. + +00:05:02.580 --> 00:05:07.840 +But now I'm like, what if that tech surface is actually very different than we realize? + +00:05:08.000 --> 00:05:14.320 +And so I'm talking about how developers can protect themselves, protect the organizations, protect their build pipelines, et cetera. + +00:05:14.480 --> 00:05:17.620 +And so, yeah, I'm excited to see what people think of it. + +00:05:17.820 --> 00:05:18.900 +Yeah, I encourage people to subscribe. + +00:05:19.040 --> 00:05:19.640 +That's really cool. + +00:05:19.860 --> 00:05:20.800 +Five to 10 minutes. + +00:05:21.260 --> 00:05:22.300 +Is it daily or weekly? + +00:05:22.500 --> 00:05:27.500 +So I released one two weeks ago and I was kind of thinking of releasing one tomorrow, but I need to just get the edits. + +00:05:27.500 --> 00:05:33.000 +It's, I hired some students and they're learning video editing and it's been very exciting. + +00:05:33.420 --> 00:05:43.120 +Yeah, that's the back end of all this type of work that people don't realize is it's an hour or 10 minutes or whatever, but then there's the whole production distribution, et cetera. + +00:05:43.480 --> 00:05:44.520 +I'm just a control freak. + +00:05:44.600 --> 00:05:45.520 +That's the problem, Michael. + +00:05:46.560 --> 00:05:48.240 +I just want to do it myself. + +00:05:48.400 --> 00:05:49.020 +Yes, exactly. + +00:05:49.680 --> 00:05:51.080 +There's a bit of a wind noise thing. + +00:05:51.140 --> 00:05:51.940 +Can we do that again? + +00:05:52.320 --> 00:05:53.620 +No, it is tough. + +00:05:53.620 --> 00:05:56.540 +It is really, really tough to kind of find that balance. + +00:05:56.640 --> 00:05:57.620 +But yeah, people check that out. + +00:05:57.740 --> 00:05:58.320 +That's awesome. + +00:05:58.740 --> 00:05:58.900 +Yeah. + +00:05:59.040 --> 00:06:01.900 +And is SheHacksPurple still your domain? + +00:06:02.340 --> 00:06:02.540 +Yeah. + +00:06:02.680 --> 00:06:08.140 +So if people go to SheHacksPurple.ca, you will find my website and my services and my blog. + +00:06:08.280 --> 00:06:19.820 +So lately I'm blogging a lot about how we can combine behavioral economic interventions, which is like the science of why people make decisions to the software development ecosystem + +00:06:19.820 --> 00:06:30.480 +so that we basically set up secure defaults and other things that just nudge developers to do the secure thing and make the secure thing always the easiest path. + +00:06:30.480 --> 00:06:38.500 +And so not how do we manipulate them and pressure them and make them feel bad, but more how can we remove cognitive load that's not necessary? + +00:06:39.120 --> 00:06:42.560 +How can we make it more obvious what we hope that they'll do? + +00:06:42.560 --> 00:06:46.500 +How can we make it so like it requires effort to do the bad thing? + +00:06:46.700 --> 00:06:53.420 +There's a phrase that I got, I think from Scott Guthrie was the one who spoke about it at Microsoft, but it doesn't really matter. + +00:06:53.940 --> 00:06:58.600 +The help people, help developers and security folks fall into the pit of success. + +00:06:58.600 --> 00:07:02.860 +Like you've got to climb out of the thing you're supposed to do and actively do it wrong. + +00:07:02.960 --> 00:07:03.260 +You know what I mean? + +00:07:03.400 --> 00:07:03.800 +Exactly. + +00:07:04.140 --> 00:07:04.520 +Exactly. + +00:07:04.700 --> 00:07:07.960 +So I'm writing a series on that based on a talk I did. + +00:07:08.100 --> 00:07:10.740 +So sometimes I'll do a talk and then I'm really excited about it. + +00:07:10.740 --> 00:07:13.480 +And I'm like, well, now I can nerd out as much as I want on my blog. + +00:07:13.640 --> 00:07:17.040 +I want to talk about what is the OWASP top 10? + +00:07:17.360 --> 00:07:17.920 +What is OWASP? + +00:07:17.960 --> 00:07:20.280 +What is the OWASP top 10 and all of that? + +00:07:20.620 --> 00:07:34.320 +But before we kind of get into that, I do want to set the stage just a little bit, because when people think about Python security or you pick your language, there's like every language and framework has a little security gotchas. + +00:07:34.320 --> 00:07:40.860 +Like in Python, there's a, I think it's a YAML parser, but it's too, it can run arbitrary code. + +00:07:41.040 --> 00:07:43.240 +So you got to do like the safe YAML parsing. + +00:07:43.340 --> 00:07:47.580 +And then there's, there's pickles, which is a serialization format that can run arbitrary code. + +00:07:47.680 --> 00:07:50.440 +So people say, these are the things to look out for. + +00:07:50.520 --> 00:07:52.420 +I actually think those are fairly useless. + +00:07:52.880 --> 00:08:03.620 +I think the real problem is things that developers do to write code and they either omit or add actions or steps that they should or shouldn't have done depending on the situation. + +00:08:04.040 --> 00:08:06.720 +And really that's what OWASP focuses on, right? + +00:08:07.020 --> 00:08:19.560 +Technically it is in a nonprofit organization, but in what I feel it is, is an international community of thousands and thousands and thousands of people who want there to be more secure software. + +00:08:20.100 --> 00:08:22.800 +And so we have chapters where people meet each month. + +00:08:22.920 --> 00:08:25.060 +There's over 300 worldwide, Michael. + +00:08:25.240 --> 00:08:26.580 +It's amazing. + +00:08:27.000 --> 00:08:27.120 +Wow. + +00:08:27.120 --> 00:08:27.440 +Yeah. + +00:08:27.560 --> 00:08:30.760 +Almost every big city in all of Canada has one. + +00:08:30.960 --> 00:08:34.520 +And like, we can't agree on a lot of things in Canada, but apparently we love OWASP. + +00:08:34.720 --> 00:08:37.200 +And like in India, oh my gosh, they have so many. + +00:08:37.360 --> 00:08:41.040 +And in the United States, like people love, oh, I love OWASP. + +00:08:41.180 --> 00:08:45.040 +And then we have over 100 active open source projects. + +00:08:45.300 --> 00:08:51.180 +So we have free books, free documents, free tools, free software, like everything you can think of. + +00:08:51.200 --> 00:08:52.580 +We're like, oh, I'll build one for you. + +00:08:53.020 --> 00:08:56.940 +And then, you know, there's a Slack channel with thousands of us on there being nerds together. + +00:08:56.940 --> 00:09:08.500 +And then twice a year, we have one in Europe and one in North America where we have a conference and we gather and there's talks and teachings and all the things, right? + +00:09:08.620 --> 00:09:16.040 +And I've been a part of OWASP since 2015 was the first thing I attended. + +00:09:16.240 --> 00:09:19.700 +And then 2016, I was a chapter leader because I don't know how to do small things. + +00:09:19.700 --> 00:09:23.180 +I only know how to do really big things because I was like, oh, I'm in. + +00:09:23.680 --> 00:09:32.040 +And I've just been, so I'm now a lifetime distinguished member because I've volunteered for over 10 years and I'm their biggest fan. + +00:09:32.220 --> 00:09:34.980 +I'm totally president of the non-existent fan club. + +00:09:35.560 --> 00:09:40.400 +And the thing, so the weird thing, Michael, so OWASP does so many amazing things. + +00:09:40.460 --> 00:09:41.540 +There's so many amazing people. + +00:09:41.540 --> 00:09:46.100 +But the thing that we're absolutely most famous for is called the OWASP top 10. + +00:09:46.520 --> 00:09:55.480 +And I volunteered in Norway at the OWASP booth at a conference because I gave a talk and then I had nothing else to do. + +00:09:55.700 --> 00:09:58.880 +And I don't mean to sound rude, but unless it's a security talk, I'm not going. + +00:09:59.000 --> 00:10:00.080 +And it was a developer talk. + +00:10:00.320 --> 00:10:03.360 +And so I went to the other two security talks and then I was like, what am I going to do? + +00:10:03.420 --> 00:10:05.060 +So I volunteered at the OWASP booth. + +00:10:05.060 --> 00:10:08.100 +And every person that walked by went top 10. + +00:10:08.560 --> 00:10:09.160 +That's awesome. + +00:10:09.680 --> 00:10:12.620 +And any of them that knew us, they only knew the top 10. + +00:10:13.020 --> 00:10:14.060 +They didn't know we had chapters. + +00:10:14.300 --> 00:10:16.300 +They didn't know we had any other things to help them. + +00:10:16.660 --> 00:10:19.340 +But literally person after person, top 10. + +00:10:20.400 --> 00:10:21.280 +That's pretty funny. + +00:10:21.380 --> 00:10:24.600 +Like a few weeks later, the top 10 team invited me to join. + +00:10:24.700 --> 00:10:25.940 +And I was like, me? + +00:10:26.780 --> 00:10:28.120 +And I was like, okay. + +00:10:28.460 --> 00:10:32.300 +I knew about the whole project because of the top 10 as well, honestly. + +00:10:32.560 --> 00:10:34.140 +Well, I mean, then we're doing a great job. + +00:10:34.140 --> 00:10:36.760 +And so we finally wrote a new one. + +00:10:36.880 --> 00:10:39.680 +And the team's like, okay, Tanya, you like talking the most. + +00:10:39.720 --> 00:10:40.540 +So you go tell them. + +00:10:41.240 --> 00:10:42.480 +You go tell them it's ready. + +00:10:42.800 --> 00:10:46.980 +Actually, if you go to the OWASP top 10, there's a GitHub link at the top. + +00:10:47.280 --> 00:10:55.640 +And you can see that there's, if you go there, you can actually see historically the 2003, 4, 7, 10, 2017, 2021, and the 2025. + +00:10:55.960 --> 00:10:59.380 +And you go and there's all the markdown files and presentations and et cetera. + +00:10:59.580 --> 00:11:02.860 +So you can kind of get the historical evolution as well. + +00:11:02.860 --> 00:11:04.300 +People file issues. + +00:11:04.480 --> 00:11:08.860 +And then I have to, me or, you know, we're twisting or that's what we do. + +00:11:09.120 --> 00:11:09.920 +We respond. + +00:11:10.240 --> 00:11:11.880 +Sometimes we respond and we're like, no. + +00:11:13.180 --> 00:11:14.840 +This is the third time you've asked. + +00:11:14.960 --> 00:11:15.200 +No. + +00:11:15.420 --> 00:11:16.740 +We talked about it twice. + +00:11:16.740 --> 00:11:17.220 +Yeah. + +00:11:17.220 --> 00:11:23.020 +We try really hard to always be open because the community, sometimes we don't have the data that supports the thing. + +00:11:23.080 --> 00:11:24.660 +But let me tell everyone what it is. + +00:11:24.720 --> 00:11:37.560 +So if you've been hiding under a rock, the OWASP top 10 is an awareness document of the top 10 things, according to the data we gathered and multiple community surveys of risks to web applications. + +00:11:37.560 --> 00:11:42.940 +Most of them relate to all software, but technically this one is about web applications. + +00:11:43.400 --> 00:11:51.540 +And you might not realize, but underneath the next steps, there's three more secret items because we couldn't decide, Michael. + +00:11:52.060 --> 00:11:54.580 +And Vibe coding needed to be on there. + +00:11:54.580 --> 00:11:57.880 +And then there was sort of a tie for the number 10. + +00:11:58.240 --> 00:12:00.420 +So we want to include the one that was the tie. + +00:12:00.840 --> 00:12:05.560 +And then we felt memory safety is still so unbelievably critical. + +00:12:05.780 --> 00:12:07.620 +We had to comment on that as well. + +00:12:07.920 --> 00:12:10.560 +So we had to talk about those as well. + +00:12:10.760 --> 00:12:11.960 +So that was good. + +00:12:12.060 --> 00:12:16.640 +So it's the top 10, but these are three other things for your consideration that are very important. + +00:12:16.920 --> 00:12:17.780 +I did not realize those. + +00:12:17.880 --> 00:12:18.300 +How interesting. + +00:12:18.300 --> 00:12:32.420 +I do think in whenever you release your next version in two, three, four years, whatever, there's going to be a strong AI bent and we're going to get into the AI angle of security in this conversation, which is going to be really fun. + +00:12:32.740 --> 00:12:35.120 +But yeah, I think we're just beginning this. + +00:12:35.260 --> 00:12:38.120 +Like you called it out with Vibe coding, but there's more to that even. + +00:12:38.380 --> 00:12:42.400 +Making the top 10 is complicated because we have to do it based on data. + +00:12:43.140 --> 00:12:54.560 +And the data I would like would be the postmortem from security incidents and would be, you know, the AppSec team telling us the things that are happening, not what we actually + +00:12:54.580 --> 00:13:00.060 +get, which is a bunch of SaaS vendors and DAS vendors telling us what their automated tools are capable of finding. + +00:13:00.600 --> 00:13:08.800 +And then a bunch of boutique pen tester companies who are so generous to give us their reports and to try to normalize that for us. + +00:13:09.220 --> 00:13:13.480 +And like we end up with millions and millions and millions of records and that's great. + +00:13:13.480 --> 00:13:22.160 +But if a SaaS tool is really good at finding X, that doesn't necessarily mean that X is the biggest problem in our industry. + +00:13:22.400 --> 00:13:31.480 +And so when we put supply chain security on there and expanded it from just being libraries, like, oh, you're using outdated or vulnerable components. + +00:13:31.940 --> 00:13:33.420 +Like, yeah, that's bad. + +00:13:33.680 --> 00:13:35.520 +But there's also malicious components. + +00:13:35.520 --> 00:13:45.200 +There's also you didn't lock down your CI and then the co-op term, like the co-op or you call them interns, put too many zeros on the Kubernetes deployment. + +00:13:45.200 --> 00:13:50.280 +And then you've got $30,000 bill you weren't expecting or like I've seen it, right? + +00:13:50.400 --> 00:13:54.480 +I feel like it's hard to get data that tells the full picture. + +00:13:54.480 --> 00:13:58.080 +And we're not allowed to just say, well, this is what the team thinks, right? + +00:13:58.100 --> 00:13:59.820 +So that's where the surveys come in. + +00:13:59.940 --> 00:14:02.100 +And we're like, this is what we see. + +00:14:02.160 --> 00:14:02.880 +This is what we think. + +00:14:02.920 --> 00:14:03.420 +Do you agree? + +00:14:03.540 --> 00:14:07.840 +And like across the board, everyone was like, supply chain must be on that list. + +00:14:07.900 --> 00:14:08.780 +You must expand it. + +00:14:08.860 --> 00:14:09.240 +We agree. + +00:14:09.400 --> 00:14:12.800 +I think it's the biggest issue of the year, at least of the last six months. + +00:14:12.800 --> 00:14:23.780 +If we look at reports like the Verizon breach report and the CrowdStrike and the like the big reports of many, many breaches, if we look at the big ones, the nation state ones, + +00:14:24.460 --> 00:14:30.880 +their supply chain, or in my opinion, if we really look at it, we think about it, they exploited the developer. + +00:14:31.120 --> 00:14:33.960 +They compromised a developer within an organization. + +00:14:33.960 --> 00:14:37.480 +It gave them access to multiple parts of the supply chain. + +00:14:37.700 --> 00:14:40.040 +And then they owned the entire organization. + +00:14:40.040 --> 00:14:46.040 +So if you get SQL injection in one app, you got into one database and maybe you could read sensitive data. + +00:14:46.620 --> 00:14:48.700 +Maybe you could delete sensitive data. + +00:14:49.080 --> 00:14:56.420 +If that database was completely unpatched in a total like terrible mess, then maybe you could take over that server. + +00:14:56.800 --> 00:15:06.040 +Then if your network's totally not secure and crappy, which is not exactly that common, like I'll look at, you know, network diagrams with clients and I'm like, oh, that line, is that a firewall? + +00:15:06.180 --> 00:15:07.600 +They're like, it's more aspirational. + +00:15:07.840 --> 00:15:09.620 +I've heard that many times, right? + +00:15:09.620 --> 00:15:17.900 +Then maybe they could pivot and get to a couple of places, but that's like, maybe, maybe, maybe, maybe, maybe you get a little bit, but you compromise a senior developer. + +00:15:18.520 --> 00:15:18.620 +Right. + +00:15:18.700 --> 00:15:25.040 +And so, yeah, I was really glad when the team agreed that we would do this and then the community supported it. + +00:15:25.100 --> 00:15:26.520 +So I was like, yes, win. + +00:15:26.920 --> 00:15:26.940 +Yeah. + +00:15:27.000 --> 00:15:29.200 +You compromise a developer, then you get. + +00:15:29.400 --> 00:15:30.300 +Especially a senior. + +00:15:30.600 --> 00:15:31.120 +Yeah, exactly. + +00:15:31.240 --> 00:15:36.160 +You get arbitrary code execution on potentially all the stuff that they send out to the world. + +00:15:36.160 --> 00:15:37.240 +It's, it's really bad. + +00:15:37.320 --> 00:15:40.220 +You know, an example of that would be the last past breach. + +00:15:40.360 --> 00:15:40.540 +Yeah. + +00:15:40.600 --> 00:15:53.060 +And the way that that all started from what I understand is one of the developers had an outdated version of Plex, the like streaming ripped video player on his home network that was open on the internet. + +00:15:53.060 --> 00:15:54.100 +That got taken over. + +00:15:54.200 --> 00:15:57.760 +Then they got into the dev machine and then they got everybody's password vaults. + +00:15:57.760 --> 00:16:03.360 +It's like, excuse me, because you had a movie player on your home automation network. + +00:16:03.500 --> 00:16:04.280 +That's crazy. + +00:16:04.480 --> 00:16:10.640 +There's been some recent hacks where the developers download like a plugin or something and it's malicious. + +00:16:10.640 --> 00:16:17.160 +And then not only is it trying to steal like secrets off their computer, but then it robs their crypto wallets. + +00:16:17.160 --> 00:16:21.160 +Because why don't you just kick someone when they're down like jerks? + +00:16:21.300 --> 00:16:21.960 +It's terrible. + +00:16:22.320 --> 00:16:23.460 +Well, we'll circle back to that. + +00:16:23.460 --> 00:16:29.040 +But I do think the vibe coding side is going to be a big deal in the future and you'll have a little bit of a challenge there. + +00:16:29.120 --> 00:16:33.900 +I think obviously professional developers love to bag on vibe coding. + +00:16:34.020 --> 00:16:35.560 +And I think that that's totally fair. + +00:16:35.880 --> 00:16:39.460 +But the problem is, I think a lot of that is kind of dark matter. + +00:16:39.800 --> 00:16:42.960 +Like you'll never see the people because the people do vibe coding. + +00:16:43.040 --> 00:16:45.400 +They don't know to go and fill out a survey for OWASP. + +00:16:45.660 --> 00:16:47.700 +They don't even know what a line of code looks like. + +00:16:47.740 --> 00:16:49.160 +They're just make me happy. + +00:16:49.300 --> 00:16:50.200 +You know, it's crazy. + +00:16:50.200 --> 00:17:04.240 +And now some companies are building dark factories, which is a term that was new for me, which is where they replace their entire software development team only with complete AI end to end solutions where there's no human in the loop whatsoever. + +00:17:05.260 --> 00:17:12.940 +And oh my gosh, Michael, imagine the security posture of what's being released there and the fact that they don't know what the posture is, right? + +00:17:13.220 --> 00:17:16.860 +But they're getting to market faster than someone else. + +00:17:16.860 --> 00:17:16.980 +Yes. + +00:17:17.360 --> 00:17:22.980 +And so consumers don't know that what they're buying has been put together with duct tape and glue. + +00:17:23.500 --> 00:17:27.120 +So I actually, we didn't talk about this beforehand, but I hope it's okay to mention. + +00:17:27.440 --> 00:17:30.260 +So I'm trying to push a secure coding law in Canada. + +00:17:30.540 --> 00:17:34.100 +So Canada is cute and quaint, just like all the stereotypes. + +00:17:34.420 --> 00:17:39.800 +And a citizen is allowed to create a petition if she could get a member of parliament to support her. + +00:17:39.920 --> 00:17:42.160 +And after three years of letter writing, one of them did. + +00:17:42.440 --> 00:17:42.540 +Wow. + +00:17:42.880 --> 00:17:43.240 +Congratulations. + +00:17:43.640 --> 00:17:44.040 +Thank you. + +00:17:44.040 --> 00:17:47.980 +In the house of commons and I have enough signatures, so it's going to go to vote. + +00:17:47.980 --> 00:17:54.240 +And so right now I'm lobbying all the public to try to get them to call their member of parliament and ask them to vote. + +00:17:54.320 --> 00:17:54.500 +Yes. + +00:17:54.520 --> 00:18:00.360 +Because what happens is my member of parliament will be like, Hey, seven one, one, five says this. + +00:18:01.000 --> 00:18:08.960 +We should create a secure coding law for all governmental organizations, have a standard and then, you know, assure the standard, make, make sure there's compliance. + +00:18:09.400 --> 00:18:12.080 +And I also like wrote the standard for them and sent it to them. + +00:18:12.080 --> 00:18:15.700 +Cause that's what I'm like, I'm like, you don't have to use it, but like, here it is in case you want one. + +00:18:16.300 --> 00:18:17.640 +I've been writing letters for years. + +00:18:17.640 --> 00:18:18.320 +I'm very annoying. + +00:18:18.720 --> 00:18:19.060 +Guess what? + +00:18:19.140 --> 00:18:21.320 +Members of parliament don't know what the word cyber means. + +00:18:21.420 --> 00:18:22.300 +Like they're very smart. + +00:18:22.520 --> 00:18:23.660 +I'm not trying to mock them. + +00:18:23.760 --> 00:18:25.640 +I'm not an expert in what they do either. + +00:18:25.800 --> 00:18:26.060 +Right. + +00:18:26.480 --> 00:18:28.000 +And so I need members. + +00:18:28.000 --> 00:18:33.400 +So Canadians, if you're listening, go like, look up petition E seven one, one five. + +00:18:33.500 --> 00:18:36.420 +You'll find me sign it and then call or write your member of parliament. + +00:18:36.560 --> 00:18:44.980 +So if enough of us call, if like 10 or 20, 30 people call the member of parliament or they receive emails, Michael, when the petition comes up, they're like, Oh yeah, my constituents care. + +00:18:45.400 --> 00:18:46.000 +Therefore I do. + +00:18:46.020 --> 00:18:48.800 +And they'll raise their hand and I get one chance. + +00:18:49.340 --> 00:18:54.720 +And I really want a lot of hands going up specifically at least half of 334 people. + +00:18:54.900 --> 00:18:55.360 +Good luck with that. + +00:18:55.360 --> 00:18:56.420 +I hope that that goes through. + +00:18:56.480 --> 00:18:56.800 +That's cool. + +00:18:56.880 --> 00:18:57.980 +We'll know by June. + +00:18:58.220 --> 00:18:58.480 +Yeah. + +00:18:58.500 --> 00:19:01.660 +That's the challenge with legislatures and government in general. + +00:19:01.780 --> 00:19:08.300 +So many of the people, especially elected officials, they're not elected because they're developers or security specialists or whatever. + +00:19:08.440 --> 00:19:08.680 +Right. + +00:19:08.840 --> 00:19:13.060 +But the difference is that you said it's fine because you don't know what they do. + +00:19:13.180 --> 00:19:14.640 +You're not an expert in law or whatever. + +00:19:14.760 --> 00:19:15.660 +That's that is true. + +00:19:15.660 --> 00:19:23.220 +But they have to choose how it's going to work for us through technology, whereas you don't have to choose how law works for them as a tech. + +00:19:23.280 --> 00:19:23.720 +You know what I mean? + +00:19:23.720 --> 00:19:25.320 +Like it's they decide. + +00:19:25.460 --> 00:19:29.900 +So it's I'm not saying that it's their fault or anything, but it is a very tricky thing to balance. + +00:19:30.180 --> 00:19:30.380 +It is. + +00:19:30.460 --> 00:19:37.680 +And this is why I, as like a influential person or whatever, are trying to use my influence for good. + +00:19:37.740 --> 00:19:39.700 +And I'm trying to protect Canada. + +00:19:39.820 --> 00:19:46.020 +And here's the thing, Michael, is that if Canada creates a law that does this, that is huge momentum for every other country. + +00:19:46.020 --> 00:19:50.180 +And Canada was one of the first countries to have privacy laws. + +00:19:50.180 --> 00:19:51.480 +Like we really led the way in that. + +00:19:51.480 --> 00:19:56.260 +We really have led the way in like laws for quantum as well. + +00:19:56.580 --> 00:20:04.100 +And like, we're not really used to being that we're just, you know, struggling along and we could lead the way in this. + +00:20:04.100 --> 00:20:04.320 +Right. + +00:20:04.320 --> 00:20:07.440 +And then that means other countries can say, well, they have one. + +00:20:07.440 --> 00:20:09.460 +Like, do we really want to be behind Canada? + +00:20:09.460 --> 00:20:10.540 +I mean, come on. + +00:20:10.540 --> 00:20:11.300 +We love Canada. + +00:20:11.300 --> 00:20:11.860 +Canada is awesome. + +00:20:11.860 --> 00:20:13.060 +Yeah, we're sweet and we're wonderful. + +00:20:13.060 --> 00:20:14.740 +And we mean well all the time. + +00:20:16.020 --> 00:20:18.580 +This portion of Talk Python To Me is sponsored by Temporal. + +00:20:18.580 --> 00:20:25.460 +Ever since I had Mason Egger on the podcast for episode 515, I've been fascinated with durable workflows in Python. + +00:20:25.940 --> 00:20:30.620 +That's why I'm thrilled that Temporal has decided to become a podcast sponsor since that episode. + +00:20:30.960 --> 00:20:39.200 +If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:20:39.820 --> 00:20:45.000 +I'm sure many of you have written brutal code to keep the workflow moving and to track when you run into problems. + +00:20:45.000 --> 00:20:46.360 +But it's trickier than that. + +00:20:46.360 --> 00:20:51.560 +What if you have a long-running workflow and you need to redeploy the app or restart the server while it's running? + +00:20:52.120 --> 00:20:55.460 +This is where Temporal's open source framework is a game changer. + +00:20:56.140 --> 00:21:10.400 +You write workflows as normal Python code and Temporal ensures that they execute reliably, even across crashes, restarts, or long-running processes while handling retries, states, and orchestrations for you so you don't have to build and maintain that logic yourself. + +00:21:10.400 --> 00:21:16.340 +You may be familiar with writing asynchronous code using the async and await keywords in Python. + +00:21:16.840 --> 00:21:25.100 +Temporal's brilliant programming model leverages the exact same programming model that you are familiar with but uses it for durability, not just concurrency. + +00:21:25.840 --> 00:21:28.080 +Imagine writing await workflow.sleep. + +00:21:28.080 --> 00:21:29.960 +Heim Delta, 30 days. + +00:21:30.300 --> 00:21:31.020 +Yes, seriously. + +00:21:31.300 --> 00:21:32.220 +Sleep for 30 days. + +00:21:32.380 --> 00:21:33.040 +Restart the server. + +00:21:33.280 --> 00:21:34.280 +Deploy new versions of the app. + +00:21:34.520 --> 00:21:34.940 +That's it. + +00:21:35.140 --> 00:21:36.280 +Temporal takes care of the rest. + +00:21:36.820 --> 00:21:41.320 +Temporal is used by teams at Netflix, Snap, and NVIDIA for critical production systems. + +00:21:41.440 --> 00:21:44.560 +Get started with the open source Python SDK today. + +00:21:44.860 --> 00:21:47.280 +Learn more at talkpython.fm/Temporal. + +00:21:47.560 --> 00:21:49.600 +The link is in your podcast player's show notes. + +00:21:50.000 --> 00:21:52.020 +Thank you to Temporal for supporting the show. + +00:21:52.020 --> 00:22:03.500 +Let's shift just a little bit to maybe people who don't necessarily mean well, the people who might exploit, you know, broken access control or other types of things. + +00:22:03.720 --> 00:22:13.660 +And to aid us here, going through the OWASP top 10, I've come up with a little example of, well, this is what some of the concrete examples might look like. + +00:22:13.760 --> 00:22:16.100 +Maybe I'll put a link to this and I'll reference them. + +00:22:16.220 --> 00:22:18.520 +And I don't know how often I'll totally use this. + +00:22:18.660 --> 00:22:20.700 +But let's start with this one. + +00:22:20.700 --> 00:22:23.540 +And if I understand it correctly, you all don't like suspense. + +00:22:24.060 --> 00:22:27.420 +It's the worst comes first, then the second worst, and then the third worst. + +00:22:27.620 --> 00:22:29.560 +Yeah, we go in order of this. + +00:22:29.920 --> 00:22:32.720 +So this is, it causes lots and lots of damage. + +00:22:33.060 --> 00:22:35.220 +It's not that hard to find or exploit. + +00:22:35.540 --> 00:22:37.080 +And it's everywhere. + +00:22:37.400 --> 00:22:38.000 +It's everywhere. + +00:22:38.380 --> 00:22:43.060 +When I talk to people that do pen testing, they're like, yeah, I find this basically every time. + +00:22:43.680 --> 00:22:51.060 +Like one of my friends, Katie Paxton Fear, she does API content and pen testing and bug bounty and stuff. + +00:22:51.060 --> 00:22:58.540 +And she said, I have never not once found broken access control in an API, like never once. + +00:22:59.160 --> 00:23:08.320 +And it's really hard to get right, Michael, because every single page, every single record, every single access, we should check that the role is allowed. + +00:23:08.320 --> 00:23:11.180 +And that they still are that role, right? + +00:23:11.240 --> 00:23:15.000 +So we continue to make sure the session's accurate and then grant access. + +00:23:15.000 --> 00:23:18.120 +And unfortunately, we just forget to ask a bot. + +00:23:18.420 --> 00:23:23.260 +Or we return the entire record set, and we'll just sort it out on the front end. + +00:23:23.340 --> 00:23:25.440 +And the malicious actor is like, thanks for the data set. + +00:23:26.680 --> 00:23:28.680 +That's so sweet of you to give that to me. + +00:23:29.220 --> 00:23:32.200 +And we just screwed up so much, so much. + +00:23:32.400 --> 00:23:35.440 +I would like to point out that these are not a single vulnerability. + +00:23:35.600 --> 00:23:38.020 +It's not like this is the number one issue. + +00:23:38.220 --> 00:23:39.740 +They're like categories, right? + +00:23:39.840 --> 00:23:49.740 +It's like violations of these could be you didn't use least privilege, or you bypass a control check, or you didn't put access control on the delete part of the API. + +00:23:49.900 --> 00:23:52.360 +You know, like there's a bunch of things that fall into each one of these, right? + +00:23:52.400 --> 00:23:52.940 +It's a category. + +00:23:52.940 --> 00:23:54.340 +They're all a bucket. + +00:23:55.040 --> 00:24:00.140 +And when we looked at the data, poor code quality was a bucket. + +00:24:00.260 --> 00:24:03.360 +I'm like, no, because all of the things go into that bucket, right? + +00:24:03.500 --> 00:24:04.360 +The OWASP top one. + +00:24:04.500 --> 00:24:05.240 +OWASP top one. + +00:24:05.340 --> 00:24:09.900 +And then also the mitigation advice is, what if you sucked less, right? + +00:24:10.000 --> 00:24:12.820 +Like there's no constructive feedback to poor code quality. + +00:24:12.940 --> 00:24:14.020 +It's not specific enough. + +00:24:14.180 --> 00:24:16.220 +So we didn't want buckets like that. + +00:24:16.280 --> 00:24:17.260 +Keep it actionable, right? + +00:24:17.380 --> 00:24:17.900 +Yeah, exactly. + +00:24:18.040 --> 00:24:21.420 +If it's not actionable, it's not worth raising awareness about, we felt. + +00:24:21.420 --> 00:24:24.160 +And broken access control, Michael, it's everywhere. + +00:24:24.360 --> 00:24:28.100 +And I wish there was like a product that you could buy that could just do this for you. + +00:24:28.160 --> 00:24:34.960 +So you can buy authentication products that manage session and identity like really, really, really well, right? + +00:24:35.000 --> 00:24:38.180 +And people buy the crap out of them because they work really well. + +00:24:38.320 --> 00:24:48.120 +I'd like to be able to buy an access control tool that was as easy to implement as, I'm going to try not to name brands, but you know the products, right? + +00:24:48.120 --> 00:24:57.100 +People pay a lot of money for Okta, you know, at like Active Directory, you know, Incognito because they work, right? + +00:24:57.140 --> 00:24:58.020 +And they work well. + +00:24:58.240 --> 00:25:04.080 +And the less painful they are to implement, the more likely, like they're willing to pay more for that, right? + +00:25:04.140 --> 00:25:08.600 +And so if we could solve this issue, like I think that could be pretty huge. + +00:25:08.940 --> 00:25:09.000 +Yeah. + +00:25:09.020 --> 00:25:12.300 +I got a couple of examples here that were, they're not obvious. + +00:25:12.660 --> 00:25:15.220 +Some are obvious, kind of like, here's a Django example. + +00:25:15.220 --> 00:25:16.660 +We might see this later. + +00:25:16.960 --> 00:25:25.160 +So you might have an admin endpoint and it has at login required, which is a decorator in Django that will do the validation before the function even runs. + +00:25:25.500 --> 00:25:26.920 +And so you look at it like, oh yeah, this is fine. + +00:25:27.000 --> 00:25:30.460 +It's using authentication here, but it's not using authorization. + +00:25:30.460 --> 00:25:33.620 +It's not checking that the person necessarily is an admin. + +00:25:33.780 --> 00:25:35.420 +It's just that they're logged in, right? + +00:25:35.780 --> 00:25:37.260 +Like that's a real simple example. + +00:25:37.540 --> 00:25:37.800 +Yep. + +00:25:38.060 --> 00:25:39.260 +And I see the problem. + +00:25:39.400 --> 00:25:43.980 +Another one would be if you say we're going to let people read and write files. + +00:25:43.980 --> 00:25:46.320 +Maybe it's like a WordPress type thing or something. + +00:25:46.560 --> 00:25:56.920 +But then if they can put dot, dot in their path and break out, let me read the file dot, dot, slash, dot, dot, slash, et cetera, dot password and read the passwords or whatever, or usernames. + +00:25:57.100 --> 00:26:01.300 +This is the type of stuff that falls in a broken access control or just no login checks at all. + +00:26:01.480 --> 00:26:05.460 +I literally did this yesterday, Michael, because someone was like, hey, go get this file from there. + +00:26:05.560 --> 00:26:07.400 +And then I go in the folder and it's not there. + +00:26:07.600 --> 00:26:08.840 +Like the link wasn't correct. + +00:26:08.900 --> 00:26:11.520 +So I just went through the web directory with that. + +00:26:11.520 --> 00:26:14.360 +Like, but they wanted to send me the file. + +00:26:14.500 --> 00:26:17.860 +So just to be clear, like, like they sent me and told me to go get it. + +00:26:18.060 --> 00:26:19.600 +I wasn't stealing anything. + +00:26:19.660 --> 00:26:22.060 +And I didn't end up eventually finding it either. + +00:26:22.540 --> 00:26:25.300 +So then they had to send me another link that was correct. + +00:26:25.400 --> 00:26:27.180 +But I was like, oh, I'll just like not waste their time. + +00:26:27.220 --> 00:26:29.840 +I'll just go look for myself because it's that easy. + +00:26:30.260 --> 00:26:30.420 +Incredible. + +00:26:30.600 --> 00:26:33.780 +It's like, I'm going to use their tools, but not in a way that they necessarily expected. + +00:26:33.920 --> 00:26:35.700 +Well, I mean, they could have just sent me the right link. + +00:26:35.800 --> 00:26:36.120 +Exactly. + +00:26:36.240 --> 00:26:37.900 +They should have just sent you the right link. + +00:26:37.900 --> 00:26:45.640 +Yeah, I've I have some I cannot recount them here, but my dad, he I've got to let you help take care of him and stuff now these days. + +00:26:45.640 --> 00:26:48.140 +And he can't do a lot of his own paperwork and things. + +00:26:48.220 --> 00:26:54.560 +And so I've had to do some crazy stuff to to get access to or to help him fill out something that, yeah, it's insane. + +00:26:55.060 --> 00:26:57.560 +OK, let's go on to number two. + +00:26:57.960 --> 00:27:02.840 +This is just setting the wrong configurations, not following the hardening guide, not doing patching. + +00:27:02.940 --> 00:27:06.460 +And this is so easy for a malicious actor to find because there's scanners. + +00:27:06.460 --> 00:27:12.860 +I joke whenever anyone's like, oh, like, you know, we don't want to do a pen test because it might break our thing. + +00:27:12.940 --> 00:27:15.660 +And I'm like, well, you're actually having a penetration test done all the time. + +00:27:15.780 --> 00:27:18.040 +If you're on the Internet, you just aren't receiving the report. + +00:27:18.380 --> 00:27:21.520 +That is absolutely so true and so disturbing. + +00:27:21.720 --> 00:27:34.520 +If if if you're out there listening and you have something on the Internet, API website, whatever, and you have not just tailed the log of it and just see slash WP slash admin slash this slash that just coming at it left and right. + +00:27:34.520 --> 00:27:36.260 +You're like, what is going on? + +00:27:36.300 --> 00:27:37.960 +Because it doesn't show up in your analytics. + +00:27:37.960 --> 00:27:49.400 +I actually had someone find something on my website that I was surprised about because I had a user from my blog from long ago and they could see my user and then it gave my + +00:27:49.400 --> 00:27:57.460 +email address and it was actually a personal email address that I because I had a backup admin account and I used my personal email and he is like, did you want that on there? + +00:27:57.460 --> 00:28:01.340 +I'm like, no, only my mom and my dad email me there because I'm bad. + +00:28:01.340 --> 00:28:05.420 +And if my parents write me, I should write back. + +00:28:06.060 --> 00:28:06.360 +Right. + +00:28:07.160 --> 00:28:11.560 +And so I'm like, oh, that's the personal email that I'm supposed to answer on time. + +00:28:12.040 --> 00:28:19.920 +He wrote me and helped me like turn off that setting that I had no idea about despite having multiple security plugins and having run an audit. + +00:28:20.020 --> 00:28:20.740 +I'd miss that. + +00:28:20.740 --> 00:28:24.680 +This one could evolve for like Python people like Django with debug equals true. + +00:28:24.820 --> 00:28:30.680 +Now this is the most, probably most used misconfiguration example for Django apps out there. + +00:28:30.740 --> 00:28:34.160 +You're like, well, of course, Michael, I know you don't set debug true in production. + +00:28:34.400 --> 00:28:35.000 +You do it all the time. + +00:28:35.000 --> 00:28:35.720 +People do it. + +00:28:35.840 --> 00:28:36.220 +You're right. + +00:28:36.320 --> 00:28:42.480 +And then two, there's like 10 other settings that are, should be in production that are not in production in Django. + +00:28:42.720 --> 00:28:43.980 +Like HSTS. + +00:28:44.440 --> 00:28:45.260 +Yes, please. + +00:28:45.420 --> 00:28:53.520 +And a bunch of other, you know, do not allow me to put it, be put into an iframe and all sorts of other things, content security policies and security headers. + +00:28:53.800 --> 00:28:54.620 +Yes, exactly. + +00:28:54.880 --> 00:28:56.380 +This is what happened with Claude Code. + +00:28:56.560 --> 00:29:09.580 +And this is how they, they essentially allowed debug and production, like by not suppressing their map file and then also not having it as part of their git ignore, which is essentially having debug mode in prod. + +00:29:09.840 --> 00:29:11.560 +And that's how they lost their source code. + +00:29:11.880 --> 00:29:14.360 +Just to be clear, I'm not shaming the developer that did that. + +00:29:14.580 --> 00:29:17.260 +They probably didn't have a checklist for that person. + +00:29:17.260 --> 00:29:21.560 +They probably didn't have anything that scanned to tell them that those settings were incorrect. + +00:29:21.920 --> 00:29:22.320 +Right. + +00:29:22.780 --> 00:29:25.020 +They probably don't even have a policy that clarifies. + +00:29:25.120 --> 00:29:26.900 +They're like, oh, they should just know not to do that. + +00:29:27.160 --> 00:29:27.300 +Right. + +00:29:27.340 --> 00:29:28.580 +And then they were rushed. + +00:29:28.660 --> 00:29:30.020 +Probably they're in a hurry. + +00:29:30.460 --> 00:29:30.620 +And then. + +00:29:31.000 --> 00:29:31.120 +Yeah. + +00:29:31.140 --> 00:29:34.420 +They're shipping three or four times a day, which I appreciate, but at the same time. + +00:29:34.660 --> 00:29:34.900 +Yeah. + +00:29:34.960 --> 00:29:37.700 +And then now very bad things are happening. + +00:29:38.060 --> 00:29:41.120 +It is going to be the most audited code that has ever happened. + +00:29:41.280 --> 00:29:41.560 +Michael. + +00:29:41.560 --> 00:29:45.040 +I've seen so many videos parsing that stuff apart. + +00:29:45.160 --> 00:29:45.660 +It's wild. + +00:29:45.980 --> 00:29:46.380 +Yeah. + +00:29:46.420 --> 00:29:54.660 +For people who don't know, I'm sure people heard clog code got leaked, but basically the map file in JavaScript says, here's the minified version. + +00:29:54.740 --> 00:29:59.100 +But if you want to show the full source version for helpful debugging, here it is. + +00:29:59.180 --> 00:30:00.300 +And there's how you get to it. + +00:30:00.320 --> 00:30:01.380 +And that apparently got shipped. + +00:30:01.720 --> 00:30:02.880 +And people were just like, you know what? + +00:30:02.960 --> 00:30:04.900 +Why don't we find out what those files are actually? + +00:30:04.900 --> 00:30:13.740 +And it was two security misconfigurations because one of them would have stopped it from going and the other one would have like not had it been in the package in the first place. + +00:30:14.080 --> 00:30:14.200 +Right. + +00:30:14.280 --> 00:30:16.020 +And it's number two. + +00:30:16.160 --> 00:30:23.180 +And it happens even to the really, really, really, really, really high profile, you know, high security assurance requiring places. + +00:30:23.640 --> 00:30:28.800 +I have a third one on this list that I think will really surprise people like for real. + +00:30:29.160 --> 00:30:30.280 +So imagine this. + +00:30:30.280 --> 00:30:35.220 +I've got a self-hosted app and I'm going to run it on a Docker in Docker. + +00:30:35.600 --> 00:30:36.560 +You know, it could be Kubernetes. + +00:30:36.700 --> 00:30:37.460 +It could be whatever. + +00:30:37.600 --> 00:30:40.360 +But I'm going to run it in Docker on my server. + +00:30:40.820 --> 00:30:44.440 +And it has both a web interface and a database. + +00:30:44.640 --> 00:30:47.800 +And the database is running on the default port and so on. + +00:30:47.860 --> 00:30:48.560 +So that's all fine. + +00:30:48.640 --> 00:30:51.080 +But it's in Docker and everything's locked down. + +00:30:51.160 --> 00:30:55.780 +And so what you could do is you can use this thing called UFW, uncomplicated firewall on Linux. + +00:30:55.980 --> 00:30:58.400 +Turn that on and say block or don't block. + +00:30:58.500 --> 00:31:00.180 +Only allow my web port. + +00:31:00.280 --> 00:31:08.580 +And in your Docker compose file, you often or even just Docker statements, you see map, say Postgres, port 5432 to 5432. + +00:31:09.200 --> 00:31:09.560 +Guess what? + +00:31:09.620 --> 00:31:13.840 +That's actually open on the Internet, probably with the default password on that database. + +00:31:13.840 --> 00:31:21.540 +Because if you look at the Docker docs and you go to the bottom, it says uncomplicated firewalls, a friend that ships with Debian and Ubuntu unless you manage firewall. + +00:31:21.960 --> 00:31:25.820 +Docker and UFW use firewall rules in ways that make them incompatible. + +00:31:26.020 --> 00:31:30.020 +When you publish the container ports on Docker traffic, it gets diverted before that. + +00:31:30.020 --> 00:31:30.660 +So guess what? + +00:31:30.700 --> 00:31:31.840 +That's open on the Internet. + +00:31:32.280 --> 00:31:33.600 +Holy smokes. + +00:31:33.860 --> 00:31:39.400 +Is that it's so common to just see this port on the container map to this port on the server. + +00:31:39.640 --> 00:31:44.700 +And if you're thinking that UFW on your firewall is going to save you, it's actually just open on the Internet. + +00:31:45.020 --> 00:31:46.020 +Like I didn't realize that. + +00:31:46.020 --> 00:31:47.600 +And so, for example, what do you do? + +00:31:47.660 --> 00:31:51.880 +Well, you say localhost colon my port on the server. + +00:31:52.140 --> 00:31:54.000 +So you're shipping, you're only listening locally. + +00:31:54.160 --> 00:31:55.500 +You don't have, or you just don't do that. + +00:31:55.580 --> 00:32:00.000 +But that's a really subtle and sneaky one that people should be aware of. + +00:32:00.220 --> 00:32:02.380 +The thing is no one can memorize all of this. + +00:32:02.600 --> 00:32:03.840 +And so what is the answer, Michael? + +00:32:04.420 --> 00:32:04.780 +Checklists? + +00:32:05.080 --> 00:32:05.880 +Checklists are good. + +00:32:06.200 --> 00:32:06.440 +Scanners? + +00:32:06.800 --> 00:32:07.600 +Scanners are good. + +00:32:07.600 --> 00:32:13.020 +Honestly, I think the modern top tier AI agentic tools are really good. + +00:32:13.360 --> 00:32:16.160 +They find a surprising amount of these things. + +00:32:16.360 --> 00:32:23.540 +They find them if you ask them to find them, or they make it part of the code that they give you when you just ask for it. + +00:32:23.600 --> 00:32:24.860 +Because people just say, I want the app. + +00:32:24.920 --> 00:32:26.580 +They don't say, I want a secure app necessarily. + +00:32:26.980 --> 00:32:29.080 +And well, it's more efficient to not worry about the security. + +00:32:29.280 --> 00:32:30.080 +We'll save you some tokens. + +00:32:30.080 --> 00:32:32.960 +Even if you just say, I want a secure app. + +00:32:33.060 --> 00:32:36.880 +So I gave a conference talk two weeks ago at RSA called Insecure Vibes. + +00:32:37.300 --> 00:32:44.100 +In the demo that I recorded in advance that was not part of the slides when I gave my live presentation, but it's on my YouTube. + +00:32:44.580 --> 00:32:49.060 +I just asked Claude, I'm like, can you make a login function that's for an insulin pump? + +00:32:49.480 --> 00:32:51.880 +So this is a medical device that needs to be really secure. + +00:32:52.240 --> 00:32:53.220 +And it does it. + +00:32:53.300 --> 00:32:55.340 +And then after I'm like, analyze it for vulnerabilities. + +00:32:55.960 --> 00:32:59.240 +And multiple AIs found critical vulnerabilities in it. + +00:32:59.240 --> 00:33:00.700 +So I asked for it to be secure. + +00:33:00.860 --> 00:33:02.480 +So you can't just say, I want to be secure. + +00:33:02.720 --> 00:33:05.180 +You have to say, and this is what secure means. + +00:33:05.460 --> 00:33:08.260 +I do think you can find a lot if you use the tools in the right way. + +00:33:08.320 --> 00:33:09.700 +But like you said, you've got to ask. + +00:33:09.820 --> 00:33:11.560 +And it's a proper step. + +00:33:11.920 --> 00:33:13.480 +Software supply chain failures. + +00:33:13.580 --> 00:33:14.200 +Number three. + +00:33:14.420 --> 00:33:15.400 +This is the expansion. + +00:33:15.620 --> 00:33:20.800 +So this used to be vulnerable and outdated components, which is part of your software supply chain. + +00:33:20.880 --> 00:33:25.700 +But Michael, I'm sure that I've told you this before, but for people that haven't heard me, blah, blah, blah, about it. + +00:33:25.700 --> 00:33:30.920 +Every single thing that you use to create and maintain your software is part of your supply chain. + +00:33:31.320 --> 00:33:43.740 +So that includes your browser, the plugins in the browser, the sandbox you created, your CI, all the settings in the CI, where you're getting your libraries and packages from, how you're getting them. + +00:33:43.740 --> 00:33:46.940 +So do they maintain integrity across the wire when you get them? + +00:33:47.220 --> 00:33:48.860 +Could it be that you got something else? + +00:33:49.680 --> 00:33:55.880 +Like every single thing that you're using to maintain and create is part of the supply chain. + +00:33:56.020 --> 00:33:58.680 +And so we need to protect the whole thing. + +00:33:59.220 --> 00:34:04.200 +And like we were saying earlier, developers themselves are becoming targets of malicious actors. + +00:34:04.200 --> 00:34:12.660 +We need to find ways to defend the developer themselves, protect them, make them safer doing their jobs, right? + +00:34:12.720 --> 00:34:19.660 +And help them find ways to secure the whole supply chain that's not too painful because they still need flexibility in order to be creative. + +00:34:20.180 --> 00:34:23.960 +So some Python things that you can do concretely here is pin your dependencies. + +00:34:24.680 --> 00:34:28.300 +You can use pip compile or you can use uv lock files. + +00:34:28.440 --> 00:34:31.700 +There's all sorts of things that are possible there. + +00:34:31.700 --> 00:34:37.660 +And then you can also, I think the other side that we haven't mentioned, Tanya, is like known vulnerabilities in packages. + +00:34:38.000 --> 00:34:50.660 +I think a lot of people, I would say over 95% of the people that install libraries from PyPI, they don't even check whether or not there's a vulnerability in that package before they install it. + +00:34:50.780 --> 00:34:51.700 +I would like to see. + +00:34:51.860 --> 00:34:56.100 +So in 2022, the first company announced this idea of reachability. + +00:34:56.580 --> 00:34:58.220 +So let's say you want to do math. + +00:34:58.300 --> 00:34:59.740 +So you install a math library. + +00:34:59.740 --> 00:35:02.540 +We don't actually want to do all of math, right? + +00:35:02.660 --> 00:35:07.760 +We probably just want to do calculus, but maybe the vulnerabilities in the statistics function, right? + +00:35:07.880 --> 00:35:13.260 +And so when your code calls all the calculus functions, you're like, woo, derivatives. + +00:35:14.240 --> 00:35:18.480 +You're not actually, there's no reachable path from your code to the vulnerability. + +00:35:19.120 --> 00:35:24.000 +Most of the time, that means it's not exploitable, except for if it's log4j, then you're just screwed. + +00:35:24.200 --> 00:35:26.060 +Just to be clear, you're just in trouble, right? + +00:35:26.060 --> 00:35:31.400 +But for most things, like 99.9% of the time, then you're fine if there's no reachability. + +00:35:31.660 --> 00:35:39.720 +And so software composition analysis tools, sometimes called supply chain security tools, when the marketing teams got a little out of hand. + +00:35:39.980 --> 00:35:47.780 +I feel like if you do one of the 19 attack surfaces within the supply chain that you don't get to call yourself a supply chain tool, but I digress. + +00:35:47.780 --> 00:36:17.760 +I have strong feels. + +00:36:17.760 --> 00:36:18.460 +That's so overwhelming. + +00:36:18.580 --> 00:36:19.800 +I'm just not even going to look at it. + +00:36:20.120 --> 00:36:24.780 +I have a known vulnerability in one of the packages that I am shipping to production. + +00:36:25.220 --> 00:36:28.180 +I think it's a PDF package or something like that. + +00:36:28.220 --> 00:36:28.760 +I can't remember. + +00:36:29.100 --> 00:36:33.580 +And I scan all my builds with pip audit and it will fail the build. + +00:36:33.660 --> 00:36:40.280 +So I have to ignore it because it is a vulnerability when you call a path that I don't call and you're running on Windows. + +00:36:40.280 --> 00:36:42.320 +And I'm trying to deploy it to Docker. + +00:36:42.900 --> 00:36:44.400 +And I'm not calling that path. + +00:36:44.460 --> 00:36:50.100 +I'm like, I understand it's a problem that could be an issue under some circumstances, but it doesn't apply here. + +00:36:50.100 --> 00:36:51.200 +And I just need to use it. + +00:36:51.320 --> 00:36:54.340 +And it's a Windows problem on my Docker Linux. + +00:36:54.460 --> 00:36:55.940 +I don't really care right now. + +00:36:56.020 --> 00:36:56.880 +I mean, it's fine. + +00:36:56.980 --> 00:36:58.440 +Until they fix it, I'll be okay. + +00:36:58.440 --> 00:37:05.420 +Well, and especially if you know what the problem is and you're not going to suddenly switch to Windows, why would you do that? + +00:37:05.900 --> 00:37:09.360 +So the tools are maturing, but they're not perfect. + +00:37:09.800 --> 00:37:13.200 +And lots of them are going at different speeds. + +00:37:13.260 --> 00:37:13.900 +We'll just say that. + +00:37:13.900 --> 00:37:18.600 +So I look forward to the day where there's reachability done on all of those things. + +00:37:20.220 --> 00:37:22.860 +This portion of Talk Python To Me is brought to you by us. + +00:37:23.000 --> 00:37:30.620 +I want to tell you about a course I put together that I'm really proud of, Agentic AI Programming for Python Developers. + +00:37:31.260 --> 00:37:37.180 +I know a lot of you have tried AI coding tools and come away thinking, well, this is more hassle than it's worth. + +00:37:37.540 --> 00:37:40.820 +And honestly, all the vibe coding hype isn't helping. + +00:37:40.820 --> 00:37:44.300 +It's a smokescreen that hides what these tools can actually do. + +00:37:44.920 --> 00:37:56.720 +This course is about agentic engineering, applying real software engineering practices with AI that understands your entire code base, runs your tests, and builds complete features under your direction. + +00:37:57.360 --> 00:38:03.580 +I've used these techniques to ship real production code across Talk Python, Python Bytes, and completely new projects. + +00:38:04.020 --> 00:38:10.160 +I migrated an entire CSS framework on a production site with thousands of lines of HTML in a few hours. + +00:38:10.160 --> 00:38:10.760 +Twice. + +00:38:11.120 --> 00:38:15.540 +I shipped a new search feature with caching and async in under an hour. + +00:38:15.980 --> 00:38:24.080 +I built a complete CLI tool for Talk Python from scratch, tested, documented, and published to PyPI in an afternoon. + +00:38:24.580 --> 00:38:28.560 +Real projects, real production code, both Greenfield and legacy. + +00:38:29.020 --> 00:38:30.680 +No toy demos, no fluff. + +00:38:31.240 --> 00:38:37.380 +I'll show you the guardrails, the planning techniques, and the workflows that turn AI into a genuine engineering partner. + +00:38:37.380 --> 00:38:41.440 +Check it out at talkpython.fm/agentic dash engineering. + +00:38:41.740 --> 00:38:44.800 +That's talkpython.fm/agentic dash engineering. + +00:38:45.000 --> 00:38:47.120 +The link is in your podcast player's show notes. + +00:38:47.120 --> 00:38:51.820 +One of the other things I wanted to mention is like, so you said pin dependencies. + +00:38:52.240 --> 00:38:59.460 +And so I teach this and then inevitably every time someone's like, well, if I pin dependencies forever, then I just have all these really old dependencies. + +00:38:59.720 --> 00:39:00.860 +That's not what Michael means. + +00:39:01.300 --> 00:39:05.080 +He means you do development, you update your dependencies to a version. + +00:39:05.720 --> 00:39:10.500 +Like ideally you're like LTE, you're like latest, you know, stable version of whatever the thing is. + +00:39:10.500 --> 00:39:23.760 +Because you're trying to keep like definitely as supported version, recent, you're not picking terrible things where, you know, it hasn't been updated in two years, or there's one maintainer and they happen to live in Russia and work for the Russian government, right? + +00:39:23.780 --> 00:39:25.280 +So you're picking like decent ones. + +00:39:25.560 --> 00:39:26.840 +You're updating it in dev. + +00:39:26.920 --> 00:39:27.820 +You're like, okay, this is the one. + +00:39:27.980 --> 00:39:28.680 +Then you pin it. + +00:39:28.960 --> 00:39:33.440 +So as it goes up to different environments, you don't get a surprise update and it changes. + +00:39:33.440 --> 00:39:39.120 +And then there's something different in prod than what you tested in UAT and approved with the security tools. + +00:39:39.560 --> 00:39:40.720 +That's what, that's what. + +00:39:40.880 --> 00:39:41.600 +And a hundred percent. + +00:39:41.680 --> 00:39:42.920 +Because it gets misinterpreted. + +00:39:43.100 --> 00:39:43.340 +Yeah. + +00:39:43.500 --> 00:39:52.240 +And another thing to do that you can do real simple is like with some of the tools, like with uv, you can say, I have pin dependencies, update them to the current ones + +00:39:52.240 --> 00:39:58.540 +with a very important caveat that you can say that are older than a week or older than a day or something. + +00:39:58.540 --> 00:40:05.900 +But because, you know, the really big example here is LLM, light LLM, just this, was that just this week or was that last week? + +00:40:05.940 --> 00:40:06.540 +I can't keep it. + +00:40:06.560 --> 00:40:07.400 +It was very recent. + +00:40:07.400 --> 00:40:07.680 +Yeah. + +00:40:07.800 --> 00:40:20.320 +This thing has a dependency that itself became, like you talked about, the developer got taken over, I believe, and a virus got put in and it was only out for like half an hour or something, but it took over, it's so popular. + +00:40:20.540 --> 00:40:25.220 +It got, took it, took it over like 50,000 machines because it gets downloaded millions of times a day. + +00:40:26.340 --> 00:40:26.660 +Automatically. + +00:40:26.760 --> 00:40:36.980 +If you say, give me the latest, he's like, obviously waiting a week is not hardcore security, but at the same time, so many of these popular issues that people take, they only last briefly, right? + +00:40:36.980 --> 00:40:37.780 +For a few moments. + +00:40:37.900 --> 00:40:41.320 +And then somebody's like, oh my gosh, why is this thing using 100% CPU? + +00:40:41.500 --> 00:40:41.900 +You know what I mean? + +00:40:42.080 --> 00:40:49.180 +And here's the thing, Michael, is that not all of those 50,000 got the memo that this happened and they're still vulnerable in prod and they could be from them. + +00:40:49.280 --> 00:40:49.440 +Yeah. + +00:40:49.520 --> 00:40:50.760 +It could be for a long time. + +00:40:50.860 --> 00:40:51.540 +Yes, I agree. + +00:40:51.700 --> 00:40:56.580 +Like update, but to, you know, one that's three days old or one week old, it's weird. + +00:40:56.580 --> 00:40:59.920 +So this advice has drastically changed over the past six months. + +00:41:00.100 --> 00:41:03.180 +The best practice used to be auto update to latest version, period. + +00:41:03.480 --> 00:41:06.220 +That used to be the advice and that's no longer the advice. + +00:41:06.220 --> 00:41:09.080 +And it's kind of heartbreaking, especially if you use npm. + +00:41:09.420 --> 00:41:10.780 +NPM is just like under siege. + +00:41:11.680 --> 00:41:12.000 +It is. + +00:41:12.080 --> 00:41:12.260 +Yeah. + +00:41:12.620 --> 00:41:16.920 +High PI is as well, but it looks over at npm as thankful for its situation. + +00:41:17.420 --> 00:41:18.780 +Number four. + +00:41:18.780 --> 00:41:20.820 +Or I got to keep cruising here. + +00:41:21.120 --> 00:41:22.420 +For cryptographic failures. + +00:41:22.920 --> 00:41:23.840 +Cryptographic failures. + +00:41:23.980 --> 00:41:24.180 +Yeah. + +00:41:24.400 --> 00:41:25.300 +Not encrypting. + +00:41:25.960 --> 00:41:27.940 +Encrypting using something really old. + +00:41:28.420 --> 00:41:32.060 +You start off encrypted and then briefly you're not encrypted and then you're encrypted again. + +00:41:32.560 --> 00:41:34.280 +You don't encrypt it when you're supposed to. + +00:41:34.720 --> 00:41:41.720 +Also in this realm, one way hashing, not just reversible encryption, right? + +00:41:41.780 --> 00:41:42.800 +It would probably fall in here. + +00:41:42.800 --> 00:41:47.060 +Encrypting user passwords and storing them in the database along with the key. + +00:41:47.820 --> 00:41:52.520 +Ideally, we would hash and we would salt and then hash user passwords. + +00:41:52.700 --> 00:41:53.540 +That would be the best. + +00:41:53.680 --> 00:41:56.820 +If you really, really, really, really are intense, you could pepper it too. + +00:41:56.940 --> 00:41:58.080 +And no, I did not make that up. + +00:41:58.440 --> 00:42:01.180 +That is a mathematical nerd joke, not an app sec joke. + +00:42:01.760 --> 00:42:06.120 +But a salt is unique per user and the salt itself isn't really a secret. + +00:42:06.460 --> 00:42:09.460 +Where a pepper is unique per system or per organization. + +00:42:09.460 --> 00:42:11.020 +And it is a secret. + +00:42:11.200 --> 00:42:11.280 +Right. + +00:42:11.360 --> 00:42:14.860 +Like a secret key that you set and then it gets factored in there. + +00:42:14.920 --> 00:42:15.260 +That's cool. + +00:42:15.400 --> 00:42:15.560 +Yeah. + +00:42:15.700 --> 00:42:15.900 +Yeah. + +00:42:15.900 --> 00:42:19.120 +Also, maybe choose more modern hashing algorithms, right? + +00:42:19.200 --> 00:42:23.060 +Obviously not MD5, but maybe something memory hard like Argon maybe. + +00:42:23.260 --> 00:42:23.540 +I don't know. + +00:42:23.640 --> 00:42:23.860 +Yes. + +00:42:23.960 --> 00:42:24.460 +Argon 2. + +00:42:24.920 --> 00:42:26.360 +That would be better for sure. + +00:42:27.160 --> 00:42:32.500 +And this is something where if you're going to do it, it's very easy to look up what you're supposed to do on the internet. + +00:42:33.420 --> 00:42:38.480 +This is something where you can ask the AI, like, are you using a good algorithm? + +00:42:38.480 --> 00:42:39.060 +Are you doing this? + +00:42:39.160 --> 00:42:40.000 +Like, make sure it's secure. + +00:42:40.160 --> 00:42:43.020 +And then it's good as long as it does it. + +00:42:43.260 --> 00:42:53.600 +Because one suggestion, if you are going to VibeCode and not do the 400 other things we'll talk about later when we talk about my prompt library, but ask it to list its security assumptions. + +00:42:54.260 --> 00:42:56.780 +So whatever it is you prompt, you give it to make a thing. + +00:42:56.840 --> 00:42:58.600 +You're like, make this, then do that, blah, blah, blah. + +00:42:59.120 --> 00:43:00.740 +Please list all your security assumptions. + +00:43:01.300 --> 00:43:06.120 +And it'll be like, oh, yeah, but obviously like you wouldn't do like authentication like that because that's terrible. + +00:43:06.120 --> 00:43:16.980 +And like in production, you would do this other thing and you're like, oh, yeah, because it'll assume that you're going to change a bunch of things later that it doesn't tell you unless you ask it to tell you its assumptions. + +00:43:18.120 --> 00:43:18.300 +Yeah. + +00:43:18.400 --> 00:43:21.080 +We'll use no password here, but when you ship it, you're going to add that, right? + +00:43:21.180 --> 00:43:23.540 +Like, no, I wasn't going to, but now I will. + +00:43:23.740 --> 00:43:24.140 +All right. + +00:43:24.140 --> 00:43:29.340 +I think one of the best known ones has got to be little Bobby tables and friends. + +00:43:29.600 --> 00:43:31.460 +Number five, injection. + +00:43:31.760 --> 00:43:32.160 +Yes. + +00:43:32.580 --> 00:43:43.200 +So injection, tricking an application in, like you put your code, the malicious actor's code into a place where it should be data, but you've tricked it into thinking it's its code. + +00:43:43.260 --> 00:43:45.680 +And then either it executes it or it interprets it. + +00:43:45.860 --> 00:43:49.580 +Like if there's an interpreter, there's a compiler, there's the potential for injection. + +00:43:49.580 --> 00:43:53.820 +And it, yeah, we don't want to mix data in with commands. + +00:43:54.200 --> 00:43:57.940 +We don't want to mix data in with anything that's going to be executed or interpreted. + +00:43:58.220 --> 00:43:59.600 +And we do it a lot, Michael. + +00:43:59.980 --> 00:44:01.560 +I know we make bad choices, don't we? + +00:44:01.800 --> 00:44:02.780 +We make bad choices. + +00:44:02.880 --> 00:44:08.360 +So obviously SQL injection is the number one in this world for sure. + +00:44:08.760 --> 00:44:10.600 +And still it's popular. + +00:44:10.840 --> 00:44:11.800 +Still people don't know. + +00:44:11.860 --> 00:44:12.500 +It's still tricky. + +00:44:12.780 --> 00:44:19.180 +I mean, we have certainly parametrized queries and ORMs and stuff that should be helping us or does help us if we choose to use them with this. + +00:44:19.180 --> 00:44:23.320 +However, I think other ones should just give them a quick shout out. + +00:44:23.440 --> 00:44:30.760 +Like for example, if you're accepting JSON and converting it to a dictionary in Python, you can do MongoDB injection. + +00:44:31.020 --> 00:44:39.260 +Like your password, you know, the Bobby tables one is like quote, semicolon drop table that, you know, like that's what that looks like in T-School. + +00:44:39.440 --> 00:44:43.020 +But in MongoDB, you can do queries that are dictionaries. + +00:44:43.240 --> 00:44:44.880 +That's like kind of how you do your filtering. + +00:44:44.880 --> 00:44:53.980 +So if you take something that would be a password in a JSON document, the password could be curly brace greater than, you know, one equals one. + +00:44:54.140 --> 00:44:58.040 +Like a really complicated JSON dictionary that is actually the query that is equivalent. + +00:44:58.160 --> 00:44:59.800 +So you got to be super careful there as well. + +00:44:59.860 --> 00:45:02.220 +And that's really tricky to do that. + +00:45:02.220 --> 00:45:05.820 +And then also like the pickles and like serialization, deserialization. + +00:45:06.000 --> 00:45:08.880 +There's a lot to this, not just SQL injection. + +00:45:09.200 --> 00:45:11.260 +It's a lot about input validation. + +00:45:11.880 --> 00:45:17.480 +So using a parametrized query, so store procedures, prepared statements, whatever you want to call them. + +00:45:17.800 --> 00:45:21.080 +What that does is says this is data only treated as data. + +00:45:21.080 --> 00:45:23.880 +And then the database can do that. + +00:45:24.520 --> 00:45:27.580 +But if we, on top of that, do input validation. + +00:45:27.580 --> 00:45:33.300 +So like we're getting the thing that looks correct and we're rejecting, we're not trying to fix it. + +00:45:33.420 --> 00:45:35.720 +We're just rejecting everything that looks not correct. + +00:45:36.060 --> 00:45:41.080 +And then if we have to accept any special characters, we escape them or sanitize them out. + +00:45:41.300 --> 00:45:42.240 +I prefer escaping. + +00:45:42.500 --> 00:45:43.700 +I think it's weird to remove stuff. + +00:45:43.800 --> 00:45:44.420 +That's my data. + +00:45:44.520 --> 00:45:45.260 +I probably want it. + +00:45:45.380 --> 00:45:50.020 +So like if you have to accept single quotes because you know you're going to have users named O'Malley, let's say. + +00:45:50.020 --> 00:45:50.460 +Right. + +00:45:50.860 --> 00:45:52.280 +So we accept the letters. + +00:45:52.520 --> 00:45:53.260 +We accept the numbers. + +00:45:53.480 --> 00:45:56.220 +We accept a single quote and some dashes, even though those are dangerous. + +00:45:56.220 --> 00:45:59.600 +And then we escape those characters because we know they're potentially dangerous. + +00:45:59.820 --> 00:46:05.760 +And then we specify, by the way, that's definitely data and not code by making it a parameter. + +00:46:06.060 --> 00:46:06.500 +Right. + +00:46:06.560 --> 00:46:08.040 +And then we're locked down. + +00:46:08.300 --> 00:46:09.380 +We're in really good shape. + +00:46:09.460 --> 00:46:21.400 +If we olded input validation on everything and then we just rejected everything that looked weird, especially against like we always need to do like a yes list and allow like this is allowed list. + +00:46:21.640 --> 00:46:24.640 +Not like when I was a pen tester, Michael, I was not. + +00:46:24.840 --> 00:46:26.480 +I was a pen tester for a year and a half. + +00:46:26.840 --> 00:46:31.000 +I had basically zero training and I could get around those in two seconds. + +00:46:31.000 --> 00:46:33.960 +And like I was not particularly superbly talented. + +00:46:34.600 --> 00:46:37.640 +And I was just like, pew, pew, pew, ha, ha, ha, ha, ha, block list. + +00:46:37.720 --> 00:46:38.340 +You just try. + +00:46:38.600 --> 00:46:39.880 +If you just know to look, right? + +00:46:40.000 --> 00:46:40.260 +Yeah. + +00:46:40.420 --> 00:46:43.600 +Well, and there's cheat sheets all over the internet of how to get around them. + +00:46:43.760 --> 00:46:46.300 +It's so everyone knows how to get around them. + +00:46:46.400 --> 00:46:48.100 +So, but you can't get around. + +00:46:48.320 --> 00:46:49.820 +Well, you're only allowed letters and numbers. + +00:46:49.820 --> 00:46:50.280 +It's like, well. + +00:46:50.320 --> 00:46:54.980 +Let's keep moving a little bit quickly so that we know that we're, we got time for a little retrospective. + +00:46:54.980 --> 00:46:57.460 +So insecure design, that's a fun one. + +00:46:57.720 --> 00:47:02.080 +It does not matter how perfectly you follow the plan if the plan is bad. + +00:47:02.340 --> 00:47:02.820 +Right. + +00:47:03.240 --> 00:47:10.280 +And so this is, this was new on the last time we released the list was the first time it was on there. + +00:47:10.280 --> 00:47:14.420 +And I'm really glad because all the other items are implementation. + +00:47:14.420 --> 00:47:17.400 +And this is the only one that is design the plan. + +00:47:17.580 --> 00:47:22.360 +And so essentially it means, you know, someone designed something and you don't talk about it. + +00:47:22.360 --> 00:47:24.020 +You don't analyze it for security. + +00:47:24.020 --> 00:47:27.720 +You don't intentionally apply secure design concepts. + +00:47:27.940 --> 00:47:29.080 +You don't do a threat model. + +00:47:29.180 --> 00:47:30.400 +There's no security review. + +00:47:30.840 --> 00:47:31.660 +You YOLO that. + +00:47:31.900 --> 00:47:35.980 +You don't even have a list of security requirements usually that you knew you should have added. + +00:47:35.980 --> 00:47:44.020 +Like if you're going to do an API and it's, you know, visible from accessible from the internet, to me, it should be behind an API gateway, period. + +00:47:44.440 --> 00:47:45.140 +That is my opinion. + +00:47:45.340 --> 00:47:47.660 +No, I don't sell one, but I think we all need one. + +00:47:47.880 --> 00:47:48.240 +Right. + +00:47:48.360 --> 00:47:51.240 +And to me, that should just be a requirement up front. + +00:47:51.240 --> 00:47:54.700 +And then when I see your design document and it's there, I'm like, thumbs up, let's go. + +00:47:55.020 --> 00:47:55.220 +Right. + +00:47:55.220 --> 00:48:03.440 +But if you're not giving clear requirements and then you're not reviewing the design, you are getting a YOLO approach at stuff. + +00:48:03.580 --> 00:48:05.760 +And that doesn't mean developers don't care. + +00:48:06.140 --> 00:48:09.600 +But if no one's taught them this, no one's asked for this, and then no one checks this. + +00:48:09.800 --> 00:48:11.740 +The problem with this one is the code looks fine. + +00:48:11.820 --> 00:48:12.920 +It looks like you're doing it right. + +00:48:12.920 --> 00:48:15.640 +It's just there's something important that's just not even there. + +00:48:15.640 --> 00:48:19.220 +Like examples that I came up with were like no rate limiting would be one. + +00:48:19.480 --> 00:48:20.440 +The login looks fine. + +00:48:20.520 --> 00:48:23.640 +You're checking the person's not a duplicate that they're there, et cetera, et cetera. + +00:48:23.780 --> 00:48:23.860 +Right. + +00:48:23.980 --> 00:48:27.060 +Or a client side enforcement and some kind of like Vue.js act. + +00:48:27.140 --> 00:48:32.960 +You've got all the validation there, but the API actually just assumes the client is doing it, which is never the way. + +00:48:32.960 --> 00:48:39.820 +And there's a lot of business logic issues that will be the way that you're solving the problem. + +00:48:40.160 --> 00:48:43.120 +If users do the thing you want, it's fine. + +00:48:43.340 --> 00:48:45.520 +But not all users do the thing you want. + +00:48:45.580 --> 00:48:46.600 +And some of them are Tanya. + +00:48:46.820 --> 00:48:48.460 +And they're like, well, I'm just going to click through your. + +00:48:48.660 --> 00:48:49.020 +Exactly. + +00:48:49.260 --> 00:48:51.560 +Some of us are curious and we like to click the buttons. + +00:48:53.020 --> 00:48:54.580 +We're like, oh, look at that. + +00:48:54.900 --> 00:48:57.520 +Oh, there's a next button, even though it says it's the last page. + +00:48:57.600 --> 00:48:58.740 +Well, what would happen if we click that? + +00:48:58.840 --> 00:48:59.060 +Absolutely. + +00:48:59.140 --> 00:49:00.400 +You've got to click that button, right? + +00:49:01.280 --> 00:49:02.900 +Authentication failures, number seven. + +00:49:03.060 --> 00:49:09.520 +So this is when an attacker can trick the app into thinking they're a different user, usually an admin user. + +00:49:09.520 --> 00:49:13.580 +Or if they're not a legitimate user, tricking them into being a legitimate user. + +00:49:13.700 --> 00:49:15.460 +But we all want to be admin, Michael. + +00:49:15.600 --> 00:49:15.880 +Oh, yeah. + +00:49:15.960 --> 00:49:17.580 +This can be caused by lots of things. + +00:49:17.580 --> 00:49:29.120 +We wrote our own authentication instead of buying a tried, tested, and true product that will do this for us easier, better, faster, and cheaper in the long run when we count maintenance. + +00:49:29.120 --> 00:49:30.980 +That's the biggest mistake. + +00:49:31.100 --> 00:49:33.420 +But we don't protect against credential stuffing. + +00:49:33.540 --> 00:49:35.020 +We don't protect against brute force. + +00:49:35.240 --> 00:49:37.040 +Those are the two super, super huge ones. + +00:49:37.120 --> 00:49:43.880 +We let people reuse passwords, use terribly insecure passwords, etc. We don't have multiple forms of authentication. + +00:49:43.880 --> 00:49:45.860 +So there's no second factor. + +00:49:46.720 --> 00:49:46.900 +Yeah. + +00:49:46.900 --> 00:49:53.820 +And there's, Michael, there are ways to do multi-factor that don't, like, have to be awful necessarily. + +00:49:54.220 --> 00:49:57.780 +Like, if you don't require the same level of security, like posture. + +00:49:57.780 --> 00:50:03.600 +So, for instance, you know, you do have multi-factor authentication for the first time. + +00:50:03.780 --> 00:50:07.240 +But then maybe you fingerprint their browser and their device and their network. + +00:50:07.420 --> 00:50:14.580 +And if they're going to log in from the same device, the same browser, and the same network, maybe you don't require an MFA challenge very often. + +00:50:14.840 --> 00:50:15.400 +A hundred percent. + +00:50:15.540 --> 00:50:18.560 +Like, you could say, trust this machine or this browser. + +00:50:18.660 --> 00:50:21.560 +And you're like, okay, we'll never ask you 2FA again. + +00:50:21.740 --> 00:50:22.860 +Just your username, password. + +00:50:23.160 --> 00:50:26.120 +Or we won't, unless you're doing something like deleting your account. + +00:50:26.120 --> 00:50:29.660 +There's ways to make this not necessarily always super painful. + +00:50:30.140 --> 00:50:33.300 +And I, you know, pass keys are so nice. + +00:50:33.600 --> 00:50:35.080 +Those are making things a lot nicer. + +00:50:35.580 --> 00:50:38.020 +But I still feel like there's a ways to go. + +00:50:38.380 --> 00:50:43.660 +I dream of the day where we trust our devices so well that, like, I can just touch the thing and I know it's okay. + +00:50:44.020 --> 00:50:44.240 +Right? + +00:50:44.400 --> 00:50:51.660 +And I know that someone can't just, like, XKCD hit me with a wrench until I put the thing in front of my face and then it unlocks. + +00:50:52.100 --> 00:50:52.140 +Exactly. + +00:50:52.240 --> 00:50:54.840 +That is one of the weakest parts of the security chain there. + +00:50:54.840 --> 00:50:55.340 +Mm-hmm. + +00:50:55.460 --> 00:50:58.020 +Software or data integrity failures? + +00:50:58.100 --> 00:50:58.620 +Number eight. + +00:50:58.940 --> 00:51:00.640 +So this is one we fought a lot about. + +00:51:01.100 --> 00:51:01.520 +Yeah. + +00:51:01.580 --> 00:51:02.460 +Especially me and Neil. + +00:51:02.720 --> 00:51:07.960 +Because Neil, I wrote this one and Neil wrote the supply chain one or vice versa. + +00:51:08.360 --> 00:51:10.280 +And there's lots of arguments of how to differentiate. + +00:51:10.280 --> 00:51:19.980 +So we need to make sure that things we download are exactly what we think they are and that the integrity, it's not been spoofed or tampered with in the meantime. + +00:51:20.200 --> 00:51:22.060 +So no one has changed it. + +00:51:22.380 --> 00:51:24.240 +And this is for data and for software. + +00:51:24.400 --> 00:51:27.500 +So, you know, third-party components that we're getting. + +00:51:27.800 --> 00:51:28.960 +Are we getting the thing we thought? + +00:51:29.080 --> 00:51:30.040 +Is there a table squat? + +00:51:30.280 --> 00:51:33.840 +Is, has someone been able to intercept in between and change it out? + +00:51:33.840 --> 00:51:35.320 +Same with data. + +00:51:35.680 --> 00:51:39.120 +Like, did someone change the data on the way to us, et cetera. + +00:51:39.400 --> 00:51:43.960 +And this is really key, especially for things that require anything medical. + +00:51:44.300 --> 00:51:48.780 +Like imagine the insulin pump that gets it wrong sometimes and people have comas. + +00:51:48.780 --> 00:51:50.600 +Like that would be so unbelievably bad. + +00:51:50.600 --> 00:51:51.660 +That's very bad. + +00:51:51.660 --> 00:51:59.700 +Yeah, I worked with a company that did like all the medical devices and instruments and ORs and ERs and security assurance. + +00:51:59.840 --> 00:52:00.200 +Hi. + +00:52:01.060 --> 00:52:02.800 +It was an awesome project. + +00:52:02.980 --> 00:52:06.020 +It was really cool, but it was also like, damn, your job hard. + +00:52:06.140 --> 00:52:07.640 +It's tough to sleep at night in that one. + +00:52:07.700 --> 00:52:07.840 +Yeah. + +00:52:07.940 --> 00:52:11.460 +The thing is, is that private industry tends to really focus on availability. + +00:52:11.920 --> 00:52:14.140 +Like if their website's down, they can't sell their thing. + +00:52:14.480 --> 00:52:16.260 +Clients call, it costs them money. + +00:52:16.560 --> 00:52:16.780 +Right. + +00:52:16.780 --> 00:52:21.660 +But integrity is like more silent hurt, if that makes sense. + +00:52:21.840 --> 00:52:21.980 +Yeah. + +00:52:22.100 --> 00:52:23.860 +So many of these are like this, honestly. + +00:52:24.060 --> 00:52:30.260 +Like the whole top 10 is only, it only slows you down and it's sand in the gears until something happens. + +00:52:30.260 --> 00:52:31.840 +And then it's your fault for not doing it. + +00:52:31.920 --> 00:52:34.160 +But before that, it's like all this stuff is a hassle. + +00:52:34.340 --> 00:52:41.700 +And integrity, it comes into play in so many situations because some, it can fail silently. + +00:52:42.380 --> 00:52:45.460 +Things that fail silently are more scary for security teams. + +00:52:45.840 --> 00:52:46.580 +Does that make sense? + +00:52:46.580 --> 00:52:51.480 +People love to use CDNs for their JavaScript and their CSS and so on. + +00:52:51.600 --> 00:53:01.580 +And there've been examples where the CDN was taken over or another developer could have been compromised who published a malicious JavaScript. + +00:53:02.240 --> 00:53:08.380 +And the danger with this is if you make that hack go through, you don't just take over that app. + +00:53:08.440 --> 00:53:12.720 +You take over all the people who use that app and everyone who uses the CDN to pull it out. + +00:53:12.720 --> 00:53:16.220 +Like it can be really the knock-on effects are mega. + +00:53:16.220 --> 00:53:16.280 +Yeah. + +00:53:16.280 --> 00:53:22.240 +And like checking the sub-resource integrity, doing that check, that can help. + +00:53:22.240 --> 00:53:26.500 +Sometimes we can do all the right things and we still get hurt. + +00:53:26.500 --> 00:53:43.980 +Because like for instance, with SolarWinds, the compromise was so deep in the organization that they were able to not only push in like code that was malicious, have it pass all the security tests in the pipeline, then sign it and then release it. + +00:53:43.980 --> 00:53:45.980 +And then not have customers also notice the problem. + +00:53:45.980 --> 00:53:48.980 +Like that situation is rare. + +00:53:48.980 --> 00:53:55.140 +What we want to do with this one is raise awareness that you should just be checking the integrity of your stuff, period. + +00:53:55.640 --> 00:53:55.820 +Right? + +00:53:56.000 --> 00:54:06.040 +So the software composition analysis companies, the security researchers, they're on it finding those rare edge case zero day situations. + +00:54:06.040 --> 00:54:10.440 +What we need the average developer to do is just check integrity, period. + +00:54:10.820 --> 00:54:14.840 +Like the thing you've got is what you think you've got and it's from the right place. + +00:54:15.200 --> 00:54:18.420 +And if we could all do that, like life would improve greatly. + +00:54:18.620 --> 00:54:20.040 +There's defaults that are not great. + +00:54:20.140 --> 00:54:21.000 +For example, check this out. + +00:54:21.060 --> 00:54:23.680 +JS deliver are Tailwind. + +00:54:24.180 --> 00:54:28.120 +So here's a real popular CDN delivering a very, very popular library. + +00:54:28.480 --> 00:54:29.520 +Here's how it tells me to use it. + +00:54:29.720 --> 00:54:30.380 +What's missing here? + +00:54:30.520 --> 00:54:32.400 +Sub-resource integrity check. + +00:54:32.400 --> 00:54:32.800 +Yes. + +00:54:33.100 --> 00:54:38.260 +So if I just say, I want to use this Tailwind and it says, great, source equals such and such. + +00:54:38.380 --> 00:54:39.040 +Good to go. + +00:54:39.400 --> 00:54:39.720 +You know what I mean? + +00:54:39.760 --> 00:54:40.320 +And that's it. + +00:54:40.560 --> 00:54:47.620 +So even the really popular CDNs and stuff are just encouraging you to fall, to scramble from the pit of success. + +00:54:47.840 --> 00:54:48.780 +You know, it's not that at all. + +00:54:48.860 --> 00:54:51.600 +Maybe we should write them and be like, I want you to change this, please. + +00:54:52.000 --> 00:54:54.060 +When I worked at Microsoft, I did that all the time. + +00:54:54.100 --> 00:54:56.760 +I'd be like, you need to change your readme page. + +00:54:56.820 --> 00:54:57.280 +It's wrong. + +00:54:57.660 --> 00:54:59.180 +You forgot the security thing. + +00:54:59.620 --> 00:55:01.680 +And they'd be like, Tanya, just, it's a demo. + +00:55:01.680 --> 00:55:02.260 +I'm like, nope. + +00:55:02.400 --> 00:55:04.000 +Two more real quick before we run out of time. + +00:55:04.500 --> 00:55:05.420 +Logging and alerting. + +00:55:05.760 --> 00:55:07.420 +So security logging and alerting. + +00:55:08.020 --> 00:55:14.980 +So developers might be doing lots of logging and they might be doing some alerting for debugging, which is important and you should still do it. + +00:55:15.180 --> 00:55:21.440 +But this is more that we're not logging when security controls are called and especially pass or fail. + +00:55:21.440 --> 00:55:28.560 +So if someone tries to log in 100 times in one second, I don't want to know the 100th time that they got in. + +00:55:28.560 --> 00:55:32.460 +I want to know all 99 times where they failed in the logs. + +00:55:32.460 --> 00:55:32.980 +Right. + +00:55:32.980 --> 00:55:38.080 +I want to have enough information in those logs that I can do a proper investigation. + +00:55:38.080 --> 00:55:42.740 +Like when I worked in AppSec, my job wasn't called incident responder. + +00:55:42.740 --> 00:55:48.880 +But every time an app got smashed, they're like, okay, Tanya, do you saying that you do that weird thing? + +00:55:49.280 --> 00:55:50.900 +And I would go look at the logs. + +00:55:50.900 --> 00:55:58.940 +And I remember a client calling me one day and they're like, Visa called us and 27 of our customers got popped and we need you to go investigate. + +00:55:59.540 --> 00:56:01.400 +And turns out they didn't have any logs. + +00:56:01.860 --> 00:56:03.560 +They didn't think they needed to log that. + +00:56:03.560 --> 00:56:08.840 +And so they had absolutely no application logs for that log for that app. + +00:56:08.880 --> 00:56:09.060 +Sorry. + +00:56:09.260 --> 00:56:15.040 +And I was like, when am I supposed to investigate walk around the building with like a magnifying glass and just look cool with a hat on? + +00:56:15.120 --> 00:56:16.640 +Like there's nothing, there's no evidence. + +00:56:16.920 --> 00:56:21.220 +There's probably somebody in the corner with a hoodie, sunglasses looking sort of hacker-ish. + +00:56:21.540 --> 00:56:21.900 +Right. + +00:56:22.280 --> 00:56:24.600 +Like, I'm just like, what am I supposed to investigate guys? + +00:56:24.660 --> 00:56:26.980 +Like you have no logs at all. + +00:56:27.060 --> 00:56:28.280 +You got to just let it keep going. + +00:56:28.280 --> 00:56:34.200 +Basically, you got to say, well, now we add logging and then we can figure out if there's new stuff happening or something. + +00:56:34.320 --> 00:56:35.000 +It's really bad. + +00:56:35.240 --> 00:56:36.480 +It turns out it wasn't them. + +00:56:36.560 --> 00:56:44.140 +It turned out that there's a sandwich shop downstairs and an employee had swiped cards and everything from our end was fine. + +00:56:44.540 --> 00:56:48.820 +But then I was like, we are rewriting this app so that it does security logging and failure. + +00:56:48.940 --> 00:56:51.460 +So essentially, you're making it so we can't investigate. + +00:56:51.740 --> 00:56:54.060 +You're making it so there's no evidence that a thing happened. + +00:56:54.320 --> 00:56:55.540 +We can't press charges in court. + +00:56:55.620 --> 00:56:56.600 +There's no chain of custody. + +00:56:56.600 --> 00:56:59.300 +We'll never know what happened. + +00:56:59.780 --> 00:57:02.340 +And then that means we don't know how to protect ourselves in the future. + +00:57:02.680 --> 00:57:04.080 +And we really need these logs. + +00:57:04.220 --> 00:57:14.640 +So every time a security thing happens, input validation, output encoding, like anything that is security related, just log that the attempt was made and it worked or it didn't work. + +00:57:14.960 --> 00:57:17.800 +And, you know, which user ID, et cetera, things like that. + +00:57:17.900 --> 00:57:18.540 +And the timestamp. + +00:57:18.760 --> 00:57:18.900 +All right. + +00:57:18.920 --> 00:57:19.420 +Last one. + +00:57:19.520 --> 00:57:22.600 +Let's round it out real quick with mishandling of exceptional conditions. + +00:57:22.600 --> 00:57:27.660 +So this is brand new and this one is related to the other one. + +00:57:27.800 --> 00:57:33.540 +So number nine was basically you're not doing logging when you should or your logs suck. + +00:57:33.660 --> 00:57:34.080 +They're incomplete. + +00:57:34.480 --> 00:57:38.640 +This one is errors happen and you just don't handle them properly. + +00:57:38.640 --> 00:57:46.960 +So I'm sure you've reviewed code and seen this where it's like try and it does a thing and then catch and then there's nothing and then end. + +00:57:47.160 --> 00:57:47.720 +I'm like, what? + +00:57:47.880 --> 00:57:50.360 +You didn't handle anything. + +00:57:50.360 --> 00:57:58.080 +Or the handling is just I'm going to print the entire system error to the screen with the stack trace and a mess. + +00:57:58.180 --> 00:57:58.940 +Nope, that's gross. + +00:57:59.560 --> 00:58:02.260 +I'm just going to not properly recover. + +00:58:02.740 --> 00:58:02.860 +Right. + +00:58:02.980 --> 00:58:07.860 +And so application resilience is important, but you can't have that at all. + +00:58:07.860 --> 00:58:09.540 +If you're not doing this, you can't recover. + +00:58:09.920 --> 00:58:09.940 +Yeah. + +00:58:10.000 --> 00:58:14.380 +Or you don't use a database transaction and it's data is corrupted, something like that. + +00:58:14.380 --> 00:58:27.580 +This is where a lot of business logic flaws, like really unique bugs happen that are harder to find because we are not handling our errors at all or we're handling them very, very poorly. + +00:58:27.580 --> 00:58:36.320 +And I was really excited to have this on here because lack of application resilience tied for this one for spot number 10. + +00:58:36.760 --> 00:58:40.700 +But if you solve this, you almost always solve lack of application resilience. + +00:58:40.700 --> 00:58:44.680 +But if you solve lack of application resilience, you do not solve this. + +00:58:45.220 --> 00:58:48.440 +And so I was like, and so that's how I got them to agree to put this one on. + +00:58:48.480 --> 00:58:53.680 +And so the other one, having technical discussions with really smart people, it's pretty cool. + +00:58:54.240 --> 00:58:54.640 +Absolutely. + +00:58:55.340 --> 00:59:04.080 +So I want to take a moment and talk about AI and security and give you a chance to talk about your prompt library and how people can get it. + +00:59:04.180 --> 00:59:07.120 +And while you're doing that, I'm going to pull up an example I can kick off. + +00:59:07.240 --> 00:59:08.800 +So tell people about it and I'll pull up the example. + +00:59:08.800 --> 00:59:12.440 +I give training and I do this bad, better, best thing where I give an example. + +00:59:12.580 --> 00:59:17.260 +So I'm like input validation or whatever the topic is, you know, like a brief lecture on it and best practices. + +00:59:17.600 --> 00:59:19.460 +Then I give an example of bad code. + +00:59:19.600 --> 00:59:21.640 +Then we fix that thing, better code. + +00:59:21.700 --> 00:59:23.380 +And then best codes, like layers of defenses. + +00:59:23.740 --> 00:59:34.160 +And when I was creating these examples with the AI, Michael, every time the example was bad code, like no security control whatsoever or completely incorrectly done. + +00:59:34.160 --> 00:59:38.220 +So like you get the input, you use it and then you validate it. + +00:59:38.580 --> 00:59:38.960 +Right. + +00:59:39.120 --> 00:59:41.540 +So it has gotten better. + +00:59:41.740 --> 00:59:46.180 +So over the past two years, I've seen it go from every time bad to maybe half the time. + +00:59:46.240 --> 00:59:47.100 +It's a bad example. + +00:59:47.220 --> 00:59:49.640 +Sometimes I have to dumb it down now, which is encouraging. + +00:59:49.640 --> 00:59:51.940 +But that's obviously not what we want. + +00:59:52.560 --> 00:59:57.480 +And so the AI, I think everyone knows, is not creating great code. + +00:59:57.640 --> 01:00:01.840 +And the reason is it was trained on not great code. + +01:00:02.180 --> 01:00:03.380 +Most code out there is not great code. + +01:00:03.580 --> 01:00:11.900 +The code specifically it used was demos, examples, things on GitHub, publicly available demos where there's no security team involved. + +01:00:11.900 --> 01:00:12.380 +Right. + +01:00:12.380 --> 01:00:12.500 +Right. + +01:00:12.680 --> 01:00:25.400 +So like if you went and scanned the code inside Microsoft that makes the Microsoft products, you better believe it, that'd probably be pretty darn good code versus some random crap Tanya did five years ago that's on her GitHub. + +01:00:25.700 --> 01:00:29.100 +That might be really crappy or it might even be intentionally vulnerable. + +01:00:29.520 --> 01:00:29.700 +Right. + +01:00:30.120 --> 01:00:30.680 +And it doesn't. + +01:00:30.980 --> 01:00:31.120 +Yeah. + +01:00:31.380 --> 01:00:31.700 +No. + +01:00:31.920 --> 01:00:40.440 +And so as a result, we have this thing that's trained that security just it's optional, it's low priority and it's missing. + +01:00:40.440 --> 01:00:43.880 +And so it is doing what it was trained to do. + +01:00:44.420 --> 01:00:50.060 +And developers and non-developers are constantly making apps now. + +01:00:50.560 --> 01:00:54.460 +We have CEOs making apps because they don't like what the marketing team did. + +01:00:54.500 --> 01:00:55.940 +And they're like, look what I did over the weekend. + +01:00:56.940 --> 01:00:57.300 +Boom. + +01:00:57.500 --> 01:00:59.840 +It's publish, please, because I'm the boss. + +01:01:00.260 --> 01:01:00.940 +Oh, I've seen it. + +01:01:01.060 --> 01:01:01.480 +Like literally. + +01:01:01.800 --> 01:01:01.860 +Yeah. + +01:01:01.900 --> 01:01:02.760 +Who's going to say no, right? + +01:01:02.840 --> 01:01:03.320 +Yeah, exactly. + +01:01:03.320 --> 01:01:12.940 +And so here we have very, very insecure code going onto the internet very, very quickly, often with no time for the security team to go look at it. + +01:01:13.060 --> 01:01:13.140 +All right. + +01:01:13.160 --> 01:01:18.620 +So you've got this prompt library that people can go and get from your website for free. + +01:01:18.940 --> 01:01:22.360 +You gave me an example to say, go find problems in this code. + +01:01:22.400 --> 01:01:25.520 +I took just some random code that I know has trouble in it and threw it in here. + +01:01:25.520 --> 01:01:34.160 +So the secure code prompt library, if you want to go, just go securemyvibe.ca and you do have to join my newsletter to get it. + +01:01:34.220 --> 01:01:38.240 +But I feel that's a reasonable price because my newsletter is awesome and you get memes. + +01:01:38.480 --> 01:01:40.980 +But anyway, so this is from that. + +01:01:41.220 --> 01:01:46.140 +So the prompt library has many things, but one of them is to review the code for security. + +01:01:46.400 --> 01:01:47.980 +So this is a code review prompt. + +01:01:48.160 --> 01:01:50.480 +So after you've generated the code, you would put this in. + +01:01:50.480 --> 01:01:51.740 +We have high risk findings. + +01:01:52.220 --> 01:01:54.840 +That looks like an, and what number was that? + +01:01:55.080 --> 01:01:57.960 +We got more findings, mass assignment, unvalidated JSON. + +01:01:59.080 --> 01:01:59.320 +Yeah. + +01:01:59.400 --> 01:02:00.940 +And look how short your code was. + +01:02:01.020 --> 01:02:03.040 +I gave it 62 lines of code here. + +01:02:03.120 --> 01:02:07.060 +And you are going to have more vulnerabilities than you have lines of code. + +01:02:07.160 --> 01:02:07.420 +Okay. + +01:02:07.540 --> 01:02:11.740 +So I'm not going to go into the details, but wow, I just gave it a little bit and it pulled up a whole bunch. + +01:02:11.820 --> 01:02:13.560 +So I think that that is. + +01:02:13.940 --> 01:02:18.500 +Did this find more than when you just asked it to review for vulnerability? + +01:02:18.500 --> 01:02:19.240 +Yeah, I think so. + +01:02:19.320 --> 01:02:20.120 +I think it did actually. + +01:02:20.480 --> 01:02:23.080 +Because if you put the AI in the right frame of mind, right? + +01:02:23.240 --> 01:02:23.980 +That's incredible. + +01:02:24.200 --> 01:02:26.820 +Well, and I gave it specific things that I wanted it to look for. + +01:02:27.080 --> 01:02:28.960 +So the prompt library has three levels. + +01:02:29.120 --> 01:02:37.920 +So prompt level one, you would make, you would add it to your memory or make a code skill, but you would make it run 100% of the times that you generate code. + +01:02:38.260 --> 01:02:47.360 +And it takes most of the first two thirds of my most recent book, Alice and Bob Learn Secure Coding, and it has condensed it into a set of prompts. + +01:02:47.500 --> 01:02:48.140 +Oh, that's awesome. + +01:02:48.140 --> 01:02:51.800 +When you build the code, these are the rules for doing so. + +01:02:52.420 --> 01:02:54.800 +Then, so that runs just every single time. + +01:02:54.800 --> 01:03:00.600 +And then it tells you all of its security assumptions and it flags any potential security issues for you automatically. + +01:03:01.040 --> 01:03:03.880 +So every time you generate code, it's like, I need you to know these things. + +01:03:03.880 --> 01:03:05.500 +And so then you can address them. + +01:03:05.500 --> 01:03:12.640 +And then level two prompts are, well, I'm going to build an API or I'm going to build a serverless app or I'm going to do this or I'm going to do that. + +01:03:12.840 --> 01:03:19.280 +And then you fill in the blanks and it helps basically set security requirements before the code's generated. + +01:03:19.280 --> 01:03:23.460 +So it does the first prompt and then that as like a double check. + +01:03:23.800 --> 01:03:26.740 +Then after you can run the secure code review check. + +01:03:26.980 --> 01:03:29.680 +And then level three is like where you want to get nitty gritty. + +01:03:29.880 --> 01:03:35.920 +Like you're like, I'm doing a user login feature and I want to hash these passwords very securely. + +01:03:36.280 --> 01:03:40.000 +Like, and then it's very specific about exactly how to do that. + +01:03:40.220 --> 01:03:40.580 +And it's free. + +01:03:40.660 --> 01:03:40.760 +Yeah. + +01:03:40.780 --> 01:03:41.940 +People should definitely check this out. + +01:03:41.980 --> 01:03:42.440 +That's very cool. + +01:03:42.520 --> 01:03:42.760 +All right. + +01:03:42.820 --> 01:03:44.100 +We are out of time, Tanya. + +01:03:44.200 --> 01:03:45.400 +Thank you so much for being here. + +01:03:45.400 --> 01:03:46.280 +Final thoughts. + +01:03:46.380 --> 01:03:49.040 +People want to get going with the new top 10. + +01:03:49.240 --> 01:03:50.760 +Please go take a look at it. + +01:03:50.820 --> 01:03:53.820 +So just look up OWASP top 10 and that will be us. + +01:03:53.940 --> 01:04:00.860 +Like Google's very good at finding us and maybe give it a read and maybe think about it the next time you are building an app. + +01:04:01.280 --> 01:04:04.360 +Also, maybe consider visiting your local OWASP chapter. + +01:04:05.100 --> 01:04:14.280 +Next time you want to, you know, search the internet how to do something that is security related, look up OWASP cheat sheets and then authentication, authorization or wherever you're doing. + +01:04:14.280 --> 01:04:15.600 +There's over 100 cheat sheets. + +01:04:16.220 --> 01:04:19.820 +We are a community that lives to serve and help you secure your software. + +01:04:20.680 --> 01:04:21.800 +And come check out me. + +01:04:21.900 --> 01:04:26.700 +If you look up She Acts Purple, I am all the things, the newsletter, the podcast, the blog, et cetera. + +01:04:26.980 --> 01:04:28.360 +And I'm also here to help. + +01:04:28.820 --> 01:04:30.040 +Well, I know you're doing really good stuff. + +01:04:30.140 --> 01:04:31.220 +I really appreciate your time here. + +01:04:31.520 --> 01:04:31.900 +Thank you, Tanya. + +01:04:32.020 --> 01:04:32.600 +Thank you, Michael. + +01:04:34.660 --> 01:04:36.980 +This has been another episode of Talk Python To Me. + +01:04:37.120 --> 01:04:38.080 +Thank you to our sponsors. + +01:04:38.280 --> 01:04:39.560 +Be sure to check out what they're offering. + +01:04:39.720 --> 01:04:41.120 +It really helps support the show. + +01:04:41.120 --> 01:04:45.700 +This episode is brought to you by Temporal, durable workflows for Python. + +01:04:45.960 --> 01:04:52.680 +Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +01:04:52.940 --> 01:04:55.960 +Get started at talkpython.fm/Temporal. + +01:04:56.240 --> 01:05:08.740 +If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:05:08.740 --> 01:05:11.420 +Best of all, there's no subscription in sight. + +01:05:11.840 --> 01:05:13.600 +Browse the catalog at talkpython.fm. + +01:05:14.240 --> 01:05:18.920 +And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:05:19.380 --> 01:05:21.400 +Just search for Python in your podcast player. + +01:05:21.500 --> 01:05:22.360 +We should be right at the top. + +01:05:22.700 --> 01:05:25.680 +If you enjoyed that geeky rap song, you can download the full track. + +01:05:25.780 --> 01:05:27.680 +The link is actually in your podcast blur show notes. + +01:05:28.400 --> 01:05:29.820 +This is your host, Michael Kennedy. + +01:05:30.020 --> 01:05:31.300 +Thank you so much for listening. + +01:05:31.500 --> 01:05:32.280 +I really appreciate it. + +01:05:32.680 --> 01:05:33.420 +I'll see you next time. + +01:05:33.420 --> 01:06:03.420 +и Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би Би + +01:06:03.420 --> 01:06:33.400 +Thank you. diff --git a/transcripts/546-self-hosting-apps-for-python-people-transcript-final.txt b/transcripts/546-self-hosting-apps-for-python-people-transcript-final.txt new file mode 100644 index 0000000..63494c2 --- /dev/null +++ b/transcripts/546-self-hosting-apps-for-python-people-transcript-final.txt @@ -0,0 +1,1638 @@ +00:00:00 The cloud is convenient, until it isn't. + +00:00:02 You upload your photos, you sync your contacts, you click through the cookie banners, then prices go up, or you read about the family that lost their entire Google account over a medical photo sent to their doctor. + +00:00:12 At some point, the question shifts from, why would I run this myself, to why aren't I? + +00:00:18 My guest this week is Alex Kretzmar. + +00:00:20 He's the head of DevRel at Tailscale, the longtime host of the Self-Hosted Podcast, and co-founder of LinuxServer.io. + +00:00:27 We cover what self-hosting really means in 2026, the apps worth running yourself, like Image and Home Assistant, why Docker Compose ties it all together, and how Tailscale lets you reach any of it from anywhere without opening a single port. + +00:00:41 If you've been thinking about pulling your digital life back behind your own walls, this is your roadmap. + +00:00:46 This is Talk Python To Me, episode 546, recorded April 27th, 2026. + +00:00:53 Talk Python To Me, yeah, we ready to roll, upgrading the code, no fear of getting old, async in the air, new frameworks in sight, geeky rap on deck, Quarth Crew, it's time to unite, we started in pyramid, cruising old school lanes, + +00:01:08 had that stable base, yeah, sir. + +00:01:09 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:14 This is your host, Michael Kennedy. + +00:01:16 I'm a PSF fellow who's been coding for over 25 years. + +00:01:20 Let's connect on social media. + +00:01:21 You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:25 The social links are all in your show notes. + +00:01:28 You can find over 10 years of past episodes at talkpython.fm, and if you want to be part of the show, you can join our recording live streams. + +00:01:35 That's right, we live stream the raw, uncut version of each episode on YouTube. + +00:01:40 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:44 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:48 Temporal is hosting their yearly conference, Temporal Replay. + +00:01:52 Join your peers at Replay, the conference on orchestrating durable workflows and agents. + +00:01:57 May 5 to 7 in San Francisco. + +00:01:59 Visit talkpython.fm/temporal dash replay and use the code TALKPYTHON75, all one word, all caps, to save up to $449 on your ticket. + +00:02:10 Alex, welcome to Talk Python To Me. + +00:02:12 Well, thanks for having me. + +00:02:13 This is Comfy Surroundings. Hello. + +00:02:15 I'm really excited to be talking about self-hosting, something I have talked around on the podcast a little bit, and I had the Home Assistant guys on for a while long ago when Home Assistant was this little boutique thing that people might find interesting. + +00:02:29 Now it's kind of blown up, but I'm really looking forward to talking about digital sovereignty, running your own apps, not being dependent on huge tech companies for every little thing, and just the joy of finding something in open source + +00:02:44 or just out there and going, hey, what if I just run that myself? + +00:02:48 And so I thought of Alex, thought of you, and I said, hey, we got to talk about this. + +00:02:52 Great, yeah. + +00:02:52 Well, thanks for having me. + +00:02:54 Yeah, you bet. + +00:02:54 So before we dive into all those things, give people a bit of a background about yourself. + +00:02:58 Yeah, well, I'm Alex. + +00:02:59 I, as you perhaps can tell from the accent, originally hail from the UK. + +00:03:03 I live in North Carolina these days, though, for my sins. + +00:03:06 And I work for Tailscale. + +00:03:08 I head up their DevRel department and primarily make YouTube videos for them. + +00:03:13 You know, it's an interesting company to work for because it's all product-led growth. + +00:03:18 So my job is to really get people enthused and excited about the product and all the interesting ways in which they can access their stuff remotely. + +00:03:26 And then, I don't know, people bring it to work and that's how the company makes money. + +00:03:31 So I get paid essentially to make YouTube videos about hacking on self-hosted applications. + +00:03:36 And I still don't quite know how that happened. + +00:03:39 I think you guys over at Tailscale are doing a great job. + +00:03:42 We're going to go into it later when we get into sort of the security and accessing stuff of all the self-hosting things. + +00:03:48 But I started using Tailscale a couple of years ago and yeah, it's fabulous. + +00:03:52 So very nicely done. + +00:03:55 So some other stuff I've done, I used to do a podcast called Self-Hosted, wrapped that up last year. + +00:04:00 But I do a new one now called Bitflip with a few of my buddies. + +00:04:03 And again, from the self-hosting universe, you can find out more about that at bitflip.show. + +00:04:07 I hope some self-promotion's okay. + +00:04:09 But please, yes. + +00:04:10 No, that sounds great. + +00:04:11 Because I was a little disappointed to hear that you shut down Self-Hosted, the podcast, because I was just getting into it. + +00:04:16 And then, so you're back. + +00:04:18 Yeah, we did it. + +00:04:18 How does it differ? + +00:04:19 Well, not much really. + +00:04:21 So the weird thing was about Self-Hosted, and I don't know if you felt this with a show with Python in the name, but I kind of felt a little bit limited by the title because I tend to approach things from a very pragmatic angle. + +00:04:34 We were just talking before we pressed record about how important Linux and open source and all this kind of stuff is. + +00:04:41 And yet I'm using a MacBook to record, not Linux, because it's just bulletproof reliable for media applications. + +00:04:47 And there are all these little compromises we make all throughout our digital lives. + +00:04:50 And so Self-Hosted as a movement, particularly in the subreddit actually, is very opinionated. + +00:04:59 And unless you're doing absolutely everything, lock, stock, 100% yourself, there are some people who say, well, you're holding it wrong. + +00:05:06 You're not doing it properly. + +00:05:08 My approach has always been, it's okay to have DigitalOcean run a VPS for me, but I've still got root to that VPS and I am hosting my own website. + +00:05:17 I'm self-hosting my own websites. + +00:05:18 But to some people that definition doesn't sit right. + +00:05:21 And so- You got to be running on a Raspberry Pi in your basement. + +00:05:25 If that's not the way it is, it's not true. + +00:05:27 Right. + +00:05:27 And we all know that there are just limitations to doing things, like maybe you're moving house and so your website would be offline for two weeks whilst you move house. + +00:05:36 That's probably not okay. + +00:05:38 Or there's a storm in your area or a water pipe bursts or like any number of fates can befall things in your house. + +00:05:45 And I'm not saying these things can't happen to a data center, but there are just mitigations in place between, you know, even just things like ISP pairing and like the data center is probably in the middle of an internet exchange building, whereas my house definitely is not. + +00:05:58 So I kind of wrapped up the self-hosted podcast just a little bit because I felt like, I don't think I feel this way anymore, but sort of 18 months ago when we wrapped it up, that self-hosting had, we kind of said all we needed to say and that as a movement, + +00:06:12 it was just kind of bubbling away in the background and those that had found it were going to find it and it was just sort of ticking over. + +00:06:18 But I don't know, self-hosting is all the line trendy these days. + +00:06:22 I think I heard Linus on the WAN show on Friday literally saying that building a NAS is trendy. + +00:06:28 And I'm like, what? + +00:06:29 Is it? + +00:06:30 Okay, cool. + +00:06:32 Well, I'm here for it. + +00:06:32 Yeah, I'm here for it as well. + +00:06:34 And I'm glad to hear you're still carrying on with the podcast under a different banner. + +00:06:37 Well, the reality is that a lot of this stuff, like I said, like I do for Tailscale and for Bitflip now is this is stuff I'm doing anyway. + +00:06:45 Like my personal YouTube channel as well at KTZ Systems. + +00:06:49 Like I just, I'm just always like, just out of shot over here. + +00:06:52 There is a desk covered with like five of those little Lenovo mini PCs that I'm putting into a little Proxmox Ceph cluster because I woke up last week and my home assistant was down because my little Minis Forum MS01 had lit itself on fire in the middle of the night. + +00:07:07 And I found, ah, where's a single point of failure? + +00:07:09 I can fix that with some clustering and high availability and so the rabbit hole continues. + +00:07:14 Yes, I love that you have a high availability on your home network. + +00:07:18 I'm working on it, which is another story. + +00:07:20 But so it turns out these little Lenovo PCs, you can pick them up for about $150 or so. + +00:07:27 Even today, even in the hardware apocalypse that we're going through. + +00:07:31 These, you know the ones I mean, like the little one liter PCs, right? + +00:07:34 Usually bolted onto the back of a monitor in an office or something. + +00:07:37 And you can pick those up for about $150 and they will run every self-hosted app you could possibly throw at them. + +00:07:43 In reality, certainly just for individual use, they are absolutely all the average person needs as a home server. + +00:07:50 And so one of the things I like to do with them is put what's called Proxmox on it, which is a hypervisor that lets you run virtual machines, something called LXCs, Linux containers, as well as Docker. + +00:08:01 We love us some Docker, I understand. + +00:08:04 Basically, if it doesn't run in Docker, I don't run it. + +00:08:07 I'm just going to trigger some people in the audience, I'm sure. + +00:08:10 You know what? + +00:08:10 I'm with you. + +00:08:11 When I go and look at one of these things that is potentially self-hosted, I'm like, well, where's the Docker Compose file? + +00:08:16 Otherwise, I'm not sure we're going to be continuing down this path. + +00:08:19 I mean, you Python people know all about standardized packaging formats and stuff like that. + +00:08:23 Like the prevalence of pip and then these days, uv, of course. + +00:08:27 Like, you know, these things matter. + +00:08:29 They're like, how users round off those rough edges of how it gets from my keyboard in my lab to your computer and wherever you are. + +00:08:38 Docker kind of closed that last 10%. + +00:08:40 I mean, a lot of the primitives of Docker existed well before, like C groups and namespaces in the Linux kernel. + +00:08:46 All that stuff existed for years before Docker came along. + +00:08:50 All they did really was provide a standardized packaging format, which is really just a tarball, and a standard way of building those tarballs with a Docker file, like a recipe. + +00:08:59 That was all they did, and provided a little bit of plumbing and networking. + +00:09:02 Like, we just ignore all the technical details they did. + +00:09:05 But essentially, they just closed that last 10% of usability, and suddenly, me, as a computer science student, could run any application in the world without having to dive into Systemd and init scripts and database migrations + +00:09:20 and blah, blah, blah, blah. + +00:09:21 It was just... + +00:09:22 Yeah, complex networking, attached volumes. + +00:09:24 Like, there's a lot of stuff going on there, yeah. + +00:09:26 Yeah. + +00:09:26 Docker is life in this house. + +00:09:28 A long time ago now, I co-founded a website called Linuxserver.io. + +00:09:32 I don't know if anybody in the audience has heard that, but it's the largest, I believe, sort of open-source containerization movement project on the internet, and that was born out of the fact that sort of 10, gosh, yeah, maybe 12 years ago + +00:09:45 that Docker was pre-1.0, so it was very sort of nascent at that point, and it was... + +00:09:52 There was just... + +00:09:53 There were no standards. + +00:09:54 Like, the readmes were all over the place, or if there even was one. + +00:09:58 There were no sort of standardized base images. + +00:10:00 People hadn't cottoned on to, like, supply chains, and, you know, today it's a hot topic, but sort of back then, it was, oh, if it runs, I'm happy, you know? + +00:10:09 So Linux Server was sort of my attempt, our attempt, I should say, at fixing some of those issues, and, you know, we packaged up media server apps back in the day, like Plex, and some of the other slightly less salubrious + +00:10:23 applications you might find on the internet, as well as a bunch of other self-hosting stuff, which we should probably get into talking about some of the apps. + +00:10:32 Yeah, absolutely. + +00:10:32 Well, to kind of put a bookend on your introduction, I do just want to quickly ask you about your racing and VIR and stuff like that. + +00:10:43 You know, and I was looking to contact you, I was going through your About page, and I saw a car racing around a racetrack, and I thought, well, can't not talk about that. + +00:10:51 I've had folks from Formula One and from NASCAR on the show before, and I'm a big fan of these kinds of things. + +00:10:57 Yeah, I do too. + +00:10:58 So, that's one of your hobbies? + +00:11:00 That's pretty awesome. + +00:11:01 Yeah, I've followed Formula One since, well, I remember sitting on my dad's knee as a kid watching Damon Hill, Nigel Mansell, go around Silverstone, so it's been a while. + +00:11:12 There's obviously a new crop of F1 fans, which is amazing, thanks to the Drive to Survive stuff, but I've followed it for years. + +00:11:19 I just enjoy watching, I just enjoy watching the sport. + +00:11:22 It's like a nerd soap opera in a way. + +00:11:26 Not a fan, honestly, of these new regs, though, with the sort of the super clipping and all this kind of stuff. + +00:11:32 It'd be interesting to see what happens when we get to, where's the next one? + +00:11:36 Miami, I think. + +00:11:37 Yeah, I believe it is Miami and then Canada. + +00:11:39 So, for people who don't know out there, Formula One is called Formula One because there's one formula on how to build the cars, but then all the teams generally, almost from scratch, build their cars. + +00:11:49 And every couple, every four or five years, they're like, okay, we're completely doing it differently. + +00:11:54 And so, this year, they've completely done it differently and there's a lot of controversy. + +00:11:58 I don't know, it's interesting, but. + +00:12:00 Yeah, they've gone for like this 50-50 split between the combustion engine and the battery power, but the batteries can't harvest enough energy every lap. + +00:12:09 So, I don't know what genius thought of that, but, so they get halfway around the lap and they lose half of their horsepower, which can mean you've got closing speeds between cars of sort of, I don't know, 50 to 100 miles per hour. + +00:12:22 And we saw in Japan in the last race, quite a bad accident as a consequence. + +00:12:27 Right there in Spoon, it wasn't pretty. + +00:12:30 I know all of the electric stuff and like the hybrid things and IndyCar and even way, way more so in Formula One is for environmental friendliness. + +00:12:39 And hey, I drive an electric car, I love electric cars and I'm all about caring about the environment and stuff, but the 20 cars driving around the track is nothing compared to the 300,000 people that took airplanes to get there. + +00:12:51 And then when they ship the cars on planes halfway around the world, like the fuel spent when they're racing, it has nothing to do with, you know, it doesn't even register on the number of the environmental impact of that. + +00:13:02 So I don't know, I kind of long for the Damon Hill days with like, Oh, me too. + +00:13:06 Fast engines, you know. + +00:13:07 On our honeymoon, actually, my wife and I, we ended up in Milan on race weekend, totally by accident, genuinely by accident. + +00:13:15 We were booking this like interrail trip around Europe and my itinerary landed us in Milan on race weekend. + +00:13:21 I didn't actually know at the time and all hotels for that weekend spiked. + +00:13:25 They're like two or three exiting costs and I'm like, what's going on? + +00:13:28 So I just typed Milan events, September, whatever. + +00:13:31 Anyway, turns out, so we went to Monza and I'll never forget we were stood at, it was the Iscari chicane so it's on the opposite side from the start, finish straight and the noise, I think there were V8s, I don't think there were V10s, I think there were V8s then but just the noise of them sitting on the grid + +00:13:45 waiting to go, it was like a bunch of angry wasps and you could hear it and it's half a mile away. + +00:13:52 Amazing, amazing. + +00:13:53 We lost something when they went to the V6 turbo hybrid stuff. + +00:13:56 100%. + +00:13:57 All right, last bit, I mean a lot of people are fans of F1 and racing, not many of them end up on a race track. + +00:14:03 Oh yeah, that's a whole different kind of fresh. + +00:14:05 Yeah, so I've been into, I've owned seven Volkswagen Golfs over the years, culminating in the Golf R a few years ago and I just had to take it on a track. + +00:14:17 Like in England I went on this run what, we call it a run what you brung track evening and I went to Brands Hatch and I literally turned up without even a helmet, without doing any prep or whatever and they just let me untrack. + +00:14:29 Just, I couldn't believe it. + +00:14:30 And then I had the best evening of my life and then we emigrated and came here and I was like, I've got to scratch that itch. + +00:14:36 So I went to the internet and found out to go to VIR you have to do all sorts of training and get like instructors and it all sounded a bit much. + +00:14:44 But anyway, VIR is a serious racetrack. + +00:14:46 Like you can end up I think on the back straight in my little golf I was doing 140 on the back straight and there are moments coming up through the uphill essays at VIR where you're just like, if this goes wrong she's going to hurt. + +00:15:00 And in the end I ended up scaring myself a bit silly but I had real fun but there was just a couple of moments where I was like, you know, I've got a kid at home. + +00:15:08 I should probably, this is a young man's game or an old man's game when you've got nothing left to lose, I guess. + +00:15:14 Yeah, that's true. + +00:15:15 There's a, it's a bimodal sort of experience. + +00:15:18 Yeah. + +00:15:19 But I learned a lot like I learned how to change brake pads, brake fluid. + +00:15:23 I fitted a new intercooler to my car. + +00:15:25 I upgraded the turbo. + +00:15:26 I did tuning. + +00:15:27 Like technical stuff. + +00:15:28 I like learning how things work. + +00:15:30 Same with software, same with cars. + +00:15:31 It's basically just one is slightly more visceral and arguably the stakes are a bit higher if you screw up installing a turbo it can be very expensive. + +00:15:40 It's worse than, oh, I got to reinstall that. + +00:15:43 Yeah. + +00:15:43 Yeah. + +00:15:43 Good fun though. + +00:15:44 No, I'm sure it's amazing. + +00:15:45 That sounds very, very cool. + +00:15:47 So what a great experience. + +00:15:48 Let's talk the main, main topic. + +00:15:51 Like, I guess we've been using the word without really defining it. + +00:15:54 Like what is self-hosting for people who are just like, you know, they, they haven't done these sorts of things. + +00:15:59 I think as I, as I alluded to earlier, there's a broad spectrum of definitions to what self-hosting means to different people, depending on how tightly you hold certain beliefs around definitions. + +00:16:12 But for me, it means the business model that exists is feeding the open source developer or small team that built it. + +00:16:21 Like it's, it's not, are you familiar with Corey Doctorow and his idea of n-certification? + +00:16:27 Yeah. + +00:16:27 The idea that a company will give some, we, we've been accused of this at Tailscale and I don't think it's actually going to happen. + +00:16:34 the CEO at Tailscale, I have great faith in Avery's leadership, honestly. + +00:16:39 I know I sound like a corporate shill saying that, but I genuinely believe it. + +00:16:42 So, the idea of n-certification is that a company takes a bunch of money from venture capital or some other source and gives the product away. + +00:16:51 We saw it with Uber, for example. + +00:16:53 Like they give the product away at a loss leading price point to gain market share. + +00:16:58 We've seen it in multiple industries over the years. + +00:17:00 Walmart is a great example. + +00:17:01 They'll put mom and pop short stores out of business in the local town and then slowly raise the prices. + +00:17:06 Right, right. + +00:17:06 Once everyone's gone, it's, it's, they have no choice but to go there. + +00:17:09 Exactly. + +00:17:10 And so the idea of n-certification in software is, is very prevalent. + +00:17:15 We've, we're seeing it with streaming services right now where they're just gradually turning the screw, lifting the prices, pulling out shows without your control. + +00:17:23 All of these things have, are really boiled around one central point. + +00:17:28 I mentioned the business model. + +00:17:29 That's one thing, but really it's control. + +00:17:31 And do you have control over the services that are running your life? + +00:17:36 If you have Google in your life, you probably don't. + +00:17:38 If you have Apple in your life, you probably don't. + +00:17:40 You feel like you do, but there are countless examples. + +00:17:44 For example, there was one a couple of years ago where, I think this was in the New York Times. + +00:17:49 We definitely covered this on Self-Hosted a while ago where a mother took pictures of their kids, a medical issue of their kids, private areas, and sent it to their doctors through telehealth. + +00:18:03 They also sent the picture to their husband through a messaging app, which then meant that that picture got backed up to, I think it was Google Photos. + +00:18:11 It might have been Amazon. + +00:18:12 Please don't quote me on this. + +00:18:13 I'm just speaking from two-year-ago memory. + +00:18:15 And they got flagged as a CSAM issue, like a child pornography issue. + +00:18:20 And they had most of their digital life cancelled. + +00:18:24 They were locked out of their accounts. + +00:18:25 They were basically banned from that company. + +00:18:28 Might have been Google. + +00:18:29 Let's go with Google. + +00:18:31 And just the idea of being locked out of my Gmail. + +00:18:34 I mean, just stop and think about how much of your life is in your Gmail inbox. + +00:18:37 How long have you had yours? + +00:18:39 15 years. + +00:18:39 I think there's over a quarter million emails in my Gmail account. + +00:18:42 It's ridiculous. + +00:18:43 I mean, it is ridiculous. + +00:18:45 And extrapolate that from email to photos. + +00:18:49 Extrapolate that to music, to videos, to, I don't know, taxis and invoices, all this stuff. + +00:18:56 There are just so many different facets of our lives that we've given up to third parties that are either being used to train the next round of industrial revolution, oligarchy revolution, like AI models, or they're being used to feed an advertiser's + +00:19:11 bottom line and create a profile about you and who you are and what you do and who you associate with. + +00:19:16 Because make no mistake, when your photo gets uploaded to Google Photos, they are making a map of all the faces in that photo. + +00:19:24 Whether you know the person in the background or not, Google will know them because they probably have Google Photos too. + +00:19:29 And they can scan that Alex was stood next to Fred Smith on June the 21st, 1983. + +00:19:36 And like, they can create such incredibly detailed profiles about people. + +00:19:41 And if that doesn't bother you, self-hosting is probably not for you. + +00:19:43 But I don't know about, I don't know about you, but it makes me deeply uncomfortable that I'm giving up these freedoms and this privacy without really appreciating that I'm doing so. + +00:19:53 Like a lot of the transaction is very, what's the word I'm looking for? + +00:19:58 Like it's just not a fair, it's not a fair exchange of value for value. + +00:20:01 It's asymmetric. + +00:20:02 Yeah. + +00:20:03 Asymmetric. + +00:20:03 Very asymmetric. + +00:20:04 Yeah, absolutely. + +00:20:05 Totally. + +00:20:05 And I want to just, while we're sort of setting the stage, I just want to put an idea out there that this kind of stuff is super valuable and a good thing to keep in mind, not just for individuals, which 100% that it is, but also for developers running their software. + +00:20:20 Do you necessarily need to take all of your data and put it into an AWS managed service or an Azure managed service or send all of your users information through, say, Google Analytics to Google to then turn around + +00:20:35 a mine or to other places? + +00:20:37 You don't have, I feel like people think they have to. + +00:20:39 You don't have to. + +00:20:40 It almost feels inevitable, doesn't it, these days? + +00:20:43 That, oh, well, everyone else is doing it. + +00:20:45 I may as well. + +00:20:46 Yeah. + +00:20:47 We'll get the cookie banner. + +00:20:48 We'll put it up. + +00:20:48 People are used to, everywhere they go, they click the cookie banner. + +00:20:52 True. + +00:20:52 But there are entirely serviceable alternatives to almost every single proprietary service that you have. + +00:20:59 Google Analytics, let's start with that one. + +00:21:01 There's an open source app called Plausible. + +00:21:03 It does almost everything that Google Analytics does. + +00:21:07 It just, the analytics stay within your world and they're not, they're not kind of fed into the Google machine. + +00:21:14 And whether that's a, like, on feature parity, there's an argument to be made there about, like, well, Google's more invasive so they have more data. + +00:21:22 I don't see that as a plus point, personally. + +00:21:25 This portion of Talk Python is brought to you by Temporal and the Temporal Replay Conference. + +00:21:31 Previously, I've told you about Temporal's open source framework and I've had Mason Egger on the podcast. + +00:21:36 If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:21:45 This is where Temporal's got your back with their open source framework. + +00:21:49 And if that kind of workload is what you're building, you should definitely consider attending the Temporal Replay Conference. + +00:21:53 It's hosted May 5-7 in Moscone Center in San Francisco. + +00:21:58 Join your peers at Replay. + +00:22:00 Temporal's conference on orchestrating durable workflows and agents. + +00:22:03 You'll learn real-world patterns for reliability, failure handling, and scale from developers building themselves, including speakers from OpenAI, Replit, and Abridge. + +00:22:13 Check out Replay 2026 at talkpython.fm/temporal dash replay and use the code talkpython75 all one word to save up to $449 on your ticket. + +00:22:25 That's talkpython.fm/temporal dash replay and code talkpython75 all one word. + +00:22:32 The link is in your podcast player's show notes. + +00:22:35 Thanks to Temporal for supporting the show. + +00:22:38 I don't either. + +00:22:39 And I think this is an interesting segue to finding some of the interesting apps here. + +00:22:43 So I went to pull up plausible.io and I think you're right. + +00:22:46 I think plausible is really great. + +00:22:48 The one that I'm using is umami.is which is sort of a peer to plausible. + +00:22:55 I believe, I think you can pay for both of them. + +00:22:58 I'm not 100% sure about umami right now. + +00:23:00 Yeah. + +00:23:01 I don't know your ad book must be doing some hard lifting over there because plausible works just fine for me. + +00:23:05 You're using umami, are you? + +00:23:06 Yeah, I'm using umami and I looked at plausible as well and umami seemed a little more oriented towards self-hosting whereas plausible self-hosting seemed like oh, you could do it but we're kind of this like thing that we run in the cloud and you can pay for but you technically could + +00:23:21 and I felt like umami was like self-hosting first with, I don't even, like I said, I'm pretty sure there is a you now can pay for it as well. + +00:23:28 But I wanted to bring up this you, the site can't be reached because I think another interesting thing is like hosting DNS. + +00:23:37 So like pyholes, I have nextdns.io which is why I can't go to plausible right now unless I log in and tell it plausible is okay. + +00:23:45 Same thing for umami by the way. + +00:23:48 I think, what about, let's talk, let's, you're at Tailscale, let's talk networking. + +00:23:52 We'll get back to the use of Tailscale when we kind of wrap things up but like, do you use Pyhole or do you use any of these sort of managed things outside just your browser? + +00:24:00 Well, the modern internet basically requires using an ad blocker. + +00:24:03 I mean, when you, I'm fortunate to work from home so I'm almost always with inside these four walls where I have an AdGuard Home instance running and my DHCP server when, whenever a device requests an IP address from the router, + +00:24:18 it will hand out the DNS server in my local network as the AdGuard Home instance. + +00:24:23 And AdGuard Home's job is to run a list of websites that it thinks are serving ads and it will block those at the DNS level. + +00:24:31 So simply what will happen is you will go to try and load a website and it can't load certain components of the webpage and those components happen to be adverts in this case. + +00:24:40 It's not 100% coverage but I'd say it's sort of in the 80 to 90% range which is still a heck of a lot better than having no ad blocking whatsoever. + +00:24:49 And the idea here is that a lot of these, well, first of all, adverts use a lot of bandwidth. + +00:24:54 They also are probably shoving down a ton of JavaScript into your browser so the performance of loading a webpage is worse. + +00:25:02 It's using more bandwidth, it's using more processing power and on mobile, of course, that matters. + +00:25:07 When I leave the house, I'm not under the umbrella of my AdGuard home instance anymore because it's running on, I don't know, a Raspberry Pi in my basement. + +00:25:15 And so I've got a couple of options. + +00:25:16 One is I can use a hosted DNS service like you do called NextDNS which basically does the same thing as a Pi hole except you pay for it. + +00:25:26 I don't think it's a huge amount of money if I recall. + +00:25:28 It's a couple of bucks. + +00:25:29 It's either $1 or $1.99 a month. + +00:25:32 It's really small, yeah. + +00:25:34 It seems fair. + +00:25:35 And the idea behind NextDNS, like I say, is that it does the same thing as a Pi hole or an AdGuard home. + +00:25:41 It's just a hosted service that you pay for a managed service. + +00:25:44 Or you can use something like Tailscale and tunnel back through your firewall remotely and set your AdGuard home as your Tailnet DNS server and then use your AdGuard home or your Pi hole from your basement that you're already running already configured + +00:25:59 with all of your ad lists and blah, blah, blah. + +00:26:01 You can configure that to be your DNS server. + +00:26:03 And my wife loves these sort of like mobile games like the Candy Crushes of the world. + +00:26:08 And they are just chocked full of ads. + +00:26:10 And we only really talk about it when we're like traveling because she's, oh God, I wish we were at home because then I wouldn't get adverts. + +00:26:17 Yeah, we'll just turn on Tailscale and lo and behold, no ads. + +00:26:20 You're back to good. + +00:26:21 I think one final little note about like running your, either your AdGuard at home or your Next DNS if you register at your router level that's really interesting is you block ads in mobile apps as well like you're mentioning or on my TV all the tracking + +00:26:36 the TV does is short-circuited because everything on the network is subjected to it. + +00:26:42 And I'm, you know, as long as these ad networks are serving up malicious ads, I don't feel bad about blocking them. + +00:26:49 That's another angle of course, yeah. + +00:26:50 Yeah, I mean, if we go to Talk Python, you know, the website, there's, ads are still there. + +00:26:57 Why? + +00:26:57 Because I'm not using some shady network to deliver it. + +00:27:00 I'm just sharing content and someone who happens to talk about what we're doing, you know, and so I think that that's a, I think that's certainly something worth considering, right? + +00:27:10 I feel like this DNS stuff is part of self-hosting at least the personal level a bit. + +00:27:14 It's the, it's one of the fundamentals, yeah. + +00:27:18 Networking is one of those things that you have to have it if you want to do anything in your house, like even, and I use my mother, who I love dearly, as the example of the non-technical person in my life. + +00:27:29 Even if my mum, like she orders a router from her ISP or something like that to get Wi-Fi in her house, well, she's doing networking, she doesn't realise it. + +00:27:39 She's getting a Wi-Fi SSID broadcast, she's getting an IP address from the router every time she connects. + +00:27:45 The DHCP server provides a DNS server, which is probably your ISP's DNS server by default, and they are recording all of your DNS queries and selling them to the highest bidder also, I might add. + +00:27:56 And so there are just so many layers to this onion, and DNS is the, just what, we have a five-year-old in the house, we just watched Shrek this weekend, hence the onion reference. + +00:28:05 There are just so many layers to this onion that you just, you can keep peeling it forever, and this is one of the things that I genuinely love most about Linux, open source, self-hosting, that whole universe is that this conversation, I could literally sit here for eight hours and talk to you about different, + +00:28:20 you know, different things, like DNS is one thing, document management is another, media streaming is another, like each of these things, they're all, they're entire industries in their own right in the real world, but in self-hosting, you can play CIS admin, you can play, you know, + +00:28:35 the person who's running these mega corps offline, fully just in your basement, you know, and there's no, there's no business model to feed, it's literally just open source software, the true spirit of it, running in your house under your control. + +00:28:49 Yeah, we're definitely in danger of going for eight hours, so, I hope not, but we could, right, we definitely could, and by way of, I think that's a perfect transition to talk about this place called Awesome Self-Hosted here, which is a Git repository and a website, + +00:29:04 you know, I do, Alex, I think this is going to be a bit of a fad, it's not really catching on, there's only 288,000 GitHub stars in this. + +00:29:12 And if you look at it, you're familiar with the Awesome Lists, of course, there are dozens of these things, but Awesome Self-Hosted, I mean, it's updated daily, like, I look at the recent Git commits and it was last updated yesterday, and there are, + +00:29:27 how many categories? + +00:29:28 There must be. + +00:29:29 I don't know, but let me scroll, like, there's a couple of pages of just categories of things like e-commerce, DNS, for example, analytics. + +00:29:38 Right. + +00:29:39 You want to replace Jira? + +00:29:40 It's in here. + +00:29:41 You want to replace, I don't know, a wiki? + +00:29:44 It's in here. + +00:29:45 You know, it's honestly kind of overwhelming. + +00:29:49 And so this speaks a little bit to one of my overall philosophies when it comes to self-hosting of find a problem in your life and solve it, like a real problem. + +00:29:59 Don't just contrive one just for the sake of it. + +00:30:01 Photos is always the universal example I go to because everybody takes photos. + +00:30:05 And so you want to look at something like image, I-M-M-I-C-H. + +00:30:09 And that is a self-hosted Google Photos clone, and it lives entirely on your hardware that you control. + +00:30:14 It has machine learning, so it can learn your face. + +00:30:18 It can do, you know, object detection. + +00:30:21 It can do basically anything that Google Photos can do, except it lives on your hardware using your files and your compute until the end of time. + +00:30:30 And that's an end of it. + +00:30:31 Like, that's as deep as the rabbit hole goes. + +00:30:34 I love it. + +00:30:34 But it also makes me nervous. + +00:30:36 Good, it should. + +00:30:37 Because the thing with self-hosting is you get to place this admin, but it also means you own the data, which means when there's an outage or a hardware failure, you're on the hook for that too. + +00:30:48 Yeah, I'm not super concerned about an outage for my self-hosting thing, but I am certainly concerned about an outage of a self-hosted something for my production apps. + +00:30:58 And when I said it makes me nervous, yeah, yeah, but the things that make me nervous are twofold. + +00:31:04 The first thing that made me nervous would be just backup, backup and restore, or kind of losing access to it. + +00:31:11 Like something that I think it takes a while, at least for me, it took a while to learn the lessons through some paper cuts, was, oh, there's a new version of this thing that I'm self-hosting. + +00:31:19 How cool. + +00:31:19 Let's see what it is. + +00:31:20 Docker compose pull, Docker compose up, and then it won't start because there's some incompatible migration or something that I didn't run and I got to go read the docs and it says, oh, did you upgrade from version 1.6 to 1.8? + +00:31:33 You can't do that. + +00:31:33 You got to go to 1.7 and then 1.8. + +00:31:35 I'm like, now I'm an admin. + +00:31:37 But more concerned, like I had all this data, what if I can't get it to work on 1.8, but it's like a half database transition and then neither will run and now what do I do? + +00:31:45 Well, the best answer to that are some of the primitives around things like ZFS and snapshots. + +00:31:52 So there is this concept with, so ZFS, by the way, if you're not familiar, is the Zettabyte file system and it was born out of Sun Microsystems in the early 2000s, I believe. + +00:32:04 It's now unfortunately owned by Oracle, but there is a project called OpenZFS which is dedicated to bringing it to the masses, to normal people. + +00:32:13 There are still some weirdnesses around the licensing with ZFS, so it's not included by default in every single Linux distro, but it is included in things like Proxmox and Ubuntu and you can install it on Arch and NixOS and even Unraid, I think, + +00:32:27 has ZFS these days. + +00:32:29 And so the idea here is you're using what's called a copy-on-write file system. + +00:32:34 Now some of these terms, I will admit, sound a little nerdy and they kind of are, but the idea behind copy-on-write is you take a snapshot at a moment in time and instead of the file system recording everything, you know, + +00:32:48 transactionally forever, it will only record the delta from the previous snapshot. + +00:32:52 And so what that means is that you can fork, you can basically fork file systems on disk and then you can mount the snapshot from three days ago as an actual file system and then restore the files that way. + +00:33:05 So let's say your upgrade scenario, you could restore the database from just before you did the upgrade because as a good sysadmin, you are doing the hygiene of taking a snapshot before you do the risky thing, right? + +00:33:18 You can automate all this stuff with scripts, right? + +00:33:21 And I think there's a pragmatic angle here of how much time do you spend automating versus administering versus just going outside and touching grass. + +00:33:30 But in the age of AI, there's really not, like it's, I installed Arch Linux last night downstairs on my gaming rig. + +00:33:38 I was done, I decided I'm done with Windows for gaming. + +00:33:41 And I thought, right, how far can Codex get me? + +00:33:43 You know, the OpenAI version of Claude Code. + +00:33:46 And I installed Arch myself and then I said, right, I want this desktop to look like this. + +00:33:50 I want this kind of vibe. + +00:33:52 I want like an Ubuntu kind of orange vibe. + +00:33:55 I want Wayland compositor for my display and I want it to all log in seamlessly and blah, blah, blah, blah, blah. + +00:34:03 I want these fonts. + +00:34:04 I want my fan curves to be this. + +00:34:05 And I just let it cook and maybe half an hour later I came back and my system was just configured. + +00:34:10 Wow. + +00:34:10 And it's amazing. + +00:34:11 And you can do the same thing with a lot of like, like backup script. + +00:34:15 You can literally say to Codex, these are my requirements. + +00:34:19 I want you to take a snapshot before you do any kind of Docker compose operation. + +00:34:25 And it will do it, whether it's via an alias or whatever. + +00:34:28 I don't, the mechanics don't matter. + +00:34:31 But the point is a lot of this stuff you can protect yourself from yourself now with so much less cognitive load than you used to have. + +00:34:37 You can then configure it to backup offsite to all sorts of different places. + +00:34:41 There's a, there's a wonderful service called ZFS.rent, which if you're not familiar is a way of, you basically send them a hard drive and they will put it into a server somewhere and you pay, I think it's $10 a month for that hard drive slot. + +00:34:54 And then you can replicate all of your photos encrypted over the internet to ZFS.rent. + +00:34:59 And it's, it's 10 bucks a month. + +00:35:01 And then you've got that peace of mind. + +00:35:03 That's wild. + +00:35:04 I had no idea about this. + +00:35:05 This is a really interesting way. + +00:35:06 It's a great service. + +00:35:08 I have several friends that use it. + +00:35:09 Okay. + +00:35:10 Yeah, that's really cool because backup is certainly one of them. + +00:35:13 And that, that's not just export the data. + +00:35:14 That's like making sure the app runs so that you can actually get to the data that's in the, you know, Postgres DB that's running in the little Docker composed network that it created when you ran it and so on, right? + +00:35:24 There's plenty of other options with backups too. + +00:35:26 Like Backblaze is a decent one, although they were in the news fairly recently for some, I don't know, they stopped backing up OneDrive folders and just did it silently. + +00:35:35 And I don't know, you know how Reddit likes to go, go, go in on people. + +00:35:40 So I don't know, Backblaze, they've been there for a long time. + +00:35:42 They're a pretty reliable option. + +00:35:44 You could also, if you want to do it fully self-hosted, Hetzner, you know, VPS provider, they have what's called a storage box, which you can usually bid on, which I think they cost somewhere typically between 30 to 50 euros a month. + +00:35:56 So it's not the cheapest option, but if you want that level, that amount of storage offsite, it gets expensive. + +00:36:03 That's just the reality of it. + +00:36:04 When the business model is just storage and not farming your data and mining you for advertising stuff, it turns out storage is expensive. + +00:36:14 Yeah, that's what you got to pay for it if you're not the product. + +00:36:16 Yeah, but these things have enough storage that between you and a few mates, you could probably split it up into different ZFS datasets and replicate that way and, you know, split the bill a little bit as it were. + +00:36:27 Are there self-hosting things that really stand out for you that you're a big fan of? + +00:36:31 Like apps? + +00:36:32 The real problem aspect is one for me, I think, that's critical to it, to the success. + +00:36:37 You know, I talked about photos as being one example. + +00:36:40 Home automation is another. + +00:36:41 As you said, you'd have the home assistant guys on this podcast before. + +00:36:45 We actually had Paulus on self-hosted a while ago and, you know, those guys, what they're doing with the Open Home Foundation is amazing. + +00:36:53 Again, they're eschewing the status quo of five different apps for five different ecosystems and making everything talk to everything else and it's amazing. + +00:37:01 And, you know, for me in this studio, for example, I've got one, two, three different ecosystems just for my studio lights and it's all brought under Home Assistant in one place. + +00:37:11 And so for me, that solves a real problem. + +00:37:13 So when Home Assistant is down, okay, it's not the biggest deal. + +00:37:16 I have to walk around and turn three sets of lights off. + +00:37:18 Okay, fine. + +00:37:19 But when you start to add all of the different ecosystems in your house together, like your thermostats, you know, I have a mini split up there that I control through an ESP32 with like a serial connection. + +00:37:31 I then have an Ecobee thermostat downstairs and so that's two ecosystems just for the climate in the house and then my garage doors are another ecosystem and so it continues. + +00:37:41 And so solving real problems and bringing them back behind the firewall really is the idea for me. + +00:37:47 Just, I don't know, it helps me sleep better at night but it's also in many cases just more convenient and less hassle. + +00:37:55 The unification really that Home Assistant brings is really one of the biggest because everybody's got their janky little app that they think is so special, you know what I mean? + +00:38:02 Yeah. + +00:38:04 And I don't blame manufacturers necessarily for going that route because the way the internet was designed is it's, you know, I have something on this desk, right? + +00:38:14 How would the manufacturer talk to it to control it through a smartphone app? + +00:38:19 The only guarantee you've got is that a cloud server exists. + +00:38:23 You can't control whether the user is necessarily on the same Wi-Fi and in fact, we've seen over the last 20 years as technology's evolved that I remember unboxing products 20 years ago that just the usability was just horrid. + +00:38:35 You know, there are so many assumptions the manufacturers have to make about the environment it's going to land in, the Wi-Fi situation, the smartphone it's going to run on, blah, blah, blah. + +00:38:43 And the only way you can really guarantee compatibility is to take control of that link and host the cloud component yourself and then have your users talk to your cloud and then have the cloud talk to the device. + +00:38:55 Even though I can reach out and touch the light that's up here, it has to go to the cloud first to talk to it just because it guarantees that user experience. + +00:39:03 I know. + +00:39:03 My lights that I have for my streaming setup, they don't even have on, you can't physically turn them on. + +00:39:08 The only way you can turn them on is over the network. + +00:39:10 It's weird. + +00:39:11 Yeah. + +00:39:11 Welcome to 2026. + +00:39:14 Exactly. + +00:39:15 How did we accept that that is normal? + +00:39:18 When did that become normal? + +00:39:20 I don't know. + +00:39:21 Now that I think about it, it should at least have an on button. + +00:39:25 Oh, well. + +00:39:25 Right. + +00:39:26 I know. + +00:39:26 So let's talk for a little while. + +00:39:28 Now we've sort of set the stage, talked about some awesome apps and motivation and so on, but let's talk a bit about actually how to do it because I'm sure there's, I don't know, let me throw out, I'll just speculate. + +00:39:39 I bet there's 30 to 40% of the people are like, oh yeah, I'll just SSH into my setup as well and then I know what to do from there. + +00:39:47 And there's like maybe 20% of the people are like, I know what, I know I should SSH in there and the others are like, what is SSH? + +00:39:54 Yeah. + +00:39:54 So there's a lot of hesitation, I think, because you are kind of becoming a DevOps person. + +00:40:01 Like you're running probably in Docker, maybe on Linux. + +00:40:03 It's not on your main machine, most likely. + +00:40:06 And then this whole backup sort of story that we talked about and restore. + +00:40:09 Like talk to people about some of the tech. + +00:40:12 It's inherently still a technical occupation and there isn't still really a great way around some of that. + +00:40:19 Now we're on a Python show. + +00:40:21 We understand that abstractions exist, right? + +00:40:24 Python, of course, itself is an abstraction above something else. + +00:40:27 There are lots of companies that will tell you and will try and sell you abstractions on top of this self-hosting layer that I'm talking about. + +00:40:35 Well, Docker is an abstraction. + +00:40:36 Linux is technically an abstraction, although let's just not talk machine code. + +00:40:42 Let's just deal in, let's just treat Linux as the base. + +00:40:45 Yeah. + +00:40:45 Assume you have an OS. + +00:40:46 Okay. + +00:40:47 Yeah. + +00:40:47 I think that's fair. + +00:40:49 I agree. + +00:40:50 You know, there are, I have a couple of, I don't know if you can see it in camera, probably not. + +00:40:55 I've got a couple of Zima Board 2s on test, which they sent me for review for YouTube. + +00:40:59 And they have a, they have something called Zima OS. + +00:41:02 Z-I-M-A-OS. + +00:41:03 Z-I-M-A-OS. + +00:41:04 And, you know, it's pretty good. + +00:41:07 Like it's a, it's a one click. + +00:41:09 You can, it's got a little app store in it, like you have on your phone and you can install a lot of these apps in one click onto Zima OS. + +00:41:16 You can connect to USB hard drive and within maybe 20 minutes, half an hour, you've got a fairly functional setup. + +00:41:23 Now, is it the most buttoned up, most secure bulletproof thing in the world? + +00:41:28 No, almost certainly not. + +00:41:30 But it gets you started. + +00:41:31 And I think that is the real key is the best way to learn this stuff is to not think about it too much. + +00:41:37 It's just to do it in a fairly low stakes way. + +00:41:41 Don't try and switch from Spotify, for example, and convert your wife and your kids and everyone in your life to your self-hosted music streaming service overnight. + +00:41:52 Softly, softly, slowly, slowly, catchy monkey. + +00:41:54 You know, it's one of those things that you're probably going to need these things running in parallel for a little while until you feel comfortable enough that when you wake up at 7am and the streaming service that you've built in your basement doesn't work and the kid can't + +00:42:09 watch their episode of cartoons before school or whatever, do you want to have to log in at 7am via SSH to your server and fix it? + +00:42:17 No, I never do. + +00:42:18 It turns out that's not something I want to do, but it's something I've had to do a few times because I've made mistakes either in not rotating logs properly or a disk filled up or there was a hardware failure or the list goes on and it's just, you know, you're trading some convenience + +00:42:32 for ownership and the transaction is different and some of the cost there is in you and your time, but I will always advocate for people to learn these skills because I think in the modern world + +00:42:47 they are such basic fundamental skills. + +00:42:49 I wouldn't put them quite in the same bracket as learning how to do plumbing or electrical work or something like that, but this stuff, you know, everybody takes photos everybody listens to music and why should we continue to enrich the pockets + +00:43:03 of Megacorps when we have the tools and the capabilities to do this stuff ourselves if we're just willing to put a few weekends aside and learn it? + +00:43:12 It's a great point. + +00:43:13 I guess start small. + +00:43:15 These little... + +00:43:16 Start small, yeah. + +00:43:17 These home or these self-hosting OSes, I guess they sort of call it, it tries to bring kind of an app store experience to the self-hosting. + +00:43:26 Another one that I would say is Coolify. + +00:43:29 I don't know if you're familiar with Coolify. + +00:43:30 Coolify's great. + +00:43:31 Yeah, I'm sure. + +00:43:32 Yeah, cool. + +00:43:32 I did some stuff with Coolify for a while. + +00:43:35 It's a little similar. + +00:43:35 And you don't even need anything in your house with Coolify. + +00:43:38 They will do hosted versions of these self-hosted apps if that even makes sense. + +00:43:42 But essentially, you're still running the service, you're paying to run the service on their infrastructure. + +00:43:49 And so all of the stuff we talked about around digital sovereignty and privacy and business models all remains true except for the fact the compute doesn't live behind your firewall. + +00:43:58 It lives somewhere else. + +00:43:59 Yeah, and you can even do things with Coolify such as get a server at Hetzner or DigitalOcean, create an account at Coolify, and then basically install their Daemon thing on your app. + +00:44:11 And then through there, a little management. + +00:44:12 You're managing multiple servers running. + +00:44:14 I wanted to love Coolify and I think the idea is great. + +00:44:18 I found that I ended up juggling so much more UI settings where I'm like, you know, if I just had a Docker Compose file, I could just define and replace or something. + +00:44:29 Yeah. + +00:44:29 Such is the life of an abstraction, right? + +00:44:31 You trade certain complexities for certain decisions that the main, I mean, look at Apple, right? + +00:44:39 We're always looking at macOS going, oh, I wish it, why are they doing it that way? + +00:44:44 Well, you outsource that decision and the same is true with Coolify. + +00:44:48 And any other abstraction that you choose as part of this stack, like even Docker, for example, is an abstraction, as I said, and you are making a certain set of, you're outsourcing a certain set of decisions to Docker in how things work. + +00:45:01 It's just a reality of the world. + +00:45:02 Yeah, that's a really good point. + +00:45:03 That's, you know, you choose your abstraction. + +00:45:05 So I bring it up because I do feel like people who are hesitant to do this kind of stuff, this is a really good option to get you started and get you comfortable and like, ah, what if I, maybe I could just run it myself after you're comfortable, you know, you work your way down until you, you know, + +00:45:19 gain some of these skills. + +00:45:21 What about Linux? + +00:45:22 You know, one of the things that I think is both a hesitation for doing this at all, but also a hesitation to use Docker. + +00:45:28 It's like, well, I could just do it on Linux. + +00:45:30 At first you're like, well, I can't do Linux or Linux is intimidating to me. + +00:45:34 Eventually you get that skill. + +00:45:35 You're like, well, I could just put it on my machine. + +00:45:36 Why do I need to actually use all this Docker complexity? + +00:45:39 It is the repeatability for me, at least. + +00:45:44 So what Docker brings to the table is a unified interface to running headless applications. + +00:45:50 I can define using a Docker compose file, which is just a short YAML file in maybe 15 lines. + +00:45:57 I can say, right, this is the name this container is going to get. + +00:46:00 These are the exact directories this application is allowed to access on my system. + +00:46:04 My photos app, for example, doesn't need access to my music library. + +00:46:08 And so you reduce the blast radius of anything going wrong. + +00:46:11 These are the ports it's allowed to access. + +00:46:13 These are the kernel capabilities it's allowed to have if you want to get that deep. + +00:46:18 You can turn off from a security perspective, you know, the photos app, for example, probably doesn't need a huge amount of kernel permissions to operate effectively. + +00:46:26 Turn off the stuff it doesn't need. + +00:46:27 And then that way, if there is a supply chain attack or a vulnerability exposed, the application itself becomes so much less of an attack vector because it literally physically has no access to certain bits of the kernel. + +00:46:42 You know, when you keep going down the list of what Docker Compose can provide for you, within 15 lines you can define an entire application's deployment and then store it in GitHub completely securely, safely. + +00:46:55 Obviously don't put secrets in GitHub, people. + +00:46:57 Please do not do that. + +00:46:58 But there are plenty of ways to sort of store secrets locally. + +00:47:03 I think there's something called OpenBow, which is a local fork of HashiCorp Vault as a secret management. + +00:47:10 You can use Bitwarden CLI, you can use 1Password. + +00:47:13 There's many ways to store secrets. + +00:47:15 Again, for me, it's like, why do we need things like Docker to exist? + +00:47:19 It's because it's a universal language. + +00:47:21 I can ship you a Docker Compose YAML or any developer assistant can ship a Compose file alongside their applications and I don't need to know anything about you or your application. + +00:47:32 I just run Docker Compose pull up and suddenly all of it's like in the Kubernetes where it's like an operator, in the Windows where it's like an installer. + +00:47:41 You're capturing all of the knowledge that you have about how to run your application successfully into this artifact which I then just pull down and deploy and run and it removes all of that complexity. + +00:47:52 Beyond Docker, you mentioned a lot of the Docker Compose stuff. + +00:47:56 You're right. + +00:47:57 I'm going to define the networking, what things can talk to what. + +00:48:00 I'm going to define the storage. + +00:48:01 I'm going to define the visibility over the firewall sort of levels of things and it's great. + +00:48:08 I just looked on my server. + +00:48:10 I have three different versions of Postgres running from different apps that are like, oh no, we use Postgres 16. + +00:48:16 Oh, we use 18 or whatever it is. + +00:48:18 It's like, how are you going to manage that if you install more than just a handful of things? + +00:48:22 They all want these different servers and what a hassle, right? + +00:48:26 But because it's all contained within their own little network that they see, it's fine to run through because they all use the same port but they're not conflicting. + +00:48:33 Yeah, that version of Postgres has no idea. + +00:48:36 You could spin up 20 different Postgres 16s on the same server because all a container is really just process isolation in memory. + +00:48:43 You want to think of it like that as a mental explanation? + +00:48:46 All you're doing is taking your RAM and slicing it up into tiny little boxes and then placing that process inside that box. + +00:48:54 It can't, that process then can't see anything outside of that box unless you give it specific and explicit permissions to do so. + +00:49:01 And that's why containers have taken over the world if you ask me. + +00:49:03 I agree. + +00:49:04 I always thought that they were another level of complexity until I realized all the stuff you put in the Docker file is basically what you would have had to ad hoc type into your Linux machine anyway. + +00:49:13 So you've got to know it anyway. + +00:49:14 Yeah, you do. + +00:49:15 Yeah. + +00:49:15 I mean, the Docker file is basically just a bash script just with bells on. + +00:49:19 Yeah, yeah. + +00:49:20 You just put run or env or something in front of all the commands. + +00:49:24 Let's come back to your comment on codex and AI because for as intimidating as these things are now, they're way less intimidating if you just have cloud code or codex and you say, hey, explain this line to me or I need this to happen. + +00:49:40 Here's the file. + +00:49:40 Why is it not happening or how do I make it happen? + +00:49:42 That is an absolutely achievable thing. + +00:49:45 even stuff like last week, my server was running slowly. + +00:49:50 I didn't know why. + +00:49:51 The CPU wasn't busy. + +00:49:53 The RAM wasn't full. + +00:49:55 I looked at things like disk pressure. + +00:49:57 I looked at all the things I as a 15 year experience sysadmin knew where to look. + +00:50:02 Didn't see anything. + +00:50:03 And so then I had codex go and look at it via SSH. + +00:50:06 I was running it on my laptop and I said, right, you have permission via SSH. + +00:50:09 Go look at this server. + +00:50:10 Tell me what's wrong. + +00:50:11 And it turned out there was some spiking on certain NAND chips on the SSD when it was trying to write to certain sectors of the disk. + +00:50:19 It was causing massive IO weight. + +00:50:22 And I didn't catch that because it didn't make those writes during but codex ran overnight. + +00:50:26 And whilst I was sleeping it was still doing the checks and still finding finding out what was going on. + +00:50:30 And it turned out that the SSD, my boot SSD was on the verge of failing. + +00:50:35 It just hadn't marked itself as failing in smart yet. + +00:50:38 And it presented me this report, gave me all the diagnostics, it ran and yada yada. + +00:50:41 I would never have caught that. + +00:50:43 No. + +00:50:44 Not until it failed. + +00:50:45 And then I'd have caught it. + +00:50:47 But now I have time to go out and research the correct SSD to replace it and not pay rush shipping and all of this stuff because the robots went out and basically did my job for me. + +00:51:01 I mean, it's like, on the one hand, AI is one of these things of like, we're ushering in the very thing that's going to replace us as humanity. + +00:51:08 But I don't see it that way. + +00:51:10 Like, burying your head in the sand and saying, you know, vibe coded, slop this, that and the other. + +00:51:14 Like, it's not, it's not really a mature take on it, in my opinion. + +00:51:17 Yes, there's a lot of, there's a lot of slop out there. + +00:51:19 Yes, there's a lot of, like, but we shouldn't be replacing art with AI. + +00:51:23 Like, art fundamentally is a human endeavor and the reason it is valuable is because of the human effort that went into it. + +00:51:29 You'll never replace that with a robot. + +00:51:31 And, not even including the fact that everything that an AI does by its very nature is derivative of something that's actually being done before. + +00:51:38 So, you're never getting anything truly new and truly revolutionary. + +00:51:42 When it comes to, like, boring, menial tasks, like figuring out why my server's slow, have at it. + +00:51:48 I don't want to, I don't really want to be debugging that all night. + +00:51:52 Yeah. + +00:51:53 The recent thing I did with DevOps, Docker, and AI was I wanted to do a new self-hosting app and I want to serve it out of the same server as some other ones, but I don't want them to interact with each other. + +00:52:05 I don't even want them on the same network, but the NGINX front end has to be able to get to both of them. + +00:52:11 So, I'm like, all right, log code, how do I create a second network that still the one container can see both of the networks, but this one can't see, you know what I mean? + +00:52:20 Like, I'm like, how do I actually make that happen without breaking anything? + +00:52:23 It just knows. + +00:52:24 Yeah, it's like, this is what you do. + +00:52:26 This is the commands you run to, like, create the external network and then here's the settings and all the compose files. + +00:52:30 You restart them in this order so stuff doesn't break. + +00:52:32 I'm like, wow, okay. + +00:52:33 If you know just enough to be dangerous on a topic and you can guide it through the hallucinations that it does, it makes you incredibly powerful and so, for that reason, at least for the foreseeable future, I don't think it's going to replace, you know, + +00:52:48 everybody. + +00:52:49 There are for sure certain tasks and certain things that humans will be less required for and I think, you know, we're on the cusp of either the greatest change in humanity's labor since the Industrial Revolution or, + +00:53:04 and the economics will bear this out one way or the other, you know, forces at play here much bigger than either of us, or it will just turn out to be inordinately too expensive to do that for a very long time and then the progress and investment will stop + +00:53:18 and either a lot of very smart people are betting an awful lot of money and they're all wrong or there is actually something to this and we will see, I guess. + +00:53:28 Yeah, I think it's being misused for a lot of stuff but I also think that there's areas where it's incredibly helpful and this computer stuff in general, programming, DevOps, amazing. + +00:53:39 So we're getting short on time, Alex. + +00:53:41 I feel like we've only scratched the surface like for real but let's talk about Tailscale. + +00:53:46 I want to talk, I want to take one step back before we jump into Tailscale and just put out a warning. + +00:53:51 This is something that really blew my mind when I saw it. + +00:53:54 So when we're running our self-hosted apps, obviously we want to have security, limited access potentially. + +00:54:01 You might be running them at home and so how do you access them? + +00:54:03 There might be a bunch of funky networking things that people do but just as a quick PSA, I want to point out that, from here, other window, if you're using something like uncomplicated firewall in your Docker Compose file, you say, + +00:54:18 listen on 00, just default, like this port maps to that port, that's effectively 0000, that port, like listen on all the things and you're using something like uncomplicated firewall or one of these other things that manipulates the IP tables. + +00:54:32 Docker says, you know, Docker and UFW use firewall rules in ways that make them incompatible. + +00:54:38 That is like, things like UFW don't block access to your Docker stuff so something else, something stronger like a cloud firewall or things like that, right? + +00:54:49 Like on my servers, I have a at the cloud hosting level don't let anything access stuff but 80 and 443 or whatever and, you know, limited access to SSH. + +00:54:59 But if I didn't have that and I just used UFW, that would be not ideal. + +00:55:04 so let's talk about firewalls for a minute. + +00:55:06 I think there's a couple of things at play. + +00:55:08 One is you're hosting a public facing service like a website, right? + +00:55:13 That clearly has to be on the public internet. + +00:55:15 There's no way around that. + +00:55:16 The whole purpose of a website or an API probably is to be hit remotely and provide a response. + +00:55:24 But when we're talking about self-hosted infrastructure, the only customer is you, maybe your family, maybe a few friends. + +00:55:32 And so the idea behind Tailscale is to bring that connectivity back to be a more personal level. + +00:55:40 You know, our free tier, for example, at Tailscale has a six user limit. + +00:55:45 It has unlimited devices. + +00:55:46 And so the idea there is that you and your family all live in the same tail net. + +00:55:51 You make sure that Tailscale is installed on your server in your basement or wherever it happens to be. + +00:55:58 And it's installed on your phone. + +00:55:59 It creates a wire guard tunnel underneath, encrypted, end-to-end. + +00:56:04 And Tailscale makes a direct connection between those two devices with no middleman. + +00:56:08 And so the way that Tailscale remains free is because we ask people, we give it away for free for a lot of it, but then we ask those people to champion us at work. + +00:56:18 And we just crossed 30,000 paying customers just last week, I believe. + +00:56:22 And so each of those paying customers, well, not all of them, but a large number came through that funnel of, well, this is awesome. + +00:56:29 Why are we not using this at work? + +00:56:31 Yeah. + +00:56:31 So let me just sort of give the elevator pitch for people, I think, how cool this is. + +00:56:36 One way to self-host is I've got this running on a spare computer of whatever sort, Mac Mini, small, NUC, whatever, on your home network. + +00:56:44 You want access to it while it's traveling. + +00:56:47 The not great way is just, well, let's just put that on the internet. + +00:56:50 I'm going to open up a port on my router. + +00:56:53 I mean, just think back to the LastPass thing, right? + +00:56:55 How did LastPass get this huge takeover a few years ago? + +00:56:59 The one of the devs was running a Plex server on the open internet and didn't patch it. + +00:57:04 That got taken over. + +00:57:05 They got lateral movement inside the network, gotten the access keys to LastPass, and down it goes, right? + +00:57:11 So that's a bad example of self-hosting. + +00:57:14 Better would be use something like Tailscale, never open any ports at all, but when you're on the Tailscale network, you see into the networks where it's running. + +00:57:23 You see into your home network even when you're away, or you see into your server infrastructure even though zero ports are open. + +00:57:31 And that to me is just kind of magical. + +00:57:33 Yeah, if you want to learn more about it, I won't get into the specifics here, but there is a blog post called How Tailscale Works at tailscale.com. + +00:57:42 I'll send Michael a link to put in the show notes. + +00:57:44 And essentially, the magic there is we abused like stateful firewalls and how they work a little bit to do something called Nat traversal. + +00:57:51 So the idea is that there weren't enough IPv4 addresses for every device in the world to get its own address and sit on the public internet. + +00:57:58 And so we created this abstraction called Network Address Translation. + +00:58:01 Each device sits behind a firewall and gets a local IP address. + +00:58:06 You've probably seen the 192.168.whatever numbers. + +00:58:10 That's a local IP address versus what you get like what'smyip.com or whatever. + +00:58:15 And that'll give you a totally different IP address than what your laptop has with inside the Wi-Fi. + +00:58:20 And so you've got to have something that's doing that translation between those two things and that's called NAT. + +00:58:24 Then Tailscale punches through that NAT and makes a direct connection from your phone at the coffee shop over 5G through your residential firewall with no ports open to your server running under the stairs. + +00:58:35 It's super seamless. + +00:58:36 Yeah, it's super seamless. + +00:58:37 So I use it for things like I have a local LLM running on my Mac. + +00:58:42 Oh, yeah. + +00:58:43 And then if I'm at the coffee shop, then I just make sure I'm on the Tailscale network and I can still run apps that talk to my OpenAI API over my self-hosted LLM as if it was running on my laptop, but it's not, right? + +00:58:56 Yeah. + +00:58:56 Remember what we said at the beginning of the show? + +00:58:58 Like the rabbit hole goes deep and if you can think of a proprietary service, there's almost certainly a self-hosted alternative to it. + +00:59:05 AI is another one that you can self-host. + +00:59:08 So if you have a Mac Mini, we all heard about OpenClaw a few weeks ago, right? + +00:59:12 You can put it on your gaming rig. + +00:59:15 If you have an NVIDIA GPU in your gaming rig, you can use that for local AI. + +00:59:20 I mean, the rabbit hole is, if you're a curious person, I apologize in advance if you've not looked into self-hosting because it will consume you for a little bit. + +00:59:28 It's just how it goes. + +00:59:29 It is definitely how it goes and it's very satisfying as you start to make progress in it. + +00:59:34 Alex, I think that's it for our time. + +00:59:36 Final thoughts for people who want to get started. + +00:59:38 How would they get started? + +00:59:39 Oh, how would they get, oh gosh, that's a broad question. + +00:59:42 Hmm. + +00:59:43 Well, if you want to learn more about building a server in and of itself, I run a website at perfectmediaserver.com where you can learn how to build basically a Linux server with some storage in it to replace Netflix or something. + +00:59:56 I mean, I don't know. + +00:59:57 Awesome self-hosting is a good place to get started. + +01:00:00 There are dozens of YouTube guides. + +01:00:03 Just type self-hosting in and just watch a couple of hours worth of YouTube and you'll get a pretty good idea. + +01:00:09 And then from there, like I say, it's all about figuring out what problems you're trying to solve and then what shape that problem takes versus what your budget is, what your personal risk tolerances are and all that kind of stuff too. + +01:00:21 There's a lot that goes into it, but if you want to reach out to me, alex.ktz.me, you can come find me. + +01:00:28 I'm on Discord all over the place and I'll say hi. + +01:00:30 I'd love to chat. + +01:00:31 Yeah, awesome. + +01:00:32 I'll certainly link to your connections on the website, on the show notes and I do want to give a shout out to Tailscale. + +01:00:39 I think people should certainly consider it as part of the connectivity of all this stuff because it makes it so much simpler and so much safer. + +01:00:45 Not a sponsored episode. + +01:00:47 Hashtag not sponsored. + +01:00:49 Yeah. + +01:00:50 I'm just a corporate shill for free today. + +01:00:52 For me, I found out about it a couple years ago and I'm like, this solves all the problems and I was just such a fan and so I just want to make, you know, I think it's really a way that things get quite simplified for it. + +01:01:05 It was the same for me and I enjoyed it so much and I've been trying to solve this remote access problem as a self-hoster for, I didn't know it, but for 20 years I opened firewall ports to do remote desktop from school to my house when I was a teenager. + +01:01:18 You know, like, I've been trying to solve this problem for a very long time and I installed Tailscale one weekend three years ago and was like, holy cow, this is amazing and I got a job here because I liked it so much. + +01:01:29 Beautiful. + +01:01:30 Well, I really appreciated you coming on the show. + +01:01:33 Learned a lot. + +01:01:34 Thanks for being here. + +01:01:34 It was fun. + +01:01:34 Yeah, thanks for having me. + +01:01:35 Yeah, see you later. + +01:01:37 This has been another episode of Talk Python To Me. + +01:01:39 Thank you to our sponsors. + +01:01:40 Be sure to check out what they're offering. + +01:01:42 It really helps support the show. + +01:01:44 Temporal is hosting their yearly conference, Temporal Replay. + +01:01:47 Join your peers at Replay, the conference on orchestrating durable workflows and agents. + +01:01:52 May 5 to 7 in San Francisco. + +01:01:54 Visit talkpython.fm/temporal dash replay and use the code talkpython75, all one word, all caps, to save up to $449 on your ticket. + +01:02:04 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:02:17 Best of all, there's no subscription in sight. + +01:02:20 Browse the catalog at talkpython.fm. + +01:02:22 And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:02:27 Just search for Python in your podcast player. + +01:02:29 We should be right at the top. + +01:02:30 If you enjoy that geeky rap song, you can download the full track. + +01:02:33 The link is actually in your podcast blur show notes. + +01:02:36 This is your host, Michael Kennedy. + +01:02:38 Thank you so much for listening. + +01:02:39 I really appreciate it. + +01:02:40 I'll see you next time. + +01:02:41 Talk Python and me. + +01:02:52 Talk Python and me. + +01:02:53 Can we be ready to roll? + +01:02:57 Upgrading the code. + +01:02:59 No fear of getting whole. + +01:03:02 We tapped into that modern vibe overcame each storm. + +01:03:06 Talk Python and me. + +01:03:07 I think is the norm. + diff --git a/transcripts/546-self-hosting-apps-for-python-people-transcript-final.vtt b/transcripts/546-self-hosting-apps-for-python-people-transcript-final.vtt new file mode 100644 index 0000000..47e1250 --- /dev/null +++ b/transcripts/546-self-hosting-apps-for-python-people-transcript-final.vtt @@ -0,0 +1,2458 @@ +WEBVTT + +00:00:00.000 --> 00:00:02.000 +The cloud is convenient, until it isn't. + +00:00:02.380 --> 00:00:12.320 +You upload your photos, you sync your contacts, you click through the cookie banners, then prices go up, or you read about the family that lost their entire Google account over a medical photo sent to their doctor. + +00:00:12.740 --> 00:00:17.560 +At some point, the question shifts from, why would I run this myself, to why aren't I? + +00:00:18.180 --> 00:00:20.180 +My guest this week is Alex Kretzmar. + +00:00:20.660 --> 00:00:27.840 +He's the head of DevRel at Tailscale, the longtime host of the Self-Hosted Podcast, and co-founder of LinuxServer.io. + +00:00:27.840 --> 00:00:41.520 +We cover what self-hosting really means in 2026, the apps worth running yourself, like Image and Home Assistant, why Docker Compose ties it all together, and how Tailscale lets you reach any of it from anywhere without opening a single port. + +00:00:41.880 --> 00:00:46.140 +If you've been thinking about pulling your digital life back behind your own walls, this is your roadmap. + +00:00:46.560 --> 00:00:52.440 +This is Talk Python To Me, episode 546, recorded April 27th, 2026. + +00:00:53.720 --> 00:01:08.140 +Talk Python To Me, yeah, we ready to roll, upgrading the code, no fear of getting old, async in the air, new frameworks in sight, geeky rap on deck, Quarth Crew, it's time to unite, we started in pyramid, cruising old school lanes, + +00:01:08.400 --> 00:01:09.920 +had that stable base, yeah, sir. + +00:01:09.920 --> 00:01:14.360 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:14.800 --> 00:01:16.240 +This is your host, Michael Kennedy. + +00:01:16.580 --> 00:01:20.220 +I'm a PSF fellow who's been coding for over 25 years. + +00:01:20.760 --> 00:01:21.900 +Let's connect on social media. + +00:01:21.900 --> 00:01:25.380 +You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:25.620 --> 00:01:27.540 +The social links are all in your show notes. + +00:01:28.240 --> 00:01:35.300 +You can find over 10 years of past episodes at talkpython.fm, and if you want to be part of the show, you can join our recording live streams. + +00:01:35.460 --> 00:01:39.520 +That's right, we live stream the raw, uncut version of each episode on YouTube. + +00:01:40.000 --> 00:01:44.520 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:44.680 --> 00:01:48.340 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:48.340 --> 00:01:52.220 +Temporal is hosting their yearly conference, Temporal Replay. + +00:01:52.720 --> 00:01:57.080 +Join your peers at Replay, the conference on orchestrating durable workflows and agents. + +00:01:57.540 --> 00:01:59.220 +May 5 to 7 in San Francisco. + +00:01:59.740 --> 00:02:09.140 +Visit talkpython.fm/temporal dash replay and use the code TALKPYTHON75, all one word, all caps, to save up to $449 on your ticket. + +00:02:10.120 --> 00:02:12.000 +Alex, welcome to Talk Python To Me. + +00:02:12.260 --> 00:02:13.120 +Well, thanks for having me. + +00:02:13.200 --> 00:02:15.680 +This is Comfy Surroundings. Hello. + +00:02:15.680 --> 00:02:29.320 +I'm really excited to be talking about self-hosting, something I have talked around on the podcast a little bit, and I had the Home Assistant guys on for a while long ago when Home Assistant was this little boutique thing that people might find interesting. + +00:02:29.560 --> 00:02:44.140 +Now it's kind of blown up, but I'm really looking forward to talking about digital sovereignty, running your own apps, not being dependent on huge tech companies for every little thing, and just the joy of finding something in open source + +00:02:44.140 --> 00:02:47.840 +or just out there and going, hey, what if I just run that myself? + +00:02:48.280 --> 00:02:51.980 +And so I thought of Alex, thought of you, and I said, hey, we got to talk about this. + +00:02:52.400 --> 00:02:52.740 +Great, yeah. + +00:02:52.980 --> 00:02:53.880 +Well, thanks for having me. + +00:02:54.100 --> 00:02:54.520 +Yeah, you bet. + +00:02:54.760 --> 00:02:58.020 +So before we dive into all those things, give people a bit of a background about yourself. + +00:02:58.260 --> 00:02:59.100 +Yeah, well, I'm Alex. + +00:02:59.300 --> 00:03:03.080 +I, as you perhaps can tell from the accent, originally hail from the UK. + +00:03:03.460 --> 00:03:06.040 +I live in North Carolina these days, though, for my sins. + +00:03:06.460 --> 00:03:08.180 +And I work for Tailscale. + +00:03:08.180 --> 00:03:13.240 +I head up their DevRel department and primarily make YouTube videos for them. + +00:03:13.940 --> 00:03:18.020 +You know, it's an interesting company to work for because it's all product-led growth. + +00:03:18.380 --> 00:03:26.060 +So my job is to really get people enthused and excited about the product and all the interesting ways in which they can access their stuff remotely. + +00:03:26.500 --> 00:03:31.000 +And then, I don't know, people bring it to work and that's how the company makes money. + +00:03:31.160 --> 00:03:36.100 +So I get paid essentially to make YouTube videos about hacking on self-hosted applications. + +00:03:36.100 --> 00:03:38.500 +And I still don't quite know how that happened. + +00:03:39.500 --> 00:03:42.380 +I think you guys over at Tailscale are doing a great job. + +00:03:42.620 --> 00:03:48.380 +We're going to go into it later when we get into sort of the security and accessing stuff of all the self-hosting things. + +00:03:48.500 --> 00:03:52.480 +But I started using Tailscale a couple of years ago and yeah, it's fabulous. + +00:03:52.740 --> 00:03:54.640 +So very nicely done. + +00:03:55.100 --> 00:04:00.000 +So some other stuff I've done, I used to do a podcast called Self-Hosted, wrapped that up last year. + +00:04:00.240 --> 00:04:03.320 +But I do a new one now called Bitflip with a few of my buddies. + +00:04:03.320 --> 00:04:07.680 +And again, from the self-hosting universe, you can find out more about that at bitflip.show. + +00:04:07.980 --> 00:04:09.360 +I hope some self-promotion's okay. + +00:04:09.700 --> 00:04:10.600 +But please, yes. + +00:04:10.680 --> 00:04:11.320 +No, that sounds great. + +00:04:11.560 --> 00:04:16.680 +Because I was a little disappointed to hear that you shut down Self-Hosted, the podcast, because I was just getting into it. + +00:04:16.800 --> 00:04:18.020 +And then, so you're back. + +00:04:18.020 --> 00:04:18.380 +Yeah, we did it. + +00:04:18.500 --> 00:04:19.380 +How does it differ? + +00:04:19.740 --> 00:04:21.460 +Well, not much really. + +00:04:21.860 --> 00:04:34.620 +So the weird thing was about Self-Hosted, and I don't know if you felt this with a show with Python in the name, but I kind of felt a little bit limited by the title because I tend to approach things from a very pragmatic angle. + +00:04:34.920 --> 00:04:40.980 +We were just talking before we pressed record about how important Linux and open source and all this kind of stuff is. + +00:04:41.360 --> 00:04:47.300 +And yet I'm using a MacBook to record, not Linux, because it's just bulletproof reliable for media applications. + +00:04:47.600 --> 00:04:50.940 +And there are all these little compromises we make all throughout our digital lives. + +00:04:50.940 --> 00:04:59.020 +And so Self-Hosted as a movement, particularly in the subreddit actually, is very opinionated. + +00:04:59.300 --> 00:05:06.400 +And unless you're doing absolutely everything, lock, stock, 100% yourself, there are some people who say, well, you're holding it wrong. + +00:05:06.480 --> 00:05:08.120 +You're not doing it properly. + +00:05:08.500 --> 00:05:17.400 +My approach has always been, it's okay to have DigitalOcean run a VPS for me, but I've still got root to that VPS and I am hosting my own website. + +00:05:17.460 --> 00:05:18.920 +I'm self-hosting my own websites. + +00:05:18.920 --> 00:05:21.680 +But to some people that definition doesn't sit right. + +00:05:21.820 --> 00:05:25.280 +And so- You got to be running on a Raspberry Pi in your basement. + +00:05:25.440 --> 00:05:26.940 +If that's not the way it is, it's not true. + +00:05:27.200 --> 00:05:27.440 +Right. + +00:05:27.880 --> 00:05:35.800 +And we all know that there are just limitations to doing things, like maybe you're moving house and so your website would be offline for two weeks whilst you move house. + +00:05:36.420 --> 00:05:37.400 +That's probably not okay. + +00:05:38.040 --> 00:05:45.820 +Or there's a storm in your area or a water pipe bursts or like any number of fates can befall things in your house. + +00:05:45.940 --> 00:05:58.340 +And I'm not saying these things can't happen to a data center, but there are just mitigations in place between, you know, even just things like ISP pairing and like the data center is probably in the middle of an internet exchange building, whereas my house definitely is not. + +00:05:58.960 --> 00:06:12.420 +So I kind of wrapped up the self-hosted podcast just a little bit because I felt like, I don't think I feel this way anymore, but sort of 18 months ago when we wrapped it up, that self-hosting had, we kind of said all we needed to say and that as a movement, + +00:06:12.600 --> 00:06:18.580 +it was just kind of bubbling away in the background and those that had found it were going to find it and it was just sort of ticking over. + +00:06:18.800 --> 00:06:22.340 +But I don't know, self-hosting is all the line trendy these days. + +00:06:22.460 --> 00:06:28.100 +I think I heard Linus on the WAN show on Friday literally saying that building a NAS is trendy. + +00:06:28.460 --> 00:06:29.220 +And I'm like, what? + +00:06:29.740 --> 00:06:30.080 +Is it? + +00:06:30.520 --> 00:06:31.720 +Okay, cool. + +00:06:32.000 --> 00:06:32.620 +Well, I'm here for it. + +00:06:32.880 --> 00:06:34.260 +Yeah, I'm here for it as well. + +00:06:34.300 --> 00:06:37.620 +And I'm glad to hear you're still carrying on with the podcast under a different banner. + +00:06:37.880 --> 00:06:45.600 +Well, the reality is that a lot of this stuff, like I said, like I do for Tailscale and for Bitflip now is this is stuff I'm doing anyway. + +00:06:45.720 --> 00:06:49.340 +Like my personal YouTube channel as well at KTZ Systems. + +00:06:49.520 --> 00:06:52.440 +Like I just, I'm just always like, just out of shot over here. + +00:06:52.500 --> 00:07:07.080 +There is a desk covered with like five of those little Lenovo mini PCs that I'm putting into a little Proxmox Ceph cluster because I woke up last week and my home assistant was down because my little Minis Forum MS01 had lit itself on fire in the middle of the night. + +00:07:07.080 --> 00:07:09.460 +And I found, ah, where's a single point of failure? + +00:07:09.460 --> 00:07:13.580 +I can fix that with some clustering and high availability and so the rabbit hole continues. + +00:07:14.080 --> 00:07:17.740 +Yes, I love that you have a high availability on your home network. + +00:07:18.040 --> 00:07:20.180 +I'm working on it, which is another story. + +00:07:20.500 --> 00:07:27.380 +But so it turns out these little Lenovo PCs, you can pick them up for about $150 or so. + +00:07:27.800 --> 00:07:31.000 +Even today, even in the hardware apocalypse that we're going through. + +00:07:31.260 --> 00:07:33.920 +These, you know the ones I mean, like the little one liter PCs, right? + +00:07:34.140 --> 00:07:37.200 +Usually bolted onto the back of a monitor in an office or something. + +00:07:37.460 --> 00:07:43.600 +And you can pick those up for about $150 and they will run every self-hosted app you could possibly throw at them. + +00:07:43.820 --> 00:07:49.880 +In reality, certainly just for individual use, they are absolutely all the average person needs as a home server. + +00:07:50.200 --> 00:08:00.660 +And so one of the things I like to do with them is put what's called Proxmox on it, which is a hypervisor that lets you run virtual machines, something called LXCs, Linux containers, as well as Docker. + +00:08:01.020 --> 00:08:03.240 +We love us some Docker, I understand. + +00:08:04.180 --> 00:08:07.320 +Basically, if it doesn't run in Docker, I don't run it. + +00:08:07.320 --> 00:08:09.580 +I'm just going to trigger some people in the audience, I'm sure. + +00:08:10.200 --> 00:08:10.680 +You know what? + +00:08:10.700 --> 00:08:11.180 +I'm with you. + +00:08:11.320 --> 00:08:16.740 +When I go and look at one of these things that is potentially self-hosted, I'm like, well, where's the Docker Compose file? + +00:08:16.840 --> 00:08:19.240 +Otherwise, I'm not sure we're going to be continuing down this path. + +00:08:19.460 --> 00:08:23.580 +I mean, you Python people know all about standardized packaging formats and stuff like that. + +00:08:23.660 --> 00:08:27.500 +Like the prevalence of pip and then these days, uv, of course. + +00:08:27.660 --> 00:08:29.760 +Like, you know, these things matter. + +00:08:29.760 --> 00:08:37.980 +They're like, how users round off those rough edges of how it gets from my keyboard in my lab to your computer and wherever you are. + +00:08:38.260 --> 00:08:40.300 +Docker kind of closed that last 10%. + +00:08:40.300 --> 00:08:46.620 +I mean, a lot of the primitives of Docker existed well before, like C groups and namespaces in the Linux kernel. + +00:08:46.920 --> 00:08:50.300 +All that stuff existed for years before Docker came along. + +00:08:50.620 --> 00:08:59.520 +All they did really was provide a standardized packaging format, which is really just a tarball, and a standard way of building those tarballs with a Docker file, like a recipe. + +00:08:59.760 --> 00:09:02.500 +That was all they did, and provided a little bit of plumbing and networking. + +00:09:02.680 --> 00:09:04.880 +Like, we just ignore all the technical details they did. + +00:09:05.900 --> 00:09:20.100 +But essentially, they just closed that last 10% of usability, and suddenly, me, as a computer science student, could run any application in the world without having to dive into Systemd and init scripts and database migrations + +00:09:20.100 --> 00:09:21.100 +and blah, blah, blah, blah. + +00:09:21.360 --> 00:09:22.080 +It was just... + +00:09:22.080 --> 00:09:24.260 +Yeah, complex networking, attached volumes. + +00:09:24.420 --> 00:09:25.980 +Like, there's a lot of stuff going on there, yeah. + +00:09:26.300 --> 00:09:26.560 +Yeah. + +00:09:26.840 --> 00:09:28.100 +Docker is life in this house. + +00:09:28.100 --> 00:09:32.160 +A long time ago now, I co-founded a website called Linuxserver.io. + +00:09:32.280 --> 00:09:45.260 +I don't know if anybody in the audience has heard that, but it's the largest, I believe, sort of open-source containerization movement project on the internet, and that was born out of the fact that sort of 10, gosh, yeah, maybe 12 years ago + +00:09:45.260 --> 00:09:52.820 +that Docker was pre-1.0, so it was very sort of nascent at that point, and it was... + +00:09:52.820 --> 00:09:53.480 +There was just... + +00:09:53.480 --> 00:09:54.220 +There were no standards. + +00:09:54.700 --> 00:09:58.140 +Like, the readmes were all over the place, or if there even was one. + +00:09:58.360 --> 00:10:00.540 +There were no sort of standardized base images. + +00:10:00.700 --> 00:10:08.920 +People hadn't cottoned on to, like, supply chains, and, you know, today it's a hot topic, but sort of back then, it was, oh, if it runs, I'm happy, you know? + +00:10:09.540 --> 00:10:23.240 +So Linux Server was sort of my attempt, our attempt, I should say, at fixing some of those issues, and, you know, we packaged up media server apps back in the day, like Plex, and some of the other slightly less salubrious + +00:10:23.240 --> 00:10:31.780 +applications you might find on the internet, as well as a bunch of other self-hosting stuff, which we should probably get into talking about some of the apps. + +00:10:32.120 --> 00:10:32.560 +Yeah, absolutely. + +00:10:32.940 --> 00:10:43.900 +Well, to kind of put a bookend on your introduction, I do just want to quickly ask you about your racing and VIR and stuff like that. + +00:10:43.980 --> 00:10:51.080 +You know, and I was looking to contact you, I was going through your About page, and I saw a car racing around a racetrack, and I thought, well, can't not talk about that. + +00:10:51.140 --> 00:10:57.780 +I've had folks from Formula One and from NASCAR on the show before, and I'm a big fan of these kinds of things. + +00:10:57.900 --> 00:10:58.580 +Yeah, I do too. + +00:10:58.900 --> 00:11:00.340 +So, that's one of your hobbies? + +00:11:00.420 --> 00:11:00.960 +That's pretty awesome. + +00:11:01.240 --> 00:11:10.900 +Yeah, I've followed Formula One since, well, I remember sitting on my dad's knee as a kid watching Damon Hill, Nigel Mansell, go around Silverstone, so it's been a while. + +00:11:12.280 --> 00:11:19.560 +There's obviously a new crop of F1 fans, which is amazing, thanks to the Drive to Survive stuff, but I've followed it for years. + +00:11:19.720 --> 00:11:22.320 +I just enjoy watching, I just enjoy watching the sport. + +00:11:22.440 --> 00:11:25.300 +It's like a nerd soap opera in a way. + +00:11:26.860 --> 00:11:32.600 +Not a fan, honestly, of these new regs, though, with the sort of the super clipping and all this kind of stuff. + +00:11:32.680 --> 00:11:36.640 +It'd be interesting to see what happens when we get to, where's the next one? + +00:11:36.740 --> 00:11:37.540 +Miami, I think. + +00:11:37.860 --> 00:11:39.520 +Yeah, I believe it is Miami and then Canada. + +00:11:39.880 --> 00:11:49.480 +So, for people who don't know out there, Formula One is called Formula One because there's one formula on how to build the cars, but then all the teams generally, almost from scratch, build their cars. + +00:11:49.840 --> 00:11:54.280 +And every couple, every four or five years, they're like, okay, we're completely doing it differently. + +00:11:54.440 --> 00:11:58.280 +And so, this year, they've completely done it differently and there's a lot of controversy. + +00:11:58.500 --> 00:12:00.000 +I don't know, it's interesting, but. + +00:12:00.000 --> 00:12:09.300 +Yeah, they've gone for like this 50-50 split between the combustion engine and the battery power, but the batteries can't harvest enough energy every lap. + +00:12:09.500 --> 00:12:22.080 +So, I don't know what genius thought of that, but, so they get halfway around the lap and they lose half of their horsepower, which can mean you've got closing speeds between cars of sort of, I don't know, 50 to 100 miles per hour. + +00:12:22.240 --> 00:12:26.760 +And we saw in Japan in the last race, quite a bad accident as a consequence. + +00:12:27.840 --> 00:12:29.340 +Right there in Spoon, it wasn't pretty. + +00:12:30.000 --> 00:12:38.840 +I know all of the electric stuff and like the hybrid things and IndyCar and even way, way more so in Formula One is for environmental friendliness. + +00:12:39.320 --> 00:12:51.160 +And hey, I drive an electric car, I love electric cars and I'm all about caring about the environment and stuff, but the 20 cars driving around the track is nothing compared to the 300,000 people that took airplanes to get there. + +00:12:51.420 --> 00:13:01.940 +And then when they ship the cars on planes halfway around the world, like the fuel spent when they're racing, it has nothing to do with, you know, it doesn't even register on the number of the environmental impact of that. + +00:13:02.000 --> 00:13:06.400 +So I don't know, I kind of long for the Damon Hill days with like, Oh, me too. + +00:13:06.520 --> 00:13:07.320 +Fast engines, you know. + +00:13:07.640 --> 00:13:15.560 +On our honeymoon, actually, my wife and I, we ended up in Milan on race weekend, totally by accident, genuinely by accident. + +00:13:15.560 --> 00:13:21.960 +We were booking this like interrail trip around Europe and my itinerary landed us in Milan on race weekend. + +00:13:21.960 --> 00:13:25.560 +I didn't actually know at the time and all hotels for that weekend spiked. + +00:13:25.720 --> 00:13:28.340 +They're like two or three exiting costs and I'm like, what's going on? + +00:13:28.380 --> 00:13:31.280 +So I just typed Milan events, September, whatever. + +00:13:31.620 --> 00:13:45.840 +Anyway, turns out, so we went to Monza and I'll never forget we were stood at, it was the Iscari chicane so it's on the opposite side from the start, finish straight and the noise, I think there were V8s, I don't think there were V10s, I think there were V8s then but just the noise of them sitting on the grid + +00:13:45.840 --> 00:13:51.720 +waiting to go, it was like a bunch of angry wasps and you could hear it and it's half a mile away. + +00:13:52.180 --> 00:13:53.160 +Amazing, amazing. + +00:13:53.360 --> 00:13:56.620 +We lost something when they went to the V6 turbo hybrid stuff. + +00:13:56.880 --> 00:13:57.240 +100%. + +00:13:57.240 --> 00:14:02.700 +All right, last bit, I mean a lot of people are fans of F1 and racing, not many of them end up on a race track. + +00:14:03.040 --> 00:14:05.240 +Oh yeah, that's a whole different kind of fresh. + +00:14:05.540 --> 00:14:17.480 +Yeah, so I've been into, I've owned seven Volkswagen Golfs over the years, culminating in the Golf R a few years ago and I just had to take it on a track. + +00:14:17.700 --> 00:14:28.960 +Like in England I went on this run what, we call it a run what you brung track evening and I went to Brands Hatch and I literally turned up without even a helmet, without doing any prep or whatever and they just let me untrack. + +00:14:29.300 --> 00:14:30.340 +Just, I couldn't believe it. + +00:14:30.740 --> 00:14:36.180 +And then I had the best evening of my life and then we emigrated and came here and I was like, I've got to scratch that itch. + +00:14:36.520 --> 00:14:44.220 +So I went to the internet and found out to go to VIR you have to do all sorts of training and get like instructors and it all sounded a bit much. + +00:14:44.500 --> 00:14:46.940 +But anyway, VIR is a serious racetrack. + +00:14:46.940 --> 00:14:58.980 +Like you can end up I think on the back straight in my little golf I was doing 140 on the back straight and there are moments coming up through the uphill essays at VIR where you're just like, if this goes wrong she's going to hurt. + +00:15:00.200 --> 00:15:08.620 +And in the end I ended up scaring myself a bit silly but I had real fun but there was just a couple of moments where I was like, you know, I've got a kid at home. + +00:15:08.800 --> 00:15:14.960 +I should probably, this is a young man's game or an old man's game when you've got nothing left to lose, I guess. + +00:15:14.960 --> 00:15:15.480 +Yeah, that's true. + +00:15:15.620 --> 00:15:18.160 +There's a, it's a bimodal sort of experience. + +00:15:18.400 --> 00:15:18.580 +Yeah. + +00:15:19.300 --> 00:15:22.540 +But I learned a lot like I learned how to change brake pads, brake fluid. + +00:15:23.000 --> 00:15:25.560 +I fitted a new intercooler to my car. + +00:15:25.640 --> 00:15:26.600 +I upgraded the turbo. + +00:15:26.860 --> 00:15:27.480 +I did tuning. + +00:15:27.920 --> 00:15:28.840 +Like technical stuff. + +00:15:28.920 --> 00:15:30.160 +I like learning how things work. + +00:15:30.300 --> 00:15:31.740 +Same with software, same with cars. + +00:15:31.840 --> 00:15:40.040 +It's basically just one is slightly more visceral and arguably the stakes are a bit higher if you screw up installing a turbo it can be very expensive. + +00:15:40.040 --> 00:15:42.560 +It's worse than, oh, I got to reinstall that. + +00:15:43.240 --> 00:15:43.460 +Yeah. + +00:15:43.660 --> 00:15:43.900 +Yeah. + +00:15:43.920 --> 00:15:44.640 +Good fun though. + +00:15:44.900 --> 00:15:45.800 +No, I'm sure it's amazing. + +00:15:45.900 --> 00:15:46.980 +That sounds very, very cool. + +00:15:47.100 --> 00:15:48.260 +So what a great experience. + +00:15:48.580 --> 00:15:51.000 +Let's talk the main, main topic. + +00:15:51.240 --> 00:15:54.700 +Like, I guess we've been using the word without really defining it. + +00:15:54.760 --> 00:15:59.360 +Like what is self-hosting for people who are just like, you know, they, they haven't done these sorts of things. + +00:15:59.640 --> 00:16:12.060 +I think as I, as I alluded to earlier, there's a broad spectrum of definitions to what self-hosting means to different people, depending on how tightly you hold certain beliefs around definitions. + +00:16:12.440 --> 00:16:21.420 +But for me, it means the business model that exists is feeding the open source developer or small team that built it. + +00:16:21.500 --> 00:16:27.040 +Like it's, it's not, are you familiar with Corey Doctorow and his idea of n-certification? + +00:16:27.500 --> 00:16:27.560 +Yeah. + +00:16:27.720 --> 00:16:34.800 +The idea that a company will give some, we, we've been accused of this at Tailscale and I don't think it's actually going to happen. + +00:16:34.800 --> 00:16:39.520 +the CEO at Tailscale, I have great faith in Avery's leadership, honestly. + +00:16:39.720 --> 00:16:42.440 +I know I sound like a corporate shill saying that, but I genuinely believe it. + +00:16:42.540 --> 00:16:51.740 +So, the idea of n-certification is that a company takes a bunch of money from venture capital or some other source and gives the product away. + +00:16:51.780 --> 00:16:53.020 +We saw it with Uber, for example. + +00:16:53.020 --> 00:16:57.760 +Like they give the product away at a loss leading price point to gain market share. + +00:16:58.080 --> 00:16:59.880 +We've seen it in multiple industries over the years. + +00:17:00.040 --> 00:17:01.120 +Walmart is a great example. + +00:17:01.240 --> 00:17:05.460 +They'll put mom and pop short stores out of business in the local town and then slowly raise the prices. + +00:17:06.080 --> 00:17:06.400 +Right, right. + +00:17:06.460 --> 00:17:09.460 +Once everyone's gone, it's, it's, they have no choice but to go there. + +00:17:09.600 --> 00:17:09.900 +Exactly. + +00:17:10.340 --> 00:17:14.880 +And so the idea of n-certification in software is, is very prevalent. + +00:17:15.100 --> 00:17:23.640 +We've, we're seeing it with streaming services right now where they're just gradually turning the screw, lifting the prices, pulling out shows without your control. + +00:17:23.980 --> 00:17:28.100 +All of these things have, are really boiled around one central point. + +00:17:28.360 --> 00:17:29.260 +I mentioned the business model. + +00:17:29.260 --> 00:17:31.340 +That's one thing, but really it's control. + +00:17:31.640 --> 00:17:35.780 +And do you have control over the services that are running your life? + +00:17:36.040 --> 00:17:38.300 +If you have Google in your life, you probably don't. + +00:17:38.400 --> 00:17:40.540 +If you have Apple in your life, you probably don't. + +00:17:40.820 --> 00:17:44.280 +You feel like you do, but there are countless examples. + +00:17:44.440 --> 00:17:49.560 +For example, there was one a couple of years ago where, I think this was in the New York Times. + +00:17:49.680 --> 00:18:03.580 +We definitely covered this on Self-Hosted a while ago where a mother took pictures of their kids, a medical issue of their kids, private areas, and sent it to their doctors through telehealth. + +00:18:03.880 --> 00:18:11.160 +They also sent the picture to their husband through a messaging app, which then meant that that picture got backed up to, I think it was Google Photos. + +00:18:11.220 --> 00:18:11.920 +It might have been Amazon. + +00:18:12.260 --> 00:18:13.240 +Please don't quote me on this. + +00:18:13.280 --> 00:18:15.160 +I'm just speaking from two-year-ago memory. + +00:18:15.920 --> 00:18:20.580 +And they got flagged as a CSAM issue, like a child pornography issue. + +00:18:20.580 --> 00:18:23.920 +And they had most of their digital life cancelled. + +00:18:24.200 --> 00:18:25.380 +They were locked out of their accounts. + +00:18:25.800 --> 00:18:28.260 +They were basically banned from that company. + +00:18:28.680 --> 00:18:29.520 +Might have been Google. + +00:18:29.720 --> 00:18:30.380 +Let's go with Google. + +00:18:31.200 --> 00:18:33.800 +And just the idea of being locked out of my Gmail. + +00:18:34.020 --> 00:18:37.600 +I mean, just stop and think about how much of your life is in your Gmail inbox. + +00:18:37.920 --> 00:18:38.680 +How long have you had yours? + +00:18:39.000 --> 00:18:39.440 +15 years. + +00:18:39.700 --> 00:18:42.540 +I think there's over a quarter million emails in my Gmail account. + +00:18:42.860 --> 00:18:43.000 +It's ridiculous. + +00:18:43.000 --> 00:18:45.020 +I mean, it is ridiculous. + +00:18:45.860 --> 00:18:49.420 +And extrapolate that from email to photos. + +00:18:49.820 --> 00:18:56.520 +Extrapolate that to music, to videos, to, I don't know, taxis and invoices, all this stuff. + +00:18:56.780 --> 00:19:11.580 +There are just so many different facets of our lives that we've given up to third parties that are either being used to train the next round of industrial revolution, oligarchy revolution, like AI models, or they're being used to feed an advertiser's + +00:19:11.580 --> 00:19:16.740 +bottom line and create a profile about you and who you are and what you do and who you associate with. + +00:19:16.960 --> 00:19:23.960 +Because make no mistake, when your photo gets uploaded to Google Photos, they are making a map of all the faces in that photo. + +00:19:24.260 --> 00:19:29.360 +Whether you know the person in the background or not, Google will know them because they probably have Google Photos too. + +00:19:29.600 --> 00:19:35.500 +And they can scan that Alex was stood next to Fred Smith on June the 21st, 1983. + +00:19:36.460 --> 00:19:40.840 +And like, they can create such incredibly detailed profiles about people. + +00:19:41.140 --> 00:19:43.580 +And if that doesn't bother you, self-hosting is probably not for you. + +00:19:43.800 --> 00:19:53.440 +But I don't know about, I don't know about you, but it makes me deeply uncomfortable that I'm giving up these freedoms and this privacy without really appreciating that I'm doing so. + +00:19:53.700 --> 00:19:58.280 +Like a lot of the transaction is very, what's the word I'm looking for? + +00:19:58.360 --> 00:20:01.900 +Like it's just not a fair, it's not a fair exchange of value for value. + +00:20:01.900 --> 00:20:02.360 +It's asymmetric. + +00:20:02.700 --> 00:20:02.900 +Yeah. + +00:20:03.160 --> 00:20:03.520 +Asymmetric. + +00:20:03.520 --> 00:20:04.220 +Very asymmetric. + +00:20:04.420 --> 00:20:05.020 +Yeah, absolutely. + +00:20:05.240 --> 00:20:05.420 +Totally. + +00:20:05.600 --> 00:20:20.560 +And I want to just, while we're sort of setting the stage, I just want to put an idea out there that this kind of stuff is super valuable and a good thing to keep in mind, not just for individuals, which 100% that it is, but also for developers running their software. + +00:20:20.940 --> 00:20:35.400 +Do you necessarily need to take all of your data and put it into an AWS managed service or an Azure managed service or send all of your users information through, say, Google Analytics to Google to then turn around + +00:20:35.400 --> 00:20:36.920 +a mine or to other places? + +00:20:37.320 --> 00:20:39.680 +You don't have, I feel like people think they have to. + +00:20:39.940 --> 00:20:40.780 +You don't have to. + +00:20:40.780 --> 00:20:43.040 +It almost feels inevitable, doesn't it, these days? + +00:20:43.360 --> 00:20:45.680 +That, oh, well, everyone else is doing it. + +00:20:45.720 --> 00:20:46.320 +I may as well. + +00:20:46.660 --> 00:20:46.840 +Yeah. + +00:20:47.100 --> 00:20:48.080 +We'll get the cookie banner. + +00:20:48.260 --> 00:20:48.860 +We'll put it up. + +00:20:48.980 --> 00:20:51.700 +People are used to, everywhere they go, they click the cookie banner. + +00:20:52.080 --> 00:20:52.360 +True. + +00:20:52.660 --> 00:20:59.080 +But there are entirely serviceable alternatives to almost every single proprietary service that you have. + +00:20:59.160 --> 00:21:00.960 +Google Analytics, let's start with that one. + +00:21:01.680 --> 00:21:03.640 +There's an open source app called Plausible. + +00:21:03.820 --> 00:21:06.760 +It does almost everything that Google Analytics does. + +00:21:07.540 --> 00:21:14.100 +It just, the analytics stay within your world and they're not, they're not kind of fed into the Google machine. + +00:21:14.660 --> 00:21:22.440 +And whether that's a, like, on feature parity, there's an argument to be made there about, like, well, Google's more invasive so they have more data. + +00:21:22.740 --> 00:21:24.420 +I don't see that as a plus point, personally. + +00:21:25.880 --> 00:21:30.760 +This portion of Talk Python is brought to you by Temporal and the Temporal Replay Conference. + +00:21:31.060 --> 00:21:36.180 +Previously, I've told you about Temporal's open source framework and I've had Mason Egger on the podcast. + +00:21:36.780 --> 00:21:45.260 +If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:21:45.800 --> 00:21:48.740 +This is where Temporal's got your back with their open source framework. + +00:21:49.000 --> 00:21:53.960 +And if that kind of workload is what you're building, you should definitely consider attending the Temporal Replay Conference. + +00:21:53.960 --> 00:21:57.780 +It's hosted May 5-7 in Moscone Center in San Francisco. + +00:21:58.560 --> 00:21:59.700 +Join your peers at Replay. + +00:22:00.060 --> 00:22:03.120 +Temporal's conference on orchestrating durable workflows and agents. + +00:22:03.340 --> 00:22:13.440 +You'll learn real-world patterns for reliability, failure handling, and scale from developers building themselves, including speakers from OpenAI, Replit, and Abridge. + +00:22:13.880 --> 00:22:25.060 +Check out Replay 2026 at talkpython.fm/temporal dash replay and use the code talkpython75 all one word to save up to $449 on your ticket. + +00:22:25.700 --> 00:22:32.360 +That's talkpython.fm/temporal dash replay and code talkpython75 all one word. + +00:22:32.820 --> 00:22:34.640 +The link is in your podcast player's show notes. + +00:22:35.360 --> 00:22:37.020 +Thanks to Temporal for supporting the show. + +00:22:38.600 --> 00:22:39.520 +I don't either. + +00:22:39.800 --> 00:22:43.780 +And I think this is an interesting segue to finding some of the interesting apps here. + +00:22:43.920 --> 00:22:46.900 +So I went to pull up plausible.io and I think you're right. + +00:22:46.940 --> 00:22:48.340 +I think plausible is really great. + +00:22:48.640 --> 00:22:54.640 +The one that I'm using is umami.is which is sort of a peer to plausible. + +00:22:55.040 --> 00:22:57.800 +I believe, I think you can pay for both of them. + +00:22:58.080 --> 00:23:00.380 +I'm not 100% sure about umami right now. + +00:23:00.580 --> 00:23:00.800 +Yeah. + +00:23:01.080 --> 00:23:05.220 +I don't know your ad book must be doing some hard lifting over there because plausible works just fine for me. + +00:23:05.540 --> 00:23:06.660 +You're using umami, are you? + +00:23:06.900 --> 00:23:21.620 +Yeah, I'm using umami and I looked at plausible as well and umami seemed a little more oriented towards self-hosting whereas plausible self-hosting seemed like oh, you could do it but we're kind of this like thing that we run in the cloud and you can pay for but you technically could + +00:23:21.620 --> 00:23:27.960 +and I felt like umami was like self-hosting first with, I don't even, like I said, I'm pretty sure there is a you now can pay for it as well. + +00:23:28.200 --> 00:23:37.080 +But I wanted to bring up this you, the site can't be reached because I think another interesting thing is like hosting DNS. + +00:23:37.720 --> 00:23:45.580 +So like pyholes, I have nextdns.io which is why I can't go to plausible right now unless I log in and tell it plausible is okay. + +00:23:45.920 --> 00:23:47.440 +Same thing for umami by the way. + +00:23:48.320 --> 00:23:51.740 +I think, what about, let's talk, let's, you're at Tailscale, let's talk networking. + +00:23:52.080 --> 00:23:59.680 +We'll get back to the use of Tailscale when we kind of wrap things up but like, do you use Pyhole or do you use any of these sort of managed things outside just your browser? + +00:24:00.080 --> 00:24:03.380 +Well, the modern internet basically requires using an ad blocker. + +00:24:03.520 --> 00:24:18.240 +I mean, when you, I'm fortunate to work from home so I'm almost always with inside these four walls where I have an AdGuard Home instance running and my DHCP server when, whenever a device requests an IP address from the router, + +00:24:18.240 --> 00:24:22.900 +it will hand out the DNS server in my local network as the AdGuard Home instance. + +00:24:23.260 --> 00:24:31.320 +And AdGuard Home's job is to run a list of websites that it thinks are serving ads and it will block those at the DNS level. + +00:24:31.460 --> 00:24:40.040 +So simply what will happen is you will go to try and load a website and it can't load certain components of the webpage and those components happen to be adverts in this case. + +00:24:40.280 --> 00:24:48.440 +It's not 100% coverage but I'd say it's sort of in the 80 to 90% range which is still a heck of a lot better than having no ad blocking whatsoever. + +00:24:49.440 --> 00:24:54.460 +And the idea here is that a lot of these, well, first of all, adverts use a lot of bandwidth. + +00:24:54.820 --> 00:25:02.060 +They also are probably shoving down a ton of JavaScript into your browser so the performance of loading a webpage is worse. + +00:25:02.400 --> 00:25:06.460 +It's using more bandwidth, it's using more processing power and on mobile, of course, that matters. + +00:25:07.120 --> 00:25:15.100 +When I leave the house, I'm not under the umbrella of my AdGuard home instance anymore because it's running on, I don't know, a Raspberry Pi in my basement. + +00:25:15.420 --> 00:25:16.400 +And so I've got a couple of options. + +00:25:16.720 --> 00:25:25.380 +One is I can use a hosted DNS service like you do called NextDNS which basically does the same thing as a Pi hole except you pay for it. + +00:25:26.040 --> 00:25:28.360 +I don't think it's a huge amount of money if I recall. + +00:25:28.460 --> 00:25:29.380 +It's a couple of bucks. + +00:25:29.800 --> 00:25:32.480 +It's either $1 or $1.99 a month. + +00:25:32.560 --> 00:25:34.000 +It's really small, yeah. + +00:25:34.260 --> 00:25:35.300 +It seems fair. + +00:25:35.660 --> 00:25:40.820 +And the idea behind NextDNS, like I say, is that it does the same thing as a Pi hole or an AdGuard home. + +00:25:41.120 --> 00:25:43.940 +It's just a hosted service that you pay for a managed service. + +00:25:44.760 --> 00:25:59.060 +Or you can use something like Tailscale and tunnel back through your firewall remotely and set your AdGuard home as your Tailnet DNS server and then use your AdGuard home or your Pi hole from your basement that you're already running already configured + +00:25:59.060 --> 00:26:00.780 +with all of your ad lists and blah, blah, blah. + +00:26:01.460 --> 00:26:03.660 +You can configure that to be your DNS server. + +00:26:03.900 --> 00:26:08.160 +And my wife loves these sort of like mobile games like the Candy Crushes of the world. + +00:26:08.480 --> 00:26:10.140 +And they are just chocked full of ads. + +00:26:10.540 --> 00:26:17.120 +And we only really talk about it when we're like traveling because she's, oh God, I wish we were at home because then I wouldn't get adverts. + +00:26:17.800 --> 00:26:20.860 +Yeah, we'll just turn on Tailscale and lo and behold, no ads. + +00:26:20.860 --> 00:26:21.560 +You're back to good. + +00:26:21.820 --> 00:26:36.560 +I think one final little note about like running your, either your AdGuard at home or your Next DNS if you register at your router level that's really interesting is you block ads in mobile apps as well like you're mentioning or on my TV all the tracking + +00:26:36.560 --> 00:26:42.020 +the TV does is short-circuited because everything on the network is subjected to it. + +00:26:42.520 --> 00:26:48.980 +And I'm, you know, as long as these ad networks are serving up malicious ads, I don't feel bad about blocking them. + +00:26:49.220 --> 00:26:50.500 +That's another angle of course, yeah. + +00:26:50.760 --> 00:26:57.220 +Yeah, I mean, if we go to Talk Python, you know, the website, there's, ads are still there. + +00:26:57.320 --> 00:26:57.500 +Why? + +00:26:57.620 --> 00:26:59.960 +Because I'm not using some shady network to deliver it. + +00:27:00.160 --> 00:27:10.320 +I'm just sharing content and someone who happens to talk about what we're doing, you know, and so I think that that's a, I think that's certainly something worth considering, right? + +00:27:10.480 --> 00:27:14.460 +I feel like this DNS stuff is part of self-hosting at least the personal level a bit. + +00:27:14.820 --> 00:27:17.740 +It's the, it's one of the fundamentals, yeah. + +00:27:18.160 --> 00:27:28.940 +Networking is one of those things that you have to have it if you want to do anything in your house, like even, and I use my mother, who I love dearly, as the example of the non-technical person in my life. + +00:27:29.500 --> 00:27:39.200 +Even if my mum, like she orders a router from her ISP or something like that to get Wi-Fi in her house, well, she's doing networking, she doesn't realise it. + +00:27:39.200 --> 00:27:44.980 +She's getting a Wi-Fi SSID broadcast, she's getting an IP address from the router every time she connects. + +00:27:45.300 --> 00:27:55.920 +The DHCP server provides a DNS server, which is probably your ISP's DNS server by default, and they are recording all of your DNS queries and selling them to the highest bidder also, I might add. + +00:27:56.140 --> 00:28:04.520 +And so there are just so many layers to this onion, and DNS is the, just what, we have a five-year-old in the house, we just watched Shrek this weekend, hence the onion reference. + +00:28:05.480 --> 00:28:20.420 +There are just so many layers to this onion that you just, you can keep peeling it forever, and this is one of the things that I genuinely love most about Linux, open source, self-hosting, that whole universe is that this conversation, I could literally sit here for eight hours and talk to you about different, + +00:28:20.600 --> 00:28:34.900 +you know, different things, like DNS is one thing, document management is another, media streaming is another, like each of these things, they're all, they're entire industries in their own right in the real world, but in self-hosting, you can play CIS admin, you can play, you know, + +00:28:35.180 --> 00:28:49.580 +the person who's running these mega corps offline, fully just in your basement, you know, and there's no, there's no business model to feed, it's literally just open source software, the true spirit of it, running in your house under your control. + +00:28:49.880 --> 00:29:04.580 +Yeah, we're definitely in danger of going for eight hours, so, I hope not, but we could, right, we definitely could, and by way of, I think that's a perfect transition to talk about this place called Awesome Self-Hosted here, which is a Git repository and a website, + +00:29:04.580 --> 00:29:11.860 +you know, I do, Alex, I think this is going to be a bit of a fad, it's not really catching on, there's only 288,000 GitHub stars in this. + +00:29:12.980 --> 00:29:27.540 +And if you look at it, you're familiar with the Awesome Lists, of course, there are dozens of these things, but Awesome Self-Hosted, I mean, it's updated daily, like, I look at the recent Git commits and it was last updated yesterday, and there are, + +00:29:27.620 --> 00:29:28.500 +how many categories? + +00:29:28.740 --> 00:29:29.360 +There must be. + +00:29:29.360 --> 00:29:38.460 +I don't know, but let me scroll, like, there's a couple of pages of just categories of things like e-commerce, DNS, for example, analytics. + +00:29:38.680 --> 00:29:38.840 +Right. + +00:29:39.080 --> 00:29:40.020 +You want to replace Jira? + +00:29:40.340 --> 00:29:40.940 +It's in here. + +00:29:41.080 --> 00:29:44.340 +You want to replace, I don't know, a wiki? + +00:29:44.600 --> 00:29:45.560 +It's in here. + +00:29:45.760 --> 00:29:49.400 +You know, it's honestly kind of overwhelming. + +00:29:49.840 --> 00:29:59.660 +And so this speaks a little bit to one of my overall philosophies when it comes to self-hosting of find a problem in your life and solve it, like a real problem. + +00:29:59.720 --> 00:30:01.680 +Don't just contrive one just for the sake of it. + +00:30:01.960 --> 00:30:05.280 +Photos is always the universal example I go to because everybody takes photos. + +00:30:05.620 --> 00:30:09.260 +And so you want to look at something like image, I-M-M-I-C-H. + +00:30:09.620 --> 00:30:14.980 +And that is a self-hosted Google Photos clone, and it lives entirely on your hardware that you control. + +00:30:14.980 --> 00:30:18.480 +It has machine learning, so it can learn your face. + +00:30:18.680 --> 00:30:20.740 +It can do, you know, object detection. + +00:30:21.020 --> 00:30:30.280 +It can do basically anything that Google Photos can do, except it lives on your hardware using your files and your compute until the end of time. + +00:30:30.480 --> 00:30:31.820 +And that's an end of it. + +00:30:31.940 --> 00:30:34.000 +Like, that's as deep as the rabbit hole goes. + +00:30:34.340 --> 00:30:34.720 +I love it. + +00:30:34.800 --> 00:30:36.140 +But it also makes me nervous. + +00:30:36.500 --> 00:30:37.160 +Good, it should. + +00:30:37.320 --> 00:30:47.680 +Because the thing with self-hosting is you get to place this admin, but it also means you own the data, which means when there's an outage or a hardware failure, you're on the hook for that too. + +00:30:48.120 --> 00:30:57.780 +Yeah, I'm not super concerned about an outage for my self-hosting thing, but I am certainly concerned about an outage of a self-hosted something for my production apps. + +00:30:58.760 --> 00:31:04.520 +And when I said it makes me nervous, yeah, yeah, but the things that make me nervous are twofold. + +00:31:04.880 --> 00:31:11.000 +The first thing that made me nervous would be just backup, backup and restore, or kind of losing access to it. + +00:31:11.060 --> 00:31:19.240 +Like something that I think it takes a while, at least for me, it took a while to learn the lessons through some paper cuts, was, oh, there's a new version of this thing that I'm self-hosting. + +00:31:19.300 --> 00:31:19.660 +How cool. + +00:31:19.720 --> 00:31:20.460 +Let's see what it is. + +00:31:20.460 --> 00:31:33.040 +Docker compose pull, Docker compose up, and then it won't start because there's some incompatible migration or something that I didn't run and I got to go read the docs and it says, oh, did you upgrade from version 1.6 to 1.8? + +00:31:33.060 --> 00:31:33.580 +You can't do that. + +00:31:33.580 --> 00:31:35.500 +You got to go to 1.7 and then 1.8. + +00:31:35.960 --> 00:31:37.240 +I'm like, now I'm an admin. + +00:31:37.580 --> 00:31:45.320 +But more concerned, like I had all this data, what if I can't get it to work on 1.8, but it's like a half database transition and then neither will run and now what do I do? + +00:31:45.320 --> 00:31:52.100 +Well, the best answer to that are some of the primitives around things like ZFS and snapshots. + +00:31:52.740 --> 00:32:04.620 +So there is this concept with, so ZFS, by the way, if you're not familiar, is the Zettabyte file system and it was born out of Sun Microsystems in the early 2000s, I believe. + +00:32:04.920 --> 00:32:13.040 +It's now unfortunately owned by Oracle, but there is a project called OpenZFS which is dedicated to bringing it to the masses, to normal people. + +00:32:13.040 --> 00:32:27.920 +There are still some weirdnesses around the licensing with ZFS, so it's not included by default in every single Linux distro, but it is included in things like Proxmox and Ubuntu and you can install it on Arch and NixOS and even Unraid, I think, + +00:32:27.980 --> 00:32:29.140 +has ZFS these days. + +00:32:29.640 --> 00:32:34.280 +And so the idea here is you're using what's called a copy-on-write file system. + +00:32:34.540 --> 00:32:47.880 +Now some of these terms, I will admit, sound a little nerdy and they kind of are, but the idea behind copy-on-write is you take a snapshot at a moment in time and instead of the file system recording everything, you know, + +00:32:48.120 --> 00:32:52.560 +transactionally forever, it will only record the delta from the previous snapshot. + +00:32:52.920 --> 00:33:05.360 +And so what that means is that you can fork, you can basically fork file systems on disk and then you can mount the snapshot from three days ago as an actual file system and then restore the files that way. + +00:33:05.700 --> 00:33:17.200 +So let's say your upgrade scenario, you could restore the database from just before you did the upgrade because as a good sysadmin, you are doing the hygiene of taking a snapshot before you do the risky thing, right? + +00:33:18.980 --> 00:33:21.380 +You can automate all this stuff with scripts, right? + +00:33:21.400 --> 00:33:30.700 +And I think there's a pragmatic angle here of how much time do you spend automating versus administering versus just going outside and touching grass. + +00:33:30.840 --> 00:33:38.200 +But in the age of AI, there's really not, like it's, I installed Arch Linux last night downstairs on my gaming rig. + +00:33:38.200 --> 00:33:40.960 +I was done, I decided I'm done with Windows for gaming. + +00:33:41.300 --> 00:33:43.780 +And I thought, right, how far can Codex get me? + +00:33:43.920 --> 00:33:45.720 +You know, the OpenAI version of Claude Code. + +00:33:46.020 --> 00:33:50.780 +And I installed Arch myself and then I said, right, I want this desktop to look like this. + +00:33:50.860 --> 00:33:52.520 +I want this kind of vibe. + +00:33:52.620 --> 00:33:55.260 +I want like an Ubuntu kind of orange vibe. + +00:33:55.260 --> 00:34:02.900 +I want Wayland compositor for my display and I want it to all log in seamlessly and blah, blah, blah, blah, blah. + +00:34:03.040 --> 00:34:04.020 +I want these fonts. + +00:34:04.120 --> 00:34:05.440 +I want my fan curves to be this. + +00:34:05.680 --> 00:34:10.320 +And I just let it cook and maybe half an hour later I came back and my system was just configured. + +00:34:10.640 --> 00:34:10.720 +Wow. + +00:34:10.760 --> 00:34:11.420 +And it's amazing. + +00:34:11.920 --> 00:34:15.700 +And you can do the same thing with a lot of like, like backup script. + +00:34:15.820 --> 00:34:18.980 +You can literally say to Codex, these are my requirements. + +00:34:19.520 --> 00:34:25.080 +I want you to take a snapshot before you do any kind of Docker compose operation. + +00:34:25.460 --> 00:34:28.240 +And it will do it, whether it's via an alias or whatever. + +00:34:28.420 --> 00:34:31.220 +I don't, the mechanics don't matter. + +00:34:31.300 --> 00:34:37.760 +But the point is a lot of this stuff you can protect yourself from yourself now with so much less cognitive load than you used to have. + +00:34:37.940 --> 00:34:41.880 +You can then configure it to backup offsite to all sorts of different places. + +00:34:41.880 --> 00:34:54.580 +There's a, there's a wonderful service called ZFS.rent, which if you're not familiar is a way of, you basically send them a hard drive and they will put it into a server somewhere and you pay, I think it's $10 a month for that hard drive slot. + +00:34:54.720 --> 00:34:59.760 +And then you can replicate all of your photos encrypted over the internet to ZFS.rent. + +00:34:59.940 --> 00:35:01.760 +And it's, it's 10 bucks a month. + +00:35:01.840 --> 00:35:03.460 +And then you've got that peace of mind. + +00:35:03.460 --> 00:35:04.280 +That's wild. + +00:35:04.440 --> 00:35:05.380 +I had no idea about this. + +00:35:05.440 --> 00:35:06.800 +This is a really interesting way. + +00:35:06.820 --> 00:35:07.780 +It's a great service. + +00:35:08.020 --> 00:35:09.240 +I have several friends that use it. + +00:35:09.440 --> 00:35:09.680 +Okay. + +00:35:10.020 --> 00:35:12.820 +Yeah, that's really cool because backup is certainly one of them. + +00:35:13.140 --> 00:35:14.980 +And that, that's not just export the data. + +00:35:14.980 --> 00:35:24.660 +That's like making sure the app runs so that you can actually get to the data that's in the, you know, Postgres DB that's running in the little Docker composed network that it created when you ran it and so on, right? + +00:35:24.880 --> 00:35:26.580 +There's plenty of other options with backups too. + +00:35:26.700 --> 00:35:35.260 +Like Backblaze is a decent one, although they were in the news fairly recently for some, I don't know, they stopped backing up OneDrive folders and just did it silently. + +00:35:35.880 --> 00:35:39.540 +And I don't know, you know how Reddit likes to go, go, go in on people. + +00:35:40.000 --> 00:35:42.460 +So I don't know, Backblaze, they've been there for a long time. + +00:35:42.540 --> 00:35:43.720 +They're a pretty reliable option. + +00:35:44.080 --> 00:35:56.440 +You could also, if you want to do it fully self-hosted, Hetzner, you know, VPS provider, they have what's called a storage box, which you can usually bid on, which I think they cost somewhere typically between 30 to 50 euros a month. + +00:35:56.640 --> 00:36:02.900 +So it's not the cheapest option, but if you want that level, that amount of storage offsite, it gets expensive. + +00:36:03.060 --> 00:36:04.160 +That's just the reality of it. + +00:36:04.740 --> 00:36:14.200 +When the business model is just storage and not farming your data and mining you for advertising stuff, it turns out storage is expensive. + +00:36:14.560 --> 00:36:16.360 +Yeah, that's what you got to pay for it if you're not the product. + +00:36:16.720 --> 00:36:27.580 +Yeah, but these things have enough storage that between you and a few mates, you could probably split it up into different ZFS datasets and replicate that way and, you know, split the bill a little bit as it were. + +00:36:27.580 --> 00:36:31.500 +Are there self-hosting things that really stand out for you that you're a big fan of? + +00:36:31.780 --> 00:36:32.340 +Like apps? + +00:36:32.580 --> 00:36:37.760 +The real problem aspect is one for me, I think, that's critical to it, to the success. + +00:36:37.860 --> 00:36:40.040 +You know, I talked about photos as being one example. + +00:36:40.400 --> 00:36:41.440 +Home automation is another. + +00:36:41.540 --> 00:36:44.580 +As you said, you'd have the home assistant guys on this podcast before. + +00:36:45.160 --> 00:36:53.060 +We actually had Paulus on self-hosted a while ago and, you know, those guys, what they're doing with the Open Home Foundation is amazing. + +00:36:53.480 --> 00:37:01.480 +Again, they're eschewing the status quo of five different apps for five different ecosystems and making everything talk to everything else and it's amazing. + +00:37:01.820 --> 00:37:11.720 +And, you know, for me in this studio, for example, I've got one, two, three different ecosystems just for my studio lights and it's all brought under Home Assistant in one place. + +00:37:11.880 --> 00:37:13.240 +And so for me, that solves a real problem. + +00:37:13.320 --> 00:37:16.380 +So when Home Assistant is down, okay, it's not the biggest deal. + +00:37:16.380 --> 00:37:18.280 +I have to walk around and turn three sets of lights off. + +00:37:18.480 --> 00:37:19.260 +Okay, fine. + +00:37:19.260 --> 00:37:30.940 +But when you start to add all of the different ecosystems in your house together, like your thermostats, you know, I have a mini split up there that I control through an ESP32 with like a serial connection. + +00:37:31.300 --> 00:37:41.660 +I then have an Ecobee thermostat downstairs and so that's two ecosystems just for the climate in the house and then my garage doors are another ecosystem and so it continues. + +00:37:41.820 --> 00:37:47.900 +And so solving real problems and bringing them back behind the firewall really is the idea for me. + +00:37:47.900 --> 00:37:54.760 +Just, I don't know, it helps me sleep better at night but it's also in many cases just more convenient and less hassle. + +00:37:55.100 --> 00:38:02.420 +The unification really that Home Assistant brings is really one of the biggest because everybody's got their janky little app that they think is so special, you know what I mean? + +00:38:02.720 --> 00:38:03.140 +Yeah. + +00:38:04.080 --> 00:38:14.400 +And I don't blame manufacturers necessarily for going that route because the way the internet was designed is it's, you know, I have something on this desk, right? + +00:38:14.460 --> 00:38:19.220 +How would the manufacturer talk to it to control it through a smartphone app? + +00:38:19.380 --> 00:38:22.720 +The only guarantee you've got is that a cloud server exists. + +00:38:23.080 --> 00:38:35.360 +You can't control whether the user is necessarily on the same Wi-Fi and in fact, we've seen over the last 20 years as technology's evolved that I remember unboxing products 20 years ago that just the usability was just horrid. + +00:38:35.720 --> 00:38:43.780 +You know, there are so many assumptions the manufacturers have to make about the environment it's going to land in, the Wi-Fi situation, the smartphone it's going to run on, blah, blah, blah. + +00:38:43.780 --> 00:38:55.400 +And the only way you can really guarantee compatibility is to take control of that link and host the cloud component yourself and then have your users talk to your cloud and then have the cloud talk to the device. + +00:38:55.560 --> 00:39:02.940 +Even though I can reach out and touch the light that's up here, it has to go to the cloud first to talk to it just because it guarantees that user experience. + +00:39:03.340 --> 00:39:03.540 +I know. + +00:39:03.860 --> 00:39:08.740 +My lights that I have for my streaming setup, they don't even have on, you can't physically turn them on. + +00:39:08.800 --> 00:39:10.380 +The only way you can turn them on is over the network. + +00:39:10.500 --> 00:39:10.820 +It's weird. + +00:39:11.120 --> 00:39:11.280 +Yeah. + +00:39:11.280 --> 00:39:13.500 +Welcome to 2026. + +00:39:14.020 --> 00:39:14.380 +Exactly. + +00:39:15.160 --> 00:39:18.060 +How did we accept that that is normal? + +00:39:18.480 --> 00:39:20.440 +When did that become normal? + +00:39:20.760 --> 00:39:21.420 +I don't know. + +00:39:21.760 --> 00:39:24.860 +Now that I think about it, it should at least have an on button. + +00:39:25.160 --> 00:39:25.520 +Oh, well. + +00:39:25.760 --> 00:39:26.080 +Right. + +00:39:26.300 --> 00:39:26.660 +I know. + +00:39:26.900 --> 00:39:28.580 +So let's talk for a little while. + +00:39:28.640 --> 00:39:39.660 +Now we've sort of set the stage, talked about some awesome apps and motivation and so on, but let's talk a bit about actually how to do it because I'm sure there's, I don't know, let me throw out, I'll just speculate. + +00:39:39.660 --> 00:39:47.280 +I bet there's 30 to 40% of the people are like, oh yeah, I'll just SSH into my setup as well and then I know what to do from there. + +00:39:47.540 --> 00:39:54.100 +And there's like maybe 20% of the people are like, I know what, I know I should SSH in there and the others are like, what is SSH? + +00:39:54.400 --> 00:39:54.560 +Yeah. + +00:39:54.660 --> 00:40:00.920 +So there's a lot of hesitation, I think, because you are kind of becoming a DevOps person. + +00:40:01.080 --> 00:40:03.560 +Like you're running probably in Docker, maybe on Linux. + +00:40:03.700 --> 00:40:05.720 +It's not on your main machine, most likely. + +00:40:06.040 --> 00:40:09.440 +And then this whole backup sort of story that we talked about and restore. + +00:40:09.720 --> 00:40:11.780 +Like talk to people about some of the tech. + +00:40:12.180 --> 00:40:19.760 +It's inherently still a technical occupation and there isn't still really a great way around some of that. + +00:40:19.980 --> 00:40:21.280 +Now we're on a Python show. + +00:40:21.520 --> 00:40:24.420 +We understand that abstractions exist, right? + +00:40:24.660 --> 00:40:27.360 +Python, of course, itself is an abstraction above something else. + +00:40:27.820 --> 00:40:35.180 +There are lots of companies that will tell you and will try and sell you abstractions on top of this self-hosting layer that I'm talking about. + +00:40:35.280 --> 00:40:36.240 +Well, Docker is an abstraction. + +00:40:36.640 --> 00:40:41.860 +Linux is technically an abstraction, although let's just not talk machine code. + +00:40:42.140 --> 00:40:45.140 +Let's just deal in, let's just treat Linux as the base. + +00:40:45.480 --> 00:40:45.600 +Yeah. + +00:40:45.660 --> 00:40:46.480 +Assume you have an OS. + +00:40:46.600 --> 00:40:46.740 +Okay. + +00:40:47.040 --> 00:40:47.320 +Yeah. + +00:40:47.600 --> 00:40:48.680 +I think that's fair. + +00:40:49.160 --> 00:40:49.420 +I agree. + +00:40:50.560 --> 00:40:55.340 +You know, there are, I have a couple of, I don't know if you can see it in camera, probably not. + +00:40:55.440 --> 00:40:58.780 +I've got a couple of Zima Board 2s on test, which they sent me for review for YouTube. + +00:40:59.180 --> 00:41:01.700 +And they have a, they have something called Zima OS. + +00:41:02.020 --> 00:41:03.800 +Z-I-M-A-OS. + +00:41:03.800 --> 00:41:03.940 +Z-I-M-A-OS. + +00:41:04.280 --> 00:41:07.140 +And, you know, it's pretty good. + +00:41:07.340 --> 00:41:08.860 +Like it's a, it's a one click. + +00:41:09.060 --> 00:41:16.440 +You can, it's got a little app store in it, like you have on your phone and you can install a lot of these apps in one click onto Zima OS. + +00:41:16.820 --> 00:41:23.500 +You can connect to USB hard drive and within maybe 20 minutes, half an hour, you've got a fairly functional setup. + +00:41:23.660 --> 00:41:28.580 +Now, is it the most buttoned up, most secure bulletproof thing in the world? + +00:41:28.700 --> 00:41:30.040 +No, almost certainly not. + +00:41:30.040 --> 00:41:31.300 +But it gets you started. + +00:41:31.740 --> 00:41:37.860 +And I think that is the real key is the best way to learn this stuff is to not think about it too much. + +00:41:37.860 --> 00:41:40.960 +It's just to do it in a fairly low stakes way. + +00:41:41.360 --> 00:41:51.840 +Don't try and switch from Spotify, for example, and convert your wife and your kids and everyone in your life to your self-hosted music streaming service overnight. + +00:41:52.320 --> 00:41:54.420 +Softly, softly, slowly, slowly, catchy monkey. + +00:41:54.820 --> 00:42:09.200 +You know, it's one of those things that you're probably going to need these things running in parallel for a little while until you feel comfortable enough that when you wake up at 7am and the streaming service that you've built in your basement doesn't work and the kid can't + +00:42:09.200 --> 00:42:17.280 +watch their episode of cartoons before school or whatever, do you want to have to log in at 7am via SSH to your server and fix it? + +00:42:17.460 --> 00:42:18.440 +No, I never do. + +00:42:18.560 --> 00:42:32.660 +It turns out that's not something I want to do, but it's something I've had to do a few times because I've made mistakes either in not rotating logs properly or a disk filled up or there was a hardware failure or the list goes on and it's just, you know, you're trading some convenience + +00:42:32.660 --> 00:42:47.140 +for ownership and the transaction is different and some of the cost there is in you and your time, but I will always advocate for people to learn these skills because I think in the modern world + +00:42:47.140 --> 00:42:49.320 +they are such basic fundamental skills. + +00:42:49.420 --> 00:43:03.380 +I wouldn't put them quite in the same bracket as learning how to do plumbing or electrical work or something like that, but this stuff, you know, everybody takes photos everybody listens to music and why should we continue to enrich the pockets + +00:43:03.380 --> 00:43:12.680 +of Megacorps when we have the tools and the capabilities to do this stuff ourselves if we're just willing to put a few weekends aside and learn it? + +00:43:12.920 --> 00:43:13.400 +It's a great point. + +00:43:13.820 --> 00:43:15.460 +I guess start small. + +00:43:15.740 --> 00:43:16.420 +These little... + +00:43:16.420 --> 00:43:17.460 +Start small, yeah. + +00:43:17.540 --> 00:43:26.860 +These home or these self-hosting OSes, I guess they sort of call it, it tries to bring kind of an app store experience to the self-hosting. + +00:43:26.860 --> 00:43:29.220 +Another one that I would say is Coolify. + +00:43:29.640 --> 00:43:30.560 +I don't know if you're familiar with Coolify. + +00:43:30.560 --> 00:43:31.200 +Coolify's great. + +00:43:31.440 --> 00:43:31.940 +Yeah, I'm sure. + +00:43:32.120 --> 00:43:32.420 +Yeah, cool. + +00:43:32.620 --> 00:43:34.700 +I did some stuff with Coolify for a while. + +00:43:35.120 --> 00:43:35.660 +It's a little similar. + +00:43:35.660 --> 00:43:38.000 +And you don't even need anything in your house with Coolify. + +00:43:38.360 --> 00:43:42.260 +They will do hosted versions of these self-hosted apps if that even makes sense. + +00:43:42.400 --> 00:43:48.620 +But essentially, you're still running the service, you're paying to run the service on their infrastructure. + +00:43:49.160 --> 00:43:58.000 +And so all of the stuff we talked about around digital sovereignty and privacy and business models all remains true except for the fact the compute doesn't live behind your firewall. + +00:43:58.180 --> 00:43:59.200 +It lives somewhere else. + +00:43:59.500 --> 00:44:10.960 +Yeah, and you can even do things with Coolify such as get a server at Hetzner or DigitalOcean, create an account at Coolify, and then basically install their Daemon thing on your app. + +00:44:11.080 --> 00:44:12.600 +And then through there, a little management. + +00:44:12.700 --> 00:44:14.200 +You're managing multiple servers running. + +00:44:14.620 --> 00:44:18.520 +I wanted to love Coolify and I think the idea is great. + +00:44:18.520 --> 00:44:28.480 +I found that I ended up juggling so much more UI settings where I'm like, you know, if I just had a Docker Compose file, I could just define and replace or something. + +00:44:29.060 --> 00:44:29.460 +Yeah. + +00:44:29.680 --> 00:44:31.840 +Such is the life of an abstraction, right? + +00:44:31.860 --> 00:44:39.060 +You trade certain complexities for certain decisions that the main, I mean, look at Apple, right? + +00:44:39.100 --> 00:44:44.060 +We're always looking at macOS going, oh, I wish it, why are they doing it that way? + +00:44:44.360 --> 00:44:48.100 +Well, you outsource that decision and the same is true with Coolify. + +00:44:48.520 --> 00:45:00.900 +And any other abstraction that you choose as part of this stack, like even Docker, for example, is an abstraction, as I said, and you are making a certain set of, you're outsourcing a certain set of decisions to Docker in how things work. + +00:45:01.040 --> 00:45:02.540 +It's just a reality of the world. + +00:45:02.700 --> 00:45:03.500 +Yeah, that's a really good point. + +00:45:03.600 --> 00:45:05.000 +That's, you know, you choose your abstraction. + +00:45:05.320 --> 00:45:19.700 +So I bring it up because I do feel like people who are hesitant to do this kind of stuff, this is a really good option to get you started and get you comfortable and like, ah, what if I, maybe I could just run it myself after you're comfortable, you know, you work your way down until you, you know, + +00:45:19.940 --> 00:45:20.820 +gain some of these skills. + +00:45:21.200 --> 00:45:22.560 +What about Linux? + +00:45:22.940 --> 00:45:28.640 +You know, one of the things that I think is both a hesitation for doing this at all, but also a hesitation to use Docker. + +00:45:28.740 --> 00:45:30.220 +It's like, well, I could just do it on Linux. + +00:45:30.560 --> 00:45:33.700 +At first you're like, well, I can't do Linux or Linux is intimidating to me. + +00:45:34.020 --> 00:45:34.920 +Eventually you get that skill. + +00:45:35.020 --> 00:45:36.560 +You're like, well, I could just put it on my machine. + +00:45:36.760 --> 00:45:39.700 +Why do I need to actually use all this Docker complexity? + +00:45:39.700 --> 00:45:43.720 +It is the repeatability for me, at least. + +00:45:44.060 --> 00:45:50.300 +So what Docker brings to the table is a unified interface to running headless applications. + +00:45:50.520 --> 00:45:57.020 +I can define using a Docker compose file, which is just a short YAML file in maybe 15 lines. + +00:45:57.160 --> 00:45:59.800 +I can say, right, this is the name this container is going to get. + +00:46:00.040 --> 00:46:04.420 +These are the exact directories this application is allowed to access on my system. + +00:46:04.740 --> 00:46:08.080 +My photos app, for example, doesn't need access to my music library. + +00:46:08.420 --> 00:46:11.640 +And so you reduce the blast radius of anything going wrong. + +00:46:11.920 --> 00:46:13.640 +These are the ports it's allowed to access. + +00:46:13.960 --> 00:46:18.320 +These are the kernel capabilities it's allowed to have if you want to get that deep. + +00:46:18.560 --> 00:46:26.140 +You can turn off from a security perspective, you know, the photos app, for example, probably doesn't need a huge amount of kernel permissions to operate effectively. + +00:46:26.520 --> 00:46:27.880 +Turn off the stuff it doesn't need. + +00:46:27.880 --> 00:46:41.180 +And then that way, if there is a supply chain attack or a vulnerability exposed, the application itself becomes so much less of an attack vector because it literally physically has no access to certain bits of the kernel. + +00:46:42.100 --> 00:46:55.260 +You know, when you keep going down the list of what Docker Compose can provide for you, within 15 lines you can define an entire application's deployment and then store it in GitHub completely securely, safely. + +00:46:55.640 --> 00:46:57.400 +Obviously don't put secrets in GitHub, people. + +00:46:57.560 --> 00:46:58.820 +Please do not do that. + +00:46:58.920 --> 00:47:03.060 +But there are plenty of ways to sort of store secrets locally. + +00:47:03.320 --> 00:47:10.340 +I think there's something called OpenBow, which is a local fork of HashiCorp Vault as a secret management. + +00:47:10.500 --> 00:47:13.000 +You can use Bitwarden CLI, you can use 1Password. + +00:47:13.400 --> 00:47:15.160 +There's many ways to store secrets. + +00:47:15.920 --> 00:47:19.420 +Again, for me, it's like, why do we need things like Docker to exist? + +00:47:19.700 --> 00:47:21.500 +It's because it's a universal language. + +00:47:21.760 --> 00:47:32.060 +I can ship you a Docker Compose YAML or any developer assistant can ship a Compose file alongside their applications and I don't need to know anything about you or your application. + +00:47:32.460 --> 00:47:41.040 +I just run Docker Compose pull up and suddenly all of it's like in the Kubernetes where it's like an operator, in the Windows where it's like an installer. + +00:47:41.260 --> 00:47:52.660 +You're capturing all of the knowledge that you have about how to run your application successfully into this artifact which I then just pull down and deploy and run and it removes all of that complexity. + +00:47:52.660 --> 00:47:56.620 +Beyond Docker, you mentioned a lot of the Docker Compose stuff. + +00:47:56.860 --> 00:47:57.320 +You're right. + +00:47:57.600 --> 00:48:00.280 +I'm going to define the networking, what things can talk to what. + +00:48:00.360 --> 00:48:01.480 +I'm going to define the storage. + +00:48:01.640 --> 00:48:08.700 +I'm going to define the visibility over the firewall sort of levels of things and it's great. + +00:48:08.920 --> 00:48:10.380 +I just looked on my server. + +00:48:10.520 --> 00:48:15.860 +I have three different versions of Postgres running from different apps that are like, oh no, we use Postgres 16. + +00:48:16.000 --> 00:48:18.160 +Oh, we use 18 or whatever it is. + +00:48:18.560 --> 00:48:22.320 +It's like, how are you going to manage that if you install more than just a handful of things? + +00:48:22.320 --> 00:48:26.340 +They all want these different servers and what a hassle, right? + +00:48:26.380 --> 00:48:33.620 +But because it's all contained within their own little network that they see, it's fine to run through because they all use the same port but they're not conflicting. + +00:48:33.900 --> 00:48:35.940 +Yeah, that version of Postgres has no idea. + +00:48:36.100 --> 00:48:43.580 +You could spin up 20 different Postgres 16s on the same server because all a container is really just process isolation in memory. + +00:48:43.820 --> 00:48:46.340 +You want to think of it like that as a mental explanation? + +00:48:46.880 --> 00:48:53.620 +All you're doing is taking your RAM and slicing it up into tiny little boxes and then placing that process inside that box. + +00:48:54.240 --> 00:49:01.140 +It can't, that process then can't see anything outside of that box unless you give it specific and explicit permissions to do so. + +00:49:01.360 --> 00:49:03.620 +And that's why containers have taken over the world if you ask me. + +00:49:03.920 --> 00:49:04.160 +I agree. + +00:49:04.260 --> 00:49:13.480 +I always thought that they were another level of complexity until I realized all the stuff you put in the Docker file is basically what you would have had to ad hoc type into your Linux machine anyway. + +00:49:13.580 --> 00:49:14.560 +So you've got to know it anyway. + +00:49:14.880 --> 00:49:15.520 +Yeah, you do. + +00:49:15.840 --> 00:49:15.960 +Yeah. + +00:49:15.960 --> 00:49:19.740 +I mean, the Docker file is basically just a bash script just with bells on. + +00:49:19.760 --> 00:49:19.980 +Yeah, yeah. + +00:49:20.160 --> 00:49:24.460 +You just put run or env or something in front of all the commands. + +00:49:24.700 --> 00:49:39.600 +Let's come back to your comment on codex and AI because for as intimidating as these things are now, they're way less intimidating if you just have cloud code or codex and you say, hey, explain this line to me or I need this to happen. + +00:49:40.040 --> 00:49:40.840 +Here's the file. + +00:49:40.980 --> 00:49:42.860 +Why is it not happening or how do I make it happen? + +00:49:42.860 --> 00:49:45.540 +That is an absolutely achievable thing. + +00:49:45.940 --> 00:49:50.460 +even stuff like last week, my server was running slowly. + +00:49:50.780 --> 00:49:51.460 +I didn't know why. + +00:49:51.740 --> 00:49:53.440 +The CPU wasn't busy. + +00:49:53.740 --> 00:49:54.780 +The RAM wasn't full. + +00:49:55.100 --> 00:49:57.400 +I looked at things like disk pressure. + +00:49:57.640 --> 00:50:02.440 +I looked at all the things I as a 15 year experience sysadmin knew where to look. + +00:50:02.560 --> 00:50:03.300 +Didn't see anything. + +00:50:03.580 --> 00:50:06.620 +And so then I had codex go and look at it via SSH. + +00:50:06.800 --> 00:50:09.340 +I was running it on my laptop and I said, right, you have permission via SSH. + +00:50:09.520 --> 00:50:10.620 +Go look at this server. + +00:50:10.960 --> 00:50:11.740 +Tell me what's wrong. + +00:50:11.740 --> 00:50:19.660 +And it turned out there was some spiking on certain NAND chips on the SSD when it was trying to write to certain sectors of the disk. + +00:50:19.760 --> 00:50:21.900 +It was causing massive IO weight. + +00:50:22.220 --> 00:50:26.200 +And I didn't catch that because it didn't make those writes during but codex ran overnight. + +00:50:26.780 --> 00:50:30.860 +And whilst I was sleeping it was still doing the checks and still finding finding out what was going on. + +00:50:30.940 --> 00:50:35.220 +And it turned out that the SSD, my boot SSD was on the verge of failing. + +00:50:35.420 --> 00:50:38.020 +It just hadn't marked itself as failing in smart yet. + +00:50:38.280 --> 00:50:41.900 +And it presented me this report, gave me all the diagnostics, it ran and yada yada. + +00:50:41.900 --> 00:50:43.740 +I would never have caught that. + +00:50:43.900 --> 00:50:44.000 +No. + +00:50:44.420 --> 00:50:45.600 +Not until it failed. + +00:50:45.940 --> 00:50:47.120 +And then I'd have caught it. + +00:50:47.180 --> 00:51:00.820 +But now I have time to go out and research the correct SSD to replace it and not pay rush shipping and all of this stuff because the robots went out and basically did my job for me. + +00:51:01.100 --> 00:51:08.860 +I mean, it's like, on the one hand, AI is one of these things of like, we're ushering in the very thing that's going to replace us as humanity. + +00:51:08.860 --> 00:51:10.500 +But I don't see it that way. + +00:51:10.640 --> 00:51:13.960 +Like, burying your head in the sand and saying, you know, vibe coded, slop this, that and the other. + +00:51:14.060 --> 00:51:17.080 +Like, it's not, it's not really a mature take on it, in my opinion. + +00:51:17.200 --> 00:51:19.280 +Yes, there's a lot of, there's a lot of slop out there. + +00:51:19.400 --> 00:51:23.520 +Yes, there's a lot of, like, but we shouldn't be replacing art with AI. + +00:51:23.760 --> 00:51:29.600 +Like, art fundamentally is a human endeavor and the reason it is valuable is because of the human effort that went into it. + +00:51:29.760 --> 00:51:31.360 +You'll never replace that with a robot. + +00:51:31.720 --> 00:51:38.500 +And, not even including the fact that everything that an AI does by its very nature is derivative of something that's actually being done before. + +00:51:38.800 --> 00:51:41.640 +So, you're never getting anything truly new and truly revolutionary. + +00:51:42.520 --> 00:51:48.360 +When it comes to, like, boring, menial tasks, like figuring out why my server's slow, have at it. + +00:51:48.860 --> 00:51:52.660 +I don't want to, I don't really want to be debugging that all night. + +00:51:52.940 --> 00:51:53.080 +Yeah. + +00:51:53.300 --> 00:52:05.080 +The recent thing I did with DevOps, Docker, and AI was I wanted to do a new self-hosting app and I want to serve it out of the same server as some other ones, but I don't want them to interact with each other. + +00:52:05.080 --> 00:52:10.680 +I don't even want them on the same network, but the NGINX front end has to be able to get to both of them. + +00:52:11.240 --> 00:52:20.280 +So, I'm like, all right, log code, how do I create a second network that still the one container can see both of the networks, but this one can't see, you know what I mean? + +00:52:20.320 --> 00:52:23.740 +Like, I'm like, how do I actually make that happen without breaking anything? + +00:52:23.900 --> 00:52:24.400 +It just knows. + +00:52:24.760 --> 00:52:26.060 +Yeah, it's like, this is what you do. + +00:52:26.160 --> 00:52:30.600 +This is the commands you run to, like, create the external network and then here's the settings and all the compose files. + +00:52:30.740 --> 00:52:32.880 +You restart them in this order so stuff doesn't break. + +00:52:32.880 --> 00:52:33.920 +I'm like, wow, okay. + +00:52:33.920 --> 00:52:48.280 +If you know just enough to be dangerous on a topic and you can guide it through the hallucinations that it does, it makes you incredibly powerful and so, for that reason, at least for the foreseeable future, I don't think it's going to replace, you know, + +00:52:48.400 --> 00:52:48.960 +everybody. + +00:52:49.760 --> 00:53:04.160 +There are for sure certain tasks and certain things that humans will be less required for and I think, you know, we're on the cusp of either the greatest change in humanity's labor since the Industrial Revolution or, + +00:53:04.680 --> 00:53:18.820 +and the economics will bear this out one way or the other, you know, forces at play here much bigger than either of us, or it will just turn out to be inordinately too expensive to do that for a very long time and then the progress and investment will stop + +00:53:18.820 --> 00:53:28.320 +and either a lot of very smart people are betting an awful lot of money and they're all wrong or there is actually something to this and we will see, I guess. + +00:53:28.320 --> 00:53:38.920 +Yeah, I think it's being misused for a lot of stuff but I also think that there's areas where it's incredibly helpful and this computer stuff in general, programming, DevOps, amazing. + +00:53:39.160 --> 00:53:40.980 +So we're getting short on time, Alex. + +00:53:41.100 --> 00:53:45.920 +I feel like we've only scratched the surface like for real but let's talk about Tailscale. + +00:53:46.280 --> 00:53:50.980 +I want to talk, I want to take one step back before we jump into Tailscale and just put out a warning. + +00:53:51.240 --> 00:53:54.120 +This is something that really blew my mind when I saw it. + +00:53:54.120 --> 00:54:00.760 +So when we're running our self-hosted apps, obviously we want to have security, limited access potentially. + +00:54:01.200 --> 00:54:03.500 +You might be running them at home and so how do you access them? + +00:54:03.540 --> 00:54:17.940 +There might be a bunch of funky networking things that people do but just as a quick PSA, I want to point out that, from here, other window, if you're using something like uncomplicated firewall in your Docker Compose file, you say, + +00:54:18.240 --> 00:54:31.860 +listen on 00, just default, like this port maps to that port, that's effectively 0000, that port, like listen on all the things and you're using something like uncomplicated firewall or one of these other things that manipulates the IP tables. + +00:54:32.280 --> 00:54:38.560 +Docker says, you know, Docker and UFW use firewall rules in ways that make them incompatible. + +00:54:38.660 --> 00:54:49.400 +That is like, things like UFW don't block access to your Docker stuff so something else, something stronger like a cloud firewall or things like that, right? + +00:54:49.460 --> 00:54:59.820 +Like on my servers, I have a at the cloud hosting level don't let anything access stuff but 80 and 443 or whatever and, you know, limited access to SSH. + +00:54:59.980 --> 00:55:03.700 +But if I didn't have that and I just used UFW, that would be not ideal. + +00:55:04.220 --> 00:55:06.080 +so let's talk about firewalls for a minute. + +00:55:06.240 --> 00:55:08.460 +I think there's a couple of things at play. + +00:55:08.840 --> 00:55:13.700 +One is you're hosting a public facing service like a website, right? + +00:55:13.700 --> 00:55:15.580 +That clearly has to be on the public internet. + +00:55:15.800 --> 00:55:16.600 +There's no way around that. + +00:55:16.680 --> 00:55:24.500 +The whole purpose of a website or an API probably is to be hit remotely and provide a response. + +00:55:24.820 --> 00:55:32.360 +But when we're talking about self-hosted infrastructure, the only customer is you, maybe your family, maybe a few friends. + +00:55:32.680 --> 00:55:40.100 +And so the idea behind Tailscale is to bring that connectivity back to be a more personal level. + +00:55:40.460 --> 00:55:45.080 +You know, our free tier, for example, at Tailscale has a six user limit. + +00:55:45.380 --> 00:55:46.820 +It has unlimited devices. + +00:55:46.820 --> 00:55:51.300 +And so the idea there is that you and your family all live in the same tail net. + +00:55:51.560 --> 00:55:57.300 +You make sure that Tailscale is installed on your server in your basement or wherever it happens to be. + +00:55:58.080 --> 00:55:59.580 +And it's installed on your phone. + +00:55:59.720 --> 00:56:03.980 +It creates a wire guard tunnel underneath, encrypted, end-to-end. + +00:56:04.240 --> 00:56:08.280 +And Tailscale makes a direct connection between those two devices with no middleman. + +00:56:08.620 --> 00:56:18.160 +And so the way that Tailscale remains free is because we ask people, we give it away for free for a lot of it, but then we ask those people to champion us at work. + +00:56:18.480 --> 00:56:21.920 +And we just crossed 30,000 paying customers just last week, I believe. + +00:56:22.220 --> 00:56:29.820 +And so each of those paying customers, well, not all of them, but a large number came through that funnel of, well, this is awesome. + +00:56:29.900 --> 00:56:31.000 +Why are we not using this at work? + +00:56:31.260 --> 00:56:31.360 +Yeah. + +00:56:31.620 --> 00:56:36.260 +So let me just sort of give the elevator pitch for people, I think, how cool this is. + +00:56:36.480 --> 00:56:44.780 +One way to self-host is I've got this running on a spare computer of whatever sort, Mac Mini, small, NUC, whatever, on your home network. + +00:56:44.980 --> 00:56:46.840 +You want access to it while it's traveling. + +00:56:47.240 --> 00:56:50.220 +The not great way is just, well, let's just put that on the internet. + +00:56:50.480 --> 00:56:53.400 +I'm going to open up a port on my router. + +00:56:53.760 --> 00:56:55.900 +I mean, just think back to the LastPass thing, right? + +00:56:55.920 --> 00:56:58.940 +How did LastPass get this huge takeover a few years ago? + +00:56:59.340 --> 00:57:04.380 +The one of the devs was running a Plex server on the open internet and didn't patch it. + +00:57:04.500 --> 00:57:05.400 +That got taken over. + +00:57:05.660 --> 00:57:11.300 +They got lateral movement inside the network, gotten the access keys to LastPass, and down it goes, right? + +00:57:11.300 --> 00:57:14.140 +So that's a bad example of self-hosting. + +00:57:14.480 --> 00:57:23.840 +Better would be use something like Tailscale, never open any ports at all, but when you're on the Tailscale network, you see into the networks where it's running. + +00:57:23.940 --> 00:57:31.560 +You see into your home network even when you're away, or you see into your server infrastructure even though zero ports are open. + +00:57:31.880 --> 00:57:33.280 +And that to me is just kind of magical. + +00:57:33.280 --> 00:57:42.140 +Yeah, if you want to learn more about it, I won't get into the specifics here, but there is a blog post called How Tailscale Works at tailscale.com. + +00:57:42.220 --> 00:57:44.120 +I'll send Michael a link to put in the show notes. + +00:57:44.740 --> 00:57:51.400 +And essentially, the magic there is we abused like stateful firewalls and how they work a little bit to do something called Nat traversal. + +00:57:51.780 --> 00:57:58.340 +So the idea is that there weren't enough IPv4 addresses for every device in the world to get its own address and sit on the public internet. + +00:57:58.640 --> 00:58:01.800 +And so we created this abstraction called Network Address Translation. + +00:58:01.800 --> 00:58:05.740 +Each device sits behind a firewall and gets a local IP address. + +00:58:06.060 --> 00:58:09.780 +You've probably seen the 192.168.whatever numbers. + +00:58:10.120 --> 00:58:15.500 +That's a local IP address versus what you get like what'smyip.com or whatever. + +00:58:15.760 --> 00:58:19.980 +And that'll give you a totally different IP address than what your laptop has with inside the Wi-Fi. + +00:58:20.280 --> 00:58:23.780 +And so you've got to have something that's doing that translation between those two things and that's called NAT. + +00:58:24.140 --> 00:58:34.580 +Then Tailscale punches through that NAT and makes a direct connection from your phone at the coffee shop over 5G through your residential firewall with no ports open to your server running under the stairs. + +00:58:35.220 --> 00:58:36.260 +It's super seamless. + +00:58:36.440 --> 00:58:37.200 +Yeah, it's super seamless. + +00:58:37.360 --> 00:58:42.680 +So I use it for things like I have a local LLM running on my Mac. + +00:58:42.680 --> 00:58:43.000 +Oh, yeah. + +00:58:43.360 --> 00:58:56.180 +And then if I'm at the coffee shop, then I just make sure I'm on the Tailscale network and I can still run apps that talk to my OpenAI API over my self-hosted LLM as if it was running on my laptop, but it's not, right? + +00:58:56.240 --> 00:58:56.400 +Yeah. + +00:58:56.560 --> 00:58:58.160 +Remember what we said at the beginning of the show? + +00:58:58.160 --> 00:59:05.620 +Like the rabbit hole goes deep and if you can think of a proprietary service, there's almost certainly a self-hosted alternative to it. + +00:59:05.900 --> 00:59:08.200 +AI is another one that you can self-host. + +00:59:08.320 --> 00:59:12.020 +So if you have a Mac Mini, we all heard about OpenClaw a few weeks ago, right? + +00:59:12.780 --> 00:59:15.080 +You can put it on your gaming rig. + +00:59:15.160 --> 00:59:19.980 +If you have an NVIDIA GPU in your gaming rig, you can use that for local AI. + +00:59:20.380 --> 00:59:28.680 +I mean, the rabbit hole is, if you're a curious person, I apologize in advance if you've not looked into self-hosting because it will consume you for a little bit. + +00:59:28.860 --> 00:59:29.560 +It's just how it goes. + +00:59:29.860 --> 00:59:34.460 +It is definitely how it goes and it's very satisfying as you start to make progress in it. + +00:59:34.720 --> 00:59:36.000 +Alex, I think that's it for our time. + +00:59:36.420 --> 00:59:38.300 +Final thoughts for people who want to get started. + +00:59:38.400 --> 00:59:39.120 +How would they get started? + +00:59:39.820 --> 00:59:42.080 +Oh, how would they get, oh gosh, that's a broad question. + +00:59:42.760 --> 00:59:43.120 +Hmm. + +00:59:43.440 --> 00:59:55.280 +Well, if you want to learn more about building a server in and of itself, I run a website at perfectmediaserver.com where you can learn how to build basically a Linux server with some storage in it to replace Netflix or something. + +00:59:56.160 --> 00:59:57.560 +I mean, I don't know. + +00:59:57.720 --> 01:00:00.440 +Awesome self-hosting is a good place to get started. + +01:00:00.640 --> 01:00:02.500 +There are dozens of YouTube guides. + +01:00:03.000 --> 01:00:08.820 +Just type self-hosting in and just watch a couple of hours worth of YouTube and you'll get a pretty good idea. + +01:00:09.320 --> 01:00:21.720 +And then from there, like I say, it's all about figuring out what problems you're trying to solve and then what shape that problem takes versus what your budget is, what your personal risk tolerances are and all that kind of stuff too. + +01:00:21.800 --> 01:00:28.500 +There's a lot that goes into it, but if you want to reach out to me, alex.ktz.me, you can come find me. + +01:00:28.540 --> 01:00:30.580 +I'm on Discord all over the place and I'll say hi. + +01:00:30.780 --> 01:00:31.280 +I'd love to chat. + +01:00:31.560 --> 01:00:32.040 +Yeah, awesome. + +01:00:32.160 --> 01:00:39.140 +I'll certainly link to your connections on the website, on the show notes and I do want to give a shout out to Tailscale. + +01:00:39.480 --> 01:00:45.620 +I think people should certainly consider it as part of the connectivity of all this stuff because it makes it so much simpler and so much safer. + +01:00:45.860 --> 01:00:47.360 +Not a sponsored episode. + +01:00:47.880 --> 01:00:48.880 +Hashtag not sponsored. + +01:00:49.220 --> 01:00:49.360 +Yeah. + +01:00:50.300 --> 01:00:52.660 +I'm just a corporate shill for free today. + +01:00:52.660 --> 01:01:04.840 +For me, I found out about it a couple years ago and I'm like, this solves all the problems and I was just such a fan and so I just want to make, you know, I think it's really a way that things get quite simplified for it. + +01:01:05.100 --> 01:01:18.120 +It was the same for me and I enjoyed it so much and I've been trying to solve this remote access problem as a self-hoster for, I didn't know it, but for 20 years I opened firewall ports to do remote desktop from school to my house when I was a teenager. + +01:01:18.320 --> 01:01:29.600 +You know, like, I've been trying to solve this problem for a very long time and I installed Tailscale one weekend three years ago and was like, holy cow, this is amazing and I got a job here because I liked it so much. + +01:01:29.940 --> 01:01:30.300 +Beautiful. + +01:01:30.720 --> 01:01:32.900 +Well, I really appreciated you coming on the show. + +01:01:33.260 --> 01:01:33.740 +Learned a lot. + +01:01:34.000 --> 01:01:34.300 +Thanks for being here. + +01:01:34.300 --> 01:01:34.440 +It was fun. + +01:01:34.620 --> 01:01:35.400 +Yeah, thanks for having me. + +01:01:35.600 --> 01:01:36.360 +Yeah, see you later. + +01:01:37.100 --> 01:01:39.400 +This has been another episode of Talk Python To Me. + +01:01:39.560 --> 01:01:40.520 +Thank you to our sponsors. + +01:01:40.720 --> 01:01:42.000 +Be sure to check out what they're offering. + +01:01:42.180 --> 01:01:43.560 +It really helps support the show. + +01:01:44.120 --> 01:01:47.340 +Temporal is hosting their yearly conference, Temporal Replay. + +01:01:47.820 --> 01:01:52.200 +Join your peers at Replay, the conference on orchestrating durable workflows and agents. + +01:01:52.620 --> 01:01:54.320 +May 5 to 7 in San Francisco. + +01:01:54.840 --> 01:02:04.240 +Visit talkpython.fm/temporal dash replay and use the code talkpython75, all one word, all caps, to save up to $449 on your ticket. + +01:02:04.740 --> 01:02:16.920 +If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML, and even LLMs. + +01:02:17.160 --> 01:02:19.600 +Best of all, there's no subscription in sight. + +01:02:20.020 --> 01:02:21.780 +Browse the catalog at talkpython.fm. + +01:02:22.420 --> 01:02:27.100 +And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:02:27.500 --> 01:02:29.580 +Just search for Python in your podcast player. + +01:02:29.680 --> 01:02:30.540 +We should be right at the top. + +01:02:30.900 --> 01:02:33.860 +If you enjoy that geeky rap song, you can download the full track. + +01:02:33.860 --> 01:02:35.880 +The link is actually in your podcast blur show notes. + +01:02:36.440 --> 01:02:38.000 +This is your host, Michael Kennedy. + +01:02:38.200 --> 01:02:39.500 +Thank you so much for listening. + +01:02:39.680 --> 01:02:40.460 +I really appreciate it. + +01:02:40.860 --> 01:02:41.620 +I'll see you next time. + +01:02:41.620 --> 01:02:52.960 +Talk Python and me. + +01:02:52.960 --> 01:02:53.700 +Talk Python and me. + +01:02:53.700 --> 01:02:55.660 +Can we be ready to roll? + +01:02:57.040 --> 01:02:58.400 +Upgrading the code. + +01:02:59.160 --> 01:03:00.860 +No fear of getting whole. + +01:03:02.220 --> 01:03:05.840 +We tapped into that modern vibe overcame each storm. + +01:03:06.580 --> 01:03:07.840 +Talk Python and me. + +01:03:07.840 --> 01:03:09.260 +I think is the norm. diff --git a/transcripts/547-parallel-python-at-any-scale-with-ray-transcript.txt b/transcripts/547-parallel-python-at-any-scale-with-ray-transcript.txt new file mode 100644 index 0000000..d0f1e10 --- /dev/null +++ b/transcripts/547-parallel-python-at-any-scale-with-ray-transcript.txt @@ -0,0 +1,740 @@ +00:00:00 When OpenAI trained GPT-3, they didn't roll their own orchestration layer. + +00:00:04 They used Ray, an open-source Python framework born out of the same Berkeley Research Lab lineage that gave us Apache Spark. And here's the twist. Ray was originally built for reinforcement + +00:00:15 learning research and then quietly faded as RL hit a wall. Until ChatGPT showed up, suddenly reinforcement learning was back. As the post-training step, that turns a raw language + +00:00:26 model into something genuinely useful. Edward Oaks and Richard Law, two founding engineers behind Ray and AnyScale, joined me on Talk Python to tell that story. We'll trace Ray from its + +00:00:37 RISE lab origins at UC Berkeley to powering some of the largest training runs in the world. + +00:00:43 We'll talk about what Ray actually is, a distributed execution engine for AI workloads, and how a few lines of Python become work running across hundreds of GPUs. We'll cover Ray data for + +00:00:54 multimodal pipelines, the dashboard, the VS Code remote debugger, CubeRay for Kubernetes, and where Ray fits alongside Dask, multiprocessing, and AsyncIO. If you've ever stared at a single + +00:01:07 machine Python script and thought, there has to be a better way to scale this, this one's for you. + +00:01:11 It's Talk Python To Me, episode 547, recorded April 27th, 2026. + +00:01:18 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:01:40 This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. + +00:01:47 Let's connect on social media. You'll find me and Talk Python on Mastodon, BlueSky, and X. The social links are all in your show notes. You can find over 10 years of past episodes at talkpython.fm. And if you want to be part of the show, you can join our recording live streams. + +00:02:01 That's right, we live stream the raw uncut version of each episode on YouTube. Just visit talkpython.fm/youtube to see the schedule of upcoming events. Be sure to subscribe there + +00:02:12 and press the bell so you'll get notified anytime we're recording. This episode is sponsored by Sentry's Seer. If you're tired of debugging in the dark, give Seer a try. There are plenty of AI tools that help you write code, but Sentry's Seer is built to help you fix it when it breaks. + +00:02:27 Visit talkpython.fm/sentry and use the code Talk Python26, all one word, no spaces, or $100 in Sentry credits. What if your AI agents worked like FastAPI microservices, + +00:02:39 typed, autonomous, and discovering each other at runtime? That's the world AgentField is building. + +00:02:45 Join them at talkpython.fm/AgentField. Edward, Richard, welcome to Talk Python To Me. Great to be here with both of you and talking about parallel computing and beyond. Thanks for having us on. + +00:02:56 Excited to be here and share some hopefully interesting information about Ray with the audience. + +00:03:01 Thanks for having us. I don't know how many people know about Ray, but it's a really cool parallel computing framework that's got this sort of big data angle and it's got an AI angle. We're going to talk about both of those and dive into the history and maybe even the future, who knows? + +00:03:16 But before we get into those, let's just start with your stories. Edward, I'll let you go first. + +00:03:21 Introduce you all, please. Yeah, my name is Edward, also go by Ed, and I've been working on Ray since I think about 2019, maybe late 2018. At that time, I was a grad student at UC Berkeley. So that's + +00:03:33 actually where Richard and I met, and that's where Ray kind of originated. So we were grad students in what was called the RISE Lab under Professor Jan Stoica. So he's also the professor that had, + +00:03:45 and the predecessor to that lab is what Spark came out of. Oh yeah, really? Wow. Yeah. So a lot of people view Ray as like kind of a successor to Spark. That's not really how we talk about it. I think + +00:03:56 it's kind of a different system solving different problems, but we did originate from the same university and sort of a similar lab. Yeah. And just kind of about me, what I'm interested in, + +00:04:05 I would say I'm not really like an AI person as much as I am like an infrastructure and like distributed computing person. So the reason why I was originally attracted to working on Ray and why + +00:04:15 I'm still doing it however many years later is I just really feel motivated by this idea of like providing an easier way for our users to leverage like large scale computing and sort of like building + +00:04:29 that like abstraction or like bridge layer that enables people to do it. Incredible. Richard, how about you? I'm one of the founding engineers here with Edward, and currently I'm on more of the + +00:04:40 product management side at any scale. And my background here is that I was actually an undergrad + +00:04:46 that was working on various like machine learning research projects. And at the time, Ray was still not like a very, it wasn't even like an early project yet. But the thing that was very exciting + +00:05:01 at Berkeley was reinforcement learning. At the time, like DeepMind was getting a lot of popularity and press for a game, like sort of innovations that we're doing for game AI. And eventually that + +00:05:14 sort of culminated in the AlphaGo moment. Tell people what that is. I'm sure some of us know, but that was kind of the first time that an AI system beat other competitors, where it wasn't just + +00:05:27 a memorization, or like a, we're going to load every possible combination of moves into the system, right? Tell us about that. I didn't follow it too closely, but at the time there were previous + +00:05:39 game AIs, like, like, you know, IBM sort of. Yeah. Stockfish, I think is what it's called. The original like chess AI. Right. And I think Go was a much more high dimensional complex game. So there + +00:05:52 was a lot. The first one, IBM won beat one of the grandmasters, but people were like, yeah, but it doesn't really count because it just knew all the possibilities and played it out, you know, which is, which I think that's a fair criticism. Yeah. And the other thing is it was like a very like hand + +00:06:06 tuned algorithm that took like years to build. So it was, it was like many people kind of using chess knowledge to like build a search algorithm that was like, you know, very specific to chess. + +00:06:16 AlphaGo one was, first of all, like the game was much harder than chess. Second, like it was, you know, a widely staged event. And then in terms of the learning algorithms, they did use + +00:06:28 reinforcement learning to train the model. And as far as I understand, like a lot of the ways they applied the machine learning techniques were not memorization or were not caching, but rather like + +00:06:39 having sort of like neural networks that could estimate the state and the value and the current of the current position and to be able to sort of extend and decide what the next move was given + +00:06:50 their internal representation of what the state was. So yeah, so that was obviously very, very impressive. And a lot of the technology that led to that moment was reinforcement learning. + +00:07:00 For us in Berkeley, we were interested in being able to sort of provide that sort of technology to researchers also at Berkeley that didn't have access to large engineering teams and Google's + +00:07:14 infrastructure and stuff like that. And so that's kind of where Ray came out of. Like it was baked out of doing reinforcement learning research and machine learning research and sort of evolved from that. + +00:07:25 Give people a look inside this research lab that y'all are talking about. It sounds super interesting. + +00:07:30 And I guess I have a couple of things that are wondering about. One is just, you know, what is a lab that generates like grid computing systems and, you know, large big data systems? + +00:07:41 How do you think about problems and then solve them? I know what a chemistry lab does, but I'm not entirely sure what this thing does to result in that coming out. And then two, + +00:07:50 how does it go from being something created in the lab that's really powerful or useful to either an open source product or even a product product service type product? Like what's that journey look like? + +00:08:02 One thing that I think is pretty unique. Well, let me take a step back for this type of like computer systems research where, you know, like grid computing or like networking or like large scale data + +00:08:11 processing. It can be hard to do that in an academic setting because a lot of times the like requirements and the infrastructure are like, well, they're expensive. And also like the types of problems + +00:08:22 that you work on, you know, like data center networking algorithms are only relevant to like the few companies that operate data centers. So it can be kind of hard to do that in an academic setting. + +00:08:32 Yeah. I was thinking about that when I was preparing for the show is like, I really want to try out some things with Ray and some of this computing stuff, but I just don't have the problems or the data that justify like genuinely using it, not just taking it through a sample. You know what I mean? + +00:08:45 I feel like academics would have a similar issue. + +00:08:48 The thing that was unique. So the lab that we were in was called the Rise Lab and the one before it was called the Amp Lab and the one after it was called the Sky Lab. And each of them kind of had a theme. So the Amp Lab was like mostly about like big data. So that was like the one that generated + +00:09:01 Spark. The Rise Lab was about like machine learning and reinforcement learning. And then the Sky Lab is about like sky computing. So like cross cloud and stuff like that. Richard and I are a little bit less familiar with that one because it was after we left. But the thing about the Amp and specifically the + +00:09:14 Rise Lab is that it was very like interdisciplinary. So the professor I mentioned that we work with, Jan, he had really intentionally set it up so that, you know, the students who are really passionate about like distributed systems and networking were working like really closely with the students who + +00:09:28 were the like machine learning and reinforcement learning experts. And then there were also folks who were really interested in security were also like working closely with both of them. + +00:09:37 And I think that kind of like cross pollination really helped yield like interesting project ideas and more kind of like realistic requirements. Because what Ray originally came from was like + +00:09:49 one of classmates and then the co-founder of AnyScale or the two of them, Robert and Philip, they were more like ML focused people. And they were trying to do reinforcement learning research, + +00:09:59 but they were trying to sort of put a square peg in a round hole by doing it on Spark. And it turned out that Spark like just really wasn't built for the requirements of reinforcement learning, + +00:10:09 which are a little bit more like dynamic in nature. And it was that kind of, and then they had access to, you know, professors who professors and students who were passionate about like distributed systems + +00:10:21 and data systems and stuff. So that's kind of where Ray came from was like organically, you had students who were trying to do reinforcement learning, they kind of hit this wall that the tools like didn't help them solve. So it was like, okay, let's start a new project and build the tool that we need. + +00:10:34 Yeah, makes a lot of sense. Richard, anything you want to add to that? + +00:10:37 Edward comes a little bit from the more systems side. And I was a little bit more on like the machine learning applied side. And I remember when I was in the RISE lab, there was a lot of + +00:10:49 interactions with the, like the one of the best machine learning, like the best machine learning groups in Berkeley as well. Like, like Mike Jordan, who, who is one of like the, like very, very famous AI professor had his group sort of co-located in the same space, + +00:11:03 in addition to all these systems people. + +00:11:05 You're talking about bear, right? Berkeley AI research. + +00:11:08 There's bear. And then there's also like a subset, which is like a lot of the Mike's students were also in, in the RISE lab. And in addition to that, there was also a biannual. So every six months, + +00:11:19 we would have a industry retreat. So there'd be about 200, 250 people that show up at like a conference + +00:11:27 or like a hotel. And 70 of them would be the students that we just talked about. And 180 of them would be like top researchers or like executives from the industry. So we were able to + +00:11:42 sort of cross pollinate and share ideas and collaborate and get feedback from folks like Bill Daly, who was, who's like the NVIDIA's chief scientist, or, you know, like a lot of really, + +00:11:52 you know, top people at Google who were doing recommendation systems and so on and so forth. + +00:11:57 So that was like that sort of moment was, was very often reoccurring. So every six months, and then we would just have this opportunity to actually touch base with what was happening in + +00:12:08 the industry and therefore drive innovation so that we could be impactful and do impactful projects. + +00:12:14 What's the relationship between reinforcement learning and like the transformer stuff that we see powering LLMs these days? How similar or different is that? + +00:12:23 Reinforcing learning is more of a, you can think of it as like a learning paradigm, right? It's like a way how it's kind of like this framework that you would use to, to set up a problem. And then, + +00:12:35 and like, it's fundamentally about like having a agent or like a, some actor or agent that interacts with the world, gets rewards or like some feedback signal from that world, and then sort of learns + +00:12:49 from that and continually updates its like, its policy. It's more focused on solving a single problem, you might say, or like a category problems, you know? It's just this very, very generic framework, + +00:13:01 right? And it can apply to like, you can imagine like the same thing is how like a mouse would interact with a maze or like a child would interact with a toy, right? So it's just a framework. It's like a + +00:13:13 symbolic representation of this framework. And whereas Transformers is like a, it's like a model architecture, right? It's like a way for us to be able to ingrain a particular modeling heuristic + +00:13:26 that tells us that like, hey, for certain types of data, in particular sequence data, there are patterns that you can learn across the sequences, and that can improve like the quality of modeling. + +00:13:39 And so like the two can be worked, like can be used together, you can do reinforcement learning with a transformer, but you can also have a transformer that stands by itself as trained with supervised learning and reinforcement learning that is done without a transformer model. + +00:13:52 Interesting. + +00:13:52 That question that you asked is actually, I think, like tightly intertwined with the history of Ray, because as we mentioned in like the 2017-2018 era, Ray was kind of originally motivated by + +00:14:04 reinforcement learning. But that reinforcement learning had like very little to do with like transformer models or LLMs. It was things along the line of the AlphaGo project that we talked about, + +00:14:14 or it was also being used a lot for robotics at Berkeley. And then reinforcement learning actually, like sort of, I would say died out for a while or like got less popular, kind of like hit a wall + +00:14:25 and it didn't, it was like viewed as not that practical. So the original Ray library, like the most popular one in the early days is called RLLib. And that was like far and away the most successful Ray library for a long time. And then it kind of like petered out for a while. + +00:14:39 RL for reinforcement learning, right? Something like that? + +00:14:41 Yeah, that's right. Reinforcement learning. + +00:14:43 Okay. + +00:14:43 And then we had this kind of ChatGPT or like LLM moment, which by the way, Ray is also like tightly intertwined with because GPT-3 and I think 4, I'm not actually sure about 4, but at least 3 was + +00:14:57 trained using Ray as like the compute framework by OpenAI. And the really big innovation that went from like GPT to ChatGPT was by applying reinforcement learning to the transformer models. + +00:15:11 So this technique is called post-training, which is like you have, you do the supervised learning that Richard was kind of talking about, or you do like what they call pre-training and you generate these like model weights that basically encode like a huge amount of information, like the whole internet. + +00:15:24 And then they are, but they're kind of unrefined, right? You can think of it as like a, I don't know, a child with a lot of intelligence, but not very good at communication or something. And they applied + +00:15:34 reinforcement learning techniques as a way to sort of tailor the model to specific use cases. + +00:15:39 So the first one was for this like chat application. So that's how you go from like GPT to ChatGPT. + +00:15:45 And then another example of that more recently is like these coding agents are also a different version of like post-trained LLMs or transformers. And we're seeing, so we originally had Ray kind + +00:15:57 of used for reinforcement learning, kind of dipped and it was used for like LLM things. And now we're actually seeing a huge resurgence in reinforcement learning specifically for this like post-training use case that I was talking about. + +00:16:08 Are you guys surprised just how far these GPT type things and clod code and so on have come given that you saw a little bit before then? + +00:16:17 I remember like Jan would occasionally pull me aside and say like, hey, you should work on like program synthesis and program synthesis is like effectively is like a model. It's a like a, + +00:16:30 it's a, it's a, it's a machinery problem where you try to like get models to write code. And then I don't think that was definitely not the right approach. Like that's not what ended up like not, it wasn't like the program synthesis line of work that ended up with coding agents, but like Jan was always + +00:16:44 like, hey, why don't we go work on programs with this? I have no idea what program synthesis is. I like, I have no expertise in this thing, but he wanted to work on the problem. Well, which is funny, because like in five years, seven years later, it turns out like this is like the biggest known + +00:16:57 economically valuable sort of application of these machinery. + +00:17:00 And solved in just a completely different way that I don't think anybody really saw coming. + +00:17:06 That was definitely an emergent thing. At least for me, I didn't expect that at all. + +00:17:10 Yeah. Well, I'm blown away by it. I honestly, I'm happy that it exists. I get to do cool stuff with it, but sure didn't see it coming. This portion of Talk Python To Me is brought to you by Sentry and + +00:17:22 Sear AI. There are plenty of AI tools that help you write code, but Sentry Sear is built to help you fix it when it breaks. The difference is context. Sear isn't just guessing based on syntax. It's + +00:17:34 analyzing your actual Sentry data, your stack traces, logs, and failure patterns. Because it has the full context, it can a spot buggy code in review and help prevent issues before they happen, + +00:17:45 and b identify the root cause of production errors. It can even draft a fix and hand the work off to an agent-like cursor to open a PR for you. Sear turns Sentry into a complete loop. You have your + +00:17:57 traces, errors, logs, and replays to see the problem, and now AI to help solve it. Join millions of devs at companies like Claude, Disney Plus, and even Talk Python who use Sentry to move + +00:18:07 faster. Check them out at talkpython.fm/sentry and use code talkpython26, all one word, for $100 in Sentry credits. Thank you to Sentry for supporting Talk Python. + +00:18:21 Let's switch over and talk about maybe set the foundations we're talking about, Ray, a little bit. + +00:18:26 And by that, I mean, let's talk about like different options for parallel computing and that kind of thing. So we have this sort of spectrum of compute, and it sounds to me like + +00:18:37 the history, the idea is, hey, let's move towards scaling this compute out across all the cores, across multiple machines, so that when you're doing training and reinforcement learning, things like + +00:18:50 that, you can actually take advantage of all the compute. And I'm guessing GPUs as well, right? + +00:18:55 Yeah, GPUs are definitely like bread and butter for Ray. + +00:18:57 So at the very smallest layer of parallelism, at least in Python land, we've got asyncio, which really still runs on a single thread, but it uses waiting periods like waiting on databases, + +00:19:07 waiting on API calls, and so on to interlace work without true parallelism, but still kind of. + +00:19:13 We have threads, which really, until recently, didn't do anything much different. + +00:19:18 Right? It's just less control structures, right? Because we had the gill, and then we now we've got free threaded Python. So it's a little bit better, but you got to have the library support. We have multi processing and sub processes. And that's + +00:19:29 kind of what we have out of the box in Python. But then we have stuff that both of you all are familiar with, or have built things like databases like Spark, or Ray, we've also got Dask and Coiled, + +00:19:42 which is, I'm interested to hear how you all see yourself as the same or different than Dask and Coiled and so on, which itself is different than when it started, at least Coiled. So it may be like, + +00:19:52 just speak to this, this arc of trying to get more compute out of our apps. + +00:19:58 I would kind of try to organize like a framework for thinking about those. So, and this is a little bit off the cuff. So hopefully it's, you guys can follow it. But I would say there's kind of like + +00:20:08 two axes I would think about. So the first one is like how specific versus kind of how general of like a parallelism framework you have. So something that is like really specific, like the most specific would be something that is like completely tailored to one use case, like a, + +00:20:23 this is not really Python, but like a SQL database. Like it's really good at like processing SQL queries, you can't really use it for anything else. And then a little bit more general than that is something + +00:20:33 like Spark. So you can use it for this kind of like big data processing type workload, you can use it for some streaming. But if you try to do anything that kind of goes outside the bounds of that, you start to + +00:20:44 run into a little bit of trouble because it has kind of an opinionated, like high level API, and an opinionated way that like data moves throughout the system, for example. And then you have kind of on + +00:20:56 the more general purpose and you have like Ray, and I would say desk is also more general purpose than the others. And so, so you have like specific to general purpose. And then there's also, I think, + +00:21:07 like the scale. So like asyncio is extremely useful for making many like concurrent, like IO bound requests, like HTTP requests, database queries, file operations, like anything like that. But it only + +00:21:20 works within one thread. Yeah, it feels a little bit like a scale up lever, even though you're not technically scaling up the hardware. It's like, yeah, you're still in the same box, just the box + +00:21:30 can do a little bit more. So asyncio is kind of scale up within a thread even. And then you can also have like scale up within a process. So if you have like multi threading, of course, like with free + +00:21:42 threading, you can actually get like parallelism. What most people do to scale up within a process, like historically with Python is they call into like native code, right? So you're using NumPy, you have basically like Python bindings, but in reality, almost all of the compute is happening in + +00:21:56 like a C extension library. And that's, that's also true for Torch. So those allow you to kind of like scale up to varying degrees, so like scale up within a thread within a process. And then + +00:22:07 multiprocessing also lets you scale up within like a whole host that you could use, you know, 64 cores of machine. And then at some point, you can't even fit on one machine anymore, you need to scale + +00:22:18 out even more. And that's where you need some kind of like parallel computing or like grid computing or kind of cluster framework, like Ray or Dask. It could be you need to scale up because of memory, + +00:22:29 or it could be a CPU, right? I think often people think just CPU, right? We just got to compute more, but it could be we've got a terabyte of stuff to try to process. Could be or it could be also for, + +00:22:40 because you need to use more GPUs, either for compute or for memory, like some of these large scale LLMs, you can't even fit it inside of like one single GPU. So you need to kind of like shard it across many machines. Yeah, we'll even see there's some, + +00:22:54 some ways to put these together, right? Like, I guess it's probably pretty straightforward, but we'll talk about the programming model and stuff. But you theoretically could use, I don't know, multiprocessing or something in your code, but then scale that across machines + +00:23:07 with Ray. Is that possible? You could. Does it make sense? + +00:23:10 Between a lot of these things, I think there's like some kind of unique parts and then some overlap. So like Ray can be used just on one machine. In that case, you know, Ray kind of manages its own processes and does like the delegation of work from like what we call your + +00:23:24 driver process, which is like the main Python program to the other processes, which in Ray terminology are like tasks and actors. If you're running Ray on one machine, then it looks quite similar to + +00:23:36 multiprocessing just with a little bit more opinionated of an API and some like integrated like observability features and stuff like that. But Ray definitely like is designed around the like + +00:23:47 multi-node kind of larger scale cluster use case. That's like where the value really comes in. + +00:23:52 I think you had a question about like Dask and Coiled. I think Dask and Coiled, they were more of like a + +00:23:59 comparison point for Ray, especially because like there was a Pandas on Ray project in 2018. And at that point, I think they were, yeah, it was, it did get brought up more often, but more recently, we don't + +00:24:13 hear about Coiled as often. I think in particular because we've sort of, you know, focused our, our product efforts a little bit more towards the AI side, whereas Coiled, I think is more like a + +00:24:24 scientific computing slash like general, you know, scale up Panda, scale up NumPy sort of approach. So we diverged and we don't see each other that often. + +00:24:33 Ian, from the last time I spoke with Matthew Rocklin, not too long ago, it looked like we were really focused on kind of creating and configuring and managing the infrastructure that allows for grid computing with + +00:24:47 data science type of stuff. A lot of like managing AWS and scaling them and, and so on. And more than the original Dask story, I think. All right. So, well, that brings us to what is Ray? I mean, + +00:25:00 we talked a little bit about it, but like, just give us the, like, what would you tell people if you made it a conference or something? + +00:25:06 You want to take this, Richard? You want me to? + +00:25:07 Yeah. I mean, I can start. + +00:25:09 We've both given that conference talk many times, by the way, so we should be good at this. + +00:25:12 Here's a rehearsal. + +00:25:14 So Ray is, by the way I would probably put it as like, it's a, it's a distributed execution engine for AI workloads. And in particular, it handles a lot of the orchestration aspects of the AI workloads and + +00:25:27 also has a variety of first party and third party libraries are built on top of it to help scale these AI workloads that we, we often see. So two popular, very, very popular applications of Ray today is that, + +00:25:41 is Reinforced Learning and then Multimodal Data Processing. Both of them are very, very relevant in today's AI world, but Reinforced Learning libraries, a lot of the third party ones, they will use Ray for + +00:25:53 coordinating the different components that you need to do Reinforced Learning with. There's like an inference engine that's involved. There's a training engine that's involved. And there's also like agents and sandboxes that are involved. So all three things, all, all these things need to be + +00:26:07 coordinated by one central orchestration system. And it's way easier to write this in Ray because Ray gives you that, that ability to control all these components as if you're writing single-threaded + +00:26:18 code. Multimodal Data Processing is the other big one where existing data processing libraries will focus on the ability to handle tabular data and work with Parquet, Iceberg, Delta, so on and so forth. + +00:26:29 Whereas like Ray finds its niche more in the, like the intersection between the data and the GPU. + +00:26:36 And so typically you're working with like larger unstructured data, for example, like images or embeddings. And oftentimes that requires like more complex scheduling and more complex orchestration + +00:26:48 that Ray is really good at. Given the origins, it certainly makes sense that you've got this focus on really nailing ML training and other types of workloads. Is it relevant to people who are just doing, I don't know, time series work or? We were going to talk about this at some point, but the, + +00:27:03 we kind of organize Ray in terms of like layers in a way. So that we call like the base, like Python API, which is quite simple. It's really just like, you know, for like people very familiar with Python, + +00:27:14 you could think of it as like multiprocessing for a cluster. So that we call kind of Ray core, is like that base, like distributed execution engine, sort of like core primitives for scaling up, + +00:27:25 distributing work and handling failures and like just overall kind of parallelism. And then on top of it, we have like a lot of library integrations, like that's what the Ray libraries are, + +00:27:36 like Ray train and serve. And then some of these post-training libraries. So that core layer is like absolutely relevant for non kind of AI workloads. And we do have many, many users that use it for + +00:27:48 things like in the finance world, they use it for parallel back testing or time series analysis, like you mentioned. Yeah. And any kind of like generic, just like parallel workload that you + +00:27:59 need to scale beyond the single machine. Now I'm thinking of it in finance and real-time trading type stuff. You could be running a whole bunch of scenarios in reverse. And then there are many of the + +00:28:09 largest hedge funds do exactly that using Ray. From my understanding, we could use Ray even on one machine. And it has some capabilities to help you sort of take better advantage of all your hardware. + +00:28:20 Like even my little streaming Mac mini has 10 CPUs and I just write regular Python code, I get like 16% or something or 10% of that. Right? Yeah. You certainly can use a Ray on one node. + +00:28:31 I think actually the kind of most compelling part of that is you can do it for development. So you can like, if you're working on this kind of large scale post-training thing, if it's useful to kind + +00:28:43 of think about what you'd have to do without Ray. So you would have like four different containers, each one would have its own like Python entry point, and you'd have to kind of like run and + +00:28:53 orchestrate them as like these independent services. So eventually maybe you'd like deploy them on Kubernetes or something like that. But even when testing locally, it's like, if you want to run all of them and like, make sure that kind of the integration points work well, and like quickly be + +00:29:07 able to like iterate and debug stuff. It's really painful if those are all kind of like loosely coupled as different processes. And especially if the way that you start them on your local machine is going to + +00:29:18 be very different than when you actually go to like scale it up in a cluster. Even if you just make a change, like, okay, now I got to go restart all the workers and so on. Right? I think a lot of people can relate to that pain. And with Ray, the thing that's really cool is you can, you can write kind + +00:29:32 of one Python script that like starts all those different processes and does the orchestration. + +00:29:37 You can run it just on your like local Mac or whatever local machine you have. And then once you kind of like have it working, then you can run it on a cluster and like scale it up using like + +00:29:46 the same code. Does it come with cluster management in terms of like infrastructure's code type of stuff? + +00:29:53 Will it spin up nodes and so on? Or do you have to have your cluster set up and then just it knows about it? You know what I mean? The answer is kind of both depending on your use case. + +00:30:03 So I'd categorize it as like there are maybe three or four ways that people run Ray clusters. + +00:30:09 So the first is using a tool that we call like the cluster launcher. So this is kind of like if you're an individual practitioner and you just want something like really low friction, + +00:30:19 we have a tool that will basically like spin up a Ray cluster on like AWS or GCP or Azure, or even on your own set of hardware, like you can kind of like bring your own set of machines. + +00:30:31 But that's not really like a fully managed experience. You can also run Ray on Kubernetes. + +00:30:36 So there's a community led project called KubeRay, which is a pretty tightly integrated like Kubernetes operator that makes it really easy to like run Ray clusters on Kubernetes. + +00:30:46 Or you can use like a more managed service like AnyScale, obviously where Richard and I work, we have like managed infrastructure for Ray clusters. But there are also, I think there are some other providers you can run Ray clusters to like AWS has an offering or + +00:31:00 Domino data labs has an offering. And I think there are a few more as well. + +00:31:03 You know, it makes a lot of sense that you guys have this sort of let us run the infrastructure side. We'll talk more about that later. With KubeRay though, do you just say like, as long as you have a Kubernetes cluster, you can just let it kind of create pods and scale up or down + +00:31:17 as demand is needed there, something like that. + +00:31:19 When you install KubeRay into your cluster, it will basically run the KubeRay controller as like a background pod. + +00:31:24 It's called like an operator in Kubernetes lingo. And then at that point, you now have these like custom resources. So you can like create a Ray cluster or a Ray job as like a custom resource. + +00:31:36 And then it will get spun up as a bunch of pods and they will connect to each other and get health checked. And all of that infrastructure management is done. + +00:31:43 KubeRay is pretty, pretty active. 2.5 thousand GitHub stars commits 17 hours ago. Nice. + +00:31:49 There's a huge community kind of initiative behind KubeRay and like we're involved with it too, but it really kind of is like kind of taken a life of its own. And it's really useful too, because like even on Kubernetes, everyone's environment is a little bit different. So having + +00:32:04 maintainers and committers from like many different companies and people who are running in like different environments makes it easier to sort of cover all the bases. + +00:32:12 For sure. Yeah. That diversity of use cases and stuff is always nice to create a better, better API, better library, and so on. + +00:32:22 This portion of Talk Python To Me is brought to you by Agentfield. What happens when you give hundreds of AI agents a shared code base and let them write code, review each other's work, + +00:32:31 and ship to production? Well, that's exactly what the team behind Agentfield AI built. And the wild part, it's not some proprietary system locked behind a paywall. It's an open source Python library. + +00:32:44 Now, where most agent frameworks have you wiring up DAGs and workflows, Agentfield lets you build AI agents the way you'd build FastAPI microservices. Think typed Python functions that become autonomous + +00:32:57 services. They discover each other at runtime, call each other like APIs, scale independently, fail independently, and recover on their own. And here's the thing. You're not just orchestrating + +00:33:08 LLM calls. You can orchestrate entire anonymous tools, spin up multiple cloud code instances, codec sessions, any coding harness you want, all running as live nodes on the same architecture, + +00:33:21 collaborating and verifying each other's output. That's how they build the factory. And it's completely free and open source. Check it out at talkpython.fm/agentfield. That's talkpython.fm + +00:33:32 slash agentfield. The link is in your podcast player show notes. Thank you to Agentfield for supporting the show. Let's talk to an example. You have a bunch of examples. So you have examples, + +00:33:43 and then you've got, is that also the gallery? Are these the same thing? I think those are the same. + +00:33:47 There's a ton here. This is kind of like all of them, and the others are like the highlighted ones. + +00:33:52 Some highlighted ones. Sure. Got it. So I think it would be nice to talk through the experience of doing a project in Ray, keeping in mind that it's always hard to talk about code over audio, + +00:34:05 but you know, let's maybe, maybe we could just like sort of skim over whoever wants to sort of narrate this experience of like going through one of the examples, you have an audio batch inference type of scenarios. Maybe we could talk. + +00:34:17 Can you score down so that I know where I'm going to end up? + +00:34:20 Yeah. Do some whisper stuff, do some GPU stuff, some LLM stuff, persist a curated subset, that sort of thing. Cool. Yeah. I kind of get the sense. So Ray is basically very similar to writing a + +00:34:34 standard Python script. So like ideally the way you sort of think about things in or in the way you read the code, it should be very similar to, should be minimally intrusive and should be very familiar + +00:34:45 with how you're, how you might sort of reason about, about like, you know, serial code or like single thread code. And so like, obviously the, the, a lot of the things that we do here don't demonstrate, + +00:34:58 or like demonstrate how you might sort of set up a project by yourself. So including like standard pip installations, you can use uv if you want and then like standard imports. Right. And then moving down, + +00:35:08 we started to enter like using Ray data, which is the data processing multimodal data system that we have. It's a library on top of Ray and it provides a lot of simple abstractions to do all sorts of like + +00:35:23 big data tasks. So like here you have example, which is simply just like reading the dataset and then like subsampling it. + +00:35:28 So let me ask you a question about this. So you basically say ray.data.readparquet and you give it an S3 link to a parquet file, presumably either assigned or public. When I say that, does that + +00:35:39 load it into one machine or does that instruct all of the workers all to go and load this? + +00:35:44 It actually doesn't load anything, but if you do end up executing it, right? So it's lazy. So, so right now what you're doing is you're just actually just like constructing this, this program. + +00:35:56 But when you do execute it, it will execute on all the processes or like, you know, across like the entire cluster. + +00:36:02 In this scenario, it doesn't necessarily need to have one of them populate the data for all the others. They can all go straight to S3 and get it. + +00:36:08 And particularly in this example, this has, it probably points to a folder and the folder has many different files. + +00:36:15 Ah, so maybe it breaks. Yeah. Yeah. Maybe it breaks it up. + +00:36:18 We have a thing where every single line of the parquet file, every single row has some set of bytes. + +00:36:25 And what we want to do is transform those bytes into a, you know, something that's more manageable, like a numpy array. So that's kind of what we're doing here. We're loading the data + +00:36:36 with torch audio, and then we're doing some resampling and then, and then we're sort of like a returning that back to ray data. So that this is like a single map test map, where like a single function. + +00:36:47 So you write a function that does this, what you just described. It passes in an item. + +00:36:52 It's a row basically. Yeah. So I think it's like a row in the parquet file. And then you just say, go to your data that you, you know, you loaded with Ray and you say map given to the function, not called the function, right? Just give it the pointer to the function. + +00:37:05 That's right. + +00:37:06 And it figures out like, okay, here's how we'll distribute it across the cluster. + +00:37:10 This map, this resample function will be executed on like hundreds of processes across the cluster. + +00:37:16 And maybe it'll do something smart, like say I'm on row 1000. So it could do a skip, maybe, or something like that, potentially. + +00:37:23 All the data is already like sharded. + +00:37:25 Got it. + +00:37:25 So it will take the, whatever is available, and then it will just like run the function. + +00:37:31 That's pretty cool. And then you've got your whisper processor. Definitely have written some whisper processing code lately. This uses a class, not a function. And the reason for this is that, + +00:37:44 as you might have experienced, like loading whisper might take a little bit of time. + +00:37:47 Yes. + +00:37:47 If you scroll to the right on this. Okay. So here we don't use it, but like, you can also move the whisper model onto a GPU. And the way you would do that is you set on the bottom, and you just use like, you know, number GPUs equals one. + +00:37:58 Right here, it says device equals CPU, but yeah, but you could put GPU here, huh? + +00:38:02 You could. And also in map badges, you would put the map like GPU, whatever. + +00:38:07 Yeah. + +00:38:07 What's happening is that as you are doing the execution, what we will do is we will spawn a bunch of these classes across on different processes on the cluster. And so they'll be + +00:38:19 able to like preload the model, and then you can send data to this class, and then it will call the double under call. And then you have this basically like operator that streaming data in and out. + +00:38:31 I have something very embarrassing to admit, which is these double underscore methods. I always knew they were called dunder methods, but I didn't know that it's because it's like double underscore. + +00:38:41 I just put that together when Richard said double under. I've been using Python for like, you know, well over a decade and I never put that together. + +00:38:49 You know, what's really interesting, because I have to talk about so much of the stuff that is written and yeah, I've certainly gone through stages where like, I'll get a message, Michael, not like that. They say it like this. Like really, but how are we supposed to know? There are so many + +00:39:02 projects. I mean, dunder doesn't necessarily fall under this, but there's a lot of open source projects that could be pronounced so differently, so many ways. And I've seen a few that will have an MP3 file or an audio file that says, this is how it's pronounced. Press play. You know what I mean? + +00:39:17 Yeah. I'm right there with you. Amazing. One thing I wanted to cover with that. So that num GPUs thing is like really powerful. This is kind of like one of the core like powers of Ray. So this means that + +00:39:28 like, you know, if you think about this pipeline, right, we had first, we're kind of like chunking up the data and reading it across a bunch of processes in the cluster. So that's like a like IO bound + +00:39:38 operation. And then we had some kind of like pre-processing logic where we were like transforming those audio files, which is like a CPU bound operation. And then now we're doing this like + +00:39:48 GPU step, which here it's like this whisper preprocessor, or it could be any kind of like ML model inference or anything that runs on a GPU. So you have these like kind of very different + +00:39:59 like compute profiles, like the IO bound, the CPU bound, the GPU bound. And Ray, like the thing that makes it so powerful is that you can express this in like one program. And then you can also like + +00:40:10 efficiently use all of those resources. Okay. So maybe I've got five GPUs, but I've got a whole bunch of cores on each machine. Would it maybe make different choices about how it scales, given the different resources, like thinking about GPUs or versus CPUs? + +00:40:25 Yeah, that's exactly right. So you would, you know, maybe you need like four CPUs per GPU to like keep the GPU busy. So Ray data will, will basically do that kind of auto scaling itself in order to like + +00:40:37 keep the GPU as busy as possible. And this Ray data, it says Ray, a raw DS. + +00:40:42 That's a data set. Yeah. Data set. Is this have any analogies or sort of similar APIs to like DASC or not DASC to Polars, Polars or Pandas or any of these other, does it try to pretend to be one + +00:40:57 of these other things or is it just its own library? So the way you would do like a data frame library, I think would heavily index on the interactive experience. And that's not something that we + +00:41:10 focus so heavily on. In fact, like there's oftentimes where like, and also the other thing is like all those libraries, they will like DASC and Polars and Pandas and so on. Like they will focus a lot on + +00:41:24 TABUO data. And I think that's like, that's important, but it's not like our strong suite. + +00:41:31 Like our, the thing I think we would want to be 10x better is, is being able to do this sort of like heterogeneous compute and being able to orchestrate like very complex pipelines very simply. Whereas, + +00:41:43 and then like come back and sort of improve and make the tabular support like just on par and usable. + +00:41:50 I think that makes a lot of sense. It absolutely does. I guess maybe the last little bit, we have to go through this whole example, but maybe the persist story is a little bit interesting. + +00:41:59 The, if you go up one more, like the, to the tab before, I think actually, this is also very interesting where we're actually using the LLM based quality filter. Okay. + +00:42:08 We're using VLM as part of the pipeline. So VLM is like optimized inference engine for LLM models. + +00:42:16 And what you can do with RayData is you can actually just say like, Hey, I just want to shove VLM into one of the stages. And I want to, you can even do like more complex parallels and you can see like, Hey, this model is like a trillion parameters. + +00:42:28 And I just want to like put it somewhere inside. And that's something that you can very easily do with RayData. Is this a open weights, local running model or is, is that something like a API call to + +00:42:39 this? I mean, you can do here in this example, it is open weights model. So you would be able to self host and you can, there's also APIs to do like anthropic calls. Yeah. That is an interesting idea to + +00:42:50 put that in the middle there. And finally, like, yeah, writing out, you can write out to any source storage of like S3, NFS, so on and so forth. It's useful for like the data transformation tasks. + +00:43:01 This again, well, it's not like you're pulling all the data to one process and then writing it's like a distributed kind of partitioned, right? To the same file or to a set of files? + +00:43:10 To a set of files. Yeah. That makes sense. That seems a lot easier to coordinate like they just have. Yeah. Otherwise you'll have problems. Yeah, exactly. A bit of a race condition or something. + +00:43:19 Okay. This is super neat. I think this is a cool way to start writing the code, but then you've got to, you know, visualize it, right? See what's going on. So you have a dashboard, which is pretty cool. + +00:43:30 I'll scroll down and try to find some pictures of the dashboard. There's some, there's nice videos here as well, but it gives you, tell us about the dashboard. It gives you a lot of views into what's happening. The first thing I'd say is like, you know, the mission of Ray is sort of like make + +00:43:42 distributed computing easy. And I think anyone who's ever written like a multi-node, like application of any kind knows that like observability and debugging are like one of the core problems + +00:43:55 anytime that you're scaling out. So yeah, we invest a lot in this like observability tooling. + +00:43:59 So the Ray dashboard, it kind of mirrors the rest of Ray where we have sort of this like core, like parallel computing, like primitive part. So the Ray dashboard, you know, you can get like a + +00:44:10 cluster level view where you see like a summary of each node and like the resource consumption, like, you know, is it fully utilizing the CPUs and GPUs? What is running on that node? Like that + +00:44:21 kind of physical layout. But then we also have like more logical views. So what's shown on the screen now is this like task and actor breakdown. So you can see, you know, if you've submitted a thousand of a, like a read task, if you think about how that Ray data pipeline works, you're like + +00:44:36 submitting a bunch of tasks that are reading the data, you can see how many of those are running, how many have completed, if they failed, you can get like a summary of the stack traces. And then we + +00:44:46 also have some like higher level views that are specific to the Ray libraries. So you can imagine like this Ray core layer, it's really like kind of generic. So you have like tasks and actors and + +00:44:59 nodes, but it doesn't necessarily tell you about like, you know, the high level summary of what's happening in that data pipeline that we were talking about a few minutes ago. So we also have some, + +00:45:08 some high level visualizations for like surveying and training that help you understand what's happening in that. + +00:45:14 There's a bunch of different libraries that you've talked about. I don't know how much time we really have to go all into them, but you've got Ray core, which we talked about, and then Ray data, which we were using to read the data, but train, tune, serve, RL for reinforcement learning. + +00:45:28 And then even more libraries. + +00:45:29 Yeah. + +00:45:31 Expanded out to more libraries. + +00:45:33 One like high level comment is, I think Richard kind of mentioned this earlier, but like one of the things that we've really invested in a lot is like building this ecosystem around Ray. We want + +00:45:42 people to feel like Ray is not just a tool for like one workload. It's really something you can like build a platform around. So if you're doing any kind of like a large scale, like machine learning + +00:45:53 or AI, you know, Ray is, it's like, if you kind of build the infrastructure or like you use managed infrastructure for like the cluster setup and all that stuff. And then the people who are actually + +00:46:04 like writing the applications are like really empowered because they can write just like Python scripts to do all these different types of use cases from like training, the tuning to RL + +00:46:14 to data processing. So yeah, we see, I think it's very common that people who are using Ray are not just using one of these libraries. They're really kind of using a slew of them or maybe even all of them. + +00:46:25 I do think it empowers people quite a bit. Like write code, kind of like, you know, but call a Ray function instead. And then guess what? It's distributed across a bunch of machines, which is a really hard problem to solve. One of the extra libraries that's cool is the multi-processing pool. + +00:46:40 I just saw that one. We expanded it. That's kind of cool because if you're already trying to do scale out through multi-processing, just to get advantage, take advantage of the local cores, you could just say, use the Ray util multi-processing pool and then boom, off it goes. Right. + +00:46:54 I haven't looked at this in a long time. This is something that I wrote like eight years ago or something. + +00:46:59 2020 probably. + +00:47:00 It kind of one of those, I think that would be very general purpose. + +00:47:03 It's also, I think a good like conceptual introduction to Ray because, you know, people are familiar with multi-processing and they know that they can like use it to scale out on one node. Well, then Ray is just kind of like the next step if you want to scale out across multiple nodes. + +00:47:17 One thing that I thought is really cool is also you've got a debugger and a VS Code, presumably open VSX as well, extension that you can install and like look at the cluster, look at the jobs + +00:47:30 running. If something crashes, it'll like break and wait for a debugger to attach potentially. + +00:47:35 You want to talk about that? + +00:47:36 It's kind of like if you could use PDB, but across the cluster. So you can, you can like set a break point, like inside a remote function, that remote function might be running on like a different, a different machine. And then if like an exception is raised or like, there's + +00:47:51 just something happening there that like you couldn't debug locally, then you can like attach remotely to that process. And you can, you know, you can get like a backtrace and you can inspect local variables and stuff like that. + +00:48:03 It's very useful in the cases where maybe you did like local development and everything was working fine. And then for some reason, when you like deploy to a cluster, something is going wrong. Like maybe there's one piece of data that like is behaving in an unexpected + +00:48:17 way. This kind of gives you a way to directly debug that without having to write a ton of print statements and filter through them as I'm sure many people have. + +00:48:25 Exactly. You don't, you don't have to like print step one, step two, step 2.1, step 2.2, step 3. Like, cause you had to insert some more like to like break it down. + +00:48:36 The step 2.2.3.a has saved me a lot of times in my life though. + +00:48:41 I mean, it's like basically a bisection algorithm to find the problem, but like the, it's like having to go to and do the line numbers and basic eventually you just need to leave a gap. + +00:48:51 But it is really nice to use in VS Code because it gives you nearly the same debugger experience as you would get just for like a regular debugger. I saw a YouTube video about this and the question that somebody said, Hey, is there a PyCharm version of this? + +00:49:04 Is there a PyCharm version of it or just, just the VS Code derivatives? + +00:49:08 I think it's only VS Code, but Hey, we're always looking for contributors. It's probably not, it's probably not that hard to extend. It's just a, as you can see from the number of libraries over there. The Ray team is quite busy. Let's talk real briefly about the ecosystem. + +00:49:22 We're getting a little short on time, but what is this ecosystem compared to like all of your tools? + +00:49:27 So integrations with say like Airflow, Apache Airflow, or even Dask, which is kind of interesting that it integrates with Dask. And so what's the story with this? + +00:49:36 I think there are two aspects to integration. Actually, I'm reminded, I need to update this page. + +00:49:42 There's like projects where you want to interoperate with Ray. So they sit side by side or like, it's like a complimentary tool. Airflow is an example of that. Like Dask would be like something + +00:49:55 where you can do a lot more of your data processing on the side and then, and then Ray stuff on the other side. Flight would be like another, so, you know, workflow or automation, you would like use + +00:50:05 that with Ray, but not like in Ray or around Ray. Whereas like there are other projects that are built on top of Ray. So like Moden that you just saw, Daft, these are libraries that, that leverage Ray + +00:50:18 and to, to orchestrate and scale. And there's like a separate API and Ray isn't necessarily exposed as the API to the users. So yeah, so I think that's something that is particularly like lively, especially + +00:50:30 now in the reinforcement learning and multimodal data processing space. Frankly, I'm looking through this, like a lot of these projects have sort of like gone, gone, like have sort of evolved or like, + +00:50:42 have like lost their community. And I think there's a, actually a massive Ray ecosystem that isn't represented on this, this screen here that is like actively building on top of Ray. + +00:50:52 All right. Well, just give you some homework. There you go. + +00:50:54 Yeah. Richard kind of mentioned this, but the way I think about it is like, kind of like things above Ray and things below Ray. So like above Ray is like the, like higher level libraries, like the reinforcement learning library, data processing library. And then below Ray is like + +00:51:08 integrating Ray into like the different infrastructure. So like with Airflow and the cube Ray, and basically like allowing you to run Ray on top of like any type of like hardware cluster + +00:51:19 management solution. So we really like try to view Ray as this kind of like, like if people, I don't know if I'm dating myself, but you know, in the internet model, there's like the narrow waste, right? Which is like TCP IP. So we view Ray as kind of like the narrow waste of the like AI, + +00:51:33 like distributed computing ecosystem. + +00:51:35 One more thing. I think we're, we've got time to talk just a little bit about the business model. + +00:51:40 So over on Ray.io, I can see that I can go to like GitHub or go to the docs, but also you've got AnyScale, which lets you basically is the infrastructure behind running Ray, right? Is that + +00:51:53 this sort of the business side of Ray? + +00:51:56 AnyScale is a company, but also a product. So for example, like Ray is like a software library that you can run, but there is a lot of, if you're sort of deploying Ray for you're like an internal + +00:52:08 platform for a company, like there's still a lot of other bells and whistles that you'll, you'll sort of want. So for example, like being able to have a fast interactive development, + +00:52:18 being able to optimize, like the time takes for the workloads to start up, having great observability and debuggability and being able to sort of like share resources across different teams within, + +00:52:31 within like across different Ray jobs. And, and then also being able to optimize your Ray workloads. + +00:52:37 So these are all like features and capabilities that you'd get with AnyScale. And, and yeah, and then also like, you know, support, being able to sort of deploy and manage and upstream fixes to + +00:52:48 Ray that sort of help your, your enterprise, like in your company achieve a goal of needs for your machine learning platform. That's like a lot of stuff that we do. + +00:52:56 You know, I think this is one of the core ways that people are making open source stuff, their business, right? Like we built you a great library, but there's this whole operational side of it that you maybe either don't want to do, or you don't have a bunch of servers or whatever. + +00:53:10 And we'll just, for a price, we'll just take care of that. Right. + +00:53:12 There's like a couple of ways that you can go. Like, so one thing I want to, I want to say is that having a company, like a successful company behind Ray is like critical for its health. Like, there's no way that we could have, that we could have built like as many of the + +00:53:27 libraries and like funded as many of like the ecosystem integrations. And like, I mean, just built something with as big of a scope as Ray, if we didn't have like a company backing it, like paying as many people to work on it as were. And yeah, I think there's like a few different + +00:53:40 ways that you can go about this, like kind of open source monetization thing. Like AnyScale model is, is largely this, yeah, like managed infrastructure and like the hard parts around it. You know, there's some people that also kind of go for the more like support expertise model. I think that + +00:53:54 could work if, you know, if you really want to like stay small, like if you have a smaller open source project, it's just a couple of people. And like, you know, you're trying to make enough money to survive and keep working on that project. Then honestly, I think that's the easier route + +00:54:06 than trying to build a whole managed product because it's, it's not easy. + +00:54:10 It's kind of a, kind of just a consulting story. This, this other side you're talking about is like, I will be your X open source project, X consultant. And guess what? I created it. So I'm, who else is going to be better? You know what I mean? + +00:54:23 That's very real. Like if I would recommend like a lot of open source people, like consider that, even if it's just the, like the start of something is like, that's the way that you really like engage with people and understand their problems and like understand where the business value is. + +00:54:37 A hundred percent. Let me ask you one more tech oriented question before we call it. + +00:54:41 What about deployment? I have 10 servers in my cluster. I changed one line in my code and I want to try it now. Now what? How hard is it to get it to update everywhere? + +00:54:51 So that is something that we, that I personally spent a lot of time working on. I think that Ray actually has a very good story for it. So there's like, there's kind of a tiered approach. So it sort of + +00:55:02 depends. Like obviously if you're changing, like if you need a different, like, like CUDA version or something, then that will require you to basically like redeploy the cluster. But that's something that happens like pretty seldom. Like, you know, maybe you do that every couple of months, + +00:55:16 something like that. If you're just changing, like, you know, in Ray, you have this like driver script, which is the main like orchestration code. So if you're just changing that, and that's like kind of what you're iterating on, like more frequently, then you can just change like that code + +00:55:30 inline. And then when you submit the job or like connect to the cluster, Ray has this thing called runtime environment, which includes basically auto packaging your local code. So what it does is it + +00:55:40 actually just like zips up the local files, uploads them to like a coordinator process in the cluster. + +00:55:46 And then when you go to actually run the tasks and actors that require that code, they have like kind of a, an internal ID that points to it, and they'll pull it down. So that means that you can, like, if you're just editing your script and rerunning, it's a matter of like less than + +00:56:00 one second to update. Oh, that's nice. Yeah. Yeah. That's a huge productivity gain. + +00:56:05 Yeah. I was thinking this must, the more you scale out, the harder it's going to be as well. Right? + +00:56:09 Yeah. And if you need to wait for a hundred nodes to pull a Docker image, every time you change one line of code, you're going to have a bad time. That makes me think of one more real quick thing is, so I have a job that's running. Maybe it takes 10 minutes. I make a change three minutes after + +00:56:23 submitting it, a new version gets deployed. What's the story with versioning running workflows? + +00:56:28 That's something where, that we kind of like leave to the outside of Ray layer. So a lot of people have different ways to do that. Like if you're running on Kubernetes, like maybe you're like checking in + +00:56:39 your CRD into your like repo, or maybe you're using something like Apache Airflow. So we kind of leave that to like the orchestration layer. Like inside of AnyScale, we have a concept of like an AnyScale + +00:56:49 job, which is sort of the code artifact and like the cluster configuration and you're like infrastructure configuration. So that's like inside of AnyScale, that's kind of like the unit + +00:56:59 of like reproducibility or versioning. And yeah, folks basically build that kind of on top of Ray. + +00:57:04 Well, very cool project, Richard and Edward. Thank you both for being here. How about a final call to action? People are interested. They want to get started with Ray. What do you tell them? + +00:57:13 Go to the Ray website and try it out. + +00:57:15 Check out the documentation. We've got a whole lot of examples. + +00:57:17 Awesome. + +00:57:18 Yeah. I would say any kind of machine learning workload or, or just general, like parallel Python, like just give it a spin. Amazing. Well, thanks for being here and talk to y'all later. Thank you. + +00:57:27 Thank you. + +00:57:29 This has been another episode of Talk Python To Me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show. This episode is sponsored by Sentry's Seer. If you're tired of debugging in the dark, give Seer a try. There are plenty of AI tools that + +00:57:43 help you write code, but Sentry's Seer is built to help you fix it when it breaks. Visit talkpython.fm/sentry and use the code talkpython26, all one word, no spaces, for $100 + +00:57:54 in Sentry credits. What if your AI agents worked like FastAPI microservices, typed, autonomous, and discovering each other at runtime? That's the world Agent Field is building. Join them + +00:58:05 at talkpython.fm/Agent Field. If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, + +00:58:18 Flask, Django, HTMX, and even LLMs. Best of all, there's no subscription in sight. Browse the catalog at talkpython.fm. And if you're not already subscribed to the show on your favorite + +00:58:29 podcast player, what are you waiting for? Just search for Python in your podcast player. We should be right at the top. If you enjoy that geeky rap song, you can download the full track. + +00:58:38 The link is actually in your podcast below or share notes. This is your host, Michael Kennedy. + +00:58:42 Thank you so much for listening. I really appreciate it. I'll see you next time. + +00:58:46 Bye. + +00:59:16 Bye. + diff --git a/transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt b/transcripts/547-parallel-python-at-any-scale-with-ray-transcript.vtt similarity index 100% rename from transcripts/parallel-python-at-any-scale-with-ray-transcript.vtt rename to transcripts/547-parallel-python-at-any-scale-with-ray-transcript.vtt From 60c53d5915654f16a089060b6c3393e3d11cc5ce Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 6 May 2026 14:24:18 -0700 Subject: [PATCH 15/16] transcripts --- ...ent-sourcing-with-chris-may-transcript.txt | 2140 +++++++++++ ...ent-sourcing-with-chris-may-transcript.vtt | 3211 +++++++++++++++++ 2 files changed, 5351 insertions(+) create mode 100644 transcripts/548-event-sourcing-with-chris-may-transcript.txt create mode 100644 transcripts/548-event-sourcing-with-chris-may-transcript.vtt diff --git a/transcripts/548-event-sourcing-with-chris-may-transcript.txt b/transcripts/548-event-sourcing-with-chris-may-transcript.txt new file mode 100644 index 0000000..9d2e5b5 --- /dev/null +++ b/transcripts/548-event-sourcing-with-chris-may-transcript.txt @@ -0,0 +1,2140 @@ +00:00:00 What if your database worked more like Git? + +00:00:02 Every change captured as an immutable event instead of a single mutating row that quietly forgets its own history. + +00:00:09 That's event sourcing. + +00:00:10 And Chris May is back on Talk Python, fresh off our Datastar panel, to walk us through what event sourcing actually looks like in Python. + +00:00:18 We'll cover core patterns, the libraries to reach for, when not to use it, and why event sourcing turns out to be a surprisingly good fit for AI-assisted coding. + +00:00:27 This is Talk Python To Me, episode 548. + +00:00:29 Recorded May 5th, 2026. + +00:00:47 Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:54 This is your host, Michael Kennedy. + +00:00:56 I'm a PSF fellow who's been coding for over 25 years. + +00:01:01 Let's connect on social media. + +00:01:02 You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:05 The social links are all in your show notes. + +00:01:08 You can find over 10 years of past episodes at talkpython.fm. + +00:01:12 And if you want to be part of the show, you can join our recording live streams. + +00:01:15 That's right. + +00:01:16 We live stream the raw, uncut version of each episode on YouTube. + +00:01:20 Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:24 Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:29 This episode is sponsored by Sentry's Seer. + +00:01:32 If you're tired of debugging in the dark, give Seer a try. + +00:01:35 There are plenty of AI tools that help you write code, but Sentry's Seer is built to help you fix it when it breaks. + +00:01:40 Visit talkpython.fm/sentry and use the code Talk Python26, all one word, no spaces, for $100 in Sentry credits. + +00:01:49 And it's brought to you by Temporal, durable workflows for Python. + +00:01:53 Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +00:02:00 Get started at talkpython.fm/Temporal. + +00:02:04 Hey, Chris. + +00:02:05 Hey there. + +00:02:05 How's it going? + +00:02:06 I'm well. + +00:02:07 How are you? + +00:02:08 Oh, good. + +00:02:08 Good to hear it. + +00:02:09 I'm happy to have you back on the show. + +00:02:11 Last time we had you on the show, you were part of the panel around Datastar, and that was cool. + +00:02:16 Now we're going to talk about event sourcing, but we'll find a way to tie it back to Datastar just a bit, I think. + +00:02:21 I see it on the horizon. + +00:02:23 Or on the show notes. + +00:02:25 Sounds good. + +00:02:25 One of these. + +00:02:26 Yeah. + +00:02:26 Well, not everyone listens to every episode. + +00:02:28 We've got new listeners coming all the time, so give us the quick intro. + +00:02:32 Who are you, Chris? + +00:02:33 Who am I? + +00:02:33 Let's see. + +00:02:34 I am a Python developer for about 20 years and a long-time listener to the show, so it's a very big privilege to be on, finally, my own self. + +00:02:43 And by the way, great job continuing going, Michael. + +00:02:46 You're constantly putting out great content, so I really appreciate all your work. + +00:02:50 But as far as me, I learned the program as an adult. + +00:02:53 Friend suggested I learn Python. + +00:02:56 I hitched my wagon to this engine, and I've loved it ever since. + +00:03:01 I was a technical coach for a little while. + +00:03:04 I started the Python group here in Richmond, Virginia, PyRVA. + +00:03:09 So if you're local, come out. + +00:03:10 In fact, we're just two weeks away from the next meeting. + +00:03:14 Awesome. + +00:03:14 How frequently do you have meetings? + +00:03:15 Once a month now. + +00:03:16 In fact, if you look behind me, you can see all these blue dots are the meetings. + +00:03:20 Okay, excellent. + +00:03:22 Keep planning. + +00:03:22 What are some of the kind of topics you all have? + +00:03:24 Oh, man. + +00:03:24 We range pretty much whatever we can figure out that people are interested in. + +00:03:29 We've had a number of AI discussions. + +00:03:32 In fact, those have been really powerful. + +00:03:34 We've kind of lined up the chairs in a big circle and just have a discussion. + +00:03:38 And it's really incredible. + +00:03:40 You know, you have people on the whole spectrum of opinions about AI. + +00:03:47 So, yeah, it was very, very... + +00:03:49 That was one of my favorite... + +00:03:50 I was going to say episodes. + +00:03:52 One of my favorite meetings. + +00:03:54 I think we might bookend this podcast episode with a little AI at the start, a little AI at the end. + +00:04:00 Sounds good. + +00:04:01 Cool. + +00:04:01 So, yeah, people will just go attend that. + +00:04:04 And I guess probably a meetup.com is where they find you. + +00:04:07 It is, yeah. + +00:04:08 And you can go to pyrva.org to get to the meetup page either way. + +00:04:12 Yeah, easiest way. + +00:04:13 Because you can find a lot of crazy stuff. + +00:04:14 You'll probably find people with pet pythons that want to meet up in Virginia as well. + +00:04:18 And it's not the same thing. + +00:04:20 Indeed, yeah. + +00:04:21 Thankfully, and we are, gosh, over 10 years old. + +00:04:24 So we are one of the first ones to pop up just because of age, I think, sometimes. + +00:04:28 Although, honestly, some of the newer ones pop up ahead of us sometimes, too. + +00:04:32 So, you know. + +00:04:32 It's such a contentious topic, this AI stuff. + +00:04:36 You know, talking about your meetup. + +00:04:37 How did that go? + +00:04:38 Were people frustrated? + +00:04:39 People excited? + +00:04:40 Yeah, it was kind of the whole spectrum. + +00:04:42 You know, I felt like it was great because everybody respected each other. + +00:04:45 And there were, I'd say, two people anchored on each of the sides. + +00:04:51 You know, two people who, actually, maybe there were three people who, you know, have bought their own hardware and are really going deep into AI and, you know, encouraging people to, like, really dig into it. + +00:05:01 And we had two people who were very AI skeptic and very, you know, worried about the environment, you know, all sorts of different things. + +00:05:10 And so, I thought it was a very healthy, very good discussion. + +00:05:14 You know, I walked away with, you know, a healthier respect for both sides. + +00:05:20 Some ideas that I incorporated into my work, especially now that the company I work for gives us a mandate to use AI to write every piece of code. + +00:05:28 So, that's been a very fascinating transition for me. + +00:05:32 But it also gives me, you know, a little bit of agency or a little bit of permission to experiment and see what to do, you know, how to actually function in this new paradigm. + +00:05:44 It is engineering. + +00:05:45 And we'll talk about it more later, but it's quite wild that your company's, that is the position. + +00:05:50 I don't necessarily think it's the wrong position. + +00:05:52 I think it might be the right one, but it's, I think it's got to be brought on from a, from a, this is a skill you need to learn, not just throw stuff at the chat bot and now that's your job. + +00:06:03 Like, these are not the same things, you know? + +00:06:04 But I think a lot of people do, especially when they're getting started, treat them as the same thing and then say it doesn't work and then they're frustrated. + +00:06:10 Yeah, totally. + +00:06:11 Yeah. + +00:06:11 Yeah. + +00:06:12 Yeah. + +00:06:13 All right. + +00:06:14 More of that. + +00:06:15 We'll speak further about certain things, but yes, I agree. + +00:06:19 Absolutely. + +00:06:19 Absolutely. + +00:06:20 All right. + +00:06:20 Well, we'll, we'll come back to that and, and get there, but let's talk about a little bit of this event sourcing thing. + +00:06:27 Yeah. + +00:06:28 What exactly is event sourcing? + +00:06:30 That's a question. + +00:06:30 Okay, cool. + +00:06:31 I wasn't sure if you had more. + +00:06:32 I saw your. + +00:06:33 No, no. + +00:06:33 I think I, when you, when you reached out and said, Hey, let's talk about this. + +00:06:36 I'm like, I, so let's take a step back. + +00:06:38 Okay. + +00:06:39 Design patterns, refactoring ideas, like all of these architecture. + +00:06:44 I love to talk about this stuff. + +00:06:45 I love to think about this stuff. + +00:06:47 I think also, maybe that's a why, one of the reasons I'm not super frustrated with the AI things because I will tell it, I want you to use, like you could say like, I want to build a system with event sourcing and it's going to work this way. + +00:06:57 Like for me, the fun is like, Oh, I got to build some of the event source and it clicks together with this and it does that. + +00:07:02 And like, yeah, I didn't have to do all the little checks and details and like the file IO, but whatever. + +00:07:07 Like I can skip that and build just like, think a little bit bigger and a little more, more big building blocks. + +00:07:12 So I'm a big fan of design patterns and paying attention to what you're doing, I can tell that you are too. + +00:07:18 Yes. + +00:07:19 Especially event sourcing, I would say. + +00:07:22 So I have been following the topic of event sourcing for over a decade. + +00:07:27 I listened to a podcast in with a couple of developers in the PHP realm and they were talking about event sourcing and it just inspired me. + +00:07:36 specifically the things that I really loved. + +00:07:39 Before I was a programmer, I was a graphic designer. + +00:07:42 And so creating websites with exceptional user experiences is something that just makes me excited. + +00:07:48 And at the time, I was working at a creative agency building websites that would just slow, get slower and slower and slower, the more data we put into it or, you know, the more we configure the routing or the navigation and whatnot. + +00:08:01 And so the thought of never having a slow page load again was intoxicating to me. + +00:08:08 However, that was in a portion of my career where I was, had so much imposter syndrome, having learned to program in as an adult and I just felt like I couldn't suggest to my lead developer we should lean into this. + +00:08:23 And even when I was a lead developer, I just didn't feel confident that I could lead a whole team into like a redesign of the code. + +00:08:32 But a couple years ago, I got laid off and I was like, you know what? + +00:08:35 I've been wanting to explore event sourcing for 10 years. + +00:08:39 I'm going to do it. + +00:08:40 And it turns out it's a lot easier than I thought. + +00:08:43 But anyway, all this to say, let me define event sourcing since I've kind of danced around it. + +00:08:47 Event sourcing is how, has to do with how you save data to the database. + +00:08:52 The best way to contrast it is most apps are kind of CRUD based, right? + +00:08:56 So you create a table, like let's say you're a shopping cart application. + +00:09:00 You have a table called carts and you have a bunch of columns and one of the columns has data for the product IDs that are in the cart. + +00:09:10 And so if you have that kind of situation and you have one user who adds five items to the cart, then removes two and then checks out. + +00:09:20 And then you have another user who adds just three items to the cart and checks out. + +00:09:23 Well, if they check out at the same time and you look at the two database rows, they are very similar. + +00:09:28 They're the same user, basically. + +00:09:30 They get put in the same cohort from your marketing side, right? + +00:09:34 Yeah. + +00:09:35 Both of them checked out exactly the same products. + +00:09:39 And if you look at the database, you'd have no idea that one of them removed two items from their shopping cart. + +00:09:44 So event sourcing, so the reason that is because CRUD based applications will mutate state in the database to ensure it's always up to date with the current state and optimize for data integrity. + +00:09:58 Event sourcing, on the other hand, captures every change that happens within the system. + +00:10:04 So you would have an event of like cart created, item added to cart, item added to cart. + +00:10:10 You'd have five item added to cart events and two removals. + +00:10:14 And so you could have the whole history of how each user interacts with your system in the database. + +00:10:19 That is essentially the core of event sourcing. + +00:10:22 Everything else is actually adding more design patterns onto event sourcing, which is really wonderful. + +00:10:27 But that is what event sourcing is. + +00:10:29 Right. + +00:10:29 So your database sort of becomes almost an audit log. + +00:10:33 It feels very source control-ish. + +00:10:35 It feels like Git, right? + +00:10:36 Like what you store is the file and then the diff, then the diff, then the diff. + +00:10:41 You get back to the file by running all the operations on it, potentially. + +00:10:45 Yeah, exactly. + +00:10:46 Yeah, that's exactly what it is. + +00:10:47 And it's funny because like, I don't know about you, but like when I first heard about this, I'm like, well, isn't it slower? + +00:10:52 Because it just seems like you're doing so much more work. + +00:10:54 Every time you're updating the cart, you're pulling out all the events from the event store and building up the state of where it's at right now and saying, okay, you know, does that cart exist in the, does that item exist in the cart? + +00:11:06 Can I remove it? + +00:11:07 Okay, let's remove it. + +00:11:08 You know, and it turns out computers are fast. + +00:11:12 And so it's negligibly slower depending on the query. + +00:11:17 Yeah. + +00:11:17 So I saw you and Bob Belderbuss talking about this on YouTube and that was one of my first thoughts too. + +00:11:22 It was like, this is super cool, but it's kind of sounds a lot slower to answer questions. + +00:11:27 Yeah. + +00:11:28 And I think, but then I thought about it and I thought, I think there's actually a decent amount of stuff that you can do to make it quite a bit faster. + +00:11:36 Right. + +00:11:36 So let me throw some ideas out to you and then you tell me how they land as somebody who's actually done this stuff. + +00:11:43 First of all, back to your comment on computers are fast. + +00:11:46 Computers are so fast. + +00:11:47 They're so much faster than people realize how fast they are and databases are fast too. + +00:11:51 If you put indexes on them, then they're so much faster. + +00:11:55 I just don't understand how websites are slow. + +00:11:58 It's just, it's not just, oh, I'm a little frustrated. + +00:12:01 It's like, how is it possible that somebody built this and accepted that it takes three seconds to load this page and yet they do and you just know that it's most likely that there's not a database index somewhere or maybe on an extreme situation there should be better caching, + +00:12:16 but it's just like, ah. + +00:12:18 So like when done right, you're right, it's absolutely blazes, right? + +00:12:21 Yeah, absolutely. + +00:12:22 But how, you could make it faster. + +00:12:25 So a couple of thoughts that came to mind. + +00:12:26 One was, I have three. + +00:12:29 The one is you could have a operational database which has, for a particular user, it might have the five ads, two removals and that's their shopping cart and the way you get that is either you query it and then code you add it up or you do some kind of aggregation thing + +00:12:43 that says get all these things and then somehow plus minus them together, right? + +00:12:48 Yeah. + +00:12:49 Probably in the shopping cart it feels like you probably need to actually pull them back. + +00:12:53 But still, you could do that super quick and then just run that bit of code and every time you make a change you could also write that to a second table, second database that doesn't have the, it just has the current state. + +00:13:04 I just have three for that user shopping cart. + +00:13:06 That sounds okay to me but it's, I mean databases hate duplication. + +00:13:10 That's like kind of their third normal form job. + +00:13:13 Do databases hate duplication or is it people that don't like it? + +00:13:18 Yeah, that's fair. + +00:13:19 That's fair. + +00:13:19 I mean they were built to avoid duplication, right? + +00:13:21 So in that sense. + +00:13:22 So, but another, you know, another possibility would just be something like Valkey. + +00:13:27 You could, I could have said Redis but I'm a fan of Valkey over Redis. + +00:13:31 I find this is like a little bit nicer project. + +00:13:33 Are you familiar with Valkey? + +00:13:35 I am not and I'm taking notes mentally right now. + +00:13:37 Yeah, here we go. + +00:13:38 So Valkey, or if I find its repository which is hiding down here, it describes itself. + +00:13:45 Where does it describe itself? + +00:13:47 So a fork of the open source Redis project right before the transition to their new source available but not open source models. + +00:13:54 So if you have say the Redis Python library, you just tell it that it talks to this thing, this endpoint, this IP address and port and it thinks it's Redis, right? + +00:14:03 But it's like more open source friendly. + +00:14:05 I'm going to star it for the world. + +00:14:06 So, and it's got 26,000 stars, right? + +00:14:08 So it's pretty popular. + +00:14:09 Not that it's totally really relevant exactly which version this is but just having a cache that has that information of what is, you know, shopping cart is three, period. + +00:14:20 Yeah. + +00:14:20 And put something, just put two pieces in place. + +00:14:23 One, when you do a query, first you check the cache. + +00:14:25 If it's not there, you get the playback, you compute it and you put it in the cache. + +00:14:30 And if you make any change to that thing, then you invalidate the cache, right? + +00:14:33 Those two things would give you basically seamless ephemeral answers to the questions that if you ask them very often, you get super quick responses, right? + +00:14:42 Absolutely. + +00:14:43 Yeah. + +00:14:43 And then the other one is, if you, you know, I always find like these extra caching servers are honestly not necessary. + +00:14:49 You know, we already talked about FAST. + +00:14:50 So I'm a big fan of disk cache, which I had Vincent Warmerdom on to talk about and disk cache is awesome. + +00:14:57 So you could just have like a local file store on your Docker image or volume or whatever. + +00:15:02 Then you get the same thing, but you don't have to have the infrastructure, right? + +00:15:05 There's, there's different options. + +00:15:06 So basically, I guess to sum this up, I'll throw it to you now. + +00:15:09 Operational database that is also has the answers or some kind of caching story. + +00:15:14 It could be in memory, it could be a server like Valkeyrie. + +00:15:17 It could be the disk cache, whatever. + +00:15:19 What do you think about those? + +00:15:20 Are you guys all exploring them? + +00:15:22 I am exploring two, essentially, yeah, I guess you could say all three in a way. + +00:15:27 I'm, I'm experimenting with NATs to do kind of the, I guess, Redis kind of side of it. + +00:15:31 But all that to say is all three are viable options depending on your use case. + +00:15:36 The first thing is like just the event store itself. + +00:15:39 Like you, getting the current state of any individual item should not take, you know, essentially should take milliseconds if, if that long. + +00:15:48 One of the, so there are two people that I kind of, who are my North star for all of my event sourcing knowledge. + +00:15:54 they are Martin Dilger and Adam Dimitriuk. + +00:15:57 And Martin wrote a book called Understanding Event Sourcing, which was huge to helping me go from somebody who created an event sourcing project to adopting event sourcing as my kind of default strategy. + +00:16:12 And in, in the book, he mentioned that he will tend to use the event store for essentially, like, if you start hitting 2000 events in a stream, then he'll think about optimizing it, you know, changing, going to one of the alternate + +00:16:26 approaches that we just mentioned. + +00:16:28 So, like, that's really impressive. + +00:16:30 I, he uses Java or one of the Java derivative languages. + +00:16:34 So, chances are, it's slower in Python. + +00:16:36 And so you need, you know, that we need to adjust that number for Python. + +00:16:41 But honestly, we should have shorter event streams anyway. + +00:16:44 2000 seems like a lot. + +00:16:46 It does. + +00:16:48 Seems like a lot. + +00:16:48 This portion of Talk Python To Me is brought to you by Sentry and Seer AI. + +00:16:54 There are plenty of AI tools that help you write code, but Sentry Seer is built to help you fix it when it breaks. + +00:17:01 The difference is context. + +00:17:02 Seer isn't just guessing based on syntax. + +00:17:05 It's analyzing your actual Sentry data, your stack traces, logs, and failure patterns. + +00:17:10 Because it has the full context, it can, A, spot buggy code in review and help prevent issues before they happen, and B, identify the root cause of production errors. + +00:17:20 It can even draft a fix and hand the work off to an agent-like cursor to open a PR for you. + +00:17:26 Seer turns Sentry into a complete loop. + +00:17:28 You have your traces, errors, logs, and replays to see the problem and now AI to help solve it. + +00:17:33 Join millions of devs at companies like Claude, Disney+, and even Talk Python who use Sentry to move faster. + +00:17:39 Check them out at talkpython.fm/sentry and use code talkpython26, all one word, for $100 in Sentry credits. + +00:17:49 Thank you to Sentry for supporting Talk Python. + +00:17:52 Kind of the first fallback for me would be what Martin and Adam call the read model, which is essentially one way you can make a read model would be the database-backed read model where you have code that subscribes + +00:18:05 to only the events it cares about. + +00:18:08 And so whenever an event comes in, it'll incrementally update that database cache or your file cache or Redis cache or whatever. + +00:18:14 And then I guess the third thing that I appreciate is using Redis or in my case, Nats, is when you have a front-end, a very high-frequency, high-updating web UI or something + +00:18:28 that you want to really make sure that the user has up-to-date information, I would lean towards that having ways to push down to the client. + +00:18:36 Yeah, anytime you've got a live stream, it seems perfect, right? + +00:18:39 Yeah, absolutely. + +00:18:40 Yeah, I mean, hook up some JavaScript or hook in some textual or whatever it is you're trying to do and just say, or just write, you know, arbitrary web sockets or service and events and just say, when this changes, send me the Delta and then we'll adjust. + +00:18:53 Yeah. + +00:18:54 Which is pretty, pretty nice. + +00:18:55 Yeah. + +00:18:55 So there's a book that you recommended by one of the guys you mentioned, Martin Dilger. + +00:19:00 Yeah. + +00:19:01 Tell people about that real quick. + +00:19:02 Yeah, this is incredible to me. + +00:19:05 He realized he needs, there's a gap. + +00:19:08 One of the biggest problems with event sourcing is that the material is sometimes there's gaps in where there's good material and because event sourcing came out of the domain driven design community, there's a lot of jargon that you have to kind of get through. + +00:19:22 And they do like their jargon in the DDC space. + +00:19:25 They really do. + +00:19:27 And it, once you understand it, it makes perfect sense. + +00:19:29 But yeah, yeah. + +00:19:30 Getting onboarded does, does take. + +00:19:32 Gets your bounded context working and then off the races. + +00:19:35 Yeah, exactly. + +00:19:36 And so Martin essentially saw this gap in knowledge and was like, I need to fill this with an e-book. + +00:19:44 And I can't remember, I want to say in like two months he wrote this thing. + +00:19:47 And it's so amazing because it introduces the way that the two of them work. + +00:19:52 Like they both independently kind of came to similar conclusions, which is to use event sourcing as your base, but to also leverage two other or one main other, whatever, a couple of other patterns, vertical slice architecture, CQRS, which is kind of that idea + +00:20:07 of like having those read models ready to, you know, optimized for you to download and use. + +00:20:14 And then using a documentation technique called event modeling diagrams. + +00:20:18 And that, that is a huge key too because as someone who has been on a couple teams to do the event-driven transition to try to really help, you know, do more asynchronously, you need to have a good communication pattern + +00:20:33 to keep everybody up to date on what does what. + +00:20:35 And I find that all three, especially event modeling diagram, they have refined this to make it simpler and simpler and simpler to the point where there's really just a few elements put together and you can understand the whole life cycle of a, of an application. + +00:20:49 Yeah, cool. + +00:20:49 I'll link to the book on, over on Amazon. + +00:20:52 You know, another, before we carry on, I thought of another more optimized scenario. + +00:20:57 What about a document database like Mongo? + +00:20:59 Yeah. + +00:21:00 Your top level elements are just like the computed fields like total lifetime value or, you know, cart value or cart count, item count, but then have like maybe a cart item events, which could be a, a nested list + +00:21:14 of acts of like rich, you know, many documents that are like item added, item added, timestamp, value, like category, all, and actually storing them in the same record. + +00:21:24 Yeah. + +00:21:24 I haven't tried that, but I think it's a really interesting approach. + +00:21:27 You know, I actually use a document database as my fire store, as my, as my event store. + +00:21:34 And I haven't really kind of dug into like kind of the optimizations I could do. + +00:21:38 But I find it curious because like the, I mean, what you're suggesting is slightly different than how I think about it because it sounds like you have like a, like a better word, a model that you're storing all of its events in as well. + +00:21:50 And some of the neat, some of the, it's kind of inside out of what the real design pattern is, right? + +00:21:55 Well, it's not so much that as much as, one thing that I have found interesting is, you know, since this came out of the domain driven design group, everything is about an aggregate, what they call aggregate. + +00:22:07 Many people call it a model. + +00:22:08 And so you, you know, set up boundaries of this is your shopping cart and these are the events that modify the shopping cart. + +00:22:15 What has been a new movement in event sourcing is to, essentially be model less is to like focus on the events themselves because they are so flexible and so many times we as developers can kind of create + +00:22:29 boundaries around what we think are the models that, but the models change. + +00:22:35 Yeah. + +00:22:35 Yeah. + +00:22:36 And coupling is like the hardest thing. + +00:22:38 The bounded context, as I would say, actually changes because the problem you're solving might change and the models don't match. + +00:22:44 Exactly. + +00:22:44 Yeah. + +00:22:45 Yeah. + +00:22:45 So all that to say is I don't, not that I want to especially say like, I don't think that that's a bad idea. + +00:22:50 I think it could be really fascinating. + +00:22:52 especially as like a secondary approach because, you know, well, whatever, you know, like I, one of the things I really find fascinating about this is this is such a flexible pattern that people, I mean, they've done so many different ways of optimizing + +00:23:05 for their event store or anything like this. + +00:23:08 So I think that's a very much a valid approach. + +00:23:11 It's, it's, the reason it came to mind is you can atomically update documents and therefore you could atomically update both the computed value and the series of events as a single action, which is interesting, you know? + +00:23:25 Yeah, absolutely. + +00:23:26 Very interesting. + +00:23:27 And they kind of, that, that one thing becomes the source of truth for what you're tracking. + +00:23:30 I don't, there might be something there. + +00:23:32 I don't, but it does sound a little bit too focused on the model. + +00:23:35 I do think. + +00:23:36 It's worth experimenting with for sure. + +00:23:38 So just, to throw out a little street cred there, look at this, purchased April 18th, 20, 2005, the domain driven design by Eric Evans. + +00:23:47 So this is kind of the greater space, right? + +00:23:50 Maybe the book is called domain driven design, tackling the complexity in the heart of software. + +00:23:54 It's pretty interesting. + +00:23:55 I think it's kind of the follow-on of the refactoring movement that Martin Fowler and all those folks were working on, like in the late 90s, early 2000s. + +00:24:05 Yeah. + +00:24:05 Kind of in that space, right? + +00:24:06 Yeah. + +00:24:07 I must say, I haven't bought that book. + +00:24:08 The closest I've come is the Cosmic Python book or the architecture patterns in Python book that Harvey and, oh, I always forget the other guy's name, but that's such an amazing book. + +00:24:19 So that's the closest I've come to DDD. + +00:24:21 There's a video. + +00:24:24 I think, I think you gave me this video, right? + +00:24:25 Events, we're seeing explain football. + +00:24:27 Yeah. + +00:24:27 And this looks like foosball type football, not American football. + +00:24:31 Indeed. + +00:24:32 I love American football, but I do believe it's slightly misnamed. + +00:24:36 Like, you don't use your feet that much, other than the red card. + +00:24:38 Yeah. + +00:24:40 True that. + +00:24:41 I mean, it's like calling Formula One, like, foot car, because you make the car go with your foot, but it's not really the main thing of the sport anyway. + +00:24:49 Yeah. + +00:24:50 But yeah, I love this video. + +00:24:51 Yeah. + +00:24:52 Okay. + +00:24:52 Tell me about it. + +00:24:53 I'll put it in the show notes. + +00:24:54 Yeah. + +00:24:54 This is, you know, I feel like event sourcing, it's like one of those things where it's hard to explain until you get it. + +00:25:00 And then I feel like it's like a good board game. + +00:25:02 You know, if you have a board game fan in your life, they're like, oh, this is such a great board game. + +00:25:06 It's so simple. + +00:25:07 And then they start explaining it and like half an hour later, you're like, are we ever going to actually play this? + +00:25:11 I don't know that I want to anymore. + +00:25:12 But I felt like the, this person has done a really good job of kind of really distilling it down and showing like why it matters. + +00:25:19 And it's a 10 minute video, you know, five minutes at 2X. + +00:25:22 And it's, it's really kind of charming. + +00:25:24 Really good. + +00:25:25 Really well done. + +00:25:25 Okay, cool. + +00:25:25 Yeah, yeah. + +00:25:26 I'll put it on there. + +00:25:26 Can you watch videos at 2X? + +00:25:28 Oh yeah, all the time. + +00:25:29 Yeah. + +00:25:29 My daughter does that. + +00:25:30 I'm like, how do you actually take it in? + +00:25:33 I just, I'm a 1X sort of person. + +00:25:34 I do slow it down or rewind to, to pull in things, but yeah. + +00:25:38 Let's see. + +00:25:39 It's like a, like a seek and then focus sort of deal. + +00:25:42 Exactly. + +00:25:42 Exactly. + +00:25:43 Not that one. + +00:25:44 This one. + +00:25:45 I also heard there's this ebook I can get. + +00:25:47 What's up with this? + +00:25:48 Yeah. + +00:25:49 So scheduling this podcast episode, I didn't know, you know, what we're going to talk about. + +00:25:55 And I know like, there's going to be things I forget. + +00:25:57 And I feel like part of the reason I am so excited about this is I know that there's someone like me, the 10 year ago version of me who has heard about these things and is curious, but just needs more information. + +00:26:08 And so I've spent the last couple of weeks creating this ebook and I put the first version up and I'm going to continue to improve it as time goes on. + +00:26:18 So if anything in this conversation is interesting to you and you just need a little bit more and you want to understand a little bit more what's going on, absolutely download it. + +00:26:25 And, you know, it's free. + +00:26:26 So why not? + +00:26:27 Cool. + +00:26:27 And then we'll have to make you do a, an audio version, put that on audible or something like that. + +00:26:32 Sounds good. + +00:26:33 Yeah, sure. + +00:26:34 I always want to do audio books for stuff that I'm working on, but just the concept of trying to speak code or config file, I just like, I got to stop. + +00:26:42 You know, it's, it's tough. + +00:26:43 It's a tough balance to do with audio books and tech, like developer stuff, but still. + +00:26:47 For sure. + +00:26:48 Yeah. + +00:26:48 Cool. + +00:26:49 All right. + +00:26:49 So people can check that out. + +00:26:50 It's for free. + +00:26:50 We'll put that in the show notes. + +00:26:51 Now, I think that this both has really positive or really big possibilities for data science, but also potential challenges. + +00:27:02 Let me throw it out to you and then you, you take us through it. + +00:27:04 Super benefits, incredibly obvious. + +00:27:06 You have an event stream that tells you over time what times the things happened. + +00:27:11 You have, both the additions and subtractions or the permutations that it goes through until it ends up in its final state. + +00:27:17 Not just show me all the customers from California who bought this month, but like show me all the Californians who abandoned the cart, but then came back and then did the, you know what I mean? + +00:27:26 Like you can just answer way more interesting questions. + +00:27:29 You got time series. + +00:27:30 On the other hand, maybe I would just want to load up a pandas data frame with the answers of what's the average cart size during checkout. + +00:27:38 And that, that becomes like a big computation out of an event source based database. + +00:27:44 If you don't have one of things. + +00:27:45 Okay. + +00:27:45 Well, let's hear it. + +00:27:46 Let's hear it. + +00:27:47 I'm going to say it. + +00:27:47 Like if you, if you don't do one of those caching or multi-database things or the QRS, I don't remember the patterns. + +00:27:56 Anyway, that looks like it maybe is a little bit of a challenge. + +00:27:59 You could do it. + +00:27:59 I think you could do it more easily in pandas, but like maybe I just, you know, some people do data science just through SQL and they just, I'm just going to write queries against a warehouse database, you know? + +00:28:09 Yeah, absolutely. + +00:28:10 Why not have it both ways? + +00:28:11 For example, on my, okay, so let's start the, for the hardcore data scientist, actually, you know, I don't even think the event store is the right format for them. + +00:28:21 You know, I would definitely have some kind of script that would run on some kind of loop that, you know, maybe every day or every couple hours or whatever would transform the raw events into some format that would be great for. + +00:28:34 Sure. + +00:28:35 And you hear about all these like OLAP cubes and all these other like super BI type of systems. + +00:28:40 None of those, no, no, I can't say none of those. + +00:28:42 Many of those are not running out of the operational database. + +00:28:45 They're like a, some kind of like warehouse data lake. + +00:28:49 We've transformed this so you answer questions. + +00:28:51 So it's not necessarily just event sourcing. + +00:28:52 Like we just want to avoid five joins so we can just ask the question directly, right? + +00:28:57 Yeah. + +00:28:57 Yeah. + +00:28:57 And in fact, my current project that I have in production at work, it is a service that multiple other services use to process items. + +00:29:07 And the project manager of one of these services reached out to me and said that they have a BigQuery table that has all this analytical information and they wanted to add the information we have to theirs. + +00:29:19 And so, you know, we set up a conversation. + +00:29:22 I created the code and every day I'm sending information to their BigQuery instance. + +00:29:26 And three days after we did a go live, you know, I created a meeting to like kind of circle back with them to make sure everything's working the way they wanted. + +00:29:34 And when they opened up the BigQuery database, they were shocked because they were expecting three days worth of data. + +00:29:40 I had, I set every piece of data I had for months, which was how long they've been sending things to my service. + +00:29:48 And so, like, I was just, I just, you know, this person was elated because they were like, they knew their data scientists wanted this information and they, now they have all this information going back to day one, so to speak. + +00:29:59 And also just recently, my boss asked me, like, I have a reports view that's just a webpage that has like how, stats on how my service is doing. + +00:30:07 And he's like, it'd be nice to have like, some, like a table of some of these, you know, last few days or whatever. + +00:30:12 And I was like, okay, how many, how many days would you like? + +00:30:15 And he's like, 30 days. + +00:30:16 So I created, the HTML was pretty easy and I just created a script to like pull the events out of the event store and populate this table, you know, as exactly as we needed. + +00:30:25 It went into production live and we were immediately there with 30 days of history. + +00:30:28 It was, it was so exciting. + +00:30:29 And, and like, this is what I get to experience every week is like, you know, having the ability to like, go back into history and answer questions that we've had that we didn't even think we knew. + +00:30:38 We didn't have any idea that we would want to know, you know, a month ago. + +00:30:42 And to be able to answer those questions with precision is, is intoxicating. + +00:30:46 Yeah. + +00:30:46 I certainly see the value. + +00:30:47 Like you don't necessarily know the questions you're going to ask. + +00:30:50 And if you don't have enough data or you don't store it in the right way, you literally can't answer them. + +00:30:56 Yeah. + +00:30:56 Right. + +00:30:57 But with, it sounds like with, events or so, you can go back and like, well, what if we ask this over time instead of by region? + +00:31:02 Like, okay, slightly different query, no problem. + +00:31:04 Exactly. + +00:31:05 Yeah. + +00:31:05 Yeah. + +00:31:05 It's, it's really quite something. + +00:31:07 I had no idea. + +00:31:09 Like my, I kind of mentioned earlier, my biggest thing was, I was, I can't wait to have fast UI. + +00:31:15 And now that I realized that I feel like our applications obviously serve the primary purpose of whatever it is that the business needs, but I didn't realize how much there was a secondary need of understanding how it works and enabling the business to make decisions + +00:31:29 based on how customers are actually using the application. + +00:31:32 We're going to talk about the AI side later, but I do just want to throw out as different constituents who might care to answer these questions. + +00:31:38 Like I was just thinking, you've got, you've got the operational side of say the website or app or, you know, driving an API for the app or something like that. + +00:31:46 That's one view. + +00:31:47 That's kind of the traditional view, but now you have this much more increasingly popular view of like data scientists and BI tools and the CEO wants a dashboard that updates live type, you know, so events are a clear trigger for those kinds of things, right? + +00:32:01 Absolutely. + +00:32:01 But then also you might ask your AI Opus or Codex or whatever, hey, find me some trends or let's look at this and, you know, it has more to work with as well, right? + +00:32:11 Yeah. + +00:32:12 Just thinking of the different constituencies, yeah? + +00:32:14 Totally. + +00:32:14 In fact, just today I was looking into a bug that was happening in production and I asked Claude, hey, can you query the GCP logs? + +00:32:23 Can you query the event store and help me understand what's going on? + +00:32:26 And it was like, sure enough, here you go and made fixing the bug much easier. + +00:32:32 This portion of Talk Python is sponsored by Temporal. + +00:32:36 Ever since I had Mason Egger on the podcast for episode 515, I've been fascinated with durable workflows in Python. + +00:32:43 That's why I'm thrilled that Temporal has decided to become a podcast sponsor since that episode. + +00:32:48 If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:32:56 I'm sure many of you have written brutal code to keep the workflow moving and to track when you run into problems, but it's trickier than that. + +00:33:03 What if you have a long-running workflow and you need to redeploy the app or restart the server while it's running? + +00:33:09 This is where Temporal's open-source framework is a game-changer. + +00:33:13 You write workflows as normal Python code and Temporal ensures that they execute reliably, even across crashes, restarts, or long-running processes while handling retries, states, and orchestrations for you so you don't have to build and maintain that logic yourself. + +00:33:27 You may be familiar with writing asynchronous code using the async and await keywords in Python. + +00:33:33 Temporal's brilliant programming model leverages the exact same programming model that you are familiar with but uses it for durability, not just concurrency. + +00:33:42 Imagine writing awaitworkflow.sleep Heimdelta 30 days. + +00:33:47 Yes, seriously, sleep for 30 days. + +00:33:49 Restart the server, deploy new versions of the app. + +00:33:51 That's it. + +00:33:52 Temporal takes care of the rest. + +00:33:53 Temporal is used by teams at Netflix, Snap, and NVIDIA for critical production systems. + +00:33:58 Get started with the open-source Python SDK today. + +00:34:01 Learn more at talkpython.fm/Temporal. + +00:34:04 The link is in your podcast player's show notes. + +00:34:07 Thank you to Temporal for supporting the show. + +00:34:09 Yeah, I guess you know why. + +00:34:11 You have more granularity on what, if the thing in the database doesn't look like you expected, you much, have a much more granular way of knowing like it was this step that made it look like that because I've had problems before where I completely + +00:34:26 upgraded, swapped out the data access layer for Talk Python training for the courses. + +00:34:32 Yeah. + +00:34:32 And for the website, it was perfect. + +00:34:33 Everything was great. + +00:34:34 But under certain circumstances on Android, the app was resulting in something, it was sending in something that would make the data not right, right? + +00:34:43 Like there was some field that was null instead of just taking on the default value. + +00:34:47 Oh, man. + +00:34:48 Which is fine. + +00:34:49 But then when the person logged in on the website, the website didn't assume that that thing could be null because it was, at a minimum, had a non-nullable default value. + +00:34:57 I'm like, why do we need to check this for null? + +00:34:59 How did it get to be null? + +00:35:01 It makes no sense. + +00:35:01 It took forever to figure that out. + +00:35:04 Oh, wow. + +00:35:04 But with event sourcing, you could see this was the event that made it null. + +00:35:08 Not just, it is null. + +00:35:09 What in the world is going on? + +00:35:11 Why could it, how could it possibly be null? + +00:35:13 Yeah, absolutely. + +00:35:14 So I think it's got some interesting debugging. + +00:35:17 And one more thing, like I know this is quite the data science side, but another constituency could be PCI, HIPAA, GDPR, like all the compliance frameworks you got to deal with for auditing or sort of audit trail + +00:35:32 or something that happens. + +00:35:33 I mean, a lot of times logs serve that value, but that might be a, they updated the record like, oh, what? + +00:35:38 Yeah, totally. + +00:35:39 Yeah. + +00:35:40 And that, you know, I, even though I've been in insurance and I've been in healthcare, I haven't had anything where I have to certify these things, but like you've, the audit log is the way you interact with everything. + +00:35:51 It is the source of truth. + +00:35:53 And so, but what's funny is I have worked on teams that created history tables to try to essentially do that work. + +00:36:00 and it was like two or three months after I started working there before I learned that that table existed. + +00:36:06 And so, there were two or three months of work I should have been putting in the history table that I didn't. + +00:36:12 And from what I hear among other developers, a lot of teams work that way. + +00:36:17 Like, only a few people really know and understand how to maintain that history table. + +00:36:22 And a lot of times, like when they try to replay it, it just doesn't work. + +00:36:25 And it's, it's unfortunate. + +00:36:26 Yeah, it's like, well, there is history in the history table. + +00:36:30 When we run it again, we don't get the same output as the final database. + +00:36:34 What's going on? + +00:36:34 Yeah, true. + +00:36:36 Yeah, but with the event sourcing, it reverses it. + +00:36:38 Basically, the events are the source of truth and the other one is some kind of dynamically generated sort of deal, yeah? + +00:36:44 Yeah, yeah. + +00:36:45 And it's a lot like, you know, a backup strategy. + +00:36:49 You know, if you never test your backup strategy, you don't really. + +00:36:52 Exactly. + +00:36:52 And I feel like it's the same thing with the history table. + +00:36:54 And honestly, to be totally honest, event sourcing is similar in that it's easy to accidentally migrate event versions. + +00:37:01 You know, like for myself, I was working on a new event to kind of, you know, on my app and I introduced a new attribute or actually, I guess it was a full, whatever the point being is, at some point I decided I wanted to change + +00:37:16 the name of the attribute because it would reflect better what it meant in the domain and not realizing that I had already published that event to production. + +00:37:24 And so at one point I was, I don't know, I don't remember what I was doing, looking up issues or honestly, it might have been a view that it was rendering that was supposed to be for with throwing errors and I couldn't understand why and I looked at it and sure enough, it was because I accidentally created a different version + +00:37:39 of the same event. + +00:37:40 Thankfully, all I had to do was change the code to say, well, if this attribute doesn't exist, look for this attribute and everything was fixed. + +00:37:47 But, you know, you can honestly fall into some of those things with event sourcing too if you're not careful. + +00:37:51 But the nice thing is because the events are still there, you have the ability to recover from them. + +00:37:56 Let's talk about versioning for a little bit. + +00:37:58 Sure. + +00:37:59 On a sort of operational third normal form type of database, you know, you might run a migration. + +00:38:04 One of the reasons I really like using MongoDBs because I almost never have to run migrations, but that's a different, it's a different debate. + +00:38:11 However, you might run the migration to say like, okay, we're going to add a column or we're going to split this data apart and move this stuff over here and that over there and then create a foreign key relationship or whatever. + +00:38:23 Yeah. + +00:38:23 But I can see if you've got this kind of history of things. + +00:38:26 Like, let's say, I don't know, how do you deal with versioning, right? + +00:38:29 Like, I've got these old events and the way you're not storing the current state. + +00:38:34 So with the migration or something like that, you're like, well, let's just transform the current state into the new state. + +00:38:38 With these, you've got like old events and new events and they might be in a real way incompatible. + +00:38:45 Yeah. + +00:38:45 Yeah, sure. + +00:38:46 What do you think about with that? + +00:38:47 You have so many strategies. + +00:38:50 You just have to choose which one works for your situation. + +00:38:52 So the first one is kind of what I mentioned just a minute ago, kind of like the MongoDB or I should say document database way of working with things where if you're adding things to an event, adding fields, then if, you know, if the code, you know, most code will be blissfully ignorant + +00:39:07 that you're adding new attributes to it. + +00:39:09 So it doesn't really matter. + +00:39:11 And then those that do care can kind of have fallbacks. + +00:39:14 And the way that Adam and Martin suggest, you essentially have like upcasters or some kind of code that essentially says like, okay, the previous version, actually, that's two different things. + +00:39:25 I'm sorry. + +00:39:25 The code that does care about the new attribute, if it encounters an older event that doesn't have that attribute, then you can have a default fallback. + +00:39:32 And it's best to have those close to, you know, the domain logic that you want to update. + +00:39:37 This is a vertical slice architecture approach as opposed to having like a global upcaster. + +00:39:41 Again, I realized that I want to keep going to the upcasting, which is the second option that I've been trying to get to. + +00:39:47 But so essentially one is, you know, have a default fallback. + +00:39:50 Second one is to have an upcaster that says, okay, you know, especially if they're two different versions, like truly different versions of the event, you know, you have add to shopping or item added and item added version two. + +00:40:02 You know, you might have completely different fields. + +00:40:03 It's good to have a piece of code that can, you know, upcast to the second one. + +00:40:08 You put that in your, like your data access layer. + +00:40:11 It might write a thing and say, give me all the items of the cart and it looks at the type or the version flag and then does a little processing or would you rewrite the database? + +00:40:20 Oh, well, that's, I was going to say that's your third option. + +00:40:22 So let me take a step back and ask your first, your last question, which was like, at least to me, the tent, what you're doing with your events is essentially rebuilding state. + +00:40:30 And so wherever you're in that loop to rebuild the state, you would probably say like, oh, is it this event or is it this event or whatever behave this way? + +00:40:37 Right there. + +00:40:38 Like in my story of using one of the data caches, you do a request, it's not in the cache. + +00:40:44 So you've got to run your build it up code and put it in the cache. + +00:40:46 And that just, I mean, that part right there would be the part that goes, okay, we're going to transform it differently now and then we'll still store the answer in the cache and the next time they ask, it's just fast. + +00:40:55 Yeah. + +00:40:56 So yeah, you would have, the way I would code it is you'd have a piece of code that's listening into the events. + +00:41:02 So you have, it knows about the old version of the event and the new version of the event. + +00:41:06 And it knows what format that data needs to be in in that Redis cache or whatever it was that you suggested. + +00:41:13 Yeah, Valkey. + +00:41:13 Come on, Valkey. + +00:41:14 Valkey. + +00:41:14 Let's go, Valkey. + +00:41:15 I'm going to remember this by the end of the conversation. + +00:41:18 So yeah, so what it does is it will be able to say like, okay, this event is an old one, but I can upcast it to a new one and convert it into the format or however we need to save it to Valkey. + +00:41:28 And then if it's a new one, it also knows, okay, I'm going to convert it this way to save to Valkey. + +00:41:33 So it's nice that it's kind of localized to, you know, the process that you need to update, upcast. + +00:41:39 And then we've touched on like kind of the quote unquote nuclear option, which is to apply a transform to your entire event store. + +00:41:46 And so you would, by doing that, you would create a new database from your database table, which would be your new event store. + +00:41:50 And you, for each event, you just copy it over, do some kind of map or transform to upcast the entire, your entire history. + +00:41:58 Yeah. + +00:41:59 I see values, value in all of them. + +00:42:01 You know, if you're doing a lot of direct SQL data science-y stuff, you probably want to transform the database. + +00:42:06 If it's primarily coming out of an API, like just let that thing handle it as it reads them, you know, computers are fast. + +00:42:12 Keep it in mind. + +00:42:13 Indeed. + +00:42:13 And I forgot to mention that, oh, his name escapes me right now. + +00:42:16 The guy who popularized event sourcing back in the 2000s wrote a book on this topic specifically. + +00:42:22 So he kind of listed out all of your strategies and when you would want to choose them and why. + +00:42:27 Are we talking Martin Dilger? + +00:42:29 No. + +00:42:30 Greg Young. + +00:42:31 Greg Young. + +00:42:32 Okay. + +00:42:32 I think it's on LeanPub. + +00:42:34 Yeah, that sounds about right. + +00:42:35 In fact, Martin Dilger's book is cheaper on LeanPub as well and you get additional content there too. + +00:42:39 Okay. + +00:42:40 Yeah, very interesting. + +00:42:42 Let's, let's make this a little bit concrete for people. + +00:42:46 Sure. + +00:42:46 We talked about how you might, in theory, architect some software, could be Python or something else to follow these patterns, but there is a Python library, right? + +00:42:55 Yeah, absolutely. + +00:42:56 Do you recommend it? + +00:42:57 I do. + +00:42:58 You know, I, it's funny, I don't have any production code with it, but I have used it a lot over the years. + +00:43:04 John Bywater has done an incredible job maintaining this, this repo and, you know, all his, all the people who have contributed as well. + +00:43:12 He has shepherded this through and is, has been really making it such an incredible application. + +00:43:18 When I wrote my applications, you know, I, the first one I did, I was like, I want to do it myself so I can understand it. + +00:43:26 And so I can really understand and respect what he's done. + +00:43:29 But also, you know, there's a part of me that really, like a lot of the, you know, essentially I feel like it depends on who you are. + +00:43:35 If you're somebody who wants to grab a framework and run with it, this is an exceptional one to do it with. + +00:43:40 It, essentially, you just write Python classes and you decorate them or subclass from some of his classes and all the magic of event sourcing happens for you. + +00:43:50 And it just make, leaves you with really readable, understandable code. + +00:43:54 And then you'll have other people who, you know, a lot of people in the event sourcing space says it's actually not that complicated. + +00:44:00 You can write your own code to do it. + +00:44:02 And I did. + +00:44:03 And I recommend it for the right type of person. + +00:44:06 For me, it was hard because there's so many decisions that you need to make. + +00:44:09 And I am not the best Python programmer. + +00:44:11 I do not know all the concurrency issues and all these things. + +00:44:15 I'm getting to learn them more. + +00:44:17 But, you know, I have software that's running in production that's, that's doing well. + +00:44:20 So all that to say is, yeah, I highly recommend this package, especially if you're new to it. + +00:44:25 It can really show you, you know, how you, one option of, of things can work. + +00:44:30 And I love that it, by default, it uses, well, you can, you can use SQLite, Postgres, or one of the, a couple of the doc, databases that are optimized for event sourcing. + +00:44:42 And so you can kind of see how it's, you know, some of the many ways you can pattern things to make it easy for you. + +00:44:48 Interesting. + +00:44:49 So I probably need to file a PR or something here. + +00:44:52 It says, the way you get it is pip install event sourcing. + +00:44:56 I feel like it should be pip install event sourcing. + +00:44:58 No, just kidding. + +00:44:59 But that's cool. + +00:45:00 However, I am, you know, so it's, you know, you can get it off PyPI, but I'm having a hard time resisting pressing this. + +00:45:07 Ask DeepWiki. + +00:45:08 Have you ever gone, what is DeepWiki? + +00:45:10 Yeah. + +00:45:11 It's a AI powered document thing that he opted into, which I thought was very fascinating. + +00:45:16 I was following, I was at the time he did it. + +00:45:19 I was, I think I was writing my own code. + +00:45:21 And so I had Slack open. + +00:45:22 He has a good Slack channel and was like showing all the things that he was able to, to all the insights that was able to be gleaned from it. + +00:45:30 This is epic. + +00:45:31 I love it. + +00:45:31 So it had, the DeepWiki apparently is like, and knows the source and the docs. + +00:45:37 And then it's just a chat. + +00:45:38 And even on fast, I asked, I said, give me an example of using this library. + +00:45:41 So sure. + +00:45:42 Here's a complete example of dog school application, all the code using event source, the light, the package. + +00:45:48 This is nuts. + +00:45:50 Yeah. + +00:45:51 Side quest unlocked. + +00:45:53 Must figure out how to get my packages into DeepWiki. + +00:45:55 This is nice. + +00:45:57 And I also want to add, he has other companion packages that for example, connect into Django and I believe Flask and some other ones too. + +00:46:05 So one of my side projects, I'm leveraging this with Django and it's really cool because one of the things that enables you to do is configure your events table to be similar to your, + +00:46:19 I guess in the same database as your Django table or at least configurable from the way Django would do it. + +00:46:24 And yeah, so getting all these read models are very easy with all the migrations. + +00:46:28 You just say, this is what I want my data to look like with Django and of course apply migrations and there it is in production. + +00:46:34 So that's really nice. + +00:46:35 Yeah, cool. + +00:46:36 It also has extension projects. + +00:46:38 What are these? + +00:46:39 The Django one, the CurrentDB, K-U-R-R-E-N-T. + +00:46:43 I imagine CurrentDB is probably a... + +00:46:45 I believe that is, yeah, it's one of the first event sourcing specific databases. + +00:46:51 I think it was called... + +00:46:51 Oh, it's for intelligent and responsive systems. + +00:46:55 I don't know, Chris, I just got to rant a little bit. + +00:46:57 Like there's all these projects that are cool and they do neat stuff and now I feel when I go to them that it's like, this is the AI compute data frame or this is the AI, the intelligent AI. + +00:47:06 It's like, it's just a database or just a data frame that AIs can use. + +00:47:10 That doesn't make it an AI data frame, you know, it's like, but they all want to capture the excitement. + +00:47:14 It drives me crazy. + +00:47:15 Yeah. + +00:47:16 And what's worse is when they don't even say exactly what they do. + +00:47:18 It is, it is your answer for AI-ing the thing that we're not going to tell you. + +00:47:22 I hate it. + +00:47:23 Yeah. + +00:47:24 Yeah, exactly. + +00:47:24 And it just obscures what the heck it is, but it's the H1 and the H2. + +00:47:27 You're like, oh my gosh. + +00:47:29 Yeah. + +00:47:29 Yeah. + +00:47:30 Okay. + +00:47:30 But that does look pretty interesting. + +00:47:32 Like, yeah, look, it's example is create a client, new event data, new UUID, et cetera, order place, serialize. + +00:47:39 So right here, this example is basically it's got primary key, a category or type of event, like a, just an event, I guess is the way you would call it. + +00:47:48 But then it has JSON serialized, like a JSON blob. + +00:47:53 That is the details of the event. + +00:47:55 Is this how you typically do it? + +00:47:56 I would say so. + +00:47:57 Or is it more column oriented where like this one has an order ID and a total. + +00:48:01 So you might have an order ID and a total in the data structure or is it in a blob level? + +00:48:06 Yeah. + +00:48:06 The implementations I've seen generally will have some kind of blob or JSON serialized or bytecode serialized optimization of it. + +00:48:17 You know, because each event, you know, when you're saving things to the database, you know, you're going to save an event. + +00:48:22 It's going to have an event stream. + +00:48:23 It's going to have generally speaking, there's probably an event version. + +00:48:26 Like there's all these specific things, but the actual payload of the event. + +00:48:29 If there's not an event version, you're going to wish there was an event version at some point probably. + +00:48:33 Exactly. + +00:48:34 Yes. + +00:48:34 And so the payload is usually some kind of blob or JSON body or something like that. + +00:48:39 It sounds very good to be a document database. + +00:48:41 Indeed. + +00:48:42 Yeah. + +00:48:43 Yeah. + +00:48:43 Because you can put indexes on like the sub items and then if they're not in that event, it just doesn't use the index for those particular ones. + +00:48:50 I mean, it's a lot of things. + +00:48:52 Yeah. + +00:48:52 Yeah, exactly. + +00:48:53 It's pretty sweet. + +00:48:54 Yeah. + +00:48:54 And again, well, I haven't said it quite, but like one of the things I have been thinking about for a decade is the more I kind of thought about it, the event sourcing and the patterns and unlocks really gives you so much flexibility. + +00:49:07 You know, you can use document data stores and like really take those, the power of that. + +00:49:13 You know, if you have, and part of this is too is just data, you know, vertical slicing or whatever, it's both multiple patterns put together. + +00:49:19 But like, you know, if you have a view that would be so much better served by having a graph query, a graph database, then use it. + +00:49:28 You know, it's, I remember at one time it took me a while, but like somebody told me that they were using, I can't remember their, their open query. + +00:49:39 Is that what it's called? + +00:49:40 The, I forgot the old name of what I'm trying to think of, but essentially it's, you know, like a database that optimizes for saving text. + +00:49:47 So you can like search for it. + +00:49:49 You know, they use that as just for one, you know, to serve the purpose of one item. + +00:49:55 And honestly, this isn't unique to event sourcing. + +00:49:57 You can do this with event-driven architecture as well. + +00:50:00 But what I love about event sourcing is like you have the benefits of event-driven architecture and the benefits of a monolith in one if you choose to go that way. + +00:50:08 And yeah, it's just, it's, I guess really what it comes down to is what I love about it and was surprised by is how flexible it gives you the ability. + +00:50:16 Yeah, in the book, it reminds me, one of my users was complaining about the status screen that I show for the users and he had all these great ideas and I was like, you know what? + +00:50:27 I want to take advantage of that. + +00:50:29 So I actually cloned my, the vertical slice for that view and created a new database column or collection for that, to power that view. + +00:50:39 and we iterated and iterated to make this thing better and with each iteration, sometimes I needed to change how the read model reacted to events and so I could just blow away the read model, regenerate it from events and we ended up with something really great so that when it + +00:50:54 was time to go live, I just changed which, where the URL went, pointed to the new one and was able to delete the old code and delete the database table and it was wonderful. + +00:51:03 Yeah, it just gives you so much flexibility to do whatever you need. + +00:51:08 So, a couple more, one more, I guess one more really relevant thing, two more things to give a shout out with this event sourcing. + +00:51:14 There's event sourcing Django which is Python package for it with Django. + +00:51:18 Imagine that probably somehow it upgrades with the ORM, don't know. + +00:51:20 But also event sourcing SQLAlchemy which is kind of cool. + +00:51:23 So if you use SQLAlchemy, yeah, very nice. + +00:51:25 All right, so this stuff is great but I imagine that it has times you should use it maybe more and times you should go, well, square peg round hole maybe not this time. + +00:51:36 Sure, yeah. + +00:51:37 What do you think? + +00:51:38 For me, I feel like it's usually the way I think about it first is because most people are very comfortable with not using event sourcing, right? + +00:51:45 And so I usually answer it the opposite way which is when should you? + +00:51:48 Sure, exactly. + +00:51:49 The two biggest, the best piece of advice that I heard over the last decade was number one, use it, a good opportunity to use it is if you have a database column called status because if you have a columnated status then that means that + +00:52:03 one item can be in multiple different statuses, right? + +00:52:07 Different states and if you're having different states each states behave different in some way or form and so you are definitely not dealing with true CRUD create, read, update, delete patterns and so event sourcing would be a great option for that. + +00:52:20 The second piece of advice is do you ever, are you ever concerned about losing data? + +00:52:25 Because by default event sourcing does not and what it enables you to do is choose when to lose data, right? + +00:52:30 Because you don't have to keep every event around forever. + +00:52:32 You can just say like after 90 days let's just put it to cold storage or just delete it, you know, it depends. + +00:52:37 Yeah, exactly. + +00:52:38 Out in the audience Mike says, I'm scared of the physical storage requirements of this potentially. + +00:52:42 I guess it depends how many data, how many events make up a final state in your system. + +00:52:47 Like a cart checkout, big deal? + +00:52:49 No, probably not. + +00:52:50 Like if used as an app log, that might be a problem. + +00:52:54 Yeah, yeah. + +00:52:54 Most, I'll say models, will have maybe a dozen events, maybe two, depends on your, obviously depends on your domain. + +00:53:03 But ideally, you will keep your events short and they have practices called closing the books where you will use events in your domain to kind of keep it short. + +00:53:12 So like, for example, a store will want to know their revenue across the entire year, but every night they shut down, they get their cash registers or if they still have cash registers and they kind of reconcile how much money they made that day. + +00:53:25 And so, you know, kind of keeping your event stream short really helps. + +00:53:29 If you're going to go back, you would just say, well, we'll just read the daily summary and then add today's events or something like to get the final output, something like that. + +00:53:37 Yeah. + +00:53:38 Unless you want to go all the way back to day one, in which case you can, you know, read and say like, okay, the, you know, it all depends on how you want to do it, right? + +00:53:46 This is again, the flexibility side of it because you could just like, say, start from today and read forward or you can start from today and say, okay, what was the event stream before this and read that and keep going back to the originating? + +00:53:57 And I think you put up Mike's comment that said, I'm afraid of the physical storage requirements. + +00:54:02 And it's like, that is the trade-off. + +00:54:04 There is, it will take more space, but thankfully, storage space is the cheapest commodity in all of online or, you know, in today's world. + +00:54:15 And it's, and most expensive is memory and then compute and then storage and then bandwidth and then storage. + +00:54:20 I think that's probably the breakdown, right? + +00:54:22 Yeah, I think so. + +00:54:24 And why things like disc cache are awesome versus like another thing that's just in a memory cache, but another process, right? + +00:54:29 Like, exactly. + +00:54:30 Yeah. + +00:54:31 And having the ability to say like, you know, we, you know, like my current application, I have not yet deleted any events, but truly like the only reason we have events older than even a week are just for analytical purposes and, and just me understanding how our system works. + +00:54:45 And so I'm planning to make a way to offload that event or those events. + +00:54:49 And a lot of people just put them straight to cold storage, you know, just so that they always have a backup just in case, but, you know, chances are they rarely ever use it. + +00:54:57 Interesting. + +00:54:58 And one other thing I did want to add is to go answer your question when not to use event sourcing. + +00:55:02 exactly. + +00:55:03 Would be essentially like, you know, so let's say you don't care about losing data. + +00:55:08 There are just a number of just simple applications that are truly crud, right? + +00:55:12 Like I've worked on a number of these where they're just forms over data. + +00:55:16 It's exactly the term I was thinking, forms over data. + +00:55:18 Defining that for people if they don't know. + +00:55:20 Yeah, it's essentially something where like in my case, one of the first ones I worked on is like you have a web page that almost exactly mirrors the database table that you're saving the data to. + +00:55:30 You know, maybe it's a contact form. + +00:55:32 Who knows what it could be? + +00:55:33 You know, the idea is like there is so the web UI or whatever you're building is just an easy way to get data into the database. + +00:55:41 And chances are you don't have status field. + +00:55:44 You don't have all these different ways of different rules for how things behave. + +00:55:49 And in fact, in my event source application, I have a model that is not event sourced. + +00:55:54 It is truly a crud model. + +00:55:56 And so just by saying, you know, adopting event sourcing doesn't mean you have to do it for everything. + +00:56:00 You can use it for even just a small bit of your project, especially if you want to try it out and see how it could be. + +00:56:05 That's a really good point. + +00:56:06 It's not an all or nothing sort of thing. + +00:56:07 Because you have a properly factored data access layer and you're not doing that inside of your Jinja template, are you? + +00:56:15 No? + +00:56:16 Right. + +00:56:17 Not even in your view, but like you've got just an opaque layer of actions. + +00:56:22 Some of those actions can be driven by events. + +00:56:25 Some of those actions can just be straight crud. + +00:56:27 Create, read, update, delete for those who don't know. + +00:56:29 One of the people who inspired me to really dig into event sourcing, he has a line of business that, well, he'll go to a company who is struggling because their database schema is holding them back. + +00:56:42 You know, for whatever decision they made, they cannot, they're having such a hard problem, hard time creating a new feature because of their database schema. + +00:56:50 So he goes in, teaches them event sourcing and uses the event sourcing event store to publish both the dream schema that they wish they would have and the old schema. + +00:57:01 And they live side by side and the event, you know, once the features are complete, they'll, you know, put it up and they'll start slowly migrating traffic over to the new event source version and, you know, eventually they can delete the old database table or schema, you know, database. + +00:57:15 And most of the teams he's worked with have kept with the event source version and gone on from there. + +00:57:21 Yeah. + +00:57:21 Oh, and then finally, one other thing I want to mention too is when not to use is it's up to your teammates because, you know, I am sold. + +00:57:30 I think this is such an incredible pattern. + +00:57:33 It is just unlocking so much joy again and so much flexibility as I've said before that I cannot imagine having to go back. + +00:57:41 That said, if I join a new team and my team members are like, I don't know, I'm going to go with them, you know. + +00:57:48 Yeah. + +00:57:49 One thing is to not use an optimal pattern. + +00:57:52 What's worse is to try to use an optimal pattern but have nobody else want to do it and then they work around it and you, you know, it sounds a little similar to like people who don't want to do unit testing. + +00:58:02 Yeah. + +00:58:02 So some of the people write the unit test and they set up CICD that'll fail if the unit test failed but then the other people will check and work without running the test at all and then they break it and you're like, what are you doing? + +00:58:11 Like, well, I don't want to run these crappy tests. + +00:58:13 You're like, well, now the whole CICD is not just not helpful. + +00:58:16 It's inhibiting me working because you won't even, you know what I mean? + +00:58:19 It's just like, and it seems like you do need a certain level of buy-in for this to make sense. + +00:58:24 Absolutely. + +00:58:24 Yeah. + +00:58:25 And maybe they should listen to this podcast and they can see it. + +00:58:28 And maybe you create an example of one feature in an event sourced way so they can see some of the benefits. + +00:58:36 But, you know. + +00:58:37 Yeah, yeah, yeah. + +00:58:37 Like your partial example, indeed. + +00:58:39 Yeah. + +00:58:39 All right. + +00:58:40 We're getting short on time here, Chris, but let's talk this AI flow. + +00:58:45 First of all, let's circle back to your comment of your company having a mandate to use AI. + +00:58:52 What the heck is going on here? + +00:58:54 How is this received and how are you receiving it? + +00:58:56 And also tell us, are you actually writing, you know, make, shipping more features and be more productive or not? + +00:59:03 Like what, give us your assessment as much as you're willing to share. + +00:59:06 Like you don't have to like. + +00:59:08 Yeah, yeah. + +00:59:09 I will, I will hide certain things, but to say. + +00:59:13 Names and places have been changed to protect the parties. + +00:59:16 Yes, and emotions and conversations with multiple other people. + +00:59:20 I would say at times I am so much more productive. + +00:59:23 At times it has brought down the production, you know. + +00:59:30 So it is a mixed case. + +00:59:33 I see that Mike in the chat said it's an overconfident intern and I'm like 110%. + +00:59:39 Like this is exactly what it is. + +00:59:40 But very smart intern. + +00:59:41 It is. + +00:59:41 Oh, absolutely. + +00:59:42 Very confident. + +00:59:43 Oh, well, he said that. + +00:59:43 Yeah, overconfident. + +00:59:44 And I find this fascinating because my production app is actually three services in one monorepo and I'm responsible for one and, you know, a couple other people are responsible for the other ones, but, you know, we're all interacting. + +00:59:59 And so my, when I need to change something on my code, you know, and I'm required to use cloud code, I say this is what I need to do and generally it does a really great job. + +01:00:08 And I think a lot of this has to do with the vertical slice architecture because vertical slices only hold code that is responsible for one feature. + +01:00:15 And so that really fits very nice into a context window. + +01:00:18 Yeah. + +01:00:19 It doesn't have to scan 200,000 lines and 100 files. + +01:00:23 It looks at five. + +01:00:24 Yeah. + +01:00:24 And it knows event sourcing. + +01:00:26 So it knows, okay, I'm subscribing to events. + +01:00:28 These are the events. + +01:00:29 I know where they are, you know, all these different things. + +01:00:31 When I work on one of the other services, it takes a lot more context to understand the state of the code. + +01:00:38 And I really have to work harder to do, to do what I need to do in those, in those parts of the code base. + +01:00:46 Yeah. + +01:00:46 So it's been a very interesting experiment. + +01:00:48 And additionally, kind of when I am, I had to curb this, but when I have been more productive is creating get work trees. + +01:00:56 So it's like, I have a kind of main repo that I work out of. + +01:00:59 And then I, if I have a feature that I, you know, I'm like looking at the code base and like, oh, or the web app or the logs. + +01:01:04 And I'm like, oh, it'd be good to like optimize this. + +01:01:07 Then I create a new work tree and set cloud up in there and get it working on a thing. + +01:01:11 And so I have found that I can only, I need to limit myself to two or three work trees because any more than that, I start losing context. + +01:01:18 And now I know what the LLM is. + +01:01:20 Yeah, exactly. + +01:01:21 If you over, overdo it, it's, you just send off five agents and don't look like that's how you end up like, oh, we have kind of like bugs in our code about architecture. + +01:01:28 Well, you never look at it. + +01:01:30 It's like, we got the super energetic, super smart intern and we kicked him off and said, go on that feature. + +01:01:35 But you know, they need guidance, right? + +01:01:38 All the tests are passing because I changed all the tests to pass. + +01:01:40 I know. + +01:01:41 The problematic data has been removed from the database. + +01:01:43 It works now. + +01:01:45 Why is it empty? + +01:01:46 Yeah. + +01:01:48 Back to your backup comments. + +01:01:50 Honestly, I'm having like an insane amount of productivity with cloud code and with AI and stuff. + +01:01:55 But it's an engineering skill. + +01:01:57 It is not just, let's fire it up and ask for it. + +01:01:59 Like one of the things I'm doing lately that I'm really appreciating is going through like a planning session, which I know a lot of people do that and like talk about it. + +01:02:07 But, and now, if you have the GitHub CLI installed, just the GH thing, you can tell it, hey, create, you know, instead of just running this plan, create a GitHub issue of this plan, write all the details in GitHub and then your next comment + +01:02:22 can be, let's work on issue 127 and it'll go work on it when it gets done. + +01:02:26 Like, let's make a retrospective comment on the issue. + +01:02:29 Let's create a PR that closes that issue. + +01:02:32 That's, you're like, there's some really interesting team dynamics that you can put in there that, you know, talking to a chat box is not covered, but if you know what, you know what to ask for. + +01:02:41 Yeah, I'm really inspired by Martin and Adam because both of them in one way or another have, let me take a step back. + +01:02:48 They, I mentioned the event modeling diagram. + +01:02:51 It is a visual diagram that really has a reduced visual language and what was mind-blowing to us a couple years ago was that AI understands it. + +01:03:01 And so, the fact that you can essentially say like, here's the diagram, can you implement the slice and it can get you from, well, let me take a step back. + +01:03:08 Martin and Adam have both had successful research spikes where they took an event modeling diagram. + +01:03:16 Actually, no, they even did, what they did even worse was they started with a conversation with a client and recorded it, created the transfer. + +01:03:23 They generated the diagram. + +01:03:24 Generated the diagram and then generated code from the diagram that didn't solve everything but got it, I think, 80 or 85% the way they're in hours from, you know, like cutting months of work down to weeks is impressive. + +01:03:38 And I've had some similar, you know, I'm still working on my personal one because like, I just, you know, after work, I just tend to shut down my computers and I'm not like dedicated to like really going at it but like, I found some really incredible benefits + +01:03:52 of doing something like that. + +01:03:53 that's awesome. + +01:03:54 I created this open source project. + +01:03:56 I mean, it's more source open, whatever, it's not really a project but I called it Python Package Guides for Agents and all the projects that I work on, I'll go and download the source and the documentation. + +01:04:05 So like, if I'm working on disk cache, I'll like literally clone it, clone the documentation and then make Claude write a super detailed, like, not use their documentation or its old version but like, have it like legit, write down examples, study the source code, study the documentation, + +01:04:20 source code like trumps documentation because if the docs are out of date and so on and then I'll drop, you know, if I'm using like two of these like maybe data classes and disk cache, I'll drop those things into my project and tell Claude about them and that's been a pretty neat thing to do as well. + +01:04:34 Yeah. + +01:04:34 But I want to leave this portion of our conversation with an incredible joke. + +01:04:38 Okay. + +01:04:39 Okay. + +01:04:40 Just because I feel like this has to be said right now. + +01:04:42 It's just so, the joke, this is the word copilot, it could be Claude, it could be Codex, it could be Chat, whatever, like just AI, right? + +01:04:48 Friends outside of tech. + +01:04:50 Lol, copilot is dumb. + +01:04:51 Friends in tech. + +01:04:52 In tech. + +01:04:52 I just bought iodine tablets and I've made an offer in land of state. + +01:04:55 My supplies of antibiotics and potable water are sufficient but I need to set up for the hydroponics to make it through the first few years. + +01:05:01 Like I feel like that's where we are, you know? + +01:05:03 Yeah, yeah, totally, totally. + +01:05:06 And that maybe also sums up your meetup as well. + +01:05:10 Yeah, yeah, yeah. + +01:05:11 Yeah, absolutely. + +01:05:12 It's quite a spectrum both of Friends outside tech and inside tech. + +01:05:17 Yeah, exactly. + +01:05:17 Like it's more like believers and non-believers. + +01:05:19 I'm not really sure. + +01:05:20 All right. + +01:05:21 Final Call to Action. + +01:05:22 What do we got here? + +01:05:24 People are interested. + +01:05:24 They want to get started. + +01:05:25 What do you tell them? + +01:05:26 Get your ebook, your free ebook that you put up? + +01:05:28 That's right. + +01:05:29 Yeah, so my website is everydaysuperpowers.dev. + +01:05:32 If you want an ebook that kind of introduces you into event sourcing and kind of gives you kind of this kind of fundamental background and a couple other things, go to everydaysuperpowers.dev slash es intro and it'll take you right there. + +01:05:46 I'm on Mastodon mostly, but I'm also watchbluesky and sometimes x underscore chrismay on all of those. + +01:05:55 With Mastodon, I think I'm on fosstodon.org. + +01:05:57 What else? + +01:05:59 I mentioned everydaysuperpowers, so that's... + +01:06:01 I also have a Discord from there too, so if you go through my website, you can see how you can join that. + +01:06:07 Sweet. + +01:06:07 Maybe check out the event sourcing library. + +01:06:09 100%. + +01:06:09 For them people. + +01:06:10 Yeah. + +01:06:10 Yeah. + +01:06:11 And if you... + +01:06:12 Oh, oh, the... + +01:06:13 Martin and Adam have a podcast called the Event Modeling and Event Sourcing Podcast, which is verbosely named, but it also is really great. + +01:06:21 You know, this is how I learn just from these great people, you know. + +01:06:25 There you go. + +01:06:26 Just kind of every... + +01:06:27 Almost every week talking about patterns they do and stuff like that. + +01:06:30 They also talk about a bunch of other stuff that isn't relevant, but I've learned so much from listening to them and they also have Discords. + +01:06:37 So go ahead. + +01:06:37 Cool. + +01:06:37 I'm honestly impressed. + +01:06:39 Like an entire podcast on a single design pattern. + +01:06:41 Let's go. + +01:06:42 That's commitment to it. + +01:06:43 Yeah. + +01:06:44 Yeah. + +01:06:44 And it's incredible. + +01:06:45 I mean, as someone who has played the board game and really enjoys it, it's amazing. + +01:06:51 Well, Chris, I really appreciate you coming on here and sharing all your experience and excitement and all the things. + +01:06:56 Great to talk to you. + +01:06:57 Likewise. + +01:06:58 Thanks for having me. + +01:06:59 This has been another episode of Talk Python To Me. + +01:07:02 Thank you to our sponsors. + +01:07:03 Be sure to check out what they're offering. + +01:07:04 It really helps support the show. + +01:07:06 This episode is sponsored by Sentry's Sear. + +01:07:09 If you're tired of debugging in the dark, give Sear a try. + +01:07:12 There are plenty of AI tools that help you write code, but Sentry's Sear is built to help you fix it when it breaks. + +01:07:18 Visit talkpython.fm/sentry and use the code talkpython26, all one word, no spaces, for $100 in Sentry credits. + +01:07:27 And it's brought to you by Temporal, durable workflows for Python. + +01:07:31 Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +01:07:38 Get started at talkpython.fm/Temporal. + +01:07:41 If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTMX, and even LLMs. + +01:07:54 Best of all, there's no subscription in sight. + +01:07:57 Browse the catalog at talkpython.fm. + +01:07:59 And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:08:04 Just search for Python in your podcast player. + +01:08:06 We should be right at the top. + +01:08:08 If you enjoyed that geeky rap song, you can download the full track. + +01:08:11 The link is actually in your podcast blur show notes. + +01:08:13 This is your host, Michael Kennedy. + +01:08:15 Thank you so much for listening. + +01:08:16 I really appreciate it. + +01:08:18 I'll see you next time. + +01:08:18 to the show toHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHD + +01:08:48 Thank you. + diff --git a/transcripts/548-event-sourcing-with-chris-may-transcript.vtt b/transcripts/548-event-sourcing-with-chris-may-transcript.vtt new file mode 100644 index 0000000..40fe32f --- /dev/null +++ b/transcripts/548-event-sourcing-with-chris-may-transcript.vtt @@ -0,0 +1,3211 @@ +WEBVTT + +00:00:00.000 --> 00:00:01.920 +What if your database worked more like Git? + +00:00:02.200 --> 00:00:09.340 +Every change captured as an immutable event instead of a single mutating row that quietly forgets its own history. + +00:00:09.860 --> 00:00:10.700 +That's event sourcing. + +00:00:10.820 --> 00:00:17.860 +And Chris May is back on Talk Python, fresh off our Datastar panel, to walk us through what event sourcing actually looks like in Python. + +00:00:18.360 --> 00:00:26.600 +We'll cover core patterns, the libraries to reach for, when not to use it, and why event sourcing turns out to be a surprisingly good fit for AI-assisted coding. + +00:00:27.180 --> 00:00:29.460 +This is Talk Python To Me, episode 548. + +00:00:29.460 --> 00:00:32.340 +Recorded May 5th, 2026. + +00:00:47.880 --> 00:00:54.600 +Welcome to Talk Python To Me, the number one Python podcast for developers and data scientists. + +00:00:54.880 --> 00:00:56.480 +This is your host, Michael Kennedy. + +00:00:56.840 --> 00:01:00.460 +I'm a PSF fellow who's been coding for over 25 years. + +00:01:01.020 --> 00:01:02.160 +Let's connect on social media. + +00:01:02.460 --> 00:01:05.640 +You'll find me and Talk Python on Mastodon, Bluesky, and X. + +00:01:05.820 --> 00:01:07.780 +The social links are all in your show notes. + +00:01:08.480 --> 00:01:12.040 +You can find over 10 years of past episodes at talkpython.fm. + +00:01:12.120 --> 00:01:15.540 +And if you want to be part of the show, you can join our recording live streams. + +00:01:15.700 --> 00:01:16.220 +That's right. + +00:01:16.420 --> 00:01:19.760 +We live stream the raw, uncut version of each episode on YouTube. + +00:01:20.240 --> 00:01:24.780 +Just visit talkpython.fm/youtube to see the schedule of upcoming events. + +00:01:24.920 --> 00:01:28.600 +Be sure to subscribe there and press the bell so you'll get notified anytime we're recording. + +00:01:29.200 --> 00:01:31.900 +This episode is sponsored by Sentry's Seer. + +00:01:32.200 --> 00:01:34.680 +If you're tired of debugging in the dark, give Seer a try. + +00:01:35.200 --> 00:01:40.500 +There are plenty of AI tools that help you write code, but Sentry's Seer is built to help you fix it when it breaks. + +00:01:40.980 --> 00:01:48.980 +Visit talkpython.fm/sentry and use the code Talk Python26, all one word, no spaces, for $100 in Sentry credits. + +00:01:49.460 --> 00:01:53.140 +And it's brought to you by Temporal, durable workflows for Python. + +00:01:53.140 --> 00:02:00.160 +Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +00:02:00.420 --> 00:02:03.440 +Get started at talkpython.fm/Temporal. + +00:02:04.460 --> 00:02:05.120 +Hey, Chris. + +00:02:05.520 --> 00:02:05.840 +Hey there. + +00:02:05.840 --> 00:02:06.480 +How's it going? + +00:02:06.780 --> 00:02:07.300 +I'm well. + +00:02:07.380 --> 00:02:07.800 +How are you? + +00:02:08.100 --> 00:02:08.640 +Oh, good. + +00:02:08.780 --> 00:02:09.540 +Good to hear it. + +00:02:09.560 --> 00:02:11.600 +I'm happy to have you back on the show. + +00:02:11.920 --> 00:02:16.660 +Last time we had you on the show, you were part of the panel around Datastar, and that was cool. + +00:02:16.920 --> 00:02:21.820 +Now we're going to talk about event sourcing, but we'll find a way to tie it back to Datastar just a bit, I think. + +00:02:21.820 --> 00:02:22.960 +I see it on the horizon. + +00:02:23.340 --> 00:02:24.000 +Or on the show notes. + +00:02:25.080 --> 00:02:25.640 +Sounds good. + +00:02:25.640 --> 00:02:25.960 +One of these. + +00:02:26.260 --> 00:02:26.480 +Yeah. + +00:02:26.660 --> 00:02:28.460 +Well, not everyone listens to every episode. + +00:02:28.680 --> 00:02:32.020 +We've got new listeners coming all the time, so give us the quick intro. + +00:02:32.240 --> 00:02:32.860 +Who are you, Chris? + +00:02:33.120 --> 00:02:33.540 +Who am I? + +00:02:33.600 --> 00:02:33.940 +Let's see. + +00:02:34.060 --> 00:02:42.880 +I am a Python developer for about 20 years and a long-time listener to the show, so it's a very big privilege to be on, finally, my own self. + +00:02:43.640 --> 00:02:46.180 +And by the way, great job continuing going, Michael. + +00:02:46.260 --> 00:02:49.680 +You're constantly putting out great content, so I really appreciate all your work. + +00:02:50.480 --> 00:02:53.240 +But as far as me, I learned the program as an adult. + +00:02:53.960 --> 00:02:55.820 +Friend suggested I learn Python. + +00:02:56.240 --> 00:03:00.500 +I hitched my wagon to this engine, and I've loved it ever since. + +00:03:01.340 --> 00:03:04.620 +I was a technical coach for a little while. + +00:03:04.800 --> 00:03:08.680 +I started the Python group here in Richmond, Virginia, PyRVA. + +00:03:09.000 --> 00:03:10.580 +So if you're local, come out. + +00:03:10.940 --> 00:03:13.420 +In fact, we're just two weeks away from the next meeting. + +00:03:14.120 --> 00:03:14.240 +Awesome. + +00:03:14.340 --> 00:03:15.460 +How frequently do you have meetings? + +00:03:15.800 --> 00:03:16.580 +Once a month now. + +00:03:16.740 --> 00:03:20.480 +In fact, if you look behind me, you can see all these blue dots are the meetings. + +00:03:20.860 --> 00:03:21.860 +Okay, excellent. + +00:03:22.080 --> 00:03:22.420 +Keep planning. + +00:03:22.480 --> 00:03:23.900 +What are some of the kind of topics you all have? + +00:03:24.100 --> 00:03:24.520 +Oh, man. + +00:03:24.520 --> 00:03:29.160 +We range pretty much whatever we can figure out that people are interested in. + +00:03:29.160 --> 00:03:31.820 +We've had a number of AI discussions. + +00:03:32.040 --> 00:03:33.820 +In fact, those have been really powerful. + +00:03:34.000 --> 00:03:38.180 +We've kind of lined up the chairs in a big circle and just have a discussion. + +00:03:38.420 --> 00:03:40.060 +And it's really incredible. + +00:03:40.260 --> 00:03:47.140 +You know, you have people on the whole spectrum of opinions about AI. + +00:03:47.420 --> 00:03:49.560 +So, yeah, it was very, very... + +00:03:49.560 --> 00:03:50.840 +That was one of my favorite... + +00:03:50.840 --> 00:03:52.440 +I was going to say episodes. + +00:03:52.600 --> 00:03:53.560 +One of my favorite meetings. + +00:03:54.020 --> 00:03:59.960 +I think we might bookend this podcast episode with a little AI at the start, a little AI at the end. + +00:04:00.180 --> 00:04:00.680 +Sounds good. + +00:04:01.420 --> 00:04:01.620 +Cool. + +00:04:01.960 --> 00:04:03.520 +So, yeah, people will just go attend that. + +00:04:04.280 --> 00:04:06.920 +And I guess probably a meetup.com is where they find you. + +00:04:07.120 --> 00:04:07.820 +It is, yeah. + +00:04:08.020 --> 00:04:12.740 +And you can go to pyrva.org to get to the meetup page either way. + +00:04:12.840 --> 00:04:13.160 +Yeah, easiest way. + +00:04:13.320 --> 00:04:14.900 +Because you can find a lot of crazy stuff. + +00:04:14.980 --> 00:04:18.580 +You'll probably find people with pet pythons that want to meet up in Virginia as well. + +00:04:18.680 --> 00:04:19.520 +And it's not the same thing. + +00:04:20.040 --> 00:04:20.940 +Indeed, yeah. + +00:04:21.180 --> 00:04:24.340 +Thankfully, and we are, gosh, over 10 years old. + +00:04:24.340 --> 00:04:28.620 +So we are one of the first ones to pop up just because of age, I think, sometimes. + +00:04:28.940 --> 00:04:31.880 +Although, honestly, some of the newer ones pop up ahead of us sometimes, too. + +00:04:32.160 --> 00:04:32.680 +So, you know. + +00:04:32.880 --> 00:04:35.960 +It's such a contentious topic, this AI stuff. + +00:04:36.040 --> 00:04:37.080 +You know, talking about your meetup. + +00:04:37.320 --> 00:04:38.080 +How did that go? + +00:04:38.160 --> 00:04:39.000 +Were people frustrated? + +00:04:39.240 --> 00:04:39.820 +People excited? + +00:04:40.200 --> 00:04:42.460 +Yeah, it was kind of the whole spectrum. + +00:04:42.600 --> 00:04:45.280 +You know, I felt like it was great because everybody respected each other. + +00:04:45.420 --> 00:04:51.200 +And there were, I'd say, two people anchored on each of the sides. + +00:04:51.380 --> 00:05:01.860 +You know, two people who, actually, maybe there were three people who, you know, have bought their own hardware and are really going deep into AI and, you know, encouraging people to, like, really dig into it. + +00:05:01.980 --> 00:05:10.440 +And we had two people who were very AI skeptic and very, you know, worried about the environment, you know, all sorts of different things. + +00:05:10.760 --> 00:05:13.940 +And so, I thought it was a very healthy, very good discussion. + +00:05:14.580 --> 00:05:19.200 +You know, I walked away with, you know, a healthier respect for both sides. + +00:05:20.180 --> 00:05:28.640 +Some ideas that I incorporated into my work, especially now that the company I work for gives us a mandate to use AI to write every piece of code. + +00:05:28.760 --> 00:05:31.880 +So, that's been a very fascinating transition for me. + +00:05:32.420 --> 00:05:43.820 +But it also gives me, you know, a little bit of agency or a little bit of permission to experiment and see what to do, you know, how to actually function in this new paradigm. + +00:05:44.320 --> 00:05:44.940 +It is engineering. + +00:05:45.320 --> 00:05:50.260 +And we'll talk about it more later, but it's quite wild that your company's, that is the position. + +00:05:50.440 --> 00:05:52.700 +I don't necessarily think it's the wrong position. + +00:05:52.940 --> 00:06:02.840 +I think it might be the right one, but it's, I think it's got to be brought on from a, from a, this is a skill you need to learn, not just throw stuff at the chat bot and now that's your job. + +00:06:03.040 --> 00:06:04.680 +Like, these are not the same things, you know? + +00:06:04.760 --> 00:06:10.260 +But I think a lot of people do, especially when they're getting started, treat them as the same thing and then say it doesn't work and then they're frustrated. + +00:06:10.260 --> 00:06:11.100 +Yeah, totally. + +00:06:11.480 --> 00:06:11.660 +Yeah. + +00:06:11.940 --> 00:06:12.160 +Yeah. + +00:06:12.740 --> 00:06:13.100 +Yeah. + +00:06:13.420 --> 00:06:13.740 +All right. + +00:06:14.380 --> 00:06:15.200 +More of that. + +00:06:15.400 --> 00:06:18.520 +We'll speak further about certain things, but yes, I agree. + +00:06:19.220 --> 00:06:19.620 +Absolutely. + +00:06:19.820 --> 00:06:20.220 +Absolutely. + +00:06:20.400 --> 00:06:20.680 +All right. + +00:06:20.740 --> 00:06:27.640 +Well, we'll, we'll come back to that and, and get there, but let's talk about a little bit of this event sourcing thing. + +00:06:27.900 --> 00:06:28.220 +Yeah. + +00:06:28.520 --> 00:06:29.860 +What exactly is event sourcing? + +00:06:30.180 --> 00:06:30.640 +That's a question. + +00:06:30.720 --> 00:06:31.000 +Okay, cool. + +00:06:31.060 --> 00:06:32.200 +I wasn't sure if you had more. + +00:06:32.420 --> 00:06:32.980 +I saw your. + +00:06:33.080 --> 00:06:33.480 +No, no. + +00:06:33.540 --> 00:06:36.820 +I think I, when you, when you reached out and said, Hey, let's talk about this. + +00:06:36.940 --> 00:06:38.920 +I'm like, I, so let's take a step back. + +00:06:38.920 --> 00:06:39.320 +Okay. + +00:06:39.500 --> 00:06:43.820 +Design patterns, refactoring ideas, like all of these architecture. + +00:06:44.040 --> 00:06:45.500 +I love to talk about this stuff. + +00:06:45.560 --> 00:06:46.960 +I love to think about this stuff. + +00:06:47.020 --> 00:06:57.800 +I think also, maybe that's a why, one of the reasons I'm not super frustrated with the AI things because I will tell it, I want you to use, like you could say like, I want to build a system with event sourcing and it's going to work this way. + +00:06:57.800 --> 00:07:02.360 +Like for me, the fun is like, Oh, I got to build some of the event source and it clicks together with this and it does that. + +00:07:02.400 --> 00:07:07.280 +And like, yeah, I didn't have to do all the little checks and details and like the file IO, but whatever. + +00:07:07.400 --> 00:07:12.520 +Like I can skip that and build just like, think a little bit bigger and a little more, more big building blocks. + +00:07:12.820 --> 00:07:17.660 +So I'm a big fan of design patterns and paying attention to what you're doing, I can tell that you are too. + +00:07:18.000 --> 00:07:18.260 +Yes. + +00:07:19.700 --> 00:07:21.420 +Especially event sourcing, I would say. + +00:07:22.040 --> 00:07:27.180 +So I have been following the topic of event sourcing for over a decade. + +00:07:27.180 --> 00:07:36.100 +I listened to a podcast in with a couple of developers in the PHP realm and they were talking about event sourcing and it just inspired me. + +00:07:36.100 --> 00:07:39.140 +specifically the things that I really loved. + +00:07:39.480 --> 00:07:41.760 +Before I was a programmer, I was a graphic designer. + +00:07:42.160 --> 00:07:47.640 +And so creating websites with exceptional user experiences is something that just makes me excited. + +00:07:48.180 --> 00:08:01.760 +And at the time, I was working at a creative agency building websites that would just slow, get slower and slower and slower, the more data we put into it or, you know, the more we configure the routing or the navigation and whatnot. + +00:08:01.760 --> 00:08:08.480 +And so the thought of never having a slow page load again was intoxicating to me. + +00:08:08.980 --> 00:08:22.600 +However, that was in a portion of my career where I was, had so much imposter syndrome, having learned to program in as an adult and I just felt like I couldn't suggest to my lead developer we should lean into this. + +00:08:23.000 --> 00:08:32.100 +And even when I was a lead developer, I just didn't feel confident that I could lead a whole team into like a redesign of the code. + +00:08:32.720 --> 00:08:35.740 +But a couple years ago, I got laid off and I was like, you know what? + +00:08:35.820 --> 00:08:39.300 +I've been wanting to explore event sourcing for 10 years. + +00:08:39.540 --> 00:08:40.340 +I'm going to do it. + +00:08:40.660 --> 00:08:43.300 +And it turns out it's a lot easier than I thought. + +00:08:43.760 --> 00:08:47.460 +But anyway, all this to say, let me define event sourcing since I've kind of danced around it. + +00:08:47.720 --> 00:08:51.620 +Event sourcing is how, has to do with how you save data to the database. + +00:08:52.100 --> 00:08:56.760 +The best way to contrast it is most apps are kind of CRUD based, right? + +00:08:56.820 --> 00:09:00.540 +So you create a table, like let's say you're a shopping cart application. + +00:09:00.940 --> 00:09:10.200 +You have a table called carts and you have a bunch of columns and one of the columns has data for the product IDs that are in the cart. + +00:09:10.560 --> 00:09:19.940 +And so if you have that kind of situation and you have one user who adds five items to the cart, then removes two and then checks out. + +00:09:20.200 --> 00:09:23.680 +And then you have another user who adds just three items to the cart and checks out. + +00:09:23.880 --> 00:09:28.680 +Well, if they check out at the same time and you look at the two database rows, they are very similar. + +00:09:28.680 --> 00:09:30.480 +They're the same user, basically. + +00:09:30.720 --> 00:09:34.420 +They get put in the same cohort from your marketing side, right? + +00:09:34.680 --> 00:09:34.940 +Yeah. + +00:09:35.180 --> 00:09:38.560 +Both of them checked out exactly the same products. + +00:09:39.060 --> 00:09:44.600 +And if you look at the database, you'd have no idea that one of them removed two items from their shopping cart. + +00:09:44.840 --> 00:09:58.360 +So event sourcing, so the reason that is because CRUD based applications will mutate state in the database to ensure it's always up to date with the current state and optimize for data integrity. + +00:09:58.720 --> 00:10:03.960 +Event sourcing, on the other hand, captures every change that happens within the system. + +00:10:04.300 --> 00:10:10.000 +So you would have an event of like cart created, item added to cart, item added to cart. + +00:10:10.100 --> 00:10:13.760 +You'd have five item added to cart events and two removals. + +00:10:14.180 --> 00:10:19.640 +And so you could have the whole history of how each user interacts with your system in the database. + +00:10:19.780 --> 00:10:22.200 +That is essentially the core of event sourcing. + +00:10:22.200 --> 00:10:27.060 +Everything else is actually adding more design patterns onto event sourcing, which is really wonderful. + +00:10:27.220 --> 00:10:29.120 +But that is what event sourcing is. + +00:10:29.440 --> 00:10:29.520 +Right. + +00:10:29.620 --> 00:10:32.900 +So your database sort of becomes almost an audit log. + +00:10:33.080 --> 00:10:35.680 +It feels very source control-ish. + +00:10:35.880 --> 00:10:36.900 +It feels like Git, right? + +00:10:36.900 --> 00:10:40.740 +Like what you store is the file and then the diff, then the diff, then the diff. + +00:10:41.120 --> 00:10:45.120 +You get back to the file by running all the operations on it, potentially. + +00:10:45.500 --> 00:10:46.320 +Yeah, exactly. + +00:10:46.680 --> 00:10:47.880 +Yeah, that's exactly what it is. + +00:10:47.960 --> 00:10:52.460 +And it's funny because like, I don't know about you, but like when I first heard about this, I'm like, well, isn't it slower? + +00:10:52.600 --> 00:10:54.560 +Because it just seems like you're doing so much more work. + +00:10:54.620 --> 00:11:06.080 +Every time you're updating the cart, you're pulling out all the events from the event store and building up the state of where it's at right now and saying, okay, you know, does that cart exist in the, does that item exist in the cart? + +00:11:06.200 --> 00:11:06.900 +Can I remove it? + +00:11:07.100 --> 00:11:08.100 +Okay, let's remove it. + +00:11:08.240 --> 00:11:10.860 +You know, and it turns out computers are fast. + +00:11:12.540 --> 00:11:16.160 +And so it's negligibly slower depending on the query. + +00:11:17.000 --> 00:11:17.080 +Yeah. + +00:11:17.080 --> 00:11:21.960 +So I saw you and Bob Belderbuss talking about this on YouTube and that was one of my first thoughts too. + +00:11:22.040 --> 00:11:27.420 +It was like, this is super cool, but it's kind of sounds a lot slower to answer questions. + +00:11:27.720 --> 00:11:27.920 +Yeah. + +00:11:28.180 --> 00:11:35.980 +And I think, but then I thought about it and I thought, I think there's actually a decent amount of stuff that you can do to make it quite a bit faster. + +00:11:36.240 --> 00:11:36.620 +Right. + +00:11:36.680 --> 00:11:43.380 +So let me throw some ideas out to you and then you tell me how they land as somebody who's actually done this stuff. + +00:11:43.540 --> 00:11:46.260 +First of all, back to your comment on computers are fast. + +00:11:46.540 --> 00:11:47.420 +Computers are so fast. + +00:11:47.740 --> 00:11:51.680 +They're so much faster than people realize how fast they are and databases are fast too. + +00:11:51.760 --> 00:11:55.340 +If you put indexes on them, then they're so much faster. + +00:11:55.560 --> 00:11:57.680 +I just don't understand how websites are slow. + +00:11:58.060 --> 00:12:01.780 +It's just, it's not just, oh, I'm a little frustrated. + +00:12:01.880 --> 00:12:16.400 +It's like, how is it possible that somebody built this and accepted that it takes three seconds to load this page and yet they do and you just know that it's most likely that there's not a database index somewhere or maybe on an extreme situation there should be better caching, + +00:12:16.560 --> 00:12:17.820 +but it's just like, ah. + +00:12:18.180 --> 00:12:21.240 +So like when done right, you're right, it's absolutely blazes, right? + +00:12:21.480 --> 00:12:22.300 +Yeah, absolutely. + +00:12:22.900 --> 00:12:24.960 +But how, you could make it faster. + +00:12:25.040 --> 00:12:26.700 +So a couple of thoughts that came to mind. + +00:12:26.960 --> 00:12:28.940 +One was, I have three. + +00:12:29.120 --> 00:12:43.680 +The one is you could have a operational database which has, for a particular user, it might have the five ads, two removals and that's their shopping cart and the way you get that is either you query it and then code you add it up or you do some kind of aggregation thing + +00:12:43.680 --> 00:12:48.600 +that says get all these things and then somehow plus minus them together, right? + +00:12:48.820 --> 00:12:48.980 +Yeah. + +00:12:49.200 --> 00:12:53.300 +Probably in the shopping cart it feels like you probably need to actually pull them back. + +00:12:53.340 --> 00:13:04.400 +But still, you could do that super quick and then just run that bit of code and every time you make a change you could also write that to a second table, second database that doesn't have the, it just has the current state. + +00:13:04.620 --> 00:13:06.560 +I just have three for that user shopping cart. + +00:13:06.760 --> 00:13:10.320 +That sounds okay to me but it's, I mean databases hate duplication. + +00:13:10.320 --> 00:13:12.900 +That's like kind of their third normal form job. + +00:13:13.180 --> 00:13:17.060 +Do databases hate duplication or is it people that don't like it? + +00:13:18.140 --> 00:13:18.960 +Yeah, that's fair. + +00:13:19.040 --> 00:13:19.440 +That's fair. + +00:13:19.780 --> 00:13:21.720 +I mean they were built to avoid duplication, right? + +00:13:21.820 --> 00:13:22.480 +So in that sense. + +00:13:22.880 --> 00:13:27.380 +So, but another, you know, another possibility would just be something like Valkey. + +00:13:27.440 --> 00:13:30.800 +You could, I could have said Redis but I'm a fan of Valkey over Redis. + +00:13:31.040 --> 00:13:33.120 +I find this is like a little bit nicer project. + +00:13:33.800 --> 00:13:35.080 +Are you familiar with Valkey? + +00:13:35.340 --> 00:13:37.660 +I am not and I'm taking notes mentally right now. + +00:13:37.660 --> 00:13:38.380 +Yeah, here we go. + +00:13:38.940 --> 00:13:45.700 +So Valkey, or if I find its repository which is hiding down here, it describes itself. + +00:13:45.860 --> 00:13:46.820 +Where does it describe itself? + +00:13:47.220 --> 00:13:54.360 +So a fork of the open source Redis project right before the transition to their new source available but not open source models. + +00:13:54.720 --> 00:14:03.680 +So if you have say the Redis Python library, you just tell it that it talks to this thing, this endpoint, this IP address and port and it thinks it's Redis, right? + +00:14:03.700 --> 00:14:05.480 +But it's like more open source friendly. + +00:14:05.560 --> 00:14:06.420 +I'm going to star it for the world. + +00:14:06.680 --> 00:14:08.620 +So, and it's got 26,000 stars, right? + +00:14:08.680 --> 00:14:09.680 +So it's pretty popular. + +00:14:09.980 --> 00:14:20.040 +Not that it's totally really relevant exactly which version this is but just having a cache that has that information of what is, you know, shopping cart is three, period. + +00:14:20.280 --> 00:14:20.520 +Yeah. + +00:14:20.780 --> 00:14:23.220 +And put something, just put two pieces in place. + +00:14:23.340 --> 00:14:25.700 +One, when you do a query, first you check the cache. + +00:14:25.780 --> 00:14:30.300 +If it's not there, you get the playback, you compute it and you put it in the cache. + +00:14:30.460 --> 00:14:33.860 +And if you make any change to that thing, then you invalidate the cache, right? + +00:14:33.860 --> 00:14:41.820 +Those two things would give you basically seamless ephemeral answers to the questions that if you ask them very often, you get super quick responses, right? + +00:14:42.080 --> 00:14:42.420 +Absolutely. + +00:14:43.140 --> 00:14:43.320 +Yeah. + +00:14:43.400 --> 00:14:48.960 +And then the other one is, if you, you know, I always find like these extra caching servers are honestly not necessary. + +00:14:49.480 --> 00:14:50.820 +You know, we already talked about FAST. + +00:14:50.980 --> 00:14:57.440 +So I'm a big fan of disk cache, which I had Vincent Warmerdom on to talk about and disk cache is awesome. + +00:14:57.520 --> 00:15:02.360 +So you could just have like a local file store on your Docker image or volume or whatever. + +00:15:02.360 --> 00:15:05.660 +Then you get the same thing, but you don't have to have the infrastructure, right? + +00:15:05.720 --> 00:15:06.680 +There's, there's different options. + +00:15:06.800 --> 00:15:09.260 +So basically, I guess to sum this up, I'll throw it to you now. + +00:15:09.640 --> 00:15:14.640 +Operational database that is also has the answers or some kind of caching story. + +00:15:14.640 --> 00:15:17.560 +It could be in memory, it could be a server like Valkeyrie. + +00:15:17.700 --> 00:15:19.240 +It could be the disk cache, whatever. + +00:15:19.600 --> 00:15:20.460 +What do you think about those? + +00:15:20.740 --> 00:15:21.820 +Are you guys all exploring them? + +00:15:22.160 --> 00:15:26.780 +I am exploring two, essentially, yeah, I guess you could say all three in a way. + +00:15:27.100 --> 00:15:31.860 +I'm, I'm experimenting with NATs to do kind of the, I guess, Redis kind of side of it. + +00:15:31.860 --> 00:15:36.360 +But all that to say is all three are viable options depending on your use case. + +00:15:36.740 --> 00:15:39.560 +The first thing is like just the event store itself. + +00:15:39.900 --> 00:15:47.980 +Like you, getting the current state of any individual item should not take, you know, essentially should take milliseconds if, if that long. + +00:15:48.280 --> 00:15:54.720 +One of the, so there are two people that I kind of, who are my North star for all of my event sourcing knowledge. + +00:15:54.720 --> 00:15:57.940 +they are Martin Dilger and Adam Dimitriuk. + +00:15:57.940 --> 00:16:11.840 +And Martin wrote a book called Understanding Event Sourcing, which was huge to helping me go from somebody who created an event sourcing project to adopting event sourcing as my kind of default strategy. + +00:16:12.380 --> 00:16:26.840 +And in, in the book, he mentioned that he will tend to use the event store for essentially, like, if you start hitting 2000 events in a stream, then he'll think about optimizing it, you know, changing, going to one of the alternate + +00:16:26.840 --> 00:16:28.500 +approaches that we just mentioned. + +00:16:28.680 --> 00:16:30.180 +So, like, that's really impressive. + +00:16:30.420 --> 00:16:33.980 +I, he uses Java or one of the Java derivative languages. + +00:16:34.240 --> 00:16:36.820 +So, chances are, it's slower in Python. + +00:16:36.820 --> 00:16:40.900 +And so you need, you know, that we need to adjust that number for Python. + +00:16:41.020 --> 00:16:44.640 +But honestly, we should have shorter event streams anyway. + +00:16:44.940 --> 00:16:45.920 +2000 seems like a lot. + +00:16:46.100 --> 00:16:46.540 +It does. + +00:16:48.080 --> 00:16:48.980 +Seems like a lot. + +00:16:48.980 --> 00:16:54.240 +This portion of Talk Python To Me is brought to you by Sentry and Seer AI. + +00:16:54.820 --> 00:17:00.640 +There are plenty of AI tools that help you write code, but Sentry Seer is built to help you fix it when it breaks. + +00:17:01.080 --> 00:17:02.220 +The difference is context. + +00:17:02.820 --> 00:17:05.060 +Seer isn't just guessing based on syntax. + +00:17:05.320 --> 00:17:09.820 +It's analyzing your actual Sentry data, your stack traces, logs, and failure patterns. + +00:17:10.240 --> 00:17:20.060 +Because it has the full context, it can, A, spot buggy code in review and help prevent issues before they happen, and B, identify the root cause of production errors. + +00:17:20.900 --> 00:17:25.520 +It can even draft a fix and hand the work off to an agent-like cursor to open a PR for you. + +00:17:26.060 --> 00:17:28.160 +Seer turns Sentry into a complete loop. + +00:17:28.580 --> 00:17:33.240 +You have your traces, errors, logs, and replays to see the problem and now AI to help solve it. + +00:17:33.520 --> 00:17:39.500 +Join millions of devs at companies like Claude, Disney+, and even Talk Python who use Sentry to move faster. + +00:17:39.960 --> 00:17:48.760 +Check them out at talkpython.fm/sentry and use code talkpython26, all one word, for $100 in Sentry credits. + +00:17:49.220 --> 00:17:51.100 +Thank you to Sentry for supporting Talk Python. + +00:17:52.160 --> 00:18:05.940 +Kind of the first fallback for me would be what Martin and Adam call the read model, which is essentially one way you can make a read model would be the database-backed read model where you have code that subscribes + +00:18:05.940 --> 00:18:07.900 +to only the events it cares about. + +00:18:08.240 --> 00:18:14.240 +And so whenever an event comes in, it'll incrementally update that database cache or your file cache or Redis cache or whatever. + +00:18:14.240 --> 00:18:28.440 +And then I guess the third thing that I appreciate is using Redis or in my case, Nats, is when you have a front-end, a very high-frequency, high-updating web UI or something + +00:18:28.440 --> 00:18:36.620 +that you want to really make sure that the user has up-to-date information, I would lean towards that having ways to push down to the client. + +00:18:36.880 --> 00:18:39.620 +Yeah, anytime you've got a live stream, it seems perfect, right? + +00:18:39.840 --> 00:18:40.660 +Yeah, absolutely. + +00:18:40.660 --> 00:18:53.620 +Yeah, I mean, hook up some JavaScript or hook in some textual or whatever it is you're trying to do and just say, or just write, you know, arbitrary web sockets or service and events and just say, when this changes, send me the Delta and then we'll adjust. + +00:18:53.960 --> 00:18:54.140 +Yeah. + +00:18:54.180 --> 00:18:55.460 +Which is pretty, pretty nice. + +00:18:55.720 --> 00:18:55.900 +Yeah. + +00:18:55.980 --> 00:19:00.640 +So there's a book that you recommended by one of the guys you mentioned, Martin Dilger. + +00:19:00.960 --> 00:19:01.140 +Yeah. + +00:19:01.140 --> 00:19:02.180 +Tell people about that real quick. + +00:19:02.400 --> 00:19:05.640 +Yeah, this is incredible to me. + +00:19:05.760 --> 00:19:08.460 +He realized he needs, there's a gap. + +00:19:08.460 --> 00:19:22.400 +One of the biggest problems with event sourcing is that the material is sometimes there's gaps in where there's good material and because event sourcing came out of the domain driven design community, there's a lot of jargon that you have to kind of get through. + +00:19:22.980 --> 00:19:25.700 +And they do like their jargon in the DDC space. + +00:19:25.960 --> 00:19:26.920 +They really do. + +00:19:27.320 --> 00:19:29.560 +And it, once you understand it, it makes perfect sense. + +00:19:29.660 --> 00:19:30.320 +But yeah, yeah. + +00:19:30.340 --> 00:19:32.420 +Getting onboarded does, does take. + +00:19:32.500 --> 00:19:35.400 +Gets your bounded context working and then off the races. + +00:19:35.640 --> 00:19:36.400 +Yeah, exactly. + +00:19:36.400 --> 00:19:44.520 +And so Martin essentially saw this gap in knowledge and was like, I need to fill this with an e-book. + +00:19:44.620 --> 00:19:47.280 +And I can't remember, I want to say in like two months he wrote this thing. + +00:19:47.480 --> 00:19:52.840 +And it's so amazing because it introduces the way that the two of them work. + +00:19:52.960 --> 00:20:07.900 +Like they both independently kind of came to similar conclusions, which is to use event sourcing as your base, but to also leverage two other or one main other, whatever, a couple of other patterns, vertical slice architecture, CQRS, which is kind of that idea + +00:20:07.900 --> 00:20:14.800 +of like having those read models ready to, you know, optimized for you to download and use. + +00:20:14.940 --> 00:20:18.820 +And then using a documentation technique called event modeling diagrams. + +00:20:18.960 --> 00:20:33.080 +And that, that is a huge key too because as someone who has been on a couple teams to do the event-driven transition to try to really help, you know, do more asynchronously, you need to have a good communication pattern + +00:20:33.080 --> 00:20:35.580 +to keep everybody up to date on what does what. + +00:20:35.940 --> 00:20:48.860 +And I find that all three, especially event modeling diagram, they have refined this to make it simpler and simpler and simpler to the point where there's really just a few elements put together and you can understand the whole life cycle of a, of an application. + +00:20:49.240 --> 00:20:49.460 +Yeah, cool. + +00:20:49.520 --> 00:20:52.240 +I'll link to the book on, over on Amazon. + +00:20:52.480 --> 00:20:57.360 +You know, another, before we carry on, I thought of another more optimized scenario. + +00:20:57.680 --> 00:20:59.560 +What about a document database like Mongo? + +00:20:59.920 --> 00:21:00.160 +Yeah. + +00:21:00.160 --> 00:21:14.200 +Your top level elements are just like the computed fields like total lifetime value or, you know, cart value or cart count, item count, but then have like maybe a cart item events, which could be a, a nested list + +00:21:14.200 --> 00:21:24.140 +of acts of like rich, you know, many documents that are like item added, item added, timestamp, value, like category, all, and actually storing them in the same record. + +00:21:24.540 --> 00:21:24.680 +Yeah. + +00:21:24.920 --> 00:21:27.740 +I haven't tried that, but I think it's a really interesting approach. + +00:21:27.740 --> 00:21:33.560 +You know, I actually use a document database as my fire store, as my, as my event store. + +00:21:34.400 --> 00:21:38.640 +And I haven't really kind of dug into like kind of the optimizations I could do. + +00:21:38.880 --> 00:21:50.360 +But I find it curious because like the, I mean, what you're suggesting is slightly different than how I think about it because it sounds like you have like a, like a better word, a model that you're storing all of its events in as well. + +00:21:50.620 --> 00:21:55.680 +And some of the neat, some of the, it's kind of inside out of what the real design pattern is, right? + +00:21:55.680 --> 00:22:06.840 +Well, it's not so much that as much as, one thing that I have found interesting is, you know, since this came out of the domain driven design group, everything is about an aggregate, what they call aggregate. + +00:22:07.060 --> 00:22:08.120 +Many people call it a model. + +00:22:08.320 --> 00:22:14.840 +And so you, you know, set up boundaries of this is your shopping cart and these are the events that modify the shopping cart. + +00:22:15.020 --> 00:22:29.520 +What has been a new movement in event sourcing is to, essentially be model less is to like focus on the events themselves because they are so flexible and so many times we as developers can kind of create + +00:22:29.520 --> 00:22:35.000 +boundaries around what we think are the models that, but the models change. + +00:22:35.260 --> 00:22:35.540 +Yeah. + +00:22:35.840 --> 00:22:36.120 +Yeah. + +00:22:36.120 --> 00:22:38.800 +And coupling is like the hardest thing. + +00:22:38.800 --> 00:22:43.880 +The bounded context, as I would say, actually changes because the problem you're solving might change and the models don't match. + +00:22:44.140 --> 00:22:44.540 +Exactly. + +00:22:44.940 --> 00:22:45.180 +Yeah. + +00:22:45.440 --> 00:22:45.660 +Yeah. + +00:22:45.660 --> 00:22:50.320 +So all that to say is I don't, not that I want to especially say like, I don't think that that's a bad idea. + +00:22:50.320 --> 00:22:51.760 +I think it could be really fascinating. + +00:22:52.100 --> 00:23:05.980 +especially as like a secondary approach because, you know, well, whatever, you know, like I, one of the things I really find fascinating about this is this is such a flexible pattern that people, I mean, they've done so many different ways of optimizing + +00:23:05.980 --> 00:23:08.980 +for their event store or anything like this. + +00:23:08.980 --> 00:23:11.100 +So I think that's a very much a valid approach. + +00:23:11.420 --> 00:23:25.280 +It's, it's, the reason it came to mind is you can atomically update documents and therefore you could atomically update both the computed value and the series of events as a single action, which is interesting, you know? + +00:23:25.500 --> 00:23:26.500 +Yeah, absolutely. + +00:23:26.840 --> 00:23:27.380 +Very interesting. + +00:23:27.440 --> 00:23:30.860 +And they kind of, that, that one thing becomes the source of truth for what you're tracking. + +00:23:30.960 --> 00:23:32.420 +I don't, there might be something there. + +00:23:32.520 --> 00:23:35.540 +I don't, but it does sound a little bit too focused on the model. + +00:23:35.640 --> 00:23:36.220 +I do think. + +00:23:36.420 --> 00:23:37.760 +It's worth experimenting with for sure. + +00:23:38.120 --> 00:23:47.660 +So just, to throw out a little street cred there, look at this, purchased April 18th, 20, 2005, the domain driven design by Eric Evans. + +00:23:47.740 --> 00:23:50.240 +So this is kind of the greater space, right? + +00:23:50.320 --> 00:23:54.680 +Maybe the book is called domain driven design, tackling the complexity in the heart of software. + +00:23:54.840 --> 00:23:55.640 +It's pretty interesting. + +00:23:55.940 --> 00:24:05.000 +I think it's kind of the follow-on of the refactoring movement that Martin Fowler and all those folks were working on, like in the late 90s, early 2000s. + +00:24:05.200 --> 00:24:05.420 +Yeah. + +00:24:05.460 --> 00:24:06.320 +Kind of in that space, right? + +00:24:06.620 --> 00:24:06.880 +Yeah. + +00:24:07.220 --> 00:24:08.700 +I must say, I haven't bought that book. + +00:24:08.700 --> 00:24:19.660 +The closest I've come is the Cosmic Python book or the architecture patterns in Python book that Harvey and, oh, I always forget the other guy's name, but that's such an amazing book. + +00:24:19.740 --> 00:24:21.240 +So that's the closest I've come to DDD. + +00:24:21.600 --> 00:24:24.000 +There's a video. + +00:24:24.140 --> 00:24:25.680 +I think, I think you gave me this video, right? + +00:24:25.780 --> 00:24:27.040 +Events, we're seeing explain football. + +00:24:27.480 --> 00:24:27.660 +Yeah. + +00:24:27.660 --> 00:24:31.580 +And this looks like foosball type football, not American football. + +00:24:31.980 --> 00:24:32.200 +Indeed. + +00:24:32.440 --> 00:24:35.880 +I love American football, but I do believe it's slightly misnamed. + +00:24:36.160 --> 00:24:38.340 +Like, you don't use your feet that much, other than the red card. + +00:24:38.680 --> 00:24:38.820 +Yeah. + +00:24:40.820 --> 00:24:41.520 +True that. + +00:24:41.840 --> 00:24:49.200 +I mean, it's like calling Formula One, like, foot car, because you make the car go with your foot, but it's not really the main thing of the sport anyway. + +00:24:49.460 --> 00:24:49.680 +Yeah. + +00:24:50.480 --> 00:24:51.480 +But yeah, I love this video. + +00:24:51.860 --> 00:24:52.020 +Yeah. + +00:24:52.140 --> 00:24:52.280 +Okay. + +00:24:52.360 --> 00:24:53.120 +Tell me about it. + +00:24:53.160 --> 00:24:53.740 +I'll put it in the show notes. + +00:24:54.040 --> 00:24:54.200 +Yeah. + +00:24:54.260 --> 00:25:00.900 +This is, you know, I feel like event sourcing, it's like one of those things where it's hard to explain until you get it. + +00:25:00.920 --> 00:25:02.620 +And then I feel like it's like a good board game. + +00:25:02.880 --> 00:25:06.640 +You know, if you have a board game fan in your life, they're like, oh, this is such a great board game. + +00:25:06.680 --> 00:25:07.380 +It's so simple. + +00:25:07.560 --> 00:25:11.540 +And then they start explaining it and like half an hour later, you're like, are we ever going to actually play this? + +00:25:11.580 --> 00:25:12.600 +I don't know that I want to anymore. + +00:25:12.920 --> 00:25:18.940 +But I felt like the, this person has done a really good job of kind of really distilling it down and showing like why it matters. + +00:25:19.440 --> 00:25:22.360 +And it's a 10 minute video, you know, five minutes at 2X. + +00:25:22.520 --> 00:25:24.300 +And it's, it's really kind of charming. + +00:25:24.520 --> 00:25:24.900 +Really good. + +00:25:25.120 --> 00:25:25.500 +Really well done. + +00:25:25.580 --> 00:25:25.780 +Okay, cool. + +00:25:25.860 --> 00:25:26.120 +Yeah, yeah. + +00:25:26.120 --> 00:25:26.660 +I'll put it on there. + +00:25:26.800 --> 00:25:27.960 +Can you watch videos at 2X? + +00:25:28.340 --> 00:25:29.180 +Oh yeah, all the time. + +00:25:29.420 --> 00:25:29.600 +Yeah. + +00:25:29.840 --> 00:25:30.600 +My daughter does that. + +00:25:30.640 --> 00:25:33.120 +I'm like, how do you actually take it in? + +00:25:33.160 --> 00:25:34.700 +I just, I'm a 1X sort of person. + +00:25:34.880 --> 00:25:38.780 +I do slow it down or rewind to, to pull in things, but yeah. + +00:25:38.780 --> 00:25:39.000 +Let's see. + +00:25:39.100 --> 00:25:42.000 +It's like a, like a seek and then focus sort of deal. + +00:25:42.120 --> 00:25:42.480 +Exactly. + +00:25:42.820 --> 00:25:43.200 +Exactly. + +00:25:43.500 --> 00:25:44.480 +Not that one. + +00:25:44.480 --> 00:25:45.280 +This one. + +00:25:45.580 --> 00:25:47.780 +I also heard there's this ebook I can get. + +00:25:47.860 --> 00:25:48.440 +What's up with this? + +00:25:48.780 --> 00:25:49.080 +Yeah. + +00:25:49.320 --> 00:25:55.100 +So scheduling this podcast episode, I didn't know, you know, what we're going to talk about. + +00:25:55.160 --> 00:25:57.080 +And I know like, there's going to be things I forget. + +00:25:57.380 --> 00:26:08.580 +And I feel like part of the reason I am so excited about this is I know that there's someone like me, the 10 year ago version of me who has heard about these things and is curious, but just needs more information. + +00:26:08.980 --> 00:26:18.280 +And so I've spent the last couple of weeks creating this ebook and I put the first version up and I'm going to continue to improve it as time goes on. + +00:26:18.420 --> 00:26:25.200 +So if anything in this conversation is interesting to you and you just need a little bit more and you want to understand a little bit more what's going on, absolutely download it. + +00:26:25.400 --> 00:26:26.580 +And, you know, it's free. + +00:26:26.820 --> 00:26:27.260 +So why not? + +00:26:27.540 --> 00:26:27.660 +Cool. + +00:26:27.820 --> 00:26:32.780 +And then we'll have to make you do a, an audio version, put that on audible or something like that. + +00:26:32.960 --> 00:26:33.540 +Sounds good. + +00:26:33.820 --> 00:26:34.360 +Yeah, sure. + +00:26:34.620 --> 00:26:42.380 +I always want to do audio books for stuff that I'm working on, but just the concept of trying to speak code or config file, I just like, I got to stop. + +00:26:42.580 --> 00:26:43.540 +You know, it's, it's tough. + +00:26:43.600 --> 00:26:47.340 +It's a tough balance to do with audio books and tech, like developer stuff, but still. + +00:26:47.600 --> 00:26:48.140 +For sure. + +00:26:48.500 --> 00:26:48.700 +Yeah. + +00:26:48.760 --> 00:26:48.980 +Cool. + +00:26:49.040 --> 00:26:49.220 +All right. + +00:26:49.220 --> 00:26:50.100 +So people can check that out. + +00:26:50.120 --> 00:26:50.620 +It's for free. + +00:26:50.740 --> 00:26:51.960 +We'll put that in the show notes. + +00:26:51.960 --> 00:27:01.560 +Now, I think that this both has really positive or really big possibilities for data science, but also potential challenges. + +00:27:02.040 --> 00:27:03.980 +Let me throw it out to you and then you, you take us through it. + +00:27:04.200 --> 00:27:06.540 +Super benefits, incredibly obvious. + +00:27:06.540 --> 00:27:10.900 +You have an event stream that tells you over time what times the things happened. + +00:27:11.020 --> 00:27:17.800 +You have, both the additions and subtractions or the permutations that it goes through until it ends up in its final state. + +00:27:17.800 --> 00:27:26.660 +Not just show me all the customers from California who bought this month, but like show me all the Californians who abandoned the cart, but then came back and then did the, you know what I mean? + +00:27:26.660 --> 00:27:29.020 +Like you can just answer way more interesting questions. + +00:27:29.020 --> 00:27:30.300 +You got time series. + +00:27:30.460 --> 00:27:38.180 +On the other hand, maybe I would just want to load up a pandas data frame with the answers of what's the average cart size during checkout. + +00:27:38.360 --> 00:27:43.620 +And that, that becomes like a big computation out of an event source based database. + +00:27:44.080 --> 00:27:44.800 +If you don't have one of things. + +00:27:45.020 --> 00:27:45.420 +Okay. + +00:27:45.480 --> 00:27:46.720 +Well, let's hear it. + +00:27:46.840 --> 00:27:47.240 +Let's hear it. + +00:27:47.240 --> 00:27:47.680 +I'm going to say it. + +00:27:47.680 --> 00:27:56.000 +Like if you, if you don't do one of those caching or multi-database things or the QRS, I don't remember the patterns. + +00:27:56.100 --> 00:27:58.640 +Anyway, that looks like it maybe is a little bit of a challenge. + +00:27:59.020 --> 00:27:59.700 +You could do it. + +00:27:59.760 --> 00:28:09.000 +I think you could do it more easily in pandas, but like maybe I just, you know, some people do data science just through SQL and they just, I'm just going to write queries against a warehouse database, you know? + +00:28:09.000 --> 00:28:09.920 +Yeah, absolutely. + +00:28:10.120 --> 00:28:11.000 +Why not have it both ways? + +00:28:11.660 --> 00:28:21.540 +For example, on my, okay, so let's start the, for the hardcore data scientist, actually, you know, I don't even think the event store is the right format for them. + +00:28:21.740 --> 00:28:34.720 +You know, I would definitely have some kind of script that would run on some kind of loop that, you know, maybe every day or every couple hours or whatever would transform the raw events into some format that would be great for. + +00:28:34.880 --> 00:28:35.080 +Sure. + +00:28:35.120 --> 00:28:40.340 +And you hear about all these like OLAP cubes and all these other like super BI type of systems. + +00:28:40.840 --> 00:28:42.800 +None of those, no, no, I can't say none of those. + +00:28:42.860 --> 00:28:45.720 +Many of those are not running out of the operational database. + +00:28:45.820 --> 00:28:49.000 +They're like a, some kind of like warehouse data lake. + +00:28:49.100 --> 00:28:50.860 +We've transformed this so you answer questions. + +00:28:51.000 --> 00:28:52.700 +So it's not necessarily just event sourcing. + +00:28:52.940 --> 00:28:56.860 +Like we just want to avoid five joins so we can just ask the question directly, right? + +00:28:57.080 --> 00:28:57.260 +Yeah. + +00:28:57.540 --> 00:28:57.720 +Yeah. + +00:28:57.780 --> 00:29:07.260 +And in fact, my current project that I have in production at work, it is a service that multiple other services use to process items. + +00:29:07.500 --> 00:29:19.080 +And the project manager of one of these services reached out to me and said that they have a BigQuery table that has all this analytical information and they wanted to add the information we have to theirs. + +00:29:19.460 --> 00:29:22.000 +And so, you know, we set up a conversation. + +00:29:22.280 --> 00:29:26.380 +I created the code and every day I'm sending information to their BigQuery instance. + +00:29:26.860 --> 00:29:33.760 +And three days after we did a go live, you know, I created a meeting to like kind of circle back with them to make sure everything's working the way they wanted. + +00:29:34.220 --> 00:29:40.400 +And when they opened up the BigQuery database, they were shocked because they were expecting three days worth of data. + +00:29:40.760 --> 00:29:48.160 +I had, I set every piece of data I had for months, which was how long they've been sending things to my service. + +00:29:48.520 --> 00:29:58.720 +And so, like, I was just, I just, you know, this person was elated because they were like, they knew their data scientists wanted this information and they, now they have all this information going back to day one, so to speak. + +00:29:59.480 --> 00:30:07.380 +And also just recently, my boss asked me, like, I have a reports view that's just a webpage that has like how, stats on how my service is doing. + +00:30:07.660 --> 00:30:12.460 +And he's like, it'd be nice to have like, some, like a table of some of these, you know, last few days or whatever. + +00:30:12.620 --> 00:30:15.040 +And I was like, okay, how many, how many days would you like? + +00:30:15.080 --> 00:30:16.000 +And he's like, 30 days. + +00:30:16.120 --> 00:30:24.900 +So I created, the HTML was pretty easy and I just created a script to like pull the events out of the event store and populate this table, you know, as exactly as we needed. + +00:30:25.060 --> 00:30:28.260 +It went into production live and we were immediately there with 30 days of history. + +00:30:28.260 --> 00:30:29.880 +It was, it was so exciting. + +00:30:29.880 --> 00:30:38.640 +And, and like, this is what I get to experience every week is like, you know, having the ability to like, go back into history and answer questions that we've had that we didn't even think we knew. + +00:30:38.900 --> 00:30:42.200 +We didn't have any idea that we would want to know, you know, a month ago. + +00:30:42.260 --> 00:30:45.700 +And to be able to answer those questions with precision is, is intoxicating. + +00:30:46.080 --> 00:30:46.200 +Yeah. + +00:30:46.460 --> 00:30:47.420 +I certainly see the value. + +00:30:47.500 --> 00:30:50.900 +Like you don't necessarily know the questions you're going to ask. + +00:30:50.920 --> 00:30:56.240 +And if you don't have enough data or you don't store it in the right way, you literally can't answer them. + +00:30:56.380 --> 00:30:56.560 +Yeah. + +00:30:56.760 --> 00:30:56.960 +Right. + +00:30:57.200 --> 00:31:02.160 +But with, it sounds like with, events or so, you can go back and like, well, what if we ask this over time instead of by region? + +00:31:02.320 --> 00:31:04.180 +Like, okay, slightly different query, no problem. + +00:31:04.500 --> 00:31:04.780 +Exactly. + +00:31:05.160 --> 00:31:05.340 +Yeah. + +00:31:05.600 --> 00:31:05.800 +Yeah. + +00:31:05.960 --> 00:31:07.580 +It's, it's really quite something. + +00:31:07.700 --> 00:31:09.180 +I had no idea. + +00:31:09.360 --> 00:31:15.060 +Like my, I kind of mentioned earlier, my biggest thing was, I was, I can't wait to have fast UI. + +00:31:15.200 --> 00:31:29.320 +And now that I realized that I feel like our applications obviously serve the primary purpose of whatever it is that the business needs, but I didn't realize how much there was a secondary need of understanding how it works and enabling the business to make decisions + +00:31:29.320 --> 00:31:31.980 +based on how customers are actually using the application. + +00:31:32.340 --> 00:31:38.680 +We're going to talk about the AI side later, but I do just want to throw out as different constituents who might care to answer these questions. + +00:31:38.840 --> 00:31:46.260 +Like I was just thinking, you've got, you've got the operational side of say the website or app or, you know, driving an API for the app or something like that. + +00:31:46.380 --> 00:31:47.240 +That's one view. + +00:31:47.320 --> 00:32:00.880 +That's kind of the traditional view, but now you have this much more increasingly popular view of like data scientists and BI tools and the CEO wants a dashboard that updates live type, you know, so events are a clear trigger for those kinds of things, right? + +00:32:01.140 --> 00:32:01.500 +Absolutely. + +00:32:01.840 --> 00:32:11.560 +But then also you might ask your AI Opus or Codex or whatever, hey, find me some trends or let's look at this and, you know, it has more to work with as well, right? + +00:32:11.800 --> 00:32:12.000 +Yeah. + +00:32:12.160 --> 00:32:13.920 +Just thinking of the different constituencies, yeah? + +00:32:14.200 --> 00:32:14.500 +Totally. + +00:32:14.880 --> 00:32:23.640 +In fact, just today I was looking into a bug that was happening in production and I asked Claude, hey, can you query the GCP logs? + +00:32:23.740 --> 00:32:26.460 +Can you query the event store and help me understand what's going on? + +00:32:26.480 --> 00:32:30.340 +And it was like, sure enough, here you go and made fixing the bug much easier. + +00:32:32.920 --> 00:32:35.660 +This portion of Talk Python is sponsored by Temporal. + +00:32:36.020 --> 00:32:42.520 +Ever since I had Mason Egger on the podcast for episode 515, I've been fascinated with durable workflows in Python. + +00:32:43.040 --> 00:32:47.700 +That's why I'm thrilled that Temporal has decided to become a podcast sponsor since that episode. + +00:32:48.140 --> 00:32:56.280 +If you've built background jobs or multi-step workflows, you know how messy things get with retries, timeouts, partial failures, and keeping state consistent. + +00:32:56.900 --> 00:33:03.460 +I'm sure many of you have written brutal code to keep the workflow moving and to track when you run into problems, but it's trickier than that. + +00:33:03.640 --> 00:33:08.640 +What if you have a long-running workflow and you need to redeploy the app or restart the server while it's running? + +00:33:09.220 --> 00:33:12.560 +This is where Temporal's open-source framework is a game-changer. + +00:33:13.240 --> 00:33:27.480 +You write workflows as normal Python code and Temporal ensures that they execute reliably, even across crashes, restarts, or long-running processes while handling retries, states, and orchestrations for you so you don't have to build and maintain that logic yourself. + +00:33:27.700 --> 00:33:33.420 +You may be familiar with writing asynchronous code using the async and await keywords in Python. + +00:33:33.920 --> 00:33:42.160 +Temporal's brilliant programming model leverages the exact same programming model that you are familiar with but uses it for durability, not just concurrency. + +00:33:42.160 --> 00:33:47.040 +Imagine writing awaitworkflow.sleep Heimdelta 30 days. + +00:33:47.380 --> 00:33:49.320 +Yes, seriously, sleep for 30 days. + +00:33:49.460 --> 00:33:51.380 +Restart the server, deploy new versions of the app. + +00:33:51.620 --> 00:33:52.040 +That's it. + +00:33:52.220 --> 00:33:53.380 +Temporal takes care of the rest. + +00:33:53.920 --> 00:33:58.420 +Temporal is used by teams at Netflix, Snap, and NVIDIA for critical production systems. + +00:33:58.920 --> 00:34:01.620 +Get started with the open-source Python SDK today. + +00:34:01.940 --> 00:34:04.360 +Learn more at talkpython.fm/Temporal. + +00:34:04.660 --> 00:34:06.680 +The link is in your podcast player's show notes. + +00:34:07.080 --> 00:34:09.100 +Thank you to Temporal for supporting the show. + +00:34:09.840 --> 00:34:11.220 +Yeah, I guess you know why. + +00:34:11.360 --> 00:34:26.080 +You have more granularity on what, if the thing in the database doesn't look like you expected, you much, have a much more granular way of knowing like it was this step that made it look like that because I've had problems before where I completely + +00:34:26.080 --> 00:34:31.700 +upgraded, swapped out the data access layer for Talk Python training for the courses. + +00:34:32.100 --> 00:34:32.200 +Yeah. + +00:34:32.200 --> 00:34:33.600 +And for the website, it was perfect. + +00:34:33.820 --> 00:34:34.600 +Everything was great. + +00:34:34.660 --> 00:34:43.820 +But under certain circumstances on Android, the app was resulting in something, it was sending in something that would make the data not right, right? + +00:34:43.840 --> 00:34:47.300 +Like there was some field that was null instead of just taking on the default value. + +00:34:47.540 --> 00:34:48.100 +Oh, man. + +00:34:48.360 --> 00:34:49.000 +Which is fine. + +00:34:49.060 --> 00:34:57.180 +But then when the person logged in on the website, the website didn't assume that that thing could be null because it was, at a minimum, had a non-nullable default value. + +00:34:57.280 --> 00:34:59.060 +I'm like, why do we need to check this for null? + +00:34:59.060 --> 00:35:01.040 +How did it get to be null? + +00:35:01.100 --> 00:35:01.880 +It makes no sense. + +00:35:01.960 --> 00:35:03.620 +It took forever to figure that out. + +00:35:04.020 --> 00:35:04.340 +Oh, wow. + +00:35:04.480 --> 00:35:08.540 +But with event sourcing, you could see this was the event that made it null. + +00:35:08.660 --> 00:35:09.700 +Not just, it is null. + +00:35:09.800 --> 00:35:11.100 +What in the world is going on? + +00:35:11.180 --> 00:35:12.980 +Why could it, how could it possibly be null? + +00:35:13.380 --> 00:35:14.320 +Yeah, absolutely. + +00:35:14.800 --> 00:35:17.460 +So I think it's got some interesting debugging. + +00:35:17.960 --> 00:35:32.940 +And one more thing, like I know this is quite the data science side, but another constituency could be PCI, HIPAA, GDPR, like all the compliance frameworks you got to deal with for auditing or sort of audit trail + +00:35:32.940 --> 00:35:33.600 +or something that happens. + +00:35:33.700 --> 00:35:37.840 +I mean, a lot of times logs serve that value, but that might be a, they updated the record like, oh, what? + +00:35:38.560 --> 00:35:39.320 +Yeah, totally. + +00:35:39.620 --> 00:35:39.900 +Yeah. + +00:35:40.220 --> 00:35:51.620 +And that, you know, I, even though I've been in insurance and I've been in healthcare, I haven't had anything where I have to certify these things, but like you've, the audit log is the way you interact with everything. + +00:35:51.620 --> 00:35:53.600 +It is the source of truth. + +00:35:53.600 --> 00:36:00.960 +And so, but what's funny is I have worked on teams that created history tables to try to essentially do that work. + +00:36:00.960 --> 00:36:06.680 +and it was like two or three months after I started working there before I learned that that table existed. + +00:36:06.840 --> 00:36:12.640 +And so, there were two or three months of work I should have been putting in the history table that I didn't. + +00:36:12.900 --> 00:36:17.000 +And from what I hear among other developers, a lot of teams work that way. + +00:36:17.060 --> 00:36:21.600 +Like, only a few people really know and understand how to maintain that history table. + +00:36:22.000 --> 00:36:25.080 +And a lot of times, like when they try to replay it, it just doesn't work. + +00:36:25.160 --> 00:36:26.180 +And it's, it's unfortunate. + +00:36:26.180 --> 00:36:30.040 +Yeah, it's like, well, there is history in the history table. + +00:36:30.360 --> 00:36:34.040 +When we run it again, we don't get the same output as the final database. + +00:36:34.160 --> 00:36:34.700 +What's going on? + +00:36:34.920 --> 00:36:35.700 +Yeah, true. + +00:36:36.020 --> 00:36:38.360 +Yeah, but with the event sourcing, it reverses it. + +00:36:38.420 --> 00:36:44.500 +Basically, the events are the source of truth and the other one is some kind of dynamically generated sort of deal, yeah? + +00:36:44.740 --> 00:36:45.480 +Yeah, yeah. + +00:36:45.600 --> 00:36:48.840 +And it's a lot like, you know, a backup strategy. + +00:36:49.100 --> 00:36:51.700 +You know, if you never test your backup strategy, you don't really. + +00:36:52.260 --> 00:36:52.700 +Exactly. + +00:36:52.960 --> 00:36:54.580 +And I feel like it's the same thing with the history table. + +00:36:54.660 --> 00:37:01.660 +And honestly, to be totally honest, event sourcing is similar in that it's easy to accidentally migrate event versions. + +00:37:01.980 --> 00:37:16.220 +You know, like for myself, I was working on a new event to kind of, you know, on my app and I introduced a new attribute or actually, I guess it was a full, whatever the point being is, at some point I decided I wanted to change + +00:37:16.220 --> 00:37:24.480 +the name of the attribute because it would reflect better what it meant in the domain and not realizing that I had already published that event to production. + +00:37:24.480 --> 00:37:39.440 +And so at one point I was, I don't know, I don't remember what I was doing, looking up issues or honestly, it might have been a view that it was rendering that was supposed to be for with throwing errors and I couldn't understand why and I looked at it and sure enough, it was because I accidentally created a different version + +00:37:39.440 --> 00:37:40.180 +of the same event. + +00:37:40.500 --> 00:37:47.200 +Thankfully, all I had to do was change the code to say, well, if this attribute doesn't exist, look for this attribute and everything was fixed. + +00:37:47.340 --> 00:37:51.800 +But, you know, you can honestly fall into some of those things with event sourcing too if you're not careful. + +00:37:51.800 --> 00:37:56.600 +But the nice thing is because the events are still there, you have the ability to recover from them. + +00:37:56.860 --> 00:37:58.420 +Let's talk about versioning for a little bit. + +00:37:58.680 --> 00:37:59.080 +Sure. + +00:37:59.380 --> 00:38:04.760 +On a sort of operational third normal form type of database, you know, you might run a migration. + +00:38:04.960 --> 00:38:11.600 +One of the reasons I really like using MongoDBs because I almost never have to run migrations, but that's a different, it's a different debate. + +00:38:11.860 --> 00:38:22.740 +However, you might run the migration to say like, okay, we're going to add a column or we're going to split this data apart and move this stuff over here and that over there and then create a foreign key relationship or whatever. + +00:38:23.060 --> 00:38:23.280 +Yeah. + +00:38:23.520 --> 00:38:26.320 +But I can see if you've got this kind of history of things. + +00:38:26.320 --> 00:38:29.400 +Like, let's say, I don't know, how do you deal with versioning, right? + +00:38:29.440 --> 00:38:34.100 +Like, I've got these old events and the way you're not storing the current state. + +00:38:34.180 --> 00:38:38.760 +So with the migration or something like that, you're like, well, let's just transform the current state into the new state. + +00:38:38.880 --> 00:38:44.940 +With these, you've got like old events and new events and they might be in a real way incompatible. + +00:38:45.220 --> 00:38:45.420 +Yeah. + +00:38:45.680 --> 00:38:46.120 +Yeah, sure. + +00:38:46.420 --> 00:38:47.560 +What do you think about with that? + +00:38:47.840 --> 00:38:49.380 +You have so many strategies. + +00:38:50.200 --> 00:38:52.340 +You just have to choose which one works for your situation. + +00:38:52.860 --> 00:39:07.760 +So the first one is kind of what I mentioned just a minute ago, kind of like the MongoDB or I should say document database way of working with things where if you're adding things to an event, adding fields, then if, you know, if the code, you know, most code will be blissfully ignorant + +00:39:07.760 --> 00:39:09.640 +that you're adding new attributes to it. + +00:39:09.720 --> 00:39:10.980 +So it doesn't really matter. + +00:39:11.320 --> 00:39:14.400 +And then those that do care can kind of have fallbacks. + +00:39:14.400 --> 00:39:25.100 +And the way that Adam and Martin suggest, you essentially have like upcasters or some kind of code that essentially says like, okay, the previous version, actually, that's two different things. + +00:39:25.180 --> 00:39:25.480 +I'm sorry. + +00:39:25.760 --> 00:39:32.300 +The code that does care about the new attribute, if it encounters an older event that doesn't have that attribute, then you can have a default fallback. + +00:39:32.560 --> 00:39:37.120 +And it's best to have those close to, you know, the domain logic that you want to update. + +00:39:37.220 --> 00:39:41.540 +This is a vertical slice architecture approach as opposed to having like a global upcaster. + +00:39:41.940 --> 00:39:46.760 +Again, I realized that I want to keep going to the upcasting, which is the second option that I've been trying to get to. + +00:39:47.220 --> 00:39:50.100 +But so essentially one is, you know, have a default fallback. + +00:39:50.340 --> 00:40:01.720 +Second one is to have an upcaster that says, okay, you know, especially if they're two different versions, like truly different versions of the event, you know, you have add to shopping or item added and item added version two. + +00:40:02.060 --> 00:40:03.620 +You know, you might have completely different fields. + +00:40:03.620 --> 00:40:08.180 +It's good to have a piece of code that can, you know, upcast to the second one. + +00:40:08.780 --> 00:40:11.000 +You put that in your, like your data access layer. + +00:40:11.000 --> 00:40:19.660 +It might write a thing and say, give me all the items of the cart and it looks at the type or the version flag and then does a little processing or would you rewrite the database? + +00:40:20.140 --> 00:40:21.800 +Oh, well, that's, I was going to say that's your third option. + +00:40:22.440 --> 00:40:30.920 +So let me take a step back and ask your first, your last question, which was like, at least to me, the tent, what you're doing with your events is essentially rebuilding state. + +00:40:30.920 --> 00:40:37.940 +And so wherever you're in that loop to rebuild the state, you would probably say like, oh, is it this event or is it this event or whatever behave this way? + +00:40:37.960 --> 00:40:38.140 +Right there. + +00:40:38.140 --> 00:40:44.060 +Like in my story of using one of the data caches, you do a request, it's not in the cache. + +00:40:44.120 --> 00:40:46.240 +So you've got to run your build it up code and put it in the cache. + +00:40:46.340 --> 00:40:55.760 +And that just, I mean, that part right there would be the part that goes, okay, we're going to transform it differently now and then we'll still store the answer in the cache and the next time they ask, it's just fast. + +00:40:55.760 --> 00:40:56.200 +Yeah. + +00:40:56.940 --> 00:41:02.480 +So yeah, you would have, the way I would code it is you'd have a piece of code that's listening into the events. + +00:41:02.600 --> 00:41:06.140 +So you have, it knows about the old version of the event and the new version of the event. + +00:41:06.260 --> 00:41:12.880 +And it knows what format that data needs to be in in that Redis cache or whatever it was that you suggested. + +00:41:13.180 --> 00:41:13.640 +Yeah, Valkey. + +00:41:13.760 --> 00:41:14.240 +Come on, Valkey. + +00:41:14.260 --> 00:41:14.500 +Valkey. + +00:41:14.560 --> 00:41:15.300 +Let's go, Valkey. + +00:41:15.300 --> 00:41:16.960 +I'm going to remember this by the end of the conversation. + +00:41:18.620 --> 00:41:27.980 +So yeah, so what it does is it will be able to say like, okay, this event is an old one, but I can upcast it to a new one and convert it into the format or however we need to save it to Valkey. + +00:41:28.240 --> 00:41:32.480 +And then if it's a new one, it also knows, okay, I'm going to convert it this way to save to Valkey. + +00:41:33.040 --> 00:41:39.320 +So it's nice that it's kind of localized to, you know, the process that you need to update, upcast. + +00:41:39.800 --> 00:41:45.860 +And then we've touched on like kind of the quote unquote nuclear option, which is to apply a transform to your entire event store. + +00:41:46.180 --> 00:41:50.180 +And so you would, by doing that, you would create a new database from your database table, which would be your new event store. + +00:41:50.320 --> 00:41:58.280 +And you, for each event, you just copy it over, do some kind of map or transform to upcast the entire, your entire history. + +00:41:58.920 --> 00:41:59.280 +Yeah. + +00:41:59.520 --> 00:42:01.220 +I see values, value in all of them. + +00:42:01.280 --> 00:42:05.860 +You know, if you're doing a lot of direct SQL data science-y stuff, you probably want to transform the database. + +00:42:06.160 --> 00:42:12.240 +If it's primarily coming out of an API, like just let that thing handle it as it reads them, you know, computers are fast. + +00:42:12.420 --> 00:42:12.940 +Keep it in mind. + +00:42:13.160 --> 00:42:13.420 +Indeed. + +00:42:13.560 --> 00:42:16.600 +And I forgot to mention that, oh, his name escapes me right now. + +00:42:16.640 --> 00:42:22.720 +The guy who popularized event sourcing back in the 2000s wrote a book on this topic specifically. + +00:42:22.720 --> 00:42:27.400 +So he kind of listed out all of your strategies and when you would want to choose them and why. + +00:42:27.860 --> 00:42:28.920 +Are we talking Martin Dilger? + +00:42:29.320 --> 00:42:29.660 +No. + +00:42:30.680 --> 00:42:31.340 +Greg Young. + +00:42:31.580 --> 00:42:31.960 +Greg Young. + +00:42:32.080 --> 00:42:32.260 +Okay. + +00:42:32.480 --> 00:42:33.840 +I think it's on LeanPub. + +00:42:34.140 --> 00:42:35.120 +Yeah, that sounds about right. + +00:42:35.220 --> 00:42:39.500 +In fact, Martin Dilger's book is cheaper on LeanPub as well and you get additional content there too. + +00:42:39.860 --> 00:42:40.000 +Okay. + +00:42:40.400 --> 00:42:41.380 +Yeah, very interesting. + +00:42:42.280 --> 00:42:45.700 +Let's, let's make this a little bit concrete for people. + +00:42:46.020 --> 00:42:46.260 +Sure. + +00:42:46.520 --> 00:42:55.440 +We talked about how you might, in theory, architect some software, could be Python or something else to follow these patterns, but there is a Python library, right? + +00:42:55.440 --> 00:42:56.480 +Yeah, absolutely. + +00:42:56.820 --> 00:42:57.600 +Do you recommend it? + +00:42:57.840 --> 00:42:58.160 +I do. + +00:42:58.380 --> 00:43:03.840 +You know, I, it's funny, I don't have any production code with it, but I have used it a lot over the years. + +00:43:04.460 --> 00:43:11.780 +John Bywater has done an incredible job maintaining this, this repo and, you know, all his, all the people who have contributed as well. + +00:43:12.040 --> 00:43:17.780 +He has shepherded this through and is, has been really making it such an incredible application. + +00:43:18.580 --> 00:43:25.900 +When I wrote my applications, you know, I, the first one I did, I was like, I want to do it myself so I can understand it. + +00:43:26.140 --> 00:43:28.720 +And so I can really understand and respect what he's done. + +00:43:29.520 --> 00:43:35.700 +But also, you know, there's a part of me that really, like a lot of the, you know, essentially I feel like it depends on who you are. + +00:43:35.700 --> 00:43:40.720 +If you're somebody who wants to grab a framework and run with it, this is an exceptional one to do it with. + +00:43:40.840 --> 00:43:50.800 +It, essentially, you just write Python classes and you decorate them or subclass from some of his classes and all the magic of event sourcing happens for you. + +00:43:50.860 --> 00:43:54.200 +And it just make, leaves you with really readable, understandable code. + +00:43:54.300 --> 00:44:00.360 +And then you'll have other people who, you know, a lot of people in the event sourcing space says it's actually not that complicated. + +00:44:00.840 --> 00:44:02.560 +You can write your own code to do it. + +00:44:02.900 --> 00:44:03.800 +And I did. + +00:44:03.920 --> 00:44:06.120 +And I recommend it for the right type of person. + +00:44:06.120 --> 00:44:09.780 +For me, it was hard because there's so many decisions that you need to make. + +00:44:09.800 --> 00:44:11.520 +And I am not the best Python programmer. + +00:44:11.680 --> 00:44:15.340 +I do not know all the concurrency issues and all these things. + +00:44:15.340 --> 00:44:16.580 +I'm getting to learn them more. + +00:44:17.080 --> 00:44:20.860 +But, you know, I have software that's running in production that's, that's doing well. + +00:44:20.940 --> 00:44:25.440 +So all that to say is, yeah, I highly recommend this package, especially if you're new to it. + +00:44:25.580 --> 00:44:30.200 +It can really show you, you know, how you, one option of, of things can work. + +00:44:30.200 --> 00:44:42.020 +And I love that it, by default, it uses, well, you can, you can use SQLite, Postgres, or one of the, a couple of the doc, databases that are optimized for event sourcing. + +00:44:42.020 --> 00:44:48.760 +And so you can kind of see how it's, you know, some of the many ways you can pattern things to make it easy for you. + +00:44:48.760 --> 00:44:49.160 +Interesting. + +00:44:49.420 --> 00:44:52.600 +So I probably need to file a PR or something here. + +00:44:52.800 --> 00:44:55.960 +It says, the way you get it is pip install event sourcing. + +00:44:56.060 --> 00:44:57.980 +I feel like it should be pip install event sourcing. + +00:44:58.680 --> 00:44:59.500 +No, just kidding. + +00:44:59.580 --> 00:45:00.120 +But that's cool. + +00:45:00.420 --> 00:45:06.820 +However, I am, you know, so it's, you know, you can get it off PyPI, but I'm having a hard time resisting pressing this. + +00:45:07.180 --> 00:45:08.080 +Ask DeepWiki. + +00:45:08.480 --> 00:45:09.840 +Have you ever gone, what is DeepWiki? + +00:45:10.240 --> 00:45:10.640 +Yeah. + +00:45:11.000 --> 00:45:15.860 +It's a AI powered document thing that he opted into, which I thought was very fascinating. + +00:45:16.140 --> 00:45:19.540 +I, I was following, I was at the time he did it. + +00:45:19.540 --> 00:45:20.980 +I was, I think I was writing my own code. + +00:45:21.060 --> 00:45:22.300 +And so I had Slack open. + +00:45:22.380 --> 00:45:30.260 +He has a good Slack channel and was like showing all the things that he was able to, to all the insights that was able to be gleaned from it. + +00:45:30.400 --> 00:45:31.040 +This is epic. + +00:45:31.160 --> 00:45:31.600 +I love it. + +00:45:31.840 --> 00:45:37.380 +So it had, the DeepWiki apparently is like, and knows the source and the docs. + +00:45:37.480 --> 00:45:38.640 +And then it's just a chat. + +00:45:38.820 --> 00:45:41.860 +And even on fast, I asked, I said, give me an example of using this library. + +00:45:41.960 --> 00:45:42.280 +So sure. + +00:45:42.300 --> 00:45:48.560 +Here's a complete example of dog school application, all the code using event source, the light, the package. + +00:45:48.760 --> 00:45:50.140 +This is nuts. + +00:45:50.480 --> 00:45:50.660 +Yeah. + +00:45:51.940 --> 00:45:52.900 +Side quest unlocked. + +00:45:53.120 --> 00:45:55.700 +Must figure out how to get my packages into DeepWiki. + +00:45:55.940 --> 00:45:56.580 +This is nice. + +00:45:57.080 --> 00:46:05.180 +And I also want to add, he has other companion packages that for example, connect into Django and I believe Flask and some other ones too. + +00:46:05.180 --> 00:46:19.280 +So one of my side projects, I'm leveraging this with Django and it's really cool because one of the things that enables you to do is configure your events table to be similar to your, + +00:46:19.320 --> 00:46:24.340 +I guess in the same database as your Django table or at least configurable from the way Django would do it. + +00:46:24.740 --> 00:46:28.440 +And yeah, so getting all these read models are very easy with all the migrations. + +00:46:28.440 --> 00:46:34.300 +You just say, this is what I want my data to look like with Django and of course apply migrations and there it is in production. + +00:46:34.520 --> 00:46:35.160 +So that's really nice. + +00:46:35.180 --> 00:46:35.860 +Yeah, cool. + +00:46:36.060 --> 00:46:37.920 +It also has extension projects. + +00:46:38.400 --> 00:46:39.120 +What are these? + +00:46:39.400 --> 00:46:43.160 +The Django one, the CurrentDB, K-U-R-R-E-N-T. + +00:46:43.300 --> 00:46:45.640 +I imagine CurrentDB is probably a... + +00:46:45.640 --> 00:46:50.680 +I believe that is, yeah, it's one of the first event sourcing specific databases. + +00:46:51.140 --> 00:46:51.960 +I think it was called... + +00:46:51.960 --> 00:46:54.780 +Oh, it's for intelligent and responsive systems. + +00:46:55.320 --> 00:46:57.160 +I don't know, Chris, I just got to rant a little bit. + +00:46:57.240 --> 00:47:05.920 +Like there's all these projects that are cool and they do neat stuff and now I feel when I go to them that it's like, this is the AI compute data frame or this is the AI, the intelligent AI. + +00:47:06.060 --> 00:47:10.500 +It's like, it's just a database or just a data frame that AIs can use. + +00:47:10.560 --> 00:47:14.740 +That doesn't make it an AI data frame, you know, it's like, but they all want to capture the excitement. + +00:47:14.880 --> 00:47:15.600 +It drives me crazy. + +00:47:15.980 --> 00:47:16.200 +Yeah. + +00:47:16.500 --> 00:47:18.880 +And what's worse is when they don't even say exactly what they do. + +00:47:18.880 --> 00:47:22.640 +It is, it is your answer for AI-ing the thing that we're not going to tell you. + +00:47:22.740 --> 00:47:23.300 +I hate it. + +00:47:23.420 --> 00:47:23.580 +Yeah. + +00:47:24.300 --> 00:47:24.880 +Yeah, exactly. + +00:47:24.960 --> 00:47:27.780 +And it just obscures what the heck it is, but it's the H1 and the H2. + +00:47:27.880 --> 00:47:28.780 +You're like, oh my gosh. + +00:47:29.020 --> 00:47:29.280 +Yeah. + +00:47:29.780 --> 00:47:30.140 +Yeah. + +00:47:30.200 --> 00:47:30.420 +Okay. + +00:47:30.480 --> 00:47:32.140 +But that does look pretty interesting. + +00:47:32.360 --> 00:47:39.980 +Like, yeah, look, it's example is create a client, new event data, new UUID, et cetera, order place, serialize. + +00:47:39.980 --> 00:47:48.700 +So right here, this example is basically it's got primary key, a category or type of event, like a, just an event, I guess is the way you would call it. + +00:47:48.800 --> 00:47:53.380 +But then it has JSON serialized, like a JSON blob. + +00:47:53.580 --> 00:47:55.120 +That is the details of the event. + +00:47:55.200 --> 00:47:56.600 +Is this how you typically do it? + +00:47:56.920 --> 00:47:57.760 +I would say so. + +00:47:57.760 --> 00:48:01.280 +Or is it more column oriented where like this one has an order ID and a total. + +00:48:01.380 --> 00:48:05.760 +So you might have an order ID and a total in the data structure or is it in a blob level? + +00:48:06.100 --> 00:48:06.300 +Yeah. + +00:48:06.440 --> 00:48:16.880 +The implementations I've seen generally will have some kind of blob or JSON serialized or bytecode serialized optimization of it. + +00:48:17.000 --> 00:48:22.060 +You know, because each event, you know, when you're saving things to the database, you know, you're going to save an event. + +00:48:22.140 --> 00:48:23.340 +It's going to have an event stream. + +00:48:23.340 --> 00:48:26.060 +It's going to have generally speaking, there's probably an event version. + +00:48:26.520 --> 00:48:29.780 +Like there's all these specific things, but the actual payload of the event. + +00:48:29.780 --> 00:48:32.960 +If there's not an event version, you're going to wish there was an event version at some point probably. + +00:48:33.360 --> 00:48:33.760 +Exactly. + +00:48:34.100 --> 00:48:34.380 +Yes. + +00:48:34.700 --> 00:48:39.200 +And so the payload is usually some kind of blob or JSON body or something like that. + +00:48:39.340 --> 00:48:41.440 +It sounds very good to be a document database. + +00:48:41.860 --> 00:48:42.180 +Indeed. + +00:48:42.580 --> 00:48:42.800 +Yeah. + +00:48:43.040 --> 00:48:43.300 +Yeah. + +00:48:43.400 --> 00:48:50.480 +Because you can put indexes on like the sub items and then if they're not in that event, it just doesn't use the index for those particular ones. + +00:48:50.480 --> 00:48:51.760 +I mean, it's a lot of things. + +00:48:52.080 --> 00:48:52.180 +Yeah. + +00:48:52.520 --> 00:48:53.100 +Yeah, exactly. + +00:48:53.200 --> 00:48:53.740 +It's pretty sweet. + +00:48:54.000 --> 00:48:54.160 +Yeah. + +00:48:54.300 --> 00:49:06.860 +And again, well, I haven't said it quite, but like one of the things I have been thinking about for a decade is the more I kind of thought about it, the event sourcing and the patterns and unlocks really gives you so much flexibility. + +00:49:07.380 --> 00:49:13.220 +You know, you can use document data stores and like really take those, the power of that. + +00:49:13.400 --> 00:49:19.480 +You know, if you have, and part of this is too is just data, you know, vertical slicing or whatever, it's both multiple patterns put together. + +00:49:19.580 --> 00:49:28.680 +But like, you know, if you have a view that would be so much better served by having a graph query, a graph database, then use it. + +00:49:28.800 --> 00:49:39.220 +You know, it's, I remember at one time it took me a while, but like somebody told me that they were using, I can't remember their, their open query. + +00:49:39.380 --> 00:49:40.040 +Is that what it's called? + +00:49:40.160 --> 00:49:47.340 +The, I forgot the old name of what I'm trying to think of, but essentially it's, you know, like a database that optimizes for saving text. + +00:49:47.340 --> 00:49:48.860 +So you can like search for it. + +00:49:49.300 --> 00:49:55.440 +You know, they use that as just for one, you know, to serve the purpose of one item. + +00:49:55.740 --> 00:49:57.540 +And honestly, this isn't unique to event sourcing. + +00:49:57.620 --> 00:50:00.120 +You can do this with event-driven architecture as well. + +00:50:00.380 --> 00:50:08.360 +But what I love about event sourcing is like you have the benefits of event-driven architecture and the benefits of a monolith in one if you choose to go that way. + +00:50:08.840 --> 00:50:16.060 +And yeah, it's just, it's, I guess really what it comes down to is what I love about it and was surprised by is how flexible it gives you the ability. + +00:50:16.500 --> 00:50:27.680 +Yeah, in the book, it reminds me, one of my users was complaining about the status screen that I show for the users and he had all these great ideas and I was like, you know what? + +00:50:27.880 --> 00:50:29.300 +I want to take advantage of that. + +00:50:29.420 --> 00:50:39.460 +So I actually cloned my, the vertical slice for that view and created a new database column or collection for that, to power that view. + +00:50:39.460 --> 00:50:54.020 +and we iterated and iterated to make this thing better and with each iteration, sometimes I needed to change how the read model reacted to events and so I could just blow away the read model, regenerate it from events and we ended up with something really great so that when it + +00:50:54.020 --> 00:51:03.300 +was time to go live, I just changed which, where the URL went, pointed to the new one and was able to delete the old code and delete the database table and it was wonderful. + +00:51:03.720 --> 00:51:08.180 +Yeah, it just gives you so much flexibility to do whatever you need. + +00:51:08.900 --> 00:51:14.480 +So, a couple more, one more, I guess one more really relevant thing, two more things to give a shout out with this event sourcing. + +00:51:14.580 --> 00:51:18.000 +There's event sourcing Django which is Python package for it with Django. + +00:51:18.320 --> 00:51:20.680 +Imagine that probably somehow it upgrades with the ORM, don't know. + +00:51:20.900 --> 00:51:23.280 +But also event sourcing SQLAlchemy which is kind of cool. + +00:51:23.540 --> 00:51:25.480 +So if you use SQLAlchemy, yeah, very nice. + +00:51:25.760 --> 00:51:36.680 +All right, so this stuff is great but I imagine that it has times you should use it maybe more and times you should go, well, square peg round hole maybe not this time. + +00:51:36.860 --> 00:51:37.700 +Sure, yeah. + +00:51:37.760 --> 00:51:38.100 +What do you think? + +00:51:38.260 --> 00:51:45.000 +For me, I feel like it's usually the way I think about it first is because most people are very comfortable with not using event sourcing, right? + +00:51:45.160 --> 00:51:47.860 +And so I usually answer it the opposite way which is when should you? + +00:51:48.040 --> 00:51:48.700 +Sure, exactly. + +00:51:49.080 --> 00:52:03.600 +The two biggest, the best piece of advice that I heard over the last decade was number one, use it, a good opportunity to use it is if you have a database column called status because if you have a columnated status then that means that + +00:52:03.600 --> 00:52:07.040 +one item can be in multiple different statuses, right? + +00:52:07.080 --> 00:52:19.980 +Different states and if you're having different states each states behave different in some way or form and so you are definitely not dealing with true CRUD create, read, update, delete patterns and so event sourcing would be a great option for that. + +00:52:20.200 --> 00:52:24.960 +The second piece of advice is do you ever, are you ever concerned about losing data? + +00:52:25.200 --> 00:52:30.500 +Because by default event sourcing does not and what it enables you to do is choose when to lose data, right? + +00:52:30.500 --> 00:52:32.380 +Because you don't have to keep every event around forever. + +00:52:32.800 --> 00:52:37.400 +You can just say like after 90 days let's just put it to cold storage or just delete it, you know, it depends. + +00:52:37.580 --> 00:52:37.880 +Yeah, exactly. + +00:52:38.040 --> 00:52:42.320 +Out in the audience Mike says, I'm scared of the physical storage requirements of this potentially. + +00:52:42.700 --> 00:52:47.500 +I guess it depends how many data, how many events make up a final state in your system. + +00:52:47.740 --> 00:52:49.620 +Like a cart checkout, big deal? + +00:52:49.800 --> 00:52:50.540 +No, probably not. + +00:52:50.860 --> 00:52:53.840 +Like if used as an app log, that might be a problem. + +00:52:54.100 --> 00:52:54.720 +Yeah, yeah. + +00:52:54.820 --> 00:53:02.660 +Most, I'll say models, will have maybe a dozen events, maybe two, depends on your, obviously depends on your domain. + +00:53:03.180 --> 00:53:12.640 +But ideally, you will keep your events short and they have practices called closing the books where you will use events in your domain to kind of keep it short. + +00:53:12.740 --> 00:53:25.580 +So like, for example, a store will want to know their revenue across the entire year, but every night they shut down, they get their cash registers or if they still have cash registers and they kind of reconcile how much money they made that day. + +00:53:25.880 --> 00:53:28.860 +And so, you know, kind of keeping your event stream short really helps. + +00:53:29.080 --> 00:53:37.440 +If you're going to go back, you would just say, well, we'll just read the daily summary and then add today's events or something like to get the final output, something like that. + +00:53:37.680 --> 00:53:37.900 +Yeah. + +00:53:38.040 --> 00:53:46.000 +Unless you want to go all the way back to day one, in which case you can, you know, read and say like, okay, the, you know, it all depends on how you want to do it, right? + +00:53:46.060 --> 00:53:57.060 +This is again, the flexibility side of it because you could just like, say, start from today and read forward or you can start from today and say, okay, what was the event stream before this and read that and keep going back to the originating? + +00:53:57.320 --> 00:54:02.680 +And I think you put up Mike's comment that said, I'm afraid of the physical storage requirements. + +00:54:02.820 --> 00:54:04.480 +And it's like, that is the trade-off. + +00:54:04.580 --> 00:54:14.440 +There is, it will take more space, but thankfully, storage space is the cheapest commodity in all of online or, you know, in today's world. + +00:54:15.200 --> 00:54:20.800 +And it's, and most expensive is memory and then compute and then storage and then bandwidth and then storage. + +00:54:20.900 --> 00:54:22.760 +I think that's probably the breakdown, right? + +00:54:22.960 --> 00:54:23.880 +Yeah, I think so. + +00:54:24.200 --> 00:54:29.580 +And why things like disc cache are awesome versus like another thing that's just in a memory cache, but another process, right? + +00:54:29.680 --> 00:54:30.440 +Like, exactly. + +00:54:30.840 --> 00:54:31.060 +Yeah. + +00:54:31.300 --> 00:54:45.040 +And having the ability to say like, you know, we, you know, like my current application, I have not yet deleted any events, but truly like the only reason we have events older than even a week are just for analytical purposes and, and just me understanding how our system works. + +00:54:45.040 --> 00:54:49.700 +And so I'm planning to make a way to offload that event or those events. + +00:54:49.860 --> 00:54:57.780 +And a lot of people just put them straight to cold storage, you know, just so that they always have a backup just in case, but, you know, chances are they rarely ever use it. + +00:54:57.960 --> 00:54:58.160 +Interesting. + +00:54:58.600 --> 00:55:02.840 +And one other thing I did want to add is to go answer your question when not to use event sourcing. + +00:55:02.980 --> 00:55:03.280 +exactly. + +00:55:03.700 --> 00:55:07.880 +Would be essentially like, you know, so let's say you don't care about losing data. + +00:55:08.360 --> 00:55:12.360 +There are just a number of just simple applications that are truly crud, right? + +00:55:12.420 --> 00:55:15.420 +Like I've worked on a number of these where they're just forms over data. + +00:55:16.080 --> 00:55:18.540 +It's exactly the term I was thinking, forms over data. + +00:55:18.780 --> 00:55:20.060 +Defining that for people if they don't know. + +00:55:20.280 --> 00:55:30.540 +Yeah, it's essentially something where like in my case, one of the first ones I worked on is like you have a web page that almost exactly mirrors the database table that you're saving the data to. + +00:55:30.800 --> 00:55:31.940 +You know, maybe it's a contact form. + +00:55:32.360 --> 00:55:33.620 +Who knows what it could be? + +00:55:33.760 --> 00:55:41.180 +You know, the idea is like there is so the web UI or whatever you're building is just an easy way to get data into the database. + +00:55:41.180 --> 00:55:44.200 +And chances are you don't have status field. + +00:55:44.340 --> 00:55:49.200 +You don't have all these different ways of different rules for how things behave. + +00:55:49.400 --> 00:55:53.960 +And in fact, in my event source application, I have a model that is not event sourced. + +00:55:54.200 --> 00:55:56.000 +It is truly a crud model. + +00:55:56.180 --> 00:56:00.560 +And so just by saying, you know, adopting event sourcing doesn't mean you have to do it for everything. + +00:56:00.680 --> 00:56:05.660 +You can use it for even just a small bit of your project, especially if you want to try it out and see how it could be. + +00:56:05.660 --> 00:56:06.260 +That's a really good point. + +00:56:06.400 --> 00:56:07.940 +It's not an all or nothing sort of thing. + +00:56:07.940 --> 00:56:14.980 +Because you have a properly factored data access layer and you're not doing that inside of your Jinja template, are you? + +00:56:15.220 --> 00:56:15.320 +No? + +00:56:16.180 --> 00:56:16.580 +Right. + +00:56:17.940 --> 00:56:22.660 +Not even in your view, but like you've got just an opaque layer of actions. + +00:56:22.820 --> 00:56:25.040 +Some of those actions can be driven by events. + +00:56:25.420 --> 00:56:27.340 +Some of those actions can just be straight crud. + +00:56:27.540 --> 00:56:29.220 +Create, read, update, delete for those who don't know. + +00:56:29.440 --> 00:56:42.040 +One of the people who inspired me to really dig into event sourcing, he has a line of business that, well, he'll go to a company who is struggling because their database schema is holding them back. + +00:56:42.360 --> 00:56:49.940 +You know, for whatever decision they made, they cannot, they're having such a hard problem, hard time creating a new feature because of their database schema. + +00:56:50.200 --> 00:57:01.020 +So he goes in, teaches them event sourcing and uses the event sourcing event store to publish both the dream schema that they wish they would have and the old schema. + +00:57:01.400 --> 00:57:15.280 +And they live side by side and the event, you know, once the features are complete, they'll, you know, put it up and they'll start slowly migrating traffic over to the new event source version and, you know, eventually they can delete the old database table or schema, you know, database. + +00:57:15.700 --> 00:57:21.280 +And most of the teams he's worked with have kept with the event source version and gone on from there. + +00:57:21.560 --> 00:57:21.720 +Yeah. + +00:57:21.860 --> 00:57:30.240 +Oh, and then finally, one other thing I want to mention too is when not to use is it's up to your teammates because, you know, I am sold. + +00:57:30.380 --> 00:57:33.020 +I think this is such an incredible pattern. + +00:57:33.140 --> 00:57:41.680 +It is just unlocking so much joy again and so much flexibility as I've said before that I cannot imagine having to go back. + +00:57:41.860 --> 00:57:47.560 +That said, if I join a new team and my team members are like, I don't know, I'm going to go with them, you know. + +00:57:48.120 --> 00:57:48.480 +Yeah. + +00:57:49.480 --> 00:57:51.700 +One thing is to not use an optimal pattern. + +00:57:52.020 --> 00:58:01.860 +What's worse is to try to use an optimal pattern but have nobody else want to do it and then they work around it and you, you know, it sounds a little similar to like people who don't want to do unit testing. + +00:58:02.200 --> 00:58:02.360 +Yeah. + +00:58:02.360 --> 00:58:11.580 +So some of the people write the unit test and they set up CICD that'll fail if the unit test failed but then the other people will check and work without running the test at all and then they break it and you're like, what are you doing? + +00:58:11.780 --> 00:58:13.360 +Like, well, I don't want to run these crappy tests. + +00:58:13.480 --> 00:58:16.260 +You're like, well, now the whole CICD is not just not helpful. + +00:58:16.440 --> 00:58:19.380 +It's inhibiting me working because you won't even, you know what I mean? + +00:58:19.380 --> 00:58:23.700 +It's just like, and it seems like you do need a certain level of buy-in for this to make sense. + +00:58:24.040 --> 00:58:24.360 +Absolutely. + +00:58:24.860 --> 00:58:25.060 +Yeah. + +00:58:25.360 --> 00:58:28.660 +And maybe they should listen to this podcast and they can see it. + +00:58:28.660 --> 00:58:36.180 +And maybe you create an example of one feature in an event sourced way so they can see some of the benefits. + +00:58:36.480 --> 00:58:36.820 +But, you know. + +00:58:37.040 --> 00:58:37.680 +Yeah, yeah, yeah. + +00:58:37.680 --> 00:58:39.220 +Like your partial example, indeed. + +00:58:39.480 --> 00:58:39.720 +Yeah. + +00:58:39.980 --> 00:58:40.260 +All right. + +00:58:40.640 --> 00:58:45.640 +We're getting short on time here, Chris, but let's talk this AI flow. + +00:58:45.640 --> 00:58:52.020 +First of all, let's circle back to your comment of your company having a mandate to use AI. + +00:58:52.440 --> 00:58:54.000 +What the heck is going on here? + +00:58:54.380 --> 00:58:56.680 +How is this received and how are you receiving it? + +00:58:56.900 --> 00:59:03.720 +And also tell us, are you actually writing, you know, make, shipping more features and be more productive or not? + +00:59:03.940 --> 00:59:06.620 +Like what, give us your assessment as much as you're willing to share. + +00:59:06.700 --> 00:59:07.500 +Like you don't have to like. + +00:59:08.820 --> 00:59:09.440 +Yeah, yeah. + +00:59:09.440 --> 00:59:13.560 +I will, I will hide certain things, but to say. + +00:59:13.560 --> 00:59:16.260 +Names and places have been changed to protect the parties. + +00:59:16.260 --> 00:59:19.300 +Yes, and emotions and conversations with multiple other people. + +00:59:20.160 --> 00:59:23.840 +I would say at times I am so much more productive. + +00:59:23.840 --> 00:59:30.240 +At times it has brought down the production, you know. + +00:59:30.500 --> 00:59:32.820 +So it is a mixed case. + +00:59:33.400 --> 00:59:39.000 +I see that Mike in the chat said it's an overconfident intern and I'm like 110%. + +00:59:39.000 --> 00:59:40.500 +Like this is exactly what it is. + +00:59:40.500 --> 00:59:41.040 +But very smart intern. + +00:59:41.360 --> 00:59:41.660 +It is. + +00:59:41.720 --> 00:59:42.220 +Oh, absolutely. + +00:59:42.380 --> 00:59:43.060 +Very confident. + +00:59:43.260 --> 00:59:43.840 +Oh, well, he said that. + +00:59:43.880 --> 00:59:44.420 +Yeah, overconfident. + +00:59:44.800 --> 00:59:58.620 +And I find this fascinating because my production app is actually three services in one monorepo and I'm responsible for one and, you know, a couple other people are responsible for the other ones, but, you know, we're all interacting. + +00:59:59.100 --> 01:00:07.960 +And so my, when I need to change something on my code, you know, and I'm required to use cloud code, I say this is what I need to do and generally it does a really great job. + +01:00:08.040 --> 01:00:15.600 +And I think a lot of this has to do with the vertical slice architecture because vertical slices only hold code that is responsible for one feature. + +01:00:15.960 --> 01:00:18.780 +And so that really fits very nice into a context window. + +01:00:18.780 --> 01:00:19.180 +Yeah. + +01:00:19.380 --> 01:00:22.900 +It doesn't have to scan 200,000 lines and 100 files. + +01:00:23.060 --> 01:00:24.000 +It looks at five. + +01:00:24.260 --> 01:00:24.520 +Yeah. + +01:00:24.820 --> 01:00:26.160 +And it knows event sourcing. + +01:00:26.300 --> 01:00:28.560 +So it knows, okay, I'm subscribing to events. + +01:00:28.700 --> 01:00:29.360 +These are the events. + +01:00:29.440 --> 01:00:31.480 +I know where they are, you know, all these different things. + +01:00:31.780 --> 01:00:38.360 +When I work on one of the other services, it takes a lot more context to understand the state of the code. + +01:00:38.540 --> 01:00:45.920 +And I really have to work harder to do, to do what I need to do in those, in those parts of the code base. + +01:00:46.240 --> 01:00:46.360 +Yeah. + +01:00:46.600 --> 01:00:48.360 +So it's been a very interesting experiment. + +01:00:48.500 --> 01:00:56.000 +And additionally, kind of when I am, I had to curb this, but when I have been more productive is creating get work trees. + +01:00:56.320 --> 01:00:59.440 +So it's like, I have a kind of main repo that I work out of. + +01:00:59.480 --> 01:01:04.920 +And then I, if I have a feature that I, you know, I'm like looking at the code base and like, oh, or the web app or the logs. + +01:01:04.920 --> 01:01:06.960 +And I'm like, oh, it'd be good to like optimize this. + +01:01:07.060 --> 01:01:11.580 +Then I create a new work tree and set cloud up in there and get it working on a thing. + +01:01:11.680 --> 01:01:18.220 +And so I have found that I can only, I need to limit myself to two or three work trees because any more than that, I start losing context. + +01:01:18.220 --> 01:01:19.960 +And now I know what the LLM is. + +01:01:20.300 --> 01:01:21.000 +Yeah, exactly. + +01:01:21.240 --> 01:01:28.460 +If you over, overdo it, it's, you just send off five agents and don't look like that's how you end up like, oh, we have kind of like bugs in our code about architecture. + +01:01:28.600 --> 01:01:30.000 +Well, you never look at it. + +01:01:30.280 --> 01:01:35.520 +It's like, we got the super energetic, super smart intern and we kicked him off and said, go on that feature. + +01:01:35.660 --> 01:01:37.780 +But you know, they need guidance, right? + +01:01:38.020 --> 01:01:40.700 +All the tests are passing because I changed all the tests to pass. + +01:01:40.860 --> 01:01:41.280 +I know. + +01:01:41.440 --> 01:01:43.440 +The problematic data has been removed from the database. + +01:01:43.640 --> 01:01:44.080 +It works now. + +01:01:45.240 --> 01:01:46.080 +Why is it empty? + +01:01:46.080 --> 01:01:46.480 +Yeah. + +01:01:48.320 --> 01:01:49.780 +Back to your backup comments. + +01:01:50.220 --> 01:01:55.300 +Honestly, I'm having like an insane amount of productivity with cloud code and with AI and stuff. + +01:01:55.500 --> 01:01:57.200 +But it's an engineering skill. + +01:01:57.360 --> 01:01:59.780 +It is not just, let's fire it up and ask for it. + +01:01:59.820 --> 01:02:07.500 +Like one of the things I'm doing lately that I'm really appreciating is going through like a planning session, which I know a lot of people do that and like talk about it. + +01:02:07.620 --> 01:02:22.220 +But, and now, if you have the GitHub CLI installed, just the GH thing, you can tell it, hey, create, you know, instead of just running this plan, create a GitHub issue of this plan, write all the details in GitHub and then your next comment + +01:02:22.220 --> 01:02:26.680 +can be, let's work on issue 127 and it'll go work on it when it gets done. + +01:02:26.780 --> 01:02:29.600 +Like, let's make a retrospective comment on the issue. + +01:02:29.740 --> 01:02:31.960 +Let's create a PR that closes that issue. + +01:02:32.120 --> 01:02:41.860 +That's, you're like, there's some really interesting team dynamics that you can put in there that, you know, talking to a chat box is not covered, but if you know what, you know what to ask for. + +01:02:41.860 --> 01:02:48.780 +Yeah, I'm really inspired by Martin and Adam because both of them in one way or another have, let me take a step back. + +01:02:48.980 --> 01:02:50.640 +They, I mentioned the event modeling diagram. + +01:02:51.160 --> 01:03:00.900 +It is a visual diagram that really has a reduced visual language and what was mind-blowing to us a couple years ago was that AI understands it. + +01:03:01.120 --> 01:03:07.860 +And so, the fact that you can essentially say like, here's the diagram, can you implement the slice and it can get you from, well, let me take a step back. + +01:03:08.140 --> 01:03:16.220 +Martin and Adam have both had successful research spikes where they took an event modeling diagram. + +01:03:16.360 --> 01:03:23.280 +Actually, no, they even did, what they did even worse was they started with a conversation with a client and recorded it, created the transfer. + +01:03:23.280 --> 01:03:24.120 +They generated the diagram. + +01:03:24.400 --> 01:03:38.100 +Generated the diagram and then generated code from the diagram that didn't solve everything but got it, I think, 80 or 85% the way they're in hours from, you know, like cutting months of work down to weeks is impressive. + +01:03:38.360 --> 01:03:52.700 +And I've had some similar, you know, I'm still working on my personal one because like, I just, you know, after work, I just tend to shut down my computers and I'm not like dedicated to like really going at it but like, I found some really incredible benefits + +01:03:52.700 --> 01:03:53.580 +of doing something like that. + +01:03:53.680 --> 01:03:54.140 +that's awesome. + +01:03:54.380 --> 01:03:55.900 +I created this open source project. + +01:03:56.100 --> 01:04:05.420 +I mean, it's more source open, whatever, it's not really a project but I called it Python Package Guides for Agents and all the projects that I work on, I'll go and download the source and the documentation. + +01:04:05.840 --> 01:04:19.880 +So like, if I'm working on disk cache, I'll like literally clone it, clone the documentation and then make Claude write a super detailed, like, not use their documentation or its old version but like, have it like legit, write down examples, study the source code, study the documentation, + +01:04:20.280 --> 01:04:34.000 +source code like trumps documentation because if the docs are out of date and so on and then I'll drop, you know, if I'm using like two of these like maybe data classes and disk cache, I'll drop those things into my project and tell Claude about them and that's been a pretty neat thing to do as well. + +01:04:34.100 --> 01:04:34.300 +Yeah. + +01:04:34.500 --> 01:04:38.320 +But I want to leave this portion of our conversation with an incredible joke. + +01:04:38.860 --> 01:04:39.220 +Okay. + +01:04:39.500 --> 01:04:39.860 +Okay. + +01:04:40.120 --> 01:04:42.120 +Just because I feel like this has to be said right now. + +01:04:42.300 --> 01:04:48.740 +It's just so, the joke, this is the word copilot, it could be Claude, it could be Codex, it could be Chat, whatever, like just AI, right? + +01:04:48.960 --> 01:04:49.880 +Friends outside of tech. + +01:04:50.220 --> 01:04:51.340 +Lol, copilot is dumb. + +01:04:51.500 --> 01:04:52.020 +Friends in tech. + +01:04:52.020 --> 01:04:52.680 +In tech. + +01:04:52.940 --> 01:04:55.500 +I just bought iodine tablets and I've made an offer in land of state. + +01:04:55.580 --> 01:05:01.380 +My supplies of antibiotics and potable water are sufficient but I need to set up for the hydroponics to make it through the first few years. + +01:05:01.580 --> 01:05:03.020 +Like I feel like that's where we are, you know? + +01:05:03.460 --> 01:05:05.240 +Yeah, yeah, totally, totally. + +01:05:06.120 --> 01:05:10.040 +And that maybe also sums up your meetup as well. + +01:05:10.340 --> 01:05:11.260 +Yeah, yeah, yeah. + +01:05:11.560 --> 01:05:12.520 +Yeah, absolutely. + +01:05:12.760 --> 01:05:16.660 +It's quite a spectrum both of Friends outside tech and inside tech. + +01:05:17.000 --> 01:05:17.460 +Yeah, exactly. + +01:05:17.620 --> 01:05:19.860 +Like it's more like believers and non-believers. + +01:05:19.980 --> 01:05:20.680 +I'm not really sure. + +01:05:20.680 --> 01:05:21.400 +All right. + +01:05:21.400 --> 01:05:22.460 +Final Call to Action. + +01:05:22.580 --> 01:05:23.720 +What do we got here? + +01:05:24.000 --> 01:05:24.780 +People are interested. + +01:05:24.920 --> 01:05:25.500 +They want to get started. + +01:05:25.600 --> 01:05:26.080 +What do you tell them? + +01:05:26.680 --> 01:05:28.580 +Get your ebook, your free ebook that you put up? + +01:05:28.580 --> 01:05:28.960 +That's right. + +01:05:29.260 --> 01:05:32.320 +Yeah, so my website is everydaysuperpowers.dev. + +01:05:32.600 --> 01:05:46.260 +If you want an ebook that kind of introduces you into event sourcing and kind of gives you kind of this kind of fundamental background and a couple other things, go to everydaysuperpowers.dev slash es intro and it'll take you right there. + +01:05:46.260 --> 01:05:54.720 +I'm on Mastodon mostly, but I'm also watchbluesky and sometimes x underscore chrismay on all of those. + +01:05:55.120 --> 01:05:57.680 +With Mastodon, I think I'm on fosstodon.org. + +01:05:57.880 --> 01:05:58.340 +What else? + +01:05:59.080 --> 01:06:01.580 +I mentioned everydaysuperpowers, so that's... + +01:06:01.580 --> 01:06:06.180 +I also have a Discord from there too, so if you go through my website, you can see how you can join that. + +01:06:07.040 --> 01:06:07.200 +Sweet. + +01:06:07.320 --> 01:06:08.940 +Maybe check out the event sourcing library. + +01:06:09.280 --> 01:06:09.840 +100%. + +01:06:09.840 --> 01:06:10.260 +For them people. + +01:06:10.260 --> 01:06:10.700 +Yeah. + +01:06:10.860 --> 01:06:11.080 +Yeah. + +01:06:11.340 --> 01:06:12.400 +And if you... + +01:06:12.400 --> 01:06:13.780 +Oh, oh, the... + +01:06:13.780 --> 01:06:21.360 +Martin and Adam have a podcast called the Event Modeling and Event Sourcing Podcast, which is verbosely named, but it also is really great. + +01:06:21.540 --> 01:06:25.260 +You know, this is how I learn just from these great people, you know. + +01:06:25.460 --> 01:06:25.780 +There you go. + +01:06:26.000 --> 01:06:27.560 +Just kind of every... + +01:06:27.560 --> 01:06:30.920 +Almost every week talking about patterns they do and stuff like that. + +01:06:30.960 --> 01:06:36.960 +They also talk about a bunch of other stuff that isn't relevant, but I've learned so much from listening to them and they also have Discords. + +01:06:37.180 --> 01:06:37.360 +So go ahead. + +01:06:37.500 --> 01:06:37.720 +Cool. + +01:06:37.820 --> 01:06:38.820 +I'm honestly impressed. + +01:06:39.140 --> 01:06:41.180 +Like an entire podcast on a single design pattern. + +01:06:41.300 --> 01:06:41.640 +Let's go. + +01:06:42.360 --> 01:06:43.640 +That's commitment to it. + +01:06:43.880 --> 01:06:44.100 +Yeah. + +01:06:44.420 --> 01:06:44.640 +Yeah. + +01:06:44.740 --> 01:06:45.480 +And it's incredible. + +01:06:45.660 --> 01:06:50.440 +I mean, as someone who has played the board game and really enjoys it, it's amazing. + +01:06:51.540 --> 01:06:56.660 +Well, Chris, I really appreciate you coming on here and sharing all your experience and excitement and all the things. + +01:06:56.940 --> 01:06:57.440 +Great to talk to you. + +01:06:57.680 --> 01:06:57.960 +Likewise. + +01:06:58.120 --> 01:06:58.720 +Thanks for having me. + +01:06:59.780 --> 01:07:02.140 +This has been another episode of Talk Python To Me. + +01:07:02.280 --> 01:07:03.260 +Thank you to our sponsors. + +01:07:03.440 --> 01:07:04.720 +Be sure to check out what they're offering. + +01:07:04.900 --> 01:07:06.280 +It really helps support the show. + +01:07:06.780 --> 01:07:09.340 +This episode is sponsored by Sentry's Sear. + +01:07:09.340 --> 01:07:12.120 +If you're tired of debugging in the dark, give Sear a try. + +01:07:12.620 --> 01:07:17.920 +There are plenty of AI tools that help you write code, but Sentry's Sear is built to help you fix it when it breaks. + +01:07:18.460 --> 01:07:26.420 +Visit talkpython.fm/sentry and use the code talkpython26, all one word, no spaces, for $100 in Sentry credits. + +01:07:27.220 --> 01:07:30.800 +And it's brought to you by Temporal, durable workflows for Python. + +01:07:31.080 --> 01:07:37.780 +Write your workflows as normal Python code and Temporal ensures they run reliably, even across crashes and restarts. + +01:07:38.280 --> 01:07:41.080 +Get started at talkpython.fm/Temporal. + +01:07:41.800 --> 01:07:54.220 +If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTMX, and even LLMs. + +01:07:54.460 --> 01:07:56.880 +Best of all, there's no subscription in sight. + +01:07:57.320 --> 01:07:59.060 +Browse the catalog at talkpython.fm. + +01:07:59.720 --> 01:08:04.400 +And if you're not already subscribed to the show on your favorite podcast player, what are you waiting for? + +01:08:04.400 --> 01:08:06.880 +Just search for Python in your podcast player. + +01:08:06.980 --> 01:08:07.860 +We should be right at the top. + +01:08:08.000 --> 01:08:11.160 +If you enjoyed that geeky rap song, you can download the full track. + +01:08:11.260 --> 01:08:13.180 +The link is actually in your podcast blur show notes. + +01:08:13.760 --> 01:08:15.300 +This is your host, Michael Kennedy. + +01:08:15.500 --> 01:08:16.800 +Thank you so much for listening. + +01:08:16.980 --> 01:08:17.760 +I really appreciate it. + +01:08:18.180 --> 01:08:18.940 +I'll see you next time. + +01:08:18.940 --> 01:08:48.940 +to the show toHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHD + +01:08:48.940 --> 01:09:18.920 +Thank you. From fcf594191513313e6411f5e3bb0662d79fcbaf37 Mon Sep 17 00:00:00 2001 From: Michael Kennedy Date: Wed, 6 May 2026 14:30:41 -0700 Subject: [PATCH 16/16] tx fixes. --- transcripts/548-event-sourcing-with-chris-may-transcript.txt | 2 +- transcripts/548-event-sourcing-with-chris-may-transcript.vtt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/transcripts/548-event-sourcing-with-chris-may-transcript.txt b/transcripts/548-event-sourcing-with-chris-may-transcript.txt index 9d2e5b5..97610ca 100644 --- a/transcripts/548-event-sourcing-with-chris-may-transcript.txt +++ b/transcripts/548-event-sourcing-with-chris-may-transcript.txt @@ -2134,7 +2134,7 @@ 01:08:18 I'll see you next time. -01:08:18 to the show toHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHD +01:08:18 to the show. 01:08:48 Thank you. diff --git a/transcripts/548-event-sourcing-with-chris-may-transcript.vtt b/transcripts/548-event-sourcing-with-chris-may-transcript.vtt index 40fe32f..d613c47 100644 --- a/transcripts/548-event-sourcing-with-chris-may-transcript.vtt +++ b/transcripts/548-event-sourcing-with-chris-may-transcript.vtt @@ -3205,7 +3205,7 @@ I really appreciate it. I'll see you next time. 01:08:18.940 --> 01:08:48.940 -to the show toHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHDHD +to the show. 01:08:48.940 --> 01:09:18.920 Thank you.