James Phoenix — Mastering Code & AI for the Modern Developer

Reading Time: 36 minutes

Coding with AI changes everything. It changes how we design, test, and improve our software projects. Today, I’m talking to generative AI expert James Phoenix.

He’s written the book on prompt engineering and shares his hard-earned AI insights freely on the show — including a crash course in developing effectively with the Cursor IDE.

Are developers just AI wranglers now? Technical managers? Will we ever code again?

You’ll find out today.

00:00 – Arvid (Host)
Hey, I’m Arvid. Welcome to the Bootstrap Founder. Today I’m talking to James Phoenix, an AI expert and software developer, who’ll teach you how to approach building software projects quickly and effectively with AI tooling. We will dive into the limitations and promises of Cursor and similar LLM-based coding environments. We will talk about the future of software engineering as a whole, and James will give you the biggest, most high-impact things you can do to get up and running with Cursor. He helps me understand how to get things up and running with Cursor, and it’s really, really, really cool.

00:36
This episode is sponsored by Paddlecom. If you’re looking for a payment provider for your business that lets you build the things you want to build instead of having to deal with things like taxes and invoices, go check out Paddle. It’s worth it. I use it too. Now here’s my conversation with James. James, you’re the co-author of the book Prompt Engineering for Generative AI, which is an amazing book. I’ve been reading it over the last couple weeks and I’ve been really, really enjoying it. I’ve been learning loads. So you’re an expert on getting AI systems to do what you want them to do, and you’re an engineer, so we have a lot of overlap here. Is coding still the same thing to you, as it was back in the day before AI. I guess the question is has your understanding of what software development fundamentally is or is about changed over the last couple of years?

01:24 – James (Guest)
Yeah, so a really great question, arvid. I think the way that it’s changed is we’re all being lifted up and we’re doing less menial work. So back in the day, when you had people using ChatGPT to write blog posts, there were a couple of people that were playing around with the idea of generating React components in GPT-3. And it didn’t really have enough cognitive reasoning power to actually produce high-quality code. And what’s happened is, in the last year and a half, we’ve now reached the point where LLMs are almost better programmers or are able to generate high-quality code directly from just one example, or not even an example.

02:03
And so I think that we’ve all reached the point where we should be AI first and we should primarily be like an engineering manager, where you have almost like a bunch of juniors that are going off with your project requirements or your task requirements and they’re basically executing that, and then they’re coming back to you with some output and you’re sort of either giving them guidance and telling them that they’re wrong, and that feels like a lot of what I’m doing yeah, doing it with claude right now. I’m sort of treating it like a junior. It comes back, I go you’re like 80 of the way there, but you’re not 100 of the way there. That can, that can happen, yeah, um, or or it’s just getting it absolutely right and nailed it and I’m like you’re the best intern, um, considering the price point, you know. So, yeah, that’s kind of where I think we’re at at the moment.

02:47 – Arvid (Host)
Yeah, it does feel and I’ve been AI first for a while too, over the last at least half year while I’ve been building my own business it feels like it’s an ongoing pull request, just this constant pull request that’s just happening every single day several times an hour, where you’re just saying, yep, sure, oh, no, here is a mistake here and I want this different. Okay, now try another way, Maybe you can do it better. Like this is what coding with AI feels so much more than before, where you would iteratively explore, through reading documentation, through trying to kind of take other people’s examples and shape them into what you need. It feels much more like a judgment call now than it actually is an intellectual exercise in creating, like a logical structure. Do we atrophy? Do you think this might be a problem for engineers who are afraid, Like? There’s a lot of engineers out there who try not to touch it right Because they think the skill set might be going away. Do you think that’s the case?

03:39 – James (Guest)
So Primogen did make a good point about a week and a half ago on a video, which was we’re going to run into this problem of beginner experts, where you basically have someone who uses Composer and they create a full stack app but they’ve got no idea how it works. And that’s a huge dangerous point, right Like you’ve just built this entire app and you’ve basically fumbled your way by working till 2am and midnight and just saying fix it, fix it, fix it. So I think there’s a danger there where we basically just end up prompting too much. So what I tell most people is you should basically always understand the lower level primitives. So, if you’re working in Reactjs, you need to understand state management, you need to understand use effect, you need to understand the rendering cycle, but when you’re generating the code, you don’t need to understand that code unless you need to fix that implementation. So it’s almost like because you’ve been lifted up to an engineering manager, you now no longer need to understand every part of the implementation. However, when the implementation needs to change, then you obviously need to dive into the code. And also, if you don’t understand those level-level primitives, then what happens is you’re going to spend a lot more time in learning.

04:50
I call it learning and discovery phase of Claude. So I think there’s two main ways that we’re using any kind of LLM. The primary way is you’re using it in worker mode. So here’s a task, here’s context X generate me, outcome Y, and basically that’s how most programmers work.

05:09
But then there’s another side that actually a lot of the juniors will use, which is that it’ll write some code and a good junior will say I don’t understand this, can you explain this to me? And they’ll be feeding off Claude and upskilling themselves, feeding off Claude and sort of upskilling themselves, almost like one-to-one on one-on-one mentoring. And so what’s interesting is that a senior will do a lot more worker mode and a junior will do a lot more learning and discovery mode. So if you’re learning a new language and you don’t understand syntax, that’s the point where you should flip modes and say you know, you’ve just generated this code, great, you’ve solved it, but I didn’t understand these three bits. And so that’s where I think you can get a lot of power from using LLMs. Is you? Basically you can ride off the back of the implementation, as long as they describe those lower level primitives to you in real time.

05:58 – Arvid (Host)
Yeah, I think there’s also a discovery mode on the other side, on the expert side of things, where you are so kind of siloed in your own knowledge situation where the only thing you know is the stuff you’ve always been doing, like your best practices. What you think is right, that when you present your work to the AI, to any LLM that has these capabilities, and tell it well, find me alternative ways to think about this it can now look at it from an outside perspective. I found this extremely useful in just trying to curtail my own limited vision that I have into my software as it came. Do you suggest that for people to lean into more? Because you say a lot of experts are in work mode because they need to get stuff done? Should they also be in this kind of explorer mode?

06:39 – James (Guest)
Yeah. So I think that there’s a danger of committing to one implementation too fast because of these types of you know, you can just generate a load of code, and so one thing that I often do is and I actually do this in Cursor and just in the chat window is, before making a larger decision, I will ask it like here’s all my code, here’s what I’m thinking about doing. Here’s a couple of different approaches. Which approach do you recommend and why should I go with that? And I’ll give you an example. I’m building a data pipeline at the moment and I had a choice between using a callback architecture versus a polling architecture and I gave it all the code and then I said which do you think we should do, given this kind of implementation that we’ve already got? And then it comes up with a series of trade-offs, and that can be a really great way for you as a senior or someone that’s an intermediate, to not just dive straight into. Let’s produce it, let’s get it working. So I think there is some value there.

07:35 – Arvid (Host)
Does it ever suggest stuff to you that is kind of way out of your field of understanding? I’m thinking I’ve done similar things I’ve had I need this and I have many different approaches. Could be callback, could be a polling, could be an RPC, and then it started suggesting like a what is it? Apache Kafka situation, a message queue implementation, something that I’ve maybe done before but probably wouldn’t have thought in this kind of particular infrastructure thing. So how much exploration should be put into all the different varieties? Because it feels like you have this kind of selection overflow, this overwhelm of all the different things you could be doing. How do we judge this when they’re presented so eloquently by the LLMs?

08:15 – James (Guest)
Yeah. So I think it all comes back to you should always choose the. This is kind of an architectural question, but yeah, you should always choose the simplest architecture and you should always have the fewest lines of code. So ideally you should have no code and you should have simple architecture like that. You should basically start with those premises. Assume that all code is basically rotting in real time and it’s you know, there’s that phrase, you know like the. That basically it’s it’s. Software is a lot like milk. You have to keep going and improving over time, so it has a high maintenance cost. So I think when you’re thinking about architectural patterns, it’s basically always the simplest. So as long as you can get away with a simple architecture pattern and it’s not going to shoot you in the foot in a couple of months, then you should just generally go for those.

09:01 – Arvid (Host)
As a software engineer, when you start with a project, a new project, greenfield, or maybe even a legacy project, do you first go through this kind of infrastructure architecture phase or do you dive right into building, into prototyping?

09:15 – James (Guest)
Right. So this is going to sound crazy, but I actually start with a README file and I basically say, okay, right, let’s map out all of what we want to achieve and they won’t. They won’t be high level goals, like we’re going to create a CMS. There’ll be sub goals within that, because the CMS is too vague of a concept. You’ve got things like Contentful and WordPress, but if we start to dive into some lower level subtasks, then what what that’s going to do is you can basically take a boilerplate and this highly refined product specification that’s almost written by a product manager and you can basically say here’s my boilerplate code, here’s my product features. I need you to start banging out SQL migrations in Cursor to basically get it from this boilerplate.

09:59
That kind of does auth and payments, but it doesn’t have any of that business logic built in, and rightly so.

10:06
But then, combined with that readme, you can actually go really far.

10:11
So that’s actually how I start every project is I pull in some boilerplate. I’ve already got a readme that’s very rich, and also in that readme I will include things like what API services we’re going to use, rate limits for those API services. We’ll include all the kind of limitations on, and sometimes I’ll even throw in the API documentation and the links to those, so that not only does it have the kind of product features but it also kind of has the recommended tech stack. It has the recommended documentation and it’s also already got the boilerplate database migrations built in, so it already knows where we’re up to. Because what you’re trying to do is avoid sort of going forward, then going backwards, then going forward, then going backwards. So as long as you sort of bring it up to context in the sense of we’ve got these three tables, we’ve got next, we’ve got Stripe, and like here’s what we’re trying to do right now with these API docs and these types of sub-goals, then I think that’s a really good place to start.

11:08 – Arvid (Host)
It does remind me a lot about just the waterfall model of software development in many ways it’s like fully specced out, but the process itself is agile, right. Then you kind of let it build and you do this kind of okay, this is a feature, this is kind of how I want it, sure for now. But okay, this is a feature, this is kind of how I want it, sure, for now, but this is kind of different. It feels like a mix of different software development archetypes.

11:32 – James (Guest)
Yes, I think the reason why it’s good to specify everything up front is because writing text is really cheap from an implementation point of view. And let’s say, for example, you define this app is going to have three features or five features. Maybe Cursor would produce you different code if you actually told it it’s going to have three features or five features. Maybe cursor would produce you different code if you actually told it it’s going to have these seven features, right? So by kind of just saying the end goal of this app you know roughly and we might change some things as we go along it’s going to have these kinds of features.

11:58
It’s very cheap to do. It takes you 20 minutes to write this all in a readme file and and because of that, it already has the context when it goes to then start building out new migrations in. You know, for sql or any database that you’re doing, you know new mongo schemas if you’re using mongo db. So because of that, you’re already getting a lot of the benefits of the, of the foresight. You know you’re like you’re trying to avoid, uh, living in hindsight and live a bit in foresight. So that’s kind of the theory behind that approach. That’s cool.

12:29 – Arvid (Host)
I like it because you’re anticipating that the model also anticipates your future features. Yes, it’s kind of a nice double anticipation here. I wonder, when you talk about this with me. I think this is a great approach. I think I subconsciously use this Now that I write my own features. I just write them out in a Notion document like the way I want them, and then I just throw that plus some files from my code base that are kind of like it or that they have to fit in into Clot, and just let it generate a whole thing. That’s how I used to do this. I recently jumped into Cursor. We can talk about this in a minute. But I want to get back to this approach for a second. How detailed are those steps? Because what I hear is that you have the features that you want to be in this kind of grand scheme of things, but when it comes to the data models and the APIs, you’re getting very specific. So it’s a mix of highly specific and general. Is that correct? How do you approach it?

13:20 – James (Guest)
Yeah. So I think the key is to essentially give Cursor or some type of LLM the sort of rough idea of what the app is doing. Then I think it’s up to you as a developer to basically go in and start doing the first feature. And once you’ve done that first feature, you should go and revisit the readme with anything that you found that’s going to potentially change. And you can almost think of like some readme or some instruction MD file as a place where you’re storing kind of the rough end gold vision of what this app is going to do.

13:51
And I think it’s dangerous to not believe or be in disbelief that things aren’t going to change right.

13:58
You’re going to get into one feature and you’re going to realize, oh, I need a different type of data structure or type of API service because of how this feature, the problem is right. As you explore the problem, you will incrementally gain knowledge around how to appropriately solve that problem. So you know like, for example, if you’re using the OpenAI batch API, there’s some limitations on that right, like you can only put in 50,000 requests per file and it has to be under 100 megabytes per size. So that’s also why I think that if you put some of these limitations as well inside of the features, as well as just mocking out like I’m going to build these three features. That’s why I think you can avoid some of those pitfalls of kind of having to go back and change. So a mixture of both the sub goals and limitations and opinionated choices is going to mean you’re going to have to do a lot less iterations when you start diving into that lower level implementation. But you will always have some backwards and forwards for sure.

14:59 – Arvid (Host)
So does this readme file, this kind of initial spark of the project, grow over time to just add more things to it?

15:06 – James (Guest)
Yeah, I do. Sometimes it depends on how much it’s changing. I would say that on general, the first version of it is the most useful because that will give you a start and will immediately start pushing you into a feature. After you’ve done that feature, you probably have enough code as a reference point to then piggyback off, plus the readme, to then start building sub features or different features. So it’s almost like you’re using the readme as a sort of rocket ship to get you to understanding. But then, once you’ve got enough code, the code itself is also partially the business context, right.

15:48
So I think there’s an element of that, because as you start to build out different types of schemas in whatever you’re doing, the LLM is very good at figuring out. Oh okay, right, cool. So if you looked at PodScan schema, we’d maybe have a podcast table. Podscan schema, we’d have a podcast table, we’d have maybe a transcriptions table. But over time you could give that schema directly to an LLM and you wouldn’t need the readme file anymore because it knows that it’s a podcast. But there’s this weird intermediate where, if you’ve got no extra tables and you’ve set a boilerplate, the readme acts as a pickme, like a rocket ship to get you to that structured schema and then from there then you can then just pull in the schema, the migrations, and it’s like, okay, right, we’re back to where we are. So I think it’s more of like an initial jump ship like that. You’re sort of using.

16:37 – Arvid (Host)
Yeah, that makes perfect sense, because if you don’t need to update it, because the code is actually the truth of the project Instead of the wishful thinking that the readme expresses, it’s the actual thinking that happened plus its implementation and code. That gives a lot more information. And schema, yeah for sure. I do wonder. This is a thought that I just for the very first time. Sadly, I should have had this way earlier. How important is naming of tables, naming of properties on tables or whatever kind of properties you might have in whatever database system you have? Because if the schema and its connectivity is so relevant for the LLM to understand the data and the kind of relationships that it has, how, how specific do you need to be when you name variables or fields in there?

17:21 – James (Guest)
Yeah. So I think that we don’t have to worry about this problem. And the reason why is whenever I’ve used Claude or GPT-4.0, the field names they suggest are already pretty good. It’s quite good semantically. So I think, yeah, you’re in the danger of if all your columns say something like time, col, x, y, z, then it’s a very dangerous place to be right Because there isn’t any semantic linkage or context. Generally speaking, llms when they’re providing kind of these generations in terms of your database, over time they generally tend to be quite good. So I think it’s almost like a byproduct of naturally developers have written good field names and that’s already in the training data. So I think it’s almost like a byproduct of naturally developers have written good field names and that’s already in the training data. So Claude’s already kind of just saying, yeah, this seems like a good name for this column or this key value in MongoDB, for example.

18:14 – Arvid (Host)
Yeah, it makes me wonder, whenever I think about these machines that are now ingesting data that has also potentially been created by machines, how much kind of self-sustaining, self aggrandizing problems we might get in the future or benefits we might get in the future from how these machines express things. Are the things that these machines then expect future tasks or future texts to also be generated? As I don’t know if it makes a lot of sense to you, but this AI ingesting AI results problem. Do you see this being a thing with AI-generated code?

18:51 – James (Guest)
I think what they’re going to do is they’re just going to use synthetic generation to solve this problem. So they’re basically going to use LLMs to generate data and then from that data, what they’re then going to do is use human reviewers and then use those as the way to get around the fact that we’ve kind of gone through and we’ve experienced that, where we’ve already used quite a lot of the training data for text. So that’s mainly the reason why I think they’re going to use synthetic data generation to get around that.

19:21 – Arvid (Host)
Yeah, but what if these humans have been trained by LLMs? What if they’ve only learned everything they know about coding by discovering these kind of things from LLMs? I mean, it’s a joke, right, but it’s kind of a potential problem.

19:36 – James (Guest)
It is. It kind of makes me think that actually the frameworks and everything that we have at the moment is going to be the primary way. Um, I think, because of of what’s happened with so much training data on stack overflow online, there’s a there’s, there is a point of maybe we’ll just stick with some of the frameworks that we’re used to, because the newer framework is competing with so much training data, so many answers, so reacts kind of become like the golden standard for example.

20:04 – Arvid (Host)
Okay, that makes sense to me. Yeah, that is an interesting point. The LLMs are so capable of dealing with these languages, these frameworks that they know, yet they probably would really struggle with something very new, something that people could probably pick up and deal with within days after it’s being released or invented. Do you think this is a drawback for new paradigms in software engineering that those machines are so trained on decades of the other stuff?

20:36 – James (Guest)
I think that the danger of it is, if you release a framework and it’s not indexed into something like Claude or GPT-4.0, you’re going to struggle with developer adoption because someone is just going to be so much faster in React or Laravel or any kind of framework that’s got a lot of training data.

20:53
So I think what’s going to end up happening is it won’t stop frameworks being developed, but the newer frameworks won’t be as good until web search and in-time context retrieval is really very powerful. And what I would say at the moment is even the documentation feature in Cursor is generally not that great. I normally just copy and paste the actual documentation page in and I find that that works a lot more than using RAG and using chunked up embeddings to get portions of the documents in. So I do think that in the short term it is a disadvantage for framework developers. In the long term, it’s probably going to be okay when the context window is good enough that you can just throw in 100 pages from a web search and it just gets the framework now right. The context window is good enough that you can just throw in 100 pages from a web search and it just gets the framework now right.

21:49 – Arvid (Host)
So in the short, term it’s definitely a problem, but in the long term I think we’ll be fine. I’m looking forward to the days where the interval between new information being out there and new information being integrated into the system gets shorter and shorter, where it’s almost at the real-time level that’s kind of my Star Trek fantasy is there is a computer system that just knows when things happen, because it has this data inflow all the time. I guess that’s what PodScan also is trying to help people with. That’s why I’ve built this thing For real-time information. So I’m trying to sell to companies like OpenAI and Anthropic, because they have training needs and I have the data right. So that’s a thing. But let’s maybe get to something that the people listening to this can actually use, which is Cursor.

22:25
I think you’ve been talking about this quite a bit. I have started using Cursor three days ago, so I’m super behind on this. So I just want to give you a really, really short glimpse into my first couple of days of experiencing this tool, which were, oh, this is VS Code. And then I did something and then, oh, this is not VS Code anymore, like all of a sudden, this, what to me was an editor for decades, just a thing where I type text and maybe get some intelligent suggestion for what a variable name is that is already in the same file, turned into a wholly different experience. And I got to say initially I was quite overwhelmed by the idea of allowing this system to write the code for me.

23:08
I think I kind of slowly iterated myself into AI-assisted coding by first using ChatGPT for questions that I had, by then using Cloud for just putting my prompts in there, taking the code and manually transferring it over, like you know, kind of the boomer way of coding, but that’s kind of what it is. And now I got into Cursor and all of a sudden everything changed and I feel overwhelmed. So maybe you can give me the first steps that I should be taking to make this usable to me and to make it comprehensible to me this tool that has so many capabilities that I probably haven’t even seen 5% of at this point. So what are the best first steps here?

23:46 – James (Guest)
Yeah.

23:46
So I think for anyone listening, just in terms of what you should be thinking about for Cursor, well, vs Code has basically got your GitHub Copilot, which is your auto completions on your single file, and they have started doing multi-file edits now as well. But for anyone who’s listening, cursor has basically got three main modes. You’ve got inline edits, which is great for if you’re just changing a block of code with your command K or control K, and you can even ask follow-up questions. So if it doesn’t give you the first time the answer you’re looking for, you can basically ask a follow-up question. Command L or Control L is your chat mode. Now I would personally recommend using that on one or two or three files. If you’re doing huge sweeping changes, you don’t want to use chat mode. It’s too much effort to manually apply all of the different blocks of code and at that point when you’re looking for that, you’ll want to use Composer, which is Command I or Control I on Windows. And so there’s three different modalities and in each of those modalities you can also use the at symbol, and I’m sure you’ve probably started using that quite a lot of it. It’s really great. It’s basically replaced that entire work stream of.

24:56
I’m going to go into Claude, I’m going to copy my code in or chat to PT and then I’m going to get the answer. I’m going to copy my code in or chat GPT and then I’m going to get the answer. I’m going to paste it back in. And there’s a lot of manual files that you’d need to trawl through and making sure that Claude or GPT-4o has that relevant file context, and so it’s saving a lot of developers time by just adding a folder or adding a file. Or you can even do something called at code base which will basically look at the local embeddings and, given your query, it will find relevant files which it will then basically pull in. And so when you’ve built a feature, you can even use at code base to then figure out what files it should actually be using. My personal favorite is actually the app folder. I actually think that when you have direct control over what files are being referenced into the prompt, you’re going to get the best kind of generations At code base is kind of good, but again, it’s all based on the semantic similarity between the query and the file names and the function names and the class names and all that kind of stuff in those files.

25:59
I think that you should default to composer mode if you’re building indie hacking apps, and you should only use chat mode and inline edits when you really need to make sure that a certain part of a workflow works really well or you’re struggling in Composer and you’ve got 99% of it done and you’re just working on a single file or a couple of files. That’s when I start to switch into that kind of mode. There was a time about a month ago when we didn’t have the newer Claude model that sometimes even Claude would struggle with features, and at that point then it’s sometimes worth actually going in and looking at the logic yourself. So, depending upon the intelligence of these models, you can give them bigger and bigger goals, and as you give them bigger and bigger goals, you should default to using Composer, because they’ll be able to handle that multi-file and editing experience. So it’s almost like, as the models get more performant and more intelligent, we should generally be doing multi-file editing as the primary way of working on a project, and the only danger with this is when you change a lot of files, you lose a lot of context.

27:12
So you do have to go and read through those files and understand what’s changed, because if you don’t, you’re basically it’s almost like you’ve given away a little bit of your project. Someone’s gone and done it in some other country and they’ve given you the code, they’ve sorted it in and you’ve got no idea how this entire piece works. So that’s the danger of Composer, right? The other danger of Composer is regressions. So once you’ve done like 90% of a feature or a file and then you say, can you add this? Sometimes it will actually randomly delete stuff in the file and so you have to be like no, actually put this back. So always be on the lookout for regressions when you’re using Composer.

27:53 – Arvid (Host)
Do you think this will be something that the team at Cursor will fix this kind of regression stuff, or is that an innate problem?

28:06 – James (Guest)
I think that’s an innate problem with the llms, yeah, and we’ll see that that goes away when we get newer and newer models. So like in a year and a half that probably won’t even be a problem anymore. But right now it’s a problem because it’s only focusing on a certain portion, portion of the generation. And there there is a feature that they’ve implemented in cursor toor to kind of patch this, which is called Composer Checkpoints, which allows you to revert, almost like a Git reset or a Git checkout, back to a specific commit shot, and you can do that directly in a Composer chat window. So you can already revert state back, so you can almost YOLO your way there, and then you can YOLO a little bit more.

28:47 – Arvid (Host)
And if it fails you can always go back to the previous YOLO. So it’s not too bad, it’s pretty good. It reminds me so much about machine learning and the way we do that gradient descent and these kinds of things where you just wiggle around the hyperparameters a little bit to see if it’s better or worse. It’s so funny how all of these things come together and the approach that we have with AI, like this very experimental but also then iterative step of building stuff Like does this work? Maybe, let’s see, let’s try this, let’s try an alternative. Are there other things that you see in your own use of Cursor or LLMs in general, where you have to look very specifically to make sure that they don’t mess up too much? Like you just said, there’s this kind of random deletion. I had that too in many ways, where it just forgot that that was part of the thing that I wanted it to build. Are there other things you have in mind?

29:29 – James (Guest)
Yeah, I think when you start building distributed services, that can get quite tricky because if you’ve got, for example, this API calls this other API and then a load of jobs happen, and then those jobs get, you know, put into a table somewhere on the redis and then all of those gets collected and then so, so. So that’s where an example of where you, as a software engineer, should focus a lot more on the in the, the flow of data and the potential errors that can occur between these distributed systems. What I have found is the more monolithic the app, the better cursor is. Just like yeah, I love this stuff, great. So you know, like, if you’re using Nextjs, like, and you’re not using an API backend, and you’ve got it all in one monorepo and you’ve got types, you can just smash out CRUD Like it’s nobody’s problem, right?

30:19
When you start having, like, an API server and you’ve got specific like AWS step functions or you’ve got a Google Cloud workflow, like anywhere where you’re adding on cloud services and you’re breaking up the flow of the data, that is where you need to spend a bit more time. Sense checking okay, I have this service A and it’s calling service B and then that spins up a load of jobs. That, to me, is where we should be spending our time, and we should be also doing a lot of the architectural work up front. And one thing I do do so I did this recently is you can use a tool like Excaliburcom and you can mock up your architecture for a certain portion of cloud, and then you can just push that into cursor, give it your database types and say generate me the API routes to do this part of it. So it’s still quite useful for converting architectural diagrams into service level code that executes those. And yeah, so that’s my thoughts on that. The more distributed the system, the harder it is to implement these code generation tools.

31:25 – Arvid (Host)
Also, I’ve been trying to have it generate AWS-compliant policy, json objects, that kind of stuff. You know these bucket policies for S3, and sometimes it works, sometimes it doesn’t. There’s a lot of hit or miss in this as well. Just in my own experience.

31:43 – James (Guest)
Right. So my recommendation for that is, if it fails more than twice, go to the website and just go and get the relevant web pages and just Command-A them or add them using Cursor, and if that fails, then you’ve reached the limits of the current model is generally how I think about things. So if it does it without any context, then it’s already baked into the training data. If it doesn’t, you need to add the relevant context. That could be your data base types, it could be the web pages on AWS Maybe, for example, the policy API has changed. Then you need to go and fetch that context manually. If that’s failing, then that’s basically like you probably the LLM can’t do it at that point and you’ve got to go and sit down and do it basically. So that’s currently how I think about things. If that makes, sense.

32:31 – Arvid (Host)
That does make a lot of sense, and it sounds like we’re still doing the stuff that we used to do in the pre-AI days. We still try to find the proper documentation. Now it’s just that we don’t want to read it ourselves. We feed it into the system. That then kind of turns the knowledge in there into code that our brain would do, which is why the atrophy idea is so strong. Right, we’re almost there. We know where the documentation is. We literally copied the URL to the docs, but we don’t read it. We just give it to the AI. Right docs, but we don’t read it. We just give it to the AI. I don’t want to sound like old men yelling at cloud, but it does feel like a risk in the career of a software engineer to not read the docs to hope that the machine does it enough. Do you still read docs? Do you even still manually write code at this point?

33:18 – James (Guest)
Yeah, so my advice on this is just what we were discussing earlier as long as you understand the primitives of what those docs are talking about, I don’t read the docs anymore Because to me the docs are describing ways of interfacing with some type of service via language. So as long as you understand that language. So, for example, if we take Terraform right, terraform has a series of resources, reference names. You’ve got data, you’ve got outputs, you’ve got variables in Terraform Cloud, you’ve got this idea of Terraform State, you have the ability to do version control on your Terraform.

33:59
And as long as I know all those things, the usefulness of me going and understanding a specific doc is fairly low in terms of impact in the long term. So for me I’m actually more interested in learning more different primitives. As long as I know the primitives like 80 or 90% that’s when I’m actually thinking well, there’s no point in me reading these docs, because web search, you can already do that with Cursor. You can actually say, every time you do a query, do a web search. But I basically think, yeah, the primitives are the main thing that you should be focusing on. So, for example, I didn’t know a bit of a for loop inside of Terraform. I said explain this, break it down for me, and then I kind of understood it. So that to me is the usefulness of this stuff.

34:42 – Arvid (Host)
And generally, I guess, in any skill-based endeavor, it’s good to know the fundamentals you know and then from there extrapolate the complex relationship between them, which is the same thing that these machines do. Right, they just turn it into code, where we then turn into judgmental product managers.

34:58 – James (Guest)
Yeah, having said that, you can do things, like you can say I know I’m learning about this, here’s what I know X, Y and Z, can you teach me some new stuff? So I did that with. I did that with? What did I do that with recently? I think it was with SQL, and it started talking to me about window functions and partitioned indexing.

35:18
So you get into some really interesting and advanced topics, which is a which I think is a way to grow your skill set. So you basically take anything that you’re learning you know, whether it’s react or nextjs or python and you basically say here’s what I know. I want to become a 10x engineer. Teach me what I don’t know and you can get some really interesting stuff out, like it was teaching me about property-based testing and mutation-based testing, which is like really advanced stuff, or like reverse debugging is like what a 10x engineer would do and you won’t. So there’s some really great stuff you can get, and that, to me, is the usefulness, because you’re exploring these higher level fundamentals that are built upon what you already know.

36:01 – Arvid (Host)
Yeah, definitely, and it’s an infinite treasure trove of knowledge too right, because for each of these things you could dive into probably what is a lifetime of other people’s knowledge and insight just compressed into the LLM, particularly with testing. I think we should talk about testing, because that’s one thing I’ve never done. Sorry, I’m outing myself as a non-test-driven engineer here, but do you test? Do you let the thing write tests, and how much of that impacts the actual building part of your software development.

36:32 – James (Guest)
Yeah. So I think I’m in very two minds with regards to testing. Testing, if we just talk about the drawbacks, it has a lot of maintenance involved and it will often mean that you’re spending time setting up testing infrastructure. The benefits are that you can essentially figure out that something is working and when you change something else, you don’t end up with regressions and you always know whether something has caused something else to fail.

37:02
Now for indie hackers you guys should get on integration tests, right. So Playwright is your friend. Don’t go for just tests, don’t go for unit level testing, just basically go, for I have this one test that runs and it tests the entire feature, end to end. Right, that’s it. So you know that maybe that could be, for example, logging in, clicking on a table, uploading a PDF and then clicking the go button, and then, after a bunch of API endpoints, a result comes back, and that’s like one feature and you can just write a single playwright test to do that. And you can just write a single Playwright test to do that, and then you get the benefit of an integration test which will test these larger flows without having to test all the bits of the app.

37:51
Personally, I do write tests, but I actually have in my cursor rules file. I have Jest tests and Playwright tests and PyTest tests, but basically what I do is I tell it as I write the code and so I’m feature first with empty tests and sometimes those empty tests get filled if it means that the product is going to be more reliable. So I think that actually there is value in both, and people that don’t test at all have got it wrong, and people that are militant with testing have also got it wrong. We’re strictly talking about the startup space. If you’re building an enterprise banking app app, then you can’t not test because if the the risk of something going wrong actually outs, offsets basically the test.

38:37
But when we’re talking about indie hacking, I think you should be feature first empty tests and then you should fill some of the empty tests with playwright integration tests for you. But they’re the best bang for your buck. They’re called black box tests. You’re not testing the whole, every single function signature call. You’re basically just testing does the thing work? And you’ll see, levels has done that with loads of his robots. He’s just writing integration tests. He’s not testing at the function level or at the class level, and that’s what you should do, in my opinion, yeah.

39:08 – Arvid (Host)
Yeah, and when things break you can dive into the test and dive into where it broke and then look at that.

39:14 – James (Guest)
There’s also an idea of this concept called defensible programming, which is basically where you you know, as you write code, if you run into something that shouldn’t happen, then you basically should log an error or you should raise an exception, and actually that ends up partially being a testing suite in some way. So I’ll give an example like if you have a job that should get created and that should create 10 jobs in the next service, if you check for if those 10 jobs should exist and if they don’t, you throw like a century error. Then that’s like a good way for you to actually be, um, not necessarily testing the actual architecture, but as the app experiences bugs in real time, you’re just logging a century error. This shouldn’t have happened, we should have. We just raised an assertion. So to me that’s actually. That actually gets you quite a lot of the way anyway.

40:03 – Arvid (Host)
Yeah, canary testing. That makes perfect sense. That’s. It’s a very interesting approach, like you’re going more on the general grand scheme of things and you try to see do things work mostly, and then you go then back into the details. You mentioned one thing the cursor rules file. What is this? I’ve never heard of this before. This sounds like something that I should probably know a lot about.

40:27 – James (Guest)
Yeah, you should know a lot about this. There’s a rule which is a hidden rule. It’s called a cursor-rules file and you should. You’re meant to put it at the root level of your directory, but I’ve actually found that it works better if you put single cursor rules files in different ones. So if you’ve got a client or an API, you should default to putting your cursor rules file for your Nextjs app or your, you know, whatever it is Vue or some other framework, you should put that in and it should talk about Vue.

40:58
And when you load your cursor project, load it just from the client app if you’re going to work in the client for a bit, because cursor rules doesn’t seem to work that well when you have multiple different folders. So, but anyway, a cursor rules file is basically a file that will change the outputs by constantly feeding in a prompt, and there’s a website that everyone should check out called cursordirectory, and it gives you a bunch of different prompts that you can then copy and paste and a load of these. Like you’re a Nextjs developer using Nextjs 15, you like writing server-side React components, minimize, use client. So it’s trying to encapsulate all the proper conventions that ensure that when it produces code, it produces it in a certain style and a certain flavor that you, as the developer, like.

41:45 – Arvid (Host)
That sounds like something that all the big frameworks like I’m thinking about Laravel and React should supply with the framework. Like this opinionated stuff should supply with the framework. This opinionated stuff that goes into the framework might as well be expressed very clearly in a cursor rules file. I wonder if that becomes a new standard where it’s not going to be cursor rules forever, but for any tool that has this kind of capacity Just like we have robotstxt or things like this some kind of neutral format that allows to express these kind of things. That is very smart, that’s cool. What else do you put in there?

42:18 – James (Guest)
So anytime you experience a bug, because the LLM currently has a bug inside of it, we’ll also go in the cursor rules file.

42:23
So I’ll give you a good example of this Superbase just migrated about maybe three or four months ago, from their next auth helpers to their SSR package and that’s a node package and so whenever Cursor was generating files for me, it would actually generate it using the next auth helpers.

42:41
And I’m going come on, Mike, this is ridiculous, Like we’ve changed this. So in my Cursor rules file it now says if you need to look at the client libraries for Superbase, always default to use the Superbase slash SSR package. And they are in these folders. And then it knows exactly that it’s already got the clients created which have got types attached and it knows that it shouldn’t hallucinate and start using next door helpers, it defaults to using the newer package. So as you experience bugs through the lack of the knowledge being inside of an LLM and it’s pre-training data, you basically just add those into the cursor rules with the hope that the new LLM, when it’s trained, will have got all those bugs out of it. But it’s almost like you’re putting a band-aid on the fact that this LLM isn’t perfect.

43:26 – Arvid (Host)
And that’s just a progression of these systems. They will get smarter, they will get better. They will integrate this in the future. It of these systems. They will get smarter, they will get better, they will integrate this in the future. It just takes some time to train them and it probably is just a matter of time before these things get outdated. But that sounds also like something that developers should maybe share amongst each other a little bit more. It sounds like something that in an enterprise or any kind of team development settings, you could share amongst each to each other to generate. It’s almost like a linter, but on a different level. Right, it’s?

43:52 – James (Guest)
like a probabilistic linter is how I would describe it. Yeah, so, yeah, it’s good, but yeah, definitely use it to any bugs that you experience, and also following certain rules. So when you produce an API endpoint, it needs to be structured like this. Or when you produce server actions, they should look like this, and you can put an example of the server action in there. So you’re really trying to make it so that when you tell it to generate a new server action or a new set of API routes, it’s going to do the same thing, which means that you won’t then have to go and change all of those manually or reprompt it, which means that you won’t then have to go and change all of those manually or reprompt it.

44:36 – Arvid (Host)
You’re just getting the actual output that you’d want kind of zero shot first time around. That’s really cool. Yeah, I’m really going to use this man. I’m looking forward to writing all my little idiosyncratic things that I want right into this file for the thing to do it. Yeah, I never knew. This is really cool, and that immediately makes this more usable to me, because now I can really put my personal touch on this code base without constantly having to explain myself. This is really really awesome I have.

44:58 – James (Guest)
I have a question for you what do you do when you have, um, when you have problems inside of your, your problems tab at the moment, like what’s your workflow? So you know, like, when you generate code and you get these problems forming, do you do anything with those at the moment?

45:13 – Arvid (Host)
Well, usually I take a nap. But yeah, the way I solve this is by giving myself like 15 minutes to deal with it, maybe 30, depending on the complexity of the thing. And if I can’t solve it I just go back like blank slate, to the like almost to a checkpoint, like you mentioned before, and try to do it again. But this time think about it differently. Like take a, take an outside perspective maybe, describe it differently, describe outcomes, describe user outcomes, describe infrastructure outcomes, where before I described features like all of. I just try to switch it and then I go down the same rabbit hole again and usually after one or two tries I get to that point like I think checkpointing is kind of what I do.

45:53 – James (Guest)
Yeah. So the reason why I was asking about this is because there’s a tab inside of Cursor called problems tab and it will highlight all of these syntax errors that you experience and type errors. And one thing you can do is when Cursor generates code, you get it to generate the code and you restart your TypeScript server or your pylance server or whatever language server you’re running. And then, after it’s done that, then if it’s generated these errors, you go into the problems tab, hit control A, control C or command A and command C, and then you go and just take all those errors and re-dump them back into Chord or some type of LLM, so give that a go. So the next time that you’re experiencing problems because there’s like a type error or like some type of syntax error, basically just get the LLM to generate the output Assuming you’ve got all the right context to start with and then just say I’ve got these types now, after restarting my language server and you can do that by hitting Command Shift P or Control Shift P, depending on if you’re on Windows or Mac and there’ll be a restart TypeScript server or whatever the server is.

46:57
And you have to restart the TypeScript server because when you use Composer, it can generate 10 files and it says that these imports aren’t actually there. And then, once you’ve restarted the language server, the number of problems will naturally decrease and you’re left with basically typed problems like compilation bugs, syntax bugs, and you can take all those and just dump them straight back in and you can work around in a loop a couple of times. If you do it about three or four more times, then generally you won’t find a solution, but one or two times you can use the problems tab like that and it works like magic.

47:34 – Arvid (Host)
That sounds way more efficient than my nap and reset strategy that I have. Yeah, I don’t think I’ve even gotten this far into Cursor yet where I’ve noticed this happening. I’m just trying to take it all in, but hey, it’s still extremely useful in so many ways, even if it comes up with problems, and I’m using PHP, which is pretty resilient to types. It’s fine, right, there’s a lot of type coercion there, so I’m not running into these issues, but I’ve had my little issues there and I think, now that I’ve talked to you, I will have way fewer issues. It’s really cool. Do you ever code without AI-assisted help anymore?

48:17 – James (Guest)
Oh my gosh, that is the perfect question, Right? So the time to do that is when you’re learning new primitives. So if you’re learning a new programming language, so if I was going to go and learn some more Rust or Go, I would turn off tab completion and I would code a vanilla-like style, basically. And the reason why is, if you don’t, the risk is you’re basically being an engineering manager that’s got a Go intern but you don’t know how to write Go and there’s a real danger of you not understanding the lower level fundamentals. So that is a really great example of where you shouldn’t be using Cursor, because you will not gain that foundational knowledge to then leverage it at a higher rate when you’re using. You know chat or composer for sure. So definitely always turn it off when you’re learning something new.

49:06 – Arvid (Host)
Yeah, that sounds about right. Yeah, otherwise it becomes a crutch yeah.

49:10 – James (Guest)
Now there is an argument, for I don’t want to learn how to write AppleScript, but I want to use AppleScript to automate my Mac. So you could basically just basically say I’m going to cursor through it and go through the pain, but not understand it and utilize that tech. But if you want to use it with compounding growth and you want to use that from project to project, then that’s where you would basically go. I’m going to take the guardrails off. I’m actually going to go and learn, like learn it myself. So it depends on whether it’s a um, whether you don’t mind not knowing, or whether you’re going to need to know that knowledge for future projects yeah, that’ll set people apart.

49:44 – Arvid (Host)
Right, that will set apart the people who care about deeply understanding and the people who just want to get stuff done and the different levels of growth between them yeah, um, one of a one of a tip that I’ve got is um emotional prompting.

49:57 – James (Guest)
So if you tell it things like uh you know I’m gonna lose my job if, if I don’t get this right, that that actually will give you better outputs than if you just ask it to do the task do you know why like?

50:11 – Arvid (Host)
do you know the, the technical background?

50:13 – James (Guest)
no for this no, I mean, they’ve they’ve seen it in science, you know in in loads of different papers, um, in research papers, but they I I guess it’s because when you’re emotionally blackmailed, there’s a sense of urgency, um and so. So yes, I will do that from time to time. You must write my Terraform code or my Nextjs code, otherwise I’m going to lose my job. I just feel like I’m basically threatening this life form that I hope doesn’t become sentient one day. Yeah right.

50:44 – Arvid (Host)
Yeah, that’s pretty funny, but that’s also, I think, that’s the big difference between being an actual technical manager who has to have people skills and being able to just yell at an LLM all day long. That’s the big difference. That’s really cool. Thank you so much for sharing all of this. I’m really excited to dive deeper into Cursor and obviously work with AI all day, every day. This is just something that is both part of my business, part of my software development strategy, and I’m just excited about the technology, and I bet there’s a lot of people out there who have really enjoyed this as well and would like to know more about you. So if people want to find you, people want to follow you, where should they go and what should they look at?

51:23 – James (Guest)
Yeah, so I’m on LinkedIn, on X, so on LinkedIn it’s James Phoenix and also on X it’s James A Phoenix 12. And then I’ve also got a product which is basically a Nextjs boilerplate which comes with Cursor already set up, and it comes with integration testing, so Playwright and Jest tests. So that’s called CursorDevKitcom, and the idea behind that is to basically give you a really nice boilerplate that’s also got all the benefits of everything that you should have with Cursor. So those are the types of places that I hang out mainly. So, yeah, very cool.

51:59 – Arvid (Host)
Yeah, I follow you on Twitter and I often watch you on Twitch when you’re there, when you stream. That’s really cool too. Like seeing you build software is quite enjoyable. It’s bizarre watching people build software as an entertainment thing, but it’s really fun. It’s bizarre watching people build software as an entertainment thing, but it’s really fun, it’s really cool. Thanks so much, man. I appreciate everything you do to teach people to deal with this stuff and to also supply them with software to get things started. That is really really cool. Thanks, man. I really really appreciate everything you did.

52:23 – James (Guest)
Yeah, yeah.

52:44 – Arvid (Host)
Thanks a lot for having me, ovid. I really appreciate it. And yeah, see you around to ratethispodcastcom slash founder. It makes a massive difference if you show up there, because then the podcast will show up in other people’s feeds. Any of this will help the show. Thanks so much for listening. Have a wonderful day and bye-bye.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.