Is there space for 'craft' in the world of AI?
This post's featured URL for sharing metadata is https://www.jvt.me/img/profile.jpg.
My friend Alan wrote yesterday in a thread on Bluesky:
In my end-of-year review I was described as a 'craft engineer' more than once. I don't disagree with this, but I have to wonder if this approach (one that has been comfortable for me for more than 10 years) is totally incompatible with modern AI tooling and a 'deliver it' mindset in a team?
I want to split this across what I do; use the new tools and deliver quickly in my workplace, and not use AI at all for my personal work and use that for the 'craft'
I can see these being complimentary to each other, but I am struggling with the change in my day job
My current AI use at work is mostly limited to using the tools to collaborate on a plan and scope for a change, then writing the change we have agreed upon by hand. This is a happy medium for me, but it's not very conventional and I am noticeably slower than team mates on pure pace of shipping
When we have things like a large state machine to determine lazy loading in a component so dense it would take me some time to really appreciate its nuance, offloading that to AI seems reasonable. But it also keeps me in a place of misunderstanding, and I really don't love that
When AI can explain code back to us as well as write it, does that misunderstanding matter anymore? Is this just the next layer of abstraction?
Having a think about this, I think the TL;DR is yes we should still have some level of "craft" in the world of Large Language Models (LLMs), but there's a balance - as ever - around how much we focus on the craft.
As I've written about before, I'm a self-described "cautious skeptic" of AI and LLMs, and I'm trying to do more with AI where makes sense, which makes more of a "moderate" in the discussion, somewhere between AI maximalist and "never use AI".
I think I'd argue I'm somewhere more towards the "craft" side of the spectrum, and have been someone who's had similar feedback in the past of being a bit too focussed on "doing the right thing".
"engineer"
I don't want to rehash the "what's the difference between a software engineer and a programmer" discourse (because it's largely gatekeeping), or "are software engineers 'real' engineers", but an interesting point is that our job titles include the word "engineer". As software engineers, our job is not only to write some code, but to consider the trade-offs between decisions we make, how users will use the features, what patterns will make our code clear to the reader and addressing scalability and security concerns in our designs.
On top of the core job, we also have areas where we do want to work on our craft - writing elegant code or introducing abstractions and reusable patterns that reduce complexity and provide a common language for interactions, improving our tooling and processes to make it easier (but safer!) for folks to ship code, and even learning how to be more productive so we're more effective engineers.
In the world of AI, some of these things can be thought through in collaboration with - or completely outsourced to - an LLM, but I still believe there's a lot of value of a human being in the loop, guiding the resulting code and its patterns to deliver a better outcome overall, as well as to hone the craft of how to use LLMs effectively. Although there's discussion about engineers becoming primarily "spec authors" or leaning towards a more product-y role, I feel that it's still (in my opinion) near necessary for an engineer who's using an LLM to understand what is being produced, so they can critique the implementation, rather than only thinking about this high-level structure.
These engineers are able to highlight when an LLM is reaching for existing patterns, which is no longer recommended (despite it being more popular in the codebase than the new pattern we're moving towards), or being aware of features that aren't in an LLM's training data. These aren't necessarily straightforward to make the LLM aware of due to context window limitations, which doesn't happen in the same way to a human and our own means to recall knowledge and pattern-match.
Additionally, several these things can also be more of a "gut feel" which is hard to explain to an LLM, but is something that humans pick up over months/years of working in a codebase, and so can surface this at review time. The code is getting reviewed by someone, right?
You're (probably) not Vibe Coding?
If we take the original definition of Vibe Coding from Andrej Karpathy:
There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists
It's likely that in a corporate environment, there's at least one review (whether human or LLM) of LLM-generated code. While it may be true you're not reviewing the code when working on a hackathon or proof-of-concept, it's probably not a great (nor secure) idea to launch thousands of lines of code straight into production.
Given we're most likely reviewing the code, I'd argue it's very important to make sure that the resulting code is not only "good enough", but generally as good as if a human had sat down to write it well. Not only does this approach keeps things consistent, but ensures that we're guiding LLMs towards a better overall style, similar to how as a more senior engineers we would coach more junior engineers.
We should be working iteratively to keep the bar high with generated code, in my opinion, regardless of whether "we could just rewrite it with AI", as we should be treating production codebases the same, regardless of the author.
Token count go brrrrr
Another key reason I see that there's space for craft in this age is that LLMs generally tend to be writing more verbose code than a human would.
This is not a problem for human reviewers if you truly Vibe Code it, but the more code being written, the higher the token count is, which means that it'll cost more for an LLM to then interact with and understand the codebase.
I've joked before that it may be intentional by LLM providers to guide LLM-generated code to being more expensive to maintain due to token count (therefore lining their pockets), and who knows if it's true?
Something I've mentioned before on Fallthrough is that we're seeing changes to LLM providers where they're starting to reduce their offerings to folks on paid plans, nudging them towards higher plans for more powerful models. I'm feeling this may be the start of the LLM providers starting to move towards the "real pricing" of what it costs to actually serve a model, and it'll tend towards the subscriptions and API prices closer to an "at cost" model.
In a time where you're then starting to really feel the burden of the per-token cost, how will your organisation handle incredibly verbose code that you're now stuck with?
Even without this perceived potential price hike, what happens if you're on a train/plane/have a power cut and have no Internet, but need to keep working? Does that mean you're effectively blocked if there's no way to reason with your complex codebase without these tools? Although local models are great for some use cases, it's very unlikely they will be powerful enough for you to get similar output to a frontier model, and so you won't even have a reasonable backup.
(Related: isaiprofitable.com)
The higher reliance we have on using what comes out of the model, instead of honing it and treating it like valuable code that needs to be readable, maintainable and something both humans and LLMs can contribute to, the worse we'll be in these situations.
Do we actually care about quality any more?
Something I've also seen and heard from folks across the industry is that - as Alan notes - there's a drive to "deliver", and work at a higher pace now we have robots to go away and do more work for us.
Over the different companies I've worked at over the years, something my leadership has found is that I like to make the implicit explicit. My neurospicy brain doesn't like when it's unclear what is being asked of me - especially if it seems to be a top-down mandates like this - and I've found that it helps others when we make these things clearer, because there may also be language barriers or cultural differences at play that lead to a lack of clarity, so we may as well clarify for everyone involved.
If we had this conversation with our leadership about why "craft" is holding us back from "delivery", it's unlikely that companies want to admit on paper that "we don't care what the level of quality of software we're shipping is, only that we are shipping quickly, regardless of quality", right? But this seems to be what companies are asking of us.
I've found that having this (sometimes tough) conversation about making that explicit to be really useful - in the best case, they clarify the position is "no, we don't want that, what we actually mean is [...]", which leads to clarity for everyone involved. But in some cases, you have leadership who don't want to explicitly say yes, that is what they're asking, because it makes them uncomfortable making that clear.
I've found that leaning on that realisation and trying to make leadership inflect on that helps them realise a little bit to understand the hypocrisy here. Even if it doesn't change anything, letting them admit (at least to themselves) that they're uncomfortable with this realisation helps, as does making it clear for everyone else.
A good practice I've found over the years is also "keep receipts" - make sure you write notes from conversations like this, ideally somewhere that others have access to, as it makes sure that a) you have them to refer back to when you need to but b) it's clearer to everyone involved what was actually discussed and what the result was.
For instance, if the answer is "no, we only care about speed, not quality", then that can be documented, shared with the team, and you can then point to that in the future when these discussions come up again.
It Depends?
Unfortunately the "speed vs quality" discussion isn't a new one, but the age of shipping more code with LLMs does make it a bit more difficult when it's much easier to ship so much more (mediocre) code. There's always been a balance between the two - aiming for "perfect" leads to taking a lot longer to deliver (and possibly finding out you had some wrong assumptions), while going for "we don't care at all" only hurts your users/customers - and a lot of the time, it's context dependent, too.
I think - as engineers within an organisation - there's a little bit of needing to "go with the flow", and follow the guidance we're receiving, while also doing what we can to try and raise the bar of those around us.
That being said, I do feel there's still a lot of value in being a craftsperson who works to produce better overall results while honing their skills, like there is still space for artisinal, farm-to-table code in the world.