blowviate

Today's Daily Double is... the current coding setup + workflow!

change is constant: the modern software engineer maximizing the efficacy of an ai heavy workflow has to tweak and update their approach constantly. with the way that models are evolving and the tools around them are evolving, the only way to stay on top of it is through relentlessly experimenting with new releases while also studying how others are leveraging these tools in novel and effective ways. i've enjoyed this fast iteration of new toys to play with, though I have heard others bemoan the ever-shifting landscape. I think I am advantaged in keeping up with the pace from not having spent many years in the industry before it was rocked by ai- I never stopped being a sponge for new info or reached complacency, and was instead happy enough to have what felt like a more level playing field where I have the tools to self-serve my own learning and growth.

voice first: my current workflow involves probably 60-80% of my text output being dictated via voice through whispering into a goose-neck microphone. having struggled in the past with rsi and tendonitis, i'm always looking for ways to reduce repetitive movements with my hands. i've found that with the evolution of wispr flow becoming as good as it is, and the ability to be able to just very quietly whisper into a microphone instead of having to talk out loud where everyone else can hear me, that there's a new possibility of true accessibility around interacting with a computer via your voice. i use a hotkey that i press before whispering into the mic and then i will have text input into slack or cursor or documents that i'm writing or pr review comments or whatever else i possibly would normally need to type out. it's most effective for natural language-based tasks which are increasingly common as a workflow given that much of what i do is interacting with one or more agents via prompting. in order to get a little more use out of this, i've built a tool on top of wispr flow that i call sayCast. this lets me use a second hotkey to have my voice be interpreted as commands to the computer as opposed to a dictation entry. what this looks like is i hit the hotkey and i say: "right half," or "fullscreen," or "open repo ${reponame}," or "next display," or "open chrome browser to localhost", etc etc etc. i've got loads of snippets for bespoke behavioral flows i need ('start jeopardy fire' opens iterm to jeopardy repo, pulls latest main, starts the server and client in separate tabs, launches quicktime movie player, and uses terminal to input adb commands to launch the jeopardy game on firetv'), as well as a heads-up display that shows what my voice input registered as and if it matched a command. with this i've been able to automate using voice to do probably a bit over half of my general computer interfacing tasks. as i spend more time developing the tool, i aim for this to become the dominant method of interfacing.

commanding the fleet: with models getting as powerful as they are and tooling around them improving in step, i find that my role involves ever less actual pen-to-paper writing code and ever more analyzing the response of one or more agents that i have set to task. say i have a task to implement this additional feature for the web checkout app my company is building. i'll come with my own preconceived opinions of what i might like to do, what tools to use, how i would architect this addition. i will lay out my thoughts, then set three agents of three different models to explore the task I have presented with a high level outline of the feature and just say "explore this area and detail an approach to building this feature in a way adhering to the standards and patterns already present in the repo", or "investigate best possible solutions using a x, y, and z framework and a, b, and c paradigms that I prefer", or "here is the task i am trying to accomplish, here is my idea in how to accomplish this, please thoroughly review it to understand if it is the best path forward or else suggest improvements", etc. once they've all returned, i compare and contrast their ideas with each other and my own, often iterating by asking questions to agent 2 about why not take the approach from agent 1. that gives me a broader understanding of the problem space as if i was talking to a few senior engineers, except i can get it immediately and i don't have to interrupt others. being able to command a few agents concurrently as opposed to relying on the output of just one is extremely helpful for my general understanding of the task at hand and possible pros and cons of various approaches towards solving for it. this practice also benefits me by helping me understand the strong and weak points of the ever evolving models by seeing how they respond to the same or similar prompts.

when it comes to actually implementing the task, i prefer to use a single instance of the most powerful agent available to me - which, of course, changes constantly. while i've tried running many in parallel, i have run into issues where cursor gets a little buggy while navigating between each agent's diffs, and i don't often find much benefit from seeing three different very similar coding implementations. once that agent has gone and completed his task, i provide code review and suggestions, saying "hey look at the pattern we're using in this other nearby service, can we more closely imitate these aspects of it?", or "hey, these unit tests don't seem to be really nailing the spec of what we were worried about testing here, how could we better prove out that this will have no regressions?", or "how can we refactor this to be more semantically meaningful, concise, and rely on more shared components", etc. especially with the advent of the browser mcp and agents that more frequently use it even without direct guidance, i feel like the loop is getting closer to being closed. the scope of how much work can be done before getting a human in the loop is surely increasing, but at current, if i just sent the agents off on our linear board unsupervised, it would be chaos. the lack of greater context to the ecosystem from deployment and infra processes, to which platforms apps run on, to the ability to validate changes made via examining user data from tools like datadog, is meaningfully beyond the capability of current agentic tools. for this reason, I think a fully closed loop of agent contribution is still a while out. for now i will enjoy my personal team of advisors and coders that help me expand the purview of what i can accomplish while dramatically reducing how long it takes to do so.

tuesday, december 16, 2025

comments