← Back to Kevin's newslettersPublished: 2024 October 15

Making a dictation app in a weekend

My friend Geoffrey Litt has been tweeting about how useful it is to dictate — literally speak out loud — to LLMs when coding:

The key is that you can get very detailed with your prompt since voice makes it so fast. Just say everything you’d say to a junior, be verbose.

If you’re doing it right, it should feel like your prompts are absurdly long. Like, I’ll sometimes ramble for minutes. You should rarely stop to re-record – if you make a mistake just keep going and correct yourself. Feel free to think aloud on the fly

Even though I can type okay — 136 WPM w/ 99% accuracy, according to MonkeyType — I’m curious about how it’d feel to speak to the computer. (I also suspect it’d be good to practice speaking clearly and off-the-cuff about technical stuff.)

Unfortunately, the app Geoffrey uses (SuperWhisper) and the handful of other ones I found all require a newer version of MacOS than the one I was running.

Since it’s out of the question to either:

I made my own lil’ app instead: Whispertron.

My two primary requirements were:

I never touch XCode and barely know anything about Swift, so I had Claude write most of the code via Cursor. (Claude also helped me navigate the byzantine XCode nested menus and incantations required to allow an app to access the microphone.)

The functionality itself pretty straightforward:

That’s it, that’s the whole app.

I’m extremely chuffed to have gotten it done in one weekend. According to my notes it took about 9 hours across two sessions, though that’s a bit of an overestimate as a few hours were spent noodling with word-by-word “live” transcription before I gave up and just did single shot (which is what the Whisper model is designed for).

I find it quite liberating and enjoyable to have LLMs generate code for a language/ecosystem that I don’t have any preconceived notions about. I just don’t know (or care!) about the “proper” way to make a MacOS app, how to structure my AppDelegate, organize code between methods vs. free functions, etc. Especially once I got the basic app itself working, I could just speak out loud with the LLM to request lots of boring code tweaks (“Hey thanks looks good, but can you give the popup window an opaque background so it pops out a bit more?”)

Anyway, feel free to check out the app (you’ll have to build it yourself). If you have experience writing MacOS apps properly (or just want to have a go at managing an LLM on a toy project), I’m open to pairing and/or reviewing pull requests that:

An art wall

LLMs saved me so much time making my goofy app last weekend that I still had enough time to make a lil’ art wall behind my desk!

A bunch of framed stuff on the wall behind Kevin's desk

There’s no big story here, I’m just happy with how it turned out =D

As it so happens, putting random postcards, Dutch illustrated animal books, and miscellaneous pandemic-hobby-project keyboards and flexures into assorted frames from Ikea goes a long way.

There’s something magical about starting a new project quickly and keeping momentum through completion — manifesting in this case as refusing to leave my house to buy “proper” supplies and instead cobbling together the knolled keyboards by sticky-tacking them onto the Ikea frame’s acrylic sheet, which I frosted with leftover sandpaper and placed over white fusable interfacing ironed onto white cardboard.

Next I’d like to design some kind of nicer-looking storage unit for that pile of transparent boxes underneath the shelf. I’ve got several square meters of felt and a dozen wood rounds leftover from my desk build, maybe there’s something I can do with 3d-printed tube connectors

Misc. stuff