Organizing computer windows
← Back to Kevin's homepagePublished: 2022 November 13I find it satisfying to have a large physical desk, where the top surface lets me spread out the stuff I’m working on in various piles, with plenty of drawers to hide stuff away for later.
The physical affordances of printed documents, file folders, and small cardboard boxes are exceptional: They can be quickly moved, stacked/unstacked, and strewn about without conscious thought or precision motor control.
Unfortunately, the same isn’t true for virtual documents and file folders. Most MacOS windows require substantial mental effort to rearrange and manipulate — both to visually locate and then to precisely move the cursor to one of a few “safe” regions where the window can be manipulated. Even determining where a window can be picked up is hard, as fashionable minimalism has eliminated visual cues and interactive widgets like tabs have invaded the formerly sovereign title bar regions.
Can you tell exactly where this window is safe to “pick up”?
It’s even harder when the window isn’t focused (click to toggle), which is the typical case when you are trying to move several windows at once.
I can tolerate these limitations on my small laptop screen, where I usually just switch between two fullscreen windows. However, on a larger monitor (or when using multiple), the limitations of MacOS windows manager grates on my nerves. As I mindlessly try to arrange two windows to be side-by-side, I can’t help but recall Nielsen’s 1996 Anti-Mac Interface:
The dark side of a direct manipulation interface is that you have to directly manipulate everything
How I organize my windows
Thankfully, there are dozens of fellow nerds upset with this status quo, and some of them have done the grueling, thankless work of reverse engineering and patching up MacOS internals to provide a set of tools to get closer to that tangible “it’s fast to shuffle working papers around on a surface and throw the rest into grouped boxes to pull back out later” physical ideal.
Here are my requirements:
- Windows are assigned to one of 10 groups (AKA “spaces” or “virtual desktops”)
- A monitor shows exactly one group
- Each group is shown on at most one monitor (I have no need to see the same windows on multiple monitors)
- Windows are automatically resized to fill up the available space.
- There are no time-wasting, keyboard-input-eating animations
- It should be fast and thoughtless to:
- move a window to a different group
- display a different group on the current monitor
- change focus between visible windows within a group
- change focus between monitors
Many of these requirements come from fond memories of using XMonad on Linux in the late 2000’s. Basically I just want that, but on Mac (because I also want battery life, to drag my signature onto PDFs, and the ability to print).
A concrete example
(Note: the following would be much clearer as a screen recording or animation, which I’ll backfill eventually.)
To give a concrete example of the workflow this supports, here are some hotkeys:
alt j/k
- move focus to next/previous windowalt g/i
- move focus to next/previous monitoralt 1
- show group 1 on the focused monitor (alt 2
,alt 3
, etc. work similarly)shift alt 1
- move the focused window to group 1
Imagine I have two side-by-side monitors with no open windows. The left monitor shows an empty group 1 and the right monitor group 2.
Say, I’d like to build a web app, so I open up my code editor on the left monitor (group 1), which automatically becomes fullscreen.
I then focus the other monitor (alt k
), and open some documentation to review while I’m writing code.
I can use alt j/k
to switch focus between monitors.
When I’m ready to see my app-in-progress, I focus the right monitor (alt j/k
again), have it display an empty new group (alt 3
), and open up a terminal and web browser there, which will automatically resize to evenly fill the screen.
I can then work on the code (on the left monitor) and see both the app and log output (on the two windows on the right monitor) simultaneously.
If I want to switch back to the documentation, I can use alt 2
to bring it up fullscreen instantly.
I find it quite helpful to have persistent groups like this — I can keep my music player on one group, slack and other chats in another one, etc.
How it’s implemented
The workflow described above is implemented with:
- A custom script that uses Yabai to partition windows across MacOS spaces, invoked via
- system-wide keyboard shortcuts implemented via Karabiner-Elements, configured by a GokuRakuJoudo file (also below).
I have also been happy with Shortcat, which cleverly allows one to “click” on anything by typing a few characters and matching descriptions provided by the system accessibility APIs. This works particularly well in Chrome, and navigating the web via the keyboard this way can be substantially faster than dragging the mouse around. (It reminds me of the Conkeror Emacs-inspired web browser, another fond 00’s memory.)
TODO / notes
I’ve been using this system for about a month, and so far it works OK:
It can feel laggy at times when switching focus. I’m not sure if that’s the ~100ms it takes to run Yabai and my wrapper, or if it’s my M1 MacBook Air struggling to drive two 4K monitors over DisplayPort in addition to its own screen.
MacOS moves windows across spaces behind my back. I haven’t spent the time to track this down yet, but suspect it’s related to plugging/unplugging external monitors.
The Yabai space labels don’t correspond to the ones shown by MacOS after that “swipe 4 fingers up to zoom out” gesture, so it’s hard to quickly see which space is on which monitor.
I suspect I’m close to the limit of what can be bolted on otherwise hacked onto a consumer-centric window manager.
At some point, I’d love to work with a team to explore more substantial ideas and designs that lean into windows as a foundational interface to to the computer. A window manager that, for example:
allows “broadcasting” either live or pre-recorded keyboard and mouse events into multiple windows, allowing for uniform automation/macro capabilities without explicit “opt-in” from app developers
allows windows to represent applications running on remote machines with different operating systems; or just have the entire window manager in the browser, with every window corresponding to a different process / machine
enables other people to “join” specific (groups of) windows with their cursors and keyboard events, making any desktop app “multiplayer”
can crop or otherwise “lie” to the underlying application about its window size, so I could, e.g., reclaim the space taken up by Chrome’s tab and location bar (which I never use)
does live character recognition / capture within window regions, effectively exposing any information visible on the screen as a computable API
allows one to make windows transparent, so they can be overlaid like physical transparencies or that translucent paper architects use to sketch on top of blueprints
allows one to draw on and attach labels to “live” windows like post-it notes, enabling information-at-the-point-of-need workflows rather than screenshots arranged in a static doc
Defining the commands I want using Yabai
#!/Users/dev/.dotfiles/bin/bb
;;explicit path above because I can't figure out how to update Karabiner-Elements's PATH to find babashka via `env`.
(require '[clojure.java.shell :refer [sh]]
'[cheshire.core :as json]
'[clojure.core.match :refer [match]])
;;Space indexes aren't persistent, seems like MacOS rearranges them somehow.
;;So use labels s1, s2, ..., s10 for all spaces so they can have a consistent identity
;;Moving spaces between monitors tears the background =( https://github.com/koekeishiya/yabai/issues/781
(def yabai
"/opt/local/bin/yabai")
(def placeholder-space-label "s11")
(defn active-display-idx []
(-> (sh yabai "-m" "query" "--displays" "--display" "mouse")
:out
(json/parse-string true)
:index
str))
(defn current-state
[]
(let [spaces (->
;;yabai takes about 20--30ms to run
(sh yabai "-m" "query" "--spaces")
:out
(json/parse-string true)
(->> (map #(update % :display str))
(map #(assoc % :idx (str (:index %))))))]
{:display->spaces (reduce (fn [m {:keys [idx display] :as space}]
(assoc-in m [display idx] space))
{} spaces)
:spaces-by-label (into {} (map (juxt :label identity) spaces))}))
(defn focus-space!
[idx]
(sh yabai "-m" "space" "--focus" idx))
(defn focus-display!
[idx]
(sh yabai "-m" "display" "--focus" idx))
(defn move-space-to-display!
[{:keys [display->spaces spaces-by-label]} space-label display-idx]
(let [space (spaces-by-label space-label)
old-display-idx (:display space)]
(when (not= display-idx old-display-idx)
(let [only-space-on-old-display? (= 1 (count (display->spaces old-display-idx)))]
(when only-space-on-old-display?
;;have to put placeholder on old display first, so that moving the space to the new display always works.
;;there's no atomic space swapping in yabai yet: https://github.com/koekeishiya/yabai/pull/555
(prn "Space" space-label "is only space on" old-display-idx ", moving placeholder to take its place")
(sh yabai "-m" "space" placeholder-space-label "--display" old-display-idx))
;;move space to new display
(sh yabai "-m" "space" space-label "--display" display-idx)
;;if the space we're moving was visible, then swap whatever was on the target display to the old display.
(when (:is-visible space)
(let [new-display-visible-space-label (->> (display->spaces display-idx)
vals
(filter :is-visible)
first
:label)]
(sh yabai "-m" "space" new-display-visible-space-label "--display" old-display-idx)))))))
(defn switch-to-space-on-active-display!
"Switches current display to space and focuses it. If space is already visible on another display, swap the current space to that display"
[state space-label]
(let [display-idx (active-display-idx)]
(move-space-to-display! state space-label display-idx)
(focus-space! space-label)
(focus-display! display-idx)))
(defn focused-space
[state]
(->> (vals (:spaces-by-label state))
(filter :has-focus)
first))
(defn toggle-stack!
[state]
(let [{:keys [idx type]} (focused-space state)]
(sh yabai "-m" "config" "--space" idx "layout" (case type
"bsp" "stack"
"stack" "bsp"))))
(defn focus-window!
[state next-or-prev]
(let [{:keys [label type]} (focused-space state)]
(case type
"bsp"
(or (= 0 (:exit (sh yabai "-m" "window" "--focus" next-or-prev)))
(= 0 (:exit (sh yabai "-m" "window" "--focus" (case next-or-prev
"next" "first"
"prev" "last")))))
"stack"
(or (= 0 (:exit (sh yabai "-m" "window" "--focus" (str "stack." next-or-prev))))
(= 0 (:exit (sh yabai "-m" "window" "--focus" (case next-or-prev
"next" "stack.first"
"prev" "stack.last"))))))))
(time ;;hmm, most commands are still on the order of 100ms, probably the query and json parsing...
(let [state (current-state)]
(match (vec *command-line-args*)
["info"]
(clojure.pprint/pprint (current-state))
["active-display"]
(prn (active-display-idx))
["switch-to-space" space-label]
(switch-to-space-on-active-display! state space-label)
["toggle-stack"]
(toggle-stack! state)
["focus-window" next-or-prev]
(focus-window! state next-or-prev)
:else
(prn "Unknown: " *command-line-args*))))
System-wide keyboard shortcuts
;; See: https://github.com/yqrashawn/GokuRakuJoudo/blob/master/tutorial.md
;; Keys specified as physical ones (i.e., ignoring alternative keyboard layouts like colmak, dvorak, etc.)
;; update by running:
;;
;; GOKU_EDN_CONFIG_FILE=~/.dotfiles/scripts/karabiner.edn goku
;; ! | means mandatory
;; # | means optional
;; C | left_command
;; T | left_control
;; O | left_option
;; S | left_shift
;; F | fn
;; Q | right_command
;; W | right_control
;; E | right_option
;; R | right_shift
;; !! | mandatory command + control + optional + shift (hyper)
;; ## | optional any
{:applications {:chrome ["^com\\.google\\.Chrome$"]
:emacs ["^org\\.gnu\\.Emacs$"]}
:templates {:yabai "/opt/local/bin/yabai -m %s"
:yabaik "/Users/dev/.dotfiles/bin/yabaik %s"}
:main [{:des "Caps lock to control"
:rules [[:##caps_lock :left_control]]}
{:des "Yabai"
:rules [
[:!Oj [:yabaik "focus-window next"]]
[:!Ok [:yabaik "focus-window prev"]]
[:!Ou [:yabai "display --focus prev || /opt/local/bin/yabai -m display --focus last"]]
[:!Oi [:yabai "display --focus next || /opt/local/bin/yabai -m display --focus first"]]
[:!OSu [:yabai "window --display prev || /opt/local/bin/yabai -m window --display last"]]
[:!OSi [:yabai "window --display next || /opt/local/bin/yabai -m window --display first"]]
[:!Ospacebar [:yabai "space --rotate 90"]]
[:!Otab [:yabai "display --focus recent"]]
[:!Oy [:yabaik "toggle-stack"]]
[:!O1 [:yabaik "switch-to-space s1"]]
[:!O2 [:yabaik "switch-to-space s2"]]
[:!O3 [:yabaik "switch-to-space s3"]]
[:!O4 [:yabaik "switch-to-space s4"]]
[:!O5 [:yabaik "switch-to-space s5"]]
[:!O6 [:yabaik "switch-to-space s6"]]
[:!O7 [:yabaik "switch-to-space s7"]]
[:!O8 [:yabaik "switch-to-space s8"]]
[:!O9 [:yabaik "switch-to-space s9"]]
[:!O0 [:yabaik "switch-to-space s10"]]
[:!OS1 [:yabai "window --space s1"]]
[:!OS2 [:yabai "window --space s2"]]
[:!OS3 [:yabai "window --space s3"]]
[:!OS4 [:yabai "window --space s4"]]
[:!OS5 [:yabai "window --space s5"]]
[:!OS6 [:yabai "window --space s6"]]
[:!OS7 [:yabai "window --space s7"]]
[:!OS8 [:yabai "window --space s8"]]
[:!OS9 [:yabai "window --space s9"]]
[:!OS0 [:yabai "window --space s10"]]
]}
{:des "Remap emacs-muscle-memory to similar keys outside of emacs"
:rules [
;;arrows
[:!Tl :down_arrow :!emacs]
[:!Tr :up_arrow :!emacs]
[:!Ty :right_arrow :!emacs]
[:!Tn :left_arrow :!emacs]
[:!Tu :escape :!emacs]
[:!Cr :page_up :chrome]
[:!Cl :page_down :chrome]
]}]
}