Note

RightLayout: Shipping a Mac AI Tool, Then Letting Go

Why a solo Mac keyboard-layout corrector with a CoreML model trained from scratch beat a 10-person dictionary tool, and why I open-sourced it.

May 8, 20267 min read

AI Systems

On this page(7)

I built RightLayout because every keyboard-layout corrector I tried for macOS broke on names, code, and typos. It was a small bet: train a CoreML model from scratch, three layouts, on-device. It worked. Then the maintenance bill came due, and I open-sourced it.

1. The problem with dictionary punto-switchers

If you type in two or three languages on a Mac, you have lived this. You start a sentence in English, the layout is still on Russian, and the screen fills with Cyrillic noise. The fix exists in theory. There are tools that watch your input and flip the layout when the word "looks wrong".

The classical version of that tool is dictionary-based. It checks each word against a frozen vocabulary and corrects when the word does not appear. That works for the easy cases. It also fails the moment a real human starts typing real text.

Names break it. Code breaks it. Acronyms break it. URLs break it. The word kubectl is not in any Russian dictionary, but it is also not a wrong-layout English word. A typo like helo is missing from the dictionary, so the tool helpfully turns it into руды. And in mixed-language paragraphs the dictionary does not even know which language to anchor against.

The commercial Mac version of this tool was shipped by a team I respect. I think there are about ten people on it. It is solid, it is polished, and it still gets the same class of false positives, because the underlying model has no idea what you are typing in. It has only a vocabulary check.

I wanted something that read context.

2. The bet

The bet was small enough to attempt over a few weekends. Train a tiny character-level model that takes a short window of recent input and predicts which of nine classes it belongs to. The nine classes are the three native layouts (EN, RU, HE) plus the six cross-layout misfires: en_from_ru, ru_from_en, en_from_he, he_from_en, ru_from_he, he_from_ru. That class set is the whole trick. Once the model says "this looks like Russian typed on an English layout", a deterministic mapper handles the actual character substitution.

The training pipeline is in the repo and is unromantic. Wikipedia and subtitle corpora for the three languages, generation of clean and cross-layout-mistyped pairs, character-level tokenization, augmentation for typos and case noise, mixup, label smoothing. The model itself is an ensemble of a small multi-scale CNN and a four-layer character Transformer, both pooled into a single linear head. It runs over a fixed 20-character window. The export goes through PyTorch into CoreML.

# from Tools/CoreMLTrainer/train.py
CLASSES = [
    'ru', 'en', 'he',
    'ru_from_en', 'he_from_en',
    'en_from_ru', 'en_from_he',
    'he_from_ru', 'ru_from_he'
]

The CoreML model that ships inside the .pkg is around 14 MB. It runs entirely on-device. Inference per token window is fast enough that the correction logic stays well under the 50 ms budget I set for the whole pipeline (event tap to replacement). It is small enough to fit in the bundle and ship with no cloud dependency.

The first time it correctly turned ghbdtn into "привет" in the middle of a sentence with a code snippet in it, I knew it was going to work. The dictionary tool I had been using for years would have eaten the snippet.

3. Where the model actually wins

Three places it beats a dictionary cleanly.

It handles typos. A short word with one missing or duplicated character is still recognizable as the right language to a character-level model. Dictionary tools either silently miss the word or, worse, "correct" it into nonsense.

It handles names and code. The model has seen enough mixed-language and mixed-script text in training that an English snippet inside a Russian sentence does not trigger a flip. The dictionary version of this is a hand-maintained whitelist that grows forever.

It handles Hebrew, which is the genuinely hard one. RTL text plus a character set with no overlap with Latin or Cyrillic plus a layout that maps Hebrew letters onto English keys means the dictionary approach has to maintain three pairwise tables and a context heuristic on top. The model just learned that akuo is "שלום" typed on an English layout and moves on.

For three or four months I used my own tool every day. It was the first time the corrector was invisible enough to forget about.

4. Why I open-sourced it instead of scaling it

Then the maintenance bill arrived.

A free macOS utility with a learned model has a long tail of unglamorous work. The accessibility-API event tap needs to keep working across macOS versions. Apple loves to silently change permission semantics between releases. The CoreML runtime drifts. The model has no test infrastructure for "real users typing real text" because that is, by definition, not in the training set. The undo-ratio learning loop, where the tool watches users undo a correction and adapts, is hard to make safe and harder to validate without telemetry I refuse to collect.

For a paid product with ten engineers, those costs are absorbable. For a free tool maintained by one person who has a day job, they compound. Every macOS major release became a week of evening debugging. Every CoreML version bump was a small risk. Every issue in GitHub was a fork in the road: do I become a Mac systems engineer in my spare time, or do I let the project rot quietly while pretending it is still maintained?

I picked a third option. I marked the project community-maintained, wrote an honest banner on the homepage and the README, kept the model in the bundle so installs still work, and moved my attention to Bernstein. The repo is public. The training pipeline is public. Pull requests are reviewed. There are no gatekeepers. If you ship good PRs, you get commit access.

That is a more honest position than "v2 coming soon, watch this space".

5. What you can take

If you want the tool, the .pkg is on the releases page. macOS 13 or newer, Accessibility permission, free. The model is inside the bundle.

If you want the code, the repo is small, the architecture doc is in .sdd/, and the training pipeline is in Tools/CoreMLTrainer/. Adding a fourth language is a few-hour exercise: extend the class enum, add the layout map, retrain, ship.

If you want the lesson, I think it is short. A solo developer can outship a small team on a focused product because the team carries coordination overhead the solo developer does not. The same solo developer cannot outmaintain a small team, because maintenance is a coordination problem and there is no coordination shortcut. Choose accordingly.

I am genuinely glad it is in the wild. Take it, fix it, ship it.