Alex ChernyshAlex ChernyshAgentic behaviorist · Tel Aviv
WritingAssistant
Back to notes

Note

RightLayout: Shipping a Mac AI Tool, Then Letting Go

Why a solo Mac keyboard-layout corrector with a CoreML model trained from scratch beat a 10-person dictionary tool, and why I open-sourced it.

May 8, 2026·7 min read
AI Systems
On this page(7)
1. The problem with dictionary punto-switchers2. The bet3. Where the model actually wins4. Why I open-sourced it instead of scaling it5. What you can takeRepositories and downloadsRelated reading

I built RightLayout because every keyboard-layout corrector I tried for macOS broke on names, code, and typos. It was a small bet: train a CoreML model from scratch, three layouts, on-device. It worked. Then the maintenance bill came due, and I open-sourced it.

What this post covers

  • why dictionary-based punto-switchers fail on the things you actually type
  • what trained from scratch means here, and the rough shape of the model
  • the maintenance trap that solo macOS system programming sets for you
  • why I handed it over to the community instead of pretending I would scale it

1. The problem with dictionary punto-switchers

If you type in two or three languages on a Mac, you have lived this. You start a sentence in English, the layout is still on Russian, and the screen fills with Cyrillic noise. The fix exists in theory. There are tools that watch your input and flip the layout when the word "looks wrong".

The classical version of that tool is dictionary-based. It checks each word against a frozen vocabulary and corrects when the word does not appear. That works for the easy cases. It also fails the moment a real human starts typing real text.

Names break it. Code breaks it. Acronyms break it. URLs break it. The word kubectl is not in any Russian dictionary, but it is also not a wrong-layout English word. A typo like helo is missing from the dictionary, so the tool helpfully turns it into руды. And in mixed-language paragraphs the dictionary does not even know which language to anchor against.

The commercial Mac version of this tool was shipped by a team I respect. I think there are about ten people on it. It is solid, it is polished, and it still gets the same class of false positives, because the underlying model has no idea what you are typing in. It has only a vocabulary check.

I wanted something that read context.

2. The bet

The bet was small enough to attempt over a few weekends. Train a tiny character-level model that takes a short window of recent input and predicts which of nine classes it belongs to. The nine classes are the three native layouts (EN, RU, HE) plus the six cross-layout misfires: en_from_ru, ru_from_en, en_from_he, he_from_en, ru_from_he, he_from_ru. That class set is the whole trick. Once the model says "this looks like Russian typed on an English layout", a deterministic mapper handles the actual character substitution.

The training pipeline is in the repo and is unromantic. Wikipedia and subtitle corpora for the three languages, generation of clean and cross-layout-mistyped pairs, character-level tokenization, augmentation for typos and case noise, mixup, label smoothing. The model itself is an ensemble of a small multi-scale CNN and a four-layer character Transformer, both pooled into a single linear head. It runs over a fixed 20-character window. The export goes through PyTorch into CoreML.

# from Tools/CoreMLTrainer/train.py
CLASSES = [
    'ru', 'en', 'he',
    'ru_from_en', 'he_from_en',
    'en_from_ru', 'en_from_he',
    'he_from_ru', 'ru_from_he'
]

The CoreML model that ships inside the .pkg is around 14 MB. It runs entirely on-device. Inference per token window is fast enough that the correction logic stays well under the 50 ms budget I set for the whole pipeline (event tap to replacement). It is small enough to fit in the bundle and ship with no cloud dependency.

The first time it correctly turned ghbdtn into "привет" in the middle of a sentence with a code snippet in it, I knew it was going to work. The dictionary tool I had been using for years would have eaten the snippet.

3. Where the model actually wins

Three places it beats a dictionary cleanly.

It handles typos. A short word with one missing or duplicated character is still recognizable as the right language to a character-level model. Dictionary tools either silently miss the word or, worse, "correct" it into nonsense.

It handles names and code. The model has seen enough mixed-language and mixed-script text in training that an English snippet inside a Russian sentence does not trigger a flip. The dictionary version of this is a hand-maintained whitelist that grows forever.

It handles Hebrew, which is the genuinely hard one. RTL text plus a character set with no overlap with Latin or Cyrillic plus a layout that maps Hebrew letters onto English keys means the dictionary approach has to maintain three pairwise tables and a context heuristic on top. The model just learned that akuo is "שלום" typed on an English layout and moves on.

For three or four months I used my own tool every day. It was the first time the corrector was invisible enough to forget about.

4. Why I open-sourced it instead of scaling it

Then the maintenance bill arrived.

A free macOS utility with a learned model has a long tail of unglamorous work. The accessibility-API event tap needs to keep working across macOS versions. Apple loves to silently change permission semantics between releases. The CoreML runtime drifts. The model has no test infrastructure for "real users typing real text" because that is, by definition, not in the training set. The undo-ratio learning loop, where the tool watches users undo a correction and adapts, is hard to make safe and harder to validate without telemetry I refuse to collect.

For a paid product with ten engineers, those costs are absorbable. For a free tool maintained by one person who has a day job, they compound. Every macOS major release became a week of evening debugging. Every CoreML version bump was a small risk. Every issue in GitHub was a fork in the road: do I become a Mac systems engineer in my spare time, or do I let the project rot quietly while pretending it is still maintained?

I picked a third option. I marked the project community-maintained, wrote an honest banner on the homepage and the README, kept the model in the bundle so installs still work, and moved my attention to Bernstein. The repo is public. The training pipeline is public. Pull requests are reviewed. There are no gatekeepers. If you ship good PRs, you get commit access.

That is a more honest position than "v2 coming soon, watch this space".

5. What you can take

If you want the tool, the .pkg is on the releases page. macOS 13 or newer, Accessibility permission, free. The model is inside the bundle.

If you want the code, the repo is small, the architecture doc is in .sdd/, and the training pipeline is in Tools/CoreMLTrainer/. Adding a fourth language is a few-hour exercise: extend the class enum, add the layout map, retrain, ship.

If you want the lesson, I think it is short. A solo developer can outship a small team on a focused product because the team carries coordination overhead the solo developer does not. The same solo developer cannot outmaintain a small team, because maintenance is a coordination problem and there is no coordination shortcut. Choose accordingly.

I am genuinely glad it is in the wild. Take it, fix it, ship it.

Resources

Repositories and downloads

  • RightLayout on GitHub
  • Latest .pkg release
  • Product page
  • Bernstein, what I work on now
Related reading

Related reading

  • Interface design for serious products
  • Need a job? Sip your drink. We'll look for you.
  • Bernstein: multi-agent orchestration that holds up

✓ Reading complete

Alex ChernyshAlex ChernyshApplied AI Systems & Platform Engineer

More on AI Systems

Part of the public notes on grounded AI systems, retrieval, evals, and shipping under real constraints.

  • →I Ran 12 AI Agents for 47 Hours. Here's What Survived.Mar 29, 2026·7 min read

Recent writing

  • →Forecasting Without Prophecy: a plain-text disciplineMay 2, 2026·13 min read
  • →Need a job? Sip your drink. We'll look for you.Apr 23, 2026·4 min read
  • →I Ran 12 AI Agents for 47 Hours. Here's What Survived.Mar 29, 2026·7 min read
On this page
  • 011. The problem with dictionary punto-switchers1 min
  • 022. The bet1 min
  • 033. Where the model actually wins1 min
  • 044. Why I open-sourced it instead of scaling it1 min
  • 055. What you can take1 min
  • 06Repositories and downloads
  • 07Related reading