New ideas using Wayland Input Methods


What are Input Methods?

Input Methods are a way of allowing third parties programs to manipulate text in text input boxes in applications.

The two traditional uses of this are on-screen keyboards like one has on a mobile phone, and Chinese, Japanese and Korean (CJK) input where a keyboard doesn't map directly to the text we want to see on screen.

It's a lot more than merely injecting keys; the input method needs to have an awareness of the text cursor and the ability to manipulate whole words and sentences at a time. This is extremely important for the CJK way of typing where multiple keyboard keys are used to compose a single character. It's also important for virtual keyboards that have advanced features like auto-completion and spell-checking words.

X vs Wayland

On X11, the standard protocol (XIM) is hardly used, we ended up with every input method creating a plugin for every toolkit. This sufficed, but was not very scalable, doesn't have the best desktop interaction, and is messy to deploy.

The story with wayland is also far from perfect. With the compositor acting as a broker, we have a protocol from the plugins themselves (InputMethod) and protocols to clients (TextInput). Unfortunately there are at least 6 active versions of these protocols, which needs some work.

A recent meeting with other wayland devs has hopefully helped agree on a path forward for the TextInput side, with me taking on a lot of active tasks moving this forward. But we still need to sort out the InputMethod side.

I used Akademy to speak to people using CJK input and got very valuable feedback about the Korean language and some changes needed to make FICTX work properly there.

If you have experience with Chinese or Japanese input on wayland, please do reach out.

A Playground of Input Method Ideas

I don't use CJK keyboard entry nor a virtual keyboard on a daily basis, but there are things I do use that could benefit from the InputMethod tech. Now we have a standard we can look at using this stack for more creative ideas. Doing this now can help work out what the requirements are for the protocols as we stretch it in new directions.

This post describes some ideas of how else we can use Input Methods to improve the Linux desktop experience.

Convenient Clipboard Connections

With a shortcut we can enter clipboard entries, previewing in place. We're able to use the preedit text to display metadata on available entries which won't get committed to provide a streamlined interface, or we could bring up a menu, or both.

Compared to the existing klipper integration we can open at the current carret position, and provide something a bit more streamlined. The preview also helps solve the frustrating problem of pasting only to find you've now got too much or too little whitespace where you hit paste.

Easy Emoji Entry

My attempt at understanding young people. Typing ':' and then a string launches an inbuilt emoji browser based on the typed string.

These days a lot of clients now have in-built support, including all GTK4 text areas, so it might not be needed. Maybe the ship for this has sailed, or there could be still value in centralising it and having a common pattern.

Direct Diacritic Display

Building on an existing merge-request by Janet Blackquill, pressing and holding a key will bring up a prompt with relevant characters based on the held key.

The original version merge request injected itself into the application repurposing a hook meant for themeing, this broke too many clients to be released as part of Plasma and it was unfortunately reverted.

Taking that idea but using input methods should avoid those problems, whilst also solving the issue of key-repeats being sometimes needed.

Timely Translation Tasks

This was solved by having a second text field open when activated where the user can type their native text, so the translations can be shown in the main text area. The current code utilises the command line tool "trans" which in turn makes web requests to proprietory web services, but we could look at other options if we turned this into a product.

Simply Speak

The one I'm most excited about. Working local dictation software that can integrate into every textfield out of the box.

I got best results using whisper-fast with a voice-activity filter on top. This was wrapped into a daemon with a DBus interface we can then use to start and stop on demand. It needs some work to be a shippable product.

Current State

All of the above are fully working examples, but everything was pieced together in a few evenings at Akademy and some time at the airport.

It's not something I would ship to users; the point right now is to be a playground, uncover issues throughout the stack, and try a few different things. That's why we use different techniques for bringing up the input methods and prompts within it rather than trying to be coherent.

That said, I have made an effort to separate the high level application code from the wayland boiler plate so we can quickly iterate to InputMethodV3 to see what's needed, and turn this into a product as we start to come up with a long term direction.

Running it for yourself

The repo exists at the readme has instructions on how to get set up.

What does it work with?

A form of TextInput is supported in almost all toolkits from Electron, GTK, Qt, SDL and more. This means it's fairly universal across wayland clients.

On the compositor level, InputMethodV1, the only specification in upstream protocols, is supported by only Kwin and Weston, but as noted in the intro this is subject to change in the future.


The playground has exposed several pre-existing deficiencies, especially regarding the placeholder text styling, input panel positioning, key repeats. We can take those forward into fixes that might affect the more important CJK input methods.

As well as being a playground I think there are some ideas worth exploring further, to find a path into the desktop and is yet another reason to be excited about the path to Wayland,

Have more ideas? Let us know in the discussion at