Allow CJK input

Mapping database

The data could be gathered form the CEDICT dictionary (CC 4.0 license) at https://www.mdbg.net/chinese/dictionary?page=cedict

Excerpt:

當日 当日 [dang1 ri4] /on that day/
當日 当日 [dang4 ri4] /that very day/the same day/
當時 当时 [dang1 shi2] /then/at that time/while/
當時 当时 [dang4 shi2] /at once/right away/

當日 = traditional Chinese
当日 = simplified Chinese
dang1 ri4 = pinyin with tone
dang ri = pinyin without tone => This is the way you normally type pinyin in. This means the numbers in the brackets would need to be removed.

Input

I would expect following behavior:

Type dangri.
Display both dangri and the according hanzi character: 当日.

Example when typing gugepinyin:

3a. By hitting space the first displayed character will be input.

3b. Touching one of the displayed characters (with your finger) will input it.

Nice to have

Quick input Only typing the first letter of each character needed:

dr = 当日, 点儿 and 丢人

Prediction: If there are multiple results for the (pinyin) input the most used will be shown first:

dr = 点儿 will be displayed first because it was used more often by the user; 当日 will be displayed second etc.

Character list: For a single expression in pinyin there can be many potential characters. Thus a list view is needed to be able to select the right one.

he = 何, 盒, 和, 河, 合, 禾, 核, 喝, 鹤, 吓, 贺, 劾, 涸, 纥 etc.

Handwriting input: Write the characters with your finger and get suggestions to choose from.

added helpwanted label

I am no developer but will try to contribute as much as possible.

I found following project which converts the CEDICT dictionary into a sqlite database:

https://gitlab.com/jmatthin/cedict_to_sqlite

I made some changes to get a table with traditional character, simplified character, first letter of each syllable and plain pinyin (without tone and no spacing between the syllables):

https://gitlab.com/rinokeros/cedict_to_sqlite

traditional	simplified	short	pinyin_no_tone
指導	指导	zd	zhidao
知道	知道	zd	zhidao

I looked a bit into rust and tried a query for zhidao:

 use rusqlite::{params, Connection, Result};

 #[derive(Debug)]
 struct Entries {
     traditional: String,
     simplified: String,
 }

 fn main() -> Result<()> {
     let path = "../target/debug/build/cedict.db";
     let conn = Connection::open(path)?;

     let mut stmt = conn.prepare("SELECT traditional, simplified FROM entries WHERE pinyin_no_tone='zhidao'")?;
     let entries = stmt.query_map(params![], |row| {
         Ok(Entries {
             traditional: row.get(0)?,
             simplified: row.get(1)?,
         })
     })?;

     for entries in entries {
         println!("Found {:?}", entries.unwrap());
     }
     Ok(())
 }

This gives me:

Found Entries { traditional: "制導", simplified: "制导" }
Found Entries { traditional: "執導", simplified: "执导" }
Found Entries { traditional: "指到", simplified: "指到" }
Found Entries { traditional: "指導", simplified: "指导" }
Found Entries { traditional: "直到", simplified: "直到" }
Found Entries { traditional: "直搗", simplified: "直捣" }
Found Entries { traditional: "知道", simplified: "知道" }

@dorota.czaplejewicz What would be a next step from here?

The keyboard is currently missing two components required for suggestions: some suggestion UI, and the ability to use the input-method interface for anything more than popup/popdown.

When it comes to the UI, #99 is the "endgame" solution, but it requires work across projects and so I'm fine experimenting with other things.

When it comes to input-method, the imservice.rs file is partially fleshed out already. You would need to work with imservice_handle_surrounding_text and imservice_handle_text_change_cause and imservice_handle_commit_state to inform the predictor of the current state, and submit preedit strings based on that as well (this is missing).

mentioned in issue #54

For mapping latin inputs to Chinese characters, I suggest using some existing library rather than creating new code, as Chinese users have many different preferences on the conversion.

https://github.com/rime/librime may be a choice.

Allow CJK input

Designs ...

Child items ...

Activity

Mapping database

Input

Nice to have

Admin message

Allow CJK input

Activity

Mapping database

Input

Nice to have