Funtoo Multilingual Project

From Funtoo
Revision as of 03:36, September 6, 2022 by Adbosco (talk | contribs)
Jump to navigation Jump to search
   Summary
Our goal is to bring Grade A support for using Funtoo Linux in different languages without hassles. The user should just need to change a few centralized settings and it should ”just work”.
   Related Pages
  • [[1]]
  •    People

    Welcome to the Funtoo Multiligual project! If you'd like to join our effort to bring Grade A support for multiple languages on Funtoo, or if you just want it to work better in your native language, come chat with us on our Discord channel and join the pack!

    Introduction

    Historically, support any language other than English has not been a central concern of developers, for various reasons. As the need to offer a more friendly interface for users outside the English speaking world, different vendors came up with different solutions, resulting in annoying incompatibilities, even within a relatively small set of characters that would give support to all Western European languages. Even in the CJK world, different standards appeared for each given language and country. And those would support only that Asian language and English. It made it really hard for someone using, say a Japanese system, to type or display German.

    With the widespread adoption of Unicode,a good part of the problems brought about by different standards of character encoding went away, but it's still not perfect. The Latin based scripts can be conveniently encoded mostly in 8 bits with UTF-8, but the other scripts were left with the higher code points, so that it still makes sense, for example, to encode Japanese using S-JIS if one needs to save storage space or network bandwidth, as that system can encode the most frequently used characters using considerably less bytes than Unicode would.

    Another issue that arises from the use of Unicode as a common encoding system is that it doesn't encode separately Chinese, Japanese and Korean characters. Therefore, the encoding itself is oblivious to what language that character belongs to. This leads to the problem of having a text written in a given language being displayed with some characters that actually belong to a different language. However, there are ways to work around this problem when the underlying system knows what language is supposed to be displayed and choose the correct quality font for that language.

    Finally, there is the problem of language input. For most languages the letters on the keyboard will correspond exactly to the characters being inputted. That is not true for more complicated scripts. Those need an additional helper system known as an ”input method engine” (IME). There are multiple different IMEs, each with its advantages and disadvantages and their respective fandoms, so that a minimal number of them needs to be supported to make everyone happy.

    CJK Project

    As part of the larger project of making Funtoo multilingual, we have a sub-project that deals specifically with concerns related to those languages that need an input method engine for input and fonts with a good coverage of their large character sets. Traditionally, this has been referred to as CJK, which stands for Chinese/Japanese/Korean, which are notorious for their need of additional settings, such as environment variables and services, like the IME itself, until they can get a usable system. In any major distribution today, this represents a major hassle these user need to go through to be able to do mundane activities, such as writing an email or a blog post.

    i18n-kit

    While some packages related to languages, such as serif/sans-serif fonts with support for European languages, and maybe the IME front ends themselves clearly belong in the desktop-kit, or maybe in the gnome-kit (app-i18n/ibus) or the kde-kit (fcitx), the majority of fonts specialized on a given language (most of the noto fonts, for example), spellcheck dictionaries (but not the engines themselves, like hunspell, aspell, etc.), unbundled translations when they exist, like in the case of app-office/libreoffice, for example, should move or be added to the i18n-kit.

    Generally speaking, I believe that if a users chooses not to use the i18n-kit, they should have support at least for the major European languages, such as English, Spanish and French, and if they need support for ”more complicated” (e.g. Chinese) or ”minor” languages (e.g. Welsh), then they will find them in the i18n-kit.

    Anything packages related to translation (e.g. CAT tools) or language learning are also good candidates to go into the i18n-kit.