Implement translation service (i18n) #387

Reinmar · 2017-01-09T09:42:45Z

Process

We code the features using the t() functions in which we define the English value of a string (and, optionally, a context). E.g. t( 'Bold' ) or t( 'Button [context: clothing]' ) (the [context: clothing] will be automatically removed upon build on, in the dev mode, on runtime). Each context must be defined in lang/contexts.json of a package which uses it or in the ckeditor5-core/lang/contexts.json (for common strings).
From time to time, we run a tool which scans code for all t() usages, extract strings from it, builds a contexts map (based on all lang/contexts.json files) and checks whether all used strings have defined contexts. Then builds a PO file with English strings and contexts and upload it to Transifex.
Then we run a tool which downloads all PO files for all defined language and put them back in lang/ directories.
When building an editor, a specific bundler read PO files and create translation map(s) for chosen language(s) based on them and bundles those as defined in the next section.

Implementation

`ckeditor5-core/src/editor~Editor#constructor`

I propose to merge the ctx name into the string in order to keep the same t() params in every possible environment t( str, values ). If it was a separate param, then in the build version (in which we replace strings with ids) it would have to be left unused, or, we'd need to change the implementation of t() on the fly.

The CKE_LANG will be defined by the bundler. It will be the editor class that's going to set it which means that it will only need to be set when bundling for the browser environment. Or we could go a bit further than that and define utils/global object which would retrieve the global scope depending on the env. In the browser that would be a window, in Node.js that would... something which is global there. That would allow this code to work without the bundler needing to preset this value.

PS. we already have utils/dom/global, but I think that it makes sense to keep them separated.

this.config.define( 'lang', global.CKE_LANG || 'en' );

this.locale = new Locale( lang );

const t = this.locale.t;

// Usage:
t( 'OK' );
t( 'button' );
t( 'button [context: clothing]' );

`ckeditor5-*/lang/contexts.json`

Contexts for used messages.

Examples: https://github.com/ckeditor/ckeditor-dev/tree/master/dev/langtool/meta

ckeditor5-core/lang/contexts.json:

{
	"OK": "Label for OK button in a dialog."
}

ckeditor5-form/lang/contexts.json:

{
	"button": "Name of a clickable form control, e.g. 'OK button'."
}

ckeditor5-tailor/lang/contexts.json:

{
	"button [context: clothing]": "Button as used in clothes."
}

The button is first defined in the ckeditor5-form package without a context, because, e.g. historically, we could've used it there without a context (cause, in our case, button is a UI button most of the time). Then, while working on the CKEditor 5 Tailor plugin we realised that button is already used, but in a different context, so we can't use t( 'button' ) as it will point to a wrong context definition (contexts are global for all packages). Instead, we'll use t( 'button [context: clothing] ' ) and add its own definition in the ckeditor5-tailor/lang/contexts.json.

`ckeditor5-utils/src/locale`

No magic here – uses the translate() function of the utils/translation-service module to get translated string and replaces placeholder values ($1, $2, etc.) with passed args.

import { translate } from './translation-service';

export default class Locale {
	constructor( lang ) {
		this.lang = lang;
	}

	t( str, ...values ) {
		const translatedString = translate( this.lang, str );

		// ... do the rest (values replacement)
	}
}

`ckeditor5-utils/src/translation-service`

The goal of this module is to encapsulate the whole "translations repository" logic from the rest of the code.

It may be dirty with some preprocessor rules or may need to be generated on the fly depending on the build type (e.g. with or without code splitting – i.e. with separate or built in language file). However, it would be good if it was defined in the repository so for dev purposes we wouldn't have to do anything. It'd just return the passed string.

Development mode implementation:

export function translate( lang, str ) {
	// Remove the ` [context: ...]` suffix.
	return str.replace( / \[context: [^\]]+\]$/, '' );
}

For the bundles, we have two cases:

where only one language is needed,
where there are multiple languages.

In case of just one language, it'll be best to simply replace the str param of t() calls
with the proper value (without the ctx now). This will allow for code splitting and hot-loading plugins without any tricks. It may happen, though, that some strings will then repeat multiple times – e.g. the ones from the core package. While this is going to make an un-gzipped package bigger, there shouldn't be a difference in case of a gzipped code. Besides, this will be just a few strings anyway and we'll save some space having such a simple implementation too.

export function translate( lang, str ) {
	return str;
}

In case of multiple languages we need to have some registry. The translate() implementation will be again simple:

export function translate( lang, str ) {
	return translations[ lang ][ str ];
	// Let this be objects, not maps, because we control the values of lang and str
	// and it will be a bit easier to generate a source of an object programatically.
	// Objects may also be better in terms of code size.
}

What's the str in this case? In order to ensure that we don't have name collisions (important for bundle splitting) I'd say that this should be either:

A totally unique string – like typical uid, but preferably shorter, using a wider range of unicode characters. A 5 chars long string using a range of all Unicode chars will give us comparable complexity to utils/uid:
```
> Math.pow( Math.pow( 2, 16 ), 5 )
1.2089258196146292e+24
> Math.pow( 35, 16 )
5.070942774902497e+24
```
The thing which worries me is whether there won't be any issues with encoding – we've seen people sourcing CKEditor in some weird encodings and it was always blowing up. But one could use a minifier which encodes special characters and it was be working again.

Another thing is ability to debug such code. With unreadble ids it may be tricky.

Finally – creating objects in which these uids are keys may be tricky. I can't find now whether there are any characters which needs to be escaped (other than the closing quote).

Anyway, this solution may create unpredictable and stupid issues.
Therefore, we may just use sequential ids from a short range (e.g. [a-z]). What about code splitting? There are two cases:
- Code splitting was done when building the whole setup – then there won't be any problems in using non-conflicting ids, because it's done by one bundler at one time.
- Someone, first built e.g. ckeditor5-preset-article-editor and then, separately (might be a different person), built ckeditor5-image. In this case, the preset bundle will use normal, short, ids (because it's the main bundle) and the image feature bundle will use prefixed ids (e.g. ckeditor5-image/<id>). The idea is that a developer releasing his/her package will use a special bundler setup which configures CKEditor plugin for Webpack to use prefixed ids.

Anyway, this is nothing we have to worry today, because, most likely, we'll work on releasing standalone package bundles after 1.0.0.

Another thing to notice is that support for multiple languages and code splitting is an optional feature, so we can implement it for just one bundler, i.e. Webpack.

Let's say, that we want to split a package to some preset X and packages Y and Z.
This will make for 3 files – x.js, y.js, z.js.

The idea is that each files will have an accompanying language(s) files defining a translations needed for this file. So there will be:

x.js, x-pl.js, x-en.js, ...
y.js, y-pl.js, y-en.js, ...
z.js, z-pl.js, z-en.js, ...

In order to run an the X preset with plugins with Polish translations one will need to load:
x.js, y.js, z.js, x-pl.js, y-pl.js, z-pl.js.

The language files will be built using entry points like this:

import { define } from '@ckeditor/ckeditor5-utils/src/translation-service';

define( 'pl', {
	'a': 'OK',
	'b': 'Anuluj',
	// ...
} );

And the define() function will merge these translations into other that it already has.

Tadam!

PS. PO file generation

We've been changing the idea which part of t() call is msgctxt, msgid and msgstr twice already, so let's clarify this:

For the following t() calls and contexts.json:

t( 'OK' );
t( 'button' );
t( 'button [context: clothing]' );

ckeditor5-core/lang/contexts.json:

{
	"OK": "Label for OK button in a dialog."
}

ckeditor5-form/lang/contexts.json:

{
	"button": "Name of a clickable form control, e.g. 'OK button'."
}

ckeditor5-tailor/lang/contexts.json:

{
	"button [context: clothing]": "Button as used in clothes."
}

The en.po file would look like this:

msgid "OK"
msgstr "OK"
msgctxt "Label for OK button in a dialog."

msgid "button"
msgstr "button"
msgctxt "Name of a clickable form control, e.g. 'OK button'."

msgid "button"
msgstr "button"
msgctxt "Button as used in clothes."

In this case the msgstr can be empty because then gettext will use msgid. In fact, in the samples I found in #387 (comment) it's empty.

Regarding #387 (comment):

As msgid we use the id of that string, not the "Bold" string. Why? The reasons were explained in the post I quoted – the msgid has limited length and, more importantly, you lose all the translations if you change the id. So, if we'd use the real string "Bold" as msgid, if we decided that for some reason it should be "Bold text", then translations would need to repeat their work and for some time we wouldn't have this string translated at all. This was pointed out by @wwalc and it happened (the English text was changed) in the past in CKE4 multiple times.

We ignore this issue. It's very rare situation.

The text was updated successfully, but these errors were encountered:

jodator · 2017-01-09T09:45:23Z

Do you plan to sue whole API from gettext? Like plural handling and stuff?

Reinmar · 2017-01-09T09:50:06Z

Nope, I don't think so. Maybe I'm wrong, cause finally I haven't checked that precisely, but I think that it's too much work and we have very little use cases for those things, so this would be a low priority.

I mean – knowing CKE4 – there were just a couple of places where we had to format a plural, which you can always somehow workaround, even if not very gracefully. So this wasn't high on the priority list.

OTOH, perhaps we could dive into this, and if it's possible, predict a full gettext support. I wouldn't like to work on it now, but if it would be possible to implement it in the future, then we'd be safe. Or it may turn out that it's simpler than I think... I'll better check it :D.

Reinmar · 2017-01-09T10:15:15Z

From what I understand from https://www.gnu.org/software/gettext/manual/gettext.html#Translating-plural-forms we need to pass a count number to the t() function, so it knows which translation to use.

ngettext [option] [textdomain] msgid msgid-plural count

So it'd be like:

t( 'file-upload/uploaded-message: Uploaded a file. | Uploaded %0 files.', fileCount, fileCount );

Why fileCount is twice? Because the first one is the count, the second is the %0. It's a bit ugly, I know. The first param gets very complicated as well... perhaps these would need to be 3 separate strings then or an object (better, because t() won't change its number of params in the built code). Actually, in this case the count could be moved to that object too:

t( {
    context: 'file-upload',
    id: 'uploaded-message',
    singular: 'Upladed a file.',
    plural: 'Uploaded %0 files',
    count: fileCount
}, fileCount );

Now, thet() function needs to be able to understand which of the plural forms it needs to use. So we'd need to include in the bundles the function which represents a plural formulas, the translate() function will know it and return the proper string based on the passed count. Then, the normal %0 substitution will take place.

If I don't miss anything, I think that we should be able to add support for plural forms in the future. It will require more a complicated logic for t() extractions and getting the plural formula functions somehow...

Reinmar · 2017-01-09T10:19:39Z

BTW, I haven't mentioned one thing here – the use of external libraries. Obviously, there are libraries which parse and write PO files and we'll use one. However, for the string resolution logic (so what the translate() function should do) I don't think that we'd be able to use any library. It needs to be perfectly integrated with our code and bundling process, so I don't see much space for any 3rd party helpers here.

wwalc · 2017-01-09T11:33:46Z

Looks great. It would be nice if the tool could list some errors like:

entries without context,
translations of the same context which overwrite each other,
using context which is not described in any contexts.json file,
context definitions that overwrite each other (the same context defined across multiple contexts.json more than once),
finding a pipe in the text and no vars defined, which would mean that the pipe was used as a part of the text (some people will create language files based on existing files, they will not read the full spec to understand the special meaning of a pipe)
other?

ma2ciek · 2017-01-09T12:03:53Z

@Reinmar Maybe it's more an edge-case, but words in many languages can have more than one plural form.
All in all you have written a very clean view of what we need to implement.

jodator · 2017-01-09T12:05:59Z

@ma2ciek gettext has it already defined and its standard is used in many places.

Checkout this: https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html

It handles polish irregular plural forms.

Reinmar · 2017-01-09T12:10:39Z

entries without context,

Yep.

translations of the same context which overwrite each other,

Yep.

using context which is not described in any contexts.json file,

Yep.

context definitions that overwrite each other (the same context defined across multiple contexts.json more than once),

Yep.

finding a pipe in the text and no vars defined, which would mean that the pipe was used as a part of the text (some people will create language files based on existing files, they will not read the full spec to understand the special meaning of a pipe)

You mean, in the plural usage form? I don't want to use "|" there, that's why I proposed the object format, as more explicit and less error prone. OTOH, the pipe format could be an option and totally satisfactory in most of the case. But we won't work on this now anyway, so we can discuss this in the future.

Reinmar · 2017-01-09T12:11:45Z

@Reinmar Maybe it's more an edge-case, but words in many languages can have more than one plural form.
All in all you have written a very clean view of what we need to implement.

The idea is that en has 2 forms, so our t() usage needs to predict only that + the count param. The translate() function must support multiple forms, because only it will have to worry about non-eng languages.

wwalc · 2017-01-09T12:21:03Z

I don't want to use "|" there, that's why I proposed the object format

👍

Reinmar · 2017-01-09T17:08:07Z

I found a quick tutorial how gettext workflow looks like and it's a bit different than how I imagined it: http://www.labri.fr/perso/fleury/posts/programming/a-quick-gettext-tutorial.html

You write code with e.g. _("Hello World\n").
You run xgettext on this code and create a .pot file (PO template) with such a content:
```
msgid "Hello World\n"
msgstr ""
```
You run msginit to create a .po file for a concrete language.
Now you add translations of this text in that .po file.
Finally, you run msgfmt to create a binary for that .po file, which equals our build step.

It's also important that when you run xgettext again, you run it with the -j (--join-existing) option which... I don't precisely know what it's doing :D. Especially that then you run msgmerge to merge these changes with the concrete .po files.

It would be worth checking how this merging process look and how sending this to Transifex looks like too. E.g. can we just generate en.po files from the code and send them to Transifex skipping the whole update and .pots generation process?

For me the process could look like this, but perhaps I miss something:

Generate en.pos out of the code.
Send them to Transifex.
Get .po files for all other languages back.

jodator · 2017-01-09T17:30:02Z

AFAIR the .pot is the file which is used as the source of currently available msgs.
In time some messages are added and some are removed. Current state of app's messages are in current .pot file.

The msgmerge will update translation files (.po files) so obsolete messages are removed and new strings are added.

This workflow is nicely supported by some tools (namely poedit ) in which you can run .pot file against .po file to update translations and create updated .po file.

So in such workflow the .pot file is needed for updating the .po files in future.

I think that the en.po would be the same as ckeditor5.pot. But I don't know if creating en.po makes sense (dunno how transifex works in that matter).

Reinmar · 2017-01-09T17:33:02Z

Also, I found https://lingohub.com/blog/2013/07/php-internationalization-with-gettext-tutorial/#What_form_of_msgids_should_be_used which clarifies how the msgids should be created:

As previously mentioned, msgid is used by gettext to identify the msgstr that should be displayed in its place. If multiple equal msgids are present, msgctxt is used to tell them apart. There are several more important roles that msgid plays:

msgid is what the programmer sees in the code

msgid is what is being displayed by default if the proper msgstr in the current locale can not be found (as a fall back)

in traditional gettext usage, the translator translates from msgid to the target locale, instead of from source msgstr to target msgstr; so, msgid is what he sees and what gives him information what should be translated

Therefore it is important to chose the form of the msgid carefully. The value of the msgid can be any string. The gettext manual suggests that the original string in the source (or English) language should be used. That might pose certain problems, for example if msgstr of the original string is changed and if the programmer decides to synchronize the msgid with a new msgstr, he would have to do it in all places in the code where it is used as well as in all other PO files. Also, there is no limit to msgid or msgstr length. Some lengthy msgids would make the code less readable.

Some developers propose that the msgids should be constructed in a more structured way [5] [6], where msgid describes its role in the application or role in the code (in which view, partial, container are they located) instead of its content. It is a great way to organize the strings from the developer’s perspective, but it might pose problems to the translator, since the translator no longer has the a source string in the msgid to translate from.

This reminded me, what I forgot to describe and what have been confusing for you – where in the t() calls do we have msgctxt, msgid and msgstr. So, to clarify this. For the following t() call and contexts.json:

t( 'basic-styles/bold: Bold' );

basic-styles/lang/contexts.json (note, I was previously using a deeper structure for the contexts.json file, but it doesn't make sense – both levels (package name and the id can be together)):

{
  "basic-styles/bold": "Bold button label."
}

The en.po file would look like this:

msgid "basic-styles/bold"
msgstr "Bold"
msgctxt "Bold button label."

So:

As msgid we use the id of that string, not the "Bold" string. Why? The reasons were explained in the post I quoted – the msgid has limited length and, more importantly, you lose all the translations if you change the id. So, if we'd use the real string "Bold" as msgid, if we decided that for some reason it should be "Bold text", then translations would need to repeat their work and for some time we wouldn't have this string translated at all. This was pointed out by @wwalc and it happened (the English text was changed) in the past in CKE4 multiple times.
So, if the msgid of the string is not the string itself, then why defining the contexts at all? We could have e.g. this:
```
t( 'core/button: Button' );
// in other context:
t( 'tailor/button: Button' );
```
Theoretically, the msgid clearly differentiates the use. However, as @wwalc pointed out again, translators on Transifex don't know CKEditor that well, so they might not be able to translate strings well enough based on short msgids. Hence, we need textual, human readable contexts just like we had in CKEditor 4 (e.g. https://github.com/ckeditor/ckeditor-dev/blob/master/dev/langtool/meta/ckeditor.plugin-basicstyles/meta.txt):
Finally, by requiring specifying the msgid in every t() usage, we'll be sure that the developer thought about what context it is, we'll be able to reuse translations from the core package, we'll know whether a certain text was defined in some other package already (we can have a tool to print msgstrs which appear with more than one msgid), etc. It's safe and pretty concise and useful at the same time.

Reinmar · 2017-01-09T17:39:23Z

Thanks @jodator. TBH, I have still troubles in understanding the real purpose of pot. Perhaps there are situations in which it allows resolving some unclear combinations of which string is defined in which po file. Or maybe it's also related to the topic I've just described – what is msgid? If you don't use real en strings for msgids, then the .pot file contains a real template of all other po files. Even to create the (base) en.po file, you need to add the msgstrs for all the entries, because, AFAICS, the _() function is used only with msgid. So in this case en.pot !== en.po.

But, if we accept what I described above, then en.pot === en.po, so we can generate just the en.po file and, if anyone wants it, the en.po can be used as en.pot.

Reinmar · 2017-01-09T17:39:50Z

Anyway, @ma2ciek will need to dig into Transifex docs to learn about the proper process.

franciscop · 2017-01-09T19:36:42Z

Just wanted to chip in to say that, with the new and awesome ES6 string templates, there is a lot of new functionality that can be applied here. For example, I just made a demo in 5 min (jsfiddle):

let translations = {
  'I am $1 $2!': {
    en: 'I am $1 $2!',
    es: '¡Soy $1 $2!',
    jp: (name, lastname) => `私は${lastname.toUpperCase()} ${name.toUpperCase()}です`
  }
};

let name = 'Francisco';
let lastname = 'Presencia';

let translation = t`I am ${name} ${lastname}!`;

Even in this simple example this looks more natural than:

let translation = t('I am $1 $2', name, lastname);

Of course you could modify many things and improve others. The point is, with template strings the syntax is closer to natural English while giving you more flexibility than normal functions.

Mavrin · 2017-01-09T20:08:32Z

Hi all. I implemented internationalization service for targetprocess. And I talk you how we made it. Maybe it will be useful for you. We use Transifex as well.
This is my talk at one conference https://mavrin.github.io/i18n-pres (Russian).
I try explain the main idea.
We don't use syntactic keys. I mean something like that __("basic-styles/bold") . We use original English phrase __("Bold button label.") . What it gives us? We see in development mode good string and we don't need dictionary for 'en' lang. Function __ have implementation like this

const currentLang ='en'; //this variable we get from user settings
const defaultLang = 'en';
const dictionary = { // it load dynamically from separate bundle
'ru': {
  'Bold button label.': 'Толстая кнопка'
}
}
const translate =() => {/*function which apply plural format or come variable*/} // we use https://github.com/yahoo/intl-messageformat for it
cost __ = (string, params) => {
  if(currentLang === defaultLang) {
    return translate(string, params);
  } else {
    return  translate(dictionary[currentLang][string] || string, params);
  }
}

How we made lang files for others. We put all dictionary to github repo.

We run tool on our CI which parse all code base and make files with dictionary. This tool support context for folder or single file, just specific comment to file or add meta file to folder . You could look it in test https://github.com/TargetProcess/tau-extract-gettext/tree/master/test
We push this dictionary to Transifex with help https://github.com/TargetProcess/tau-transifex/blob/master/index.js#L136-L152
We run sync between Transifex and Github with help https://github.com/TargetProcess/transifex-github-integration. It get all dictionaries from Transifex and push to GitHub and create tag.
We created web service wich we can install in your server and call update route. This service allows you to see the full version of the dictionary.

Reinmar · 2017-01-10T09:33:24Z

Of course you could modify many things and improve others. The point is, with template strings the syntax is closer to natural English while giving you more flexibility than normal functions.

Hi! Thanks for chipping in :) The use of templates was already proposed: ckeditor/ckeditor5-design#136 (comment) and we came to a conclusion that it's not going to work in our case.

We need to synchronise translations through a service like Transifex which means that they need to be plain text strings. Also, with the t() function we're able to pass an object there for a more complicated cases (see #387 (comment)).

EDIT:

@franciscop, I took a bit closer look at your demo and it's really interesting. I see that in the translation repository there are plain text strings and that you can change the order of placeholders in a translation and everything works fine:

en: 'I am $1 $2!',
es: '¡Soy $2 $1!',

If we were able to ensure that in the future we'd be able to implement the plural forms (see #387 (comment)), then using template strings could be an option. It doesn't sound particularly tricky to have the t() function supporting templates and objects at the same time, so it seems feasible.

Reinmar · 2017-01-10T09:55:44Z

I edited my previous comment because it didn't make a lot of sense initially and I created https://github.com/ckeditor/ckeditor5-utils/issues/117 – it'll be worth considering adding support for template strings to the t() function.

ma2ciek · 2017-01-10T14:13:02Z

So I'm working now on the very basic support without plural forms and template strings which we'll add later. I'll be pushing changes to the pull request for the https://github.com/ckeditor/ckeditor5-utils/issues/116

After small searching for the phrase t( ... ), which we want to find, collect and export to the .po file, I have found linkformview.js which is using generic form of t() that crashes a little bit this idea. Are these generic forms more?

Reinmar · 2017-01-10T15:05:52Z

I have found linkformview.js which is using generic form of t() that crashes a little bit this idea. Are these generic forms more?

That's a bug. t() should always be used with a string. You should throw an error if it's not.

Reinmar · 2017-01-10T15:13:10Z

I talked with @fredck and we decided to simplify the t() calls by removing the msg id from them. It has some downsides (most of the things listed in #387 (comment)), but it has one important upside – the t() calls will be shorter. Since the downsides will occur in rare situations, it's better to focus on something which most of us see – the t() calls.

I updated the spec in the issue description and for you to be able to understand what I changed I created https://gist.github.com/Reinmar/2ca4400d5fb724fd9be88fd903aee8eb/revisions.

Reinmar · 2017-01-10T16:00:15Z

@Mavrin

Hi all. I implemented internationalization service for targetprocess. And I talk you how we made it. Maybe it will be useful for you. We use Transifex as well.

Thanks for the tips. I can see that we're planning to implement the same solution (especially after simplifying the t() calls), so we're going in the right direction :)

ma2ciek · 2017-01-12T11:32:01Z

So for the clarification @Reinmar @wwalc I'm going to implement errors after spec update for the following cases:

t() call parameters without relevant translation key in package/lang/contexts.json and core/lang/contexts.json,
translation keys in contexts.json without relevant parameter t() call,
context definitions that are defined across multiple contexts.json more than once for the same translation key.

Reinmar · 2017-01-12T12:40:15Z

Sounds good to me.

ma2ciek · 2017-01-30T09:10:54Z

EDITED

I'm not sure about this path. The spec said translations should be placed in lang/ directly. I guess you wanted to separate those po files from contexts.json and en.pot, I guess it makes sense. However, then the correct path would be lang/translations/

Yes, I made a typo. Translations are saving under the lang/translations/ path.

Is any of these on review two? Please make PRs in every repo if those branches are supposed to be reviewed together.

I'll fix few issues and I'll create a pull request within an hour.

I've been checking the state of the env utils and I found this: https://github.com/ckeditor/ckeditor5-dev/compare/t/44#diff-71629f5ae239e9d11642ad3fec3911c1R33.
This is not going to work in practice. The t() call may be formatted in various ways and used in at least two contexts (after space and after tabulation), but that happens only in our code style. If different code style is used, there are many more variations. Besides, it's going to break with a simple string containing ")" character.

I had to misunderstand implementation of this part. I hope it's not going be hard to add babel or similar parser here to find t() call occurrences. There is already a Webpack parser that solves this issue: https://github.com/webpack-contrib/i18n-webpack-plugin/blob/master/index.js, but it won't be able to share code logic across Webpack and Rollup plugins.

ma2ciek · 2017-02-06T12:53:08Z

After a week I improved the t() calls find and replace using the Acorn parser and modified version of Escodegen generator (which master do not use latest version of Estraverse and throws errors during export default declaration parsing). So for now ckeditor5-dev-packages Uses the fork of the Escodegen. I'll report that issue.

I changed the way the .po files are being imported from the packages, but it turned out, that this part has to be explicitly declared in the CKEditorWebpackPlugin options. The ckeditor5-core package which is the default translation package must be resolved first to provide translations that other packages may use.

It took quite a lot of time compare to the effects, mainly because of the researching in out of date webpack plugins and webpack documentation. I hope it'll change soon.

Reinmar · 2017-02-06T13:37:38Z

I changed the way the .po files are being imported from the packages, but it turned out, that this part has to be explicitly declared in the CKEditorWebpackPlugin options. The ckeditor5-core package which is the default translation package must be resolved first to provide translations that other packages may use.

How does the process look like? Is it split into two parts:

discovering paths to packages,
modifying the files

or not?

If it is, then I don't see a problem in sorting the translations in a way that the core will always be first. Besides, why does it have to be first in the first place? There should be no conflicts between translations provided by packages anyway.

Reinmar · 2017-02-06T13:39:26Z

After a week I improved the t() calls find and replace using the Acorn parser and modified version of Escodegen generator (which master do not use latest version of Estraverse and throws errors during export default declaration parsing). So for now ckeditor5-dev-packages Uses the fork of the Escodegen. I'll report that issue.

How's that possible that Escodegen isn't supporting all ES6 yet? How's Webpack working with ES6? Is it always pre-transpiling the modules to CJS or some custom format?

ma2ciek · 2017-02-06T13:53:38Z

How does the process look like? Is it split into two parts:
discovering paths to packages,
modifying the files
or not?

The process isn't split. For each js file in ckeditor5-* the po file of the package is being added (if wasn't added earlier) and then the file is being translated. So the package can't use ckiedtor5-core translations which may be defined later.

Reinmar · 2017-02-06T14:06:38Z

The process isn't split.

Can't we split it? ;< Or somehow locate and add ckeditor5-core automatically?

ma2ciek · 2017-02-06T14:10:10Z

We can add manually ckeditor5-core. But we have to know where the package physically is. So this should be written explicitly in CKEditorWebpackPlugin options.

ma2ciek · 2017-02-06T14:12:23Z

Can't we split it?

The loader is used just after resolver for each file. So this part can't be separate if we don't know paths to the po files in the packages first.

Reinmar · 2017-02-06T14:17:38Z

We can add manually ckeditor5-core. But we have to know where the package physically is. So this should be written explicitly in CKEditorWebpackPlugin options.

That's why I asked whether we cannot use Webpack's module resolution logic to find this out ;) We can't hardcode this in any way.

ma2ciek · 2017-02-06T15:39:32Z

We can't hardcode this in any way.

So I have no idea how to get the informations about used packages before the loader is executed. This is a similar problem to the multi-language support problem.

Reinmar · 2017-02-06T15:43:11Z

So I have no idea how to get the informations about used packages before the loader is executed. This is a similar problem to the multi-language support problem.

It doesn't have to be an information about used packages. It has to be a path to the @ckeditor/ckeditor5-core package. I guess Webpack must have some API which lets you get path to a package based on its name. That's what triggers the "resolve" hooks.

ma2ciek · 2017-02-07T14:05:48Z

I used webpack compiler.resolvers.normal.resolve (it was very hard to find that function and any information about it other than compiler.resolvers.normal.resolve.toString() ), which has to be run after compiler.plugin( 'after-resolvers', cb ) hook. This function can resolve the path and then it's possible to get core translations before the loader translates t() calls. So this problem no longer exists.

ma2ciek · 2017-02-07T14:20:16Z

How's that possible that Escodegen isn't supporting all ES6 yet? How's Webpack working with ES6? Is it always pre-transpiling the modules to CJS or some custom format?

Webpack doesn't use Escodegen. I'm trying to figure out what is that package doing before files emitting. But I think, that Webpack may have own generator, because it transforms Acorn output to own objects and is injecting its __webpack_require__ function.

Reinmar · 2017-03-02T20:56:46Z

Big news :)

The editor can finally be localised:

There's still a lot of work ahead of us (mainly stabilisation and adding missing features), but it's good to finally see this working :).

Reinmar · 2017-03-02T20:57:15Z

PS. To build that sample I used https://github.com/ckeditor/ckeditor5-labs/tree/master/integrations/webpack-integration-localization

Reinmar added status:confirmed type:feature This issue reports a feature request (an idea for a new functionality or a missing option). labels Jan 9, 2017

Reinmar assigned ma2ciek Jan 9, 2017

Reinmar added this to the iteration 6 milestone Jan 9, 2017

Reinmar mentioned this issue Jan 10, 2017

Localizing strings (translations) ckeditor/ckeditor5-design#136

Closed

This was referenced Jan 30, 2017

Add translation tasks to gulpfile. #394

Merged

Scripts for collecting, uploading and downloading translations. ckeditor/ckeditor5-dev#60

Merged

Plugin doesn't work with webpack@2.2.0 webpack-contrib/i18n-webpack-plugin#31

Open

Reinmar mentioned this issue Feb 2, 2017

Webpack support #139

Closed

Reinmar modified the milestones: iteration 8, iteration 6 Feb 2, 2017

Reinmar mentioned this issue Feb 6, 2017

🏁 Release: Iteration 6&7 – 7th developer preview ckeditor/ckeditor5-design#173

Open

Reinmar closed this as completed in 72af30c Mar 2, 2017

Reinmar mentioned this issue Mar 13, 2017

🏁 Release: Iteration 8 – 8th developer preview ckeditor/ckeditor5-design#175

Open

ma2ciek mentioned this issue Mar 14, 2017

T/92: Provide more secure translations upload / download ckeditor/ckeditor5-dev#124

Merged

Reinmar mentioned this issue Oct 20, 2017

Implement translation service (i18n) v2 #624

Closed

ma2ciek mentioned this issue Nov 23, 2017

Implement font size and family feature #638

Closed

ma2ciek mentioned this issue Oct 9, 2019

Implement translation service #4929

Closed

Reinmar mentioned this issue Oct 9, 2019

Consider adding support for template strings to t() #4930

Closed

ma2ciek mentioned this issue May 18, 2020

Create scripts which collect, upload and download translations #6882

Closed

Implement translation service (i18n) #387

Implement translation service (i18n) #387

Comments

Reinmar commented Jan 9, 2017 • edited Loading

Process

Implementation

ckeditor5-core/src/editor~Editor#constructor

ckeditor5-*/lang/contexts.json

ckeditor5-utils/src/locale

ckeditor5-utils/src/translation-service

PS. PO file generation

jodator commented Jan 9, 2017

Reinmar commented Jan 9, 2017

Reinmar commented Jan 9, 2017

Reinmar commented Jan 9, 2017

wwalc commented Jan 9, 2017

ma2ciek commented Jan 9, 2017

jodator commented Jan 9, 2017

Reinmar commented Jan 9, 2017

Reinmar commented Jan 9, 2017 • edited Loading

wwalc commented Jan 9, 2017

Reinmar commented Jan 9, 2017

jodator commented Jan 9, 2017 • edited Loading

Reinmar commented Jan 9, 2017

Reinmar commented Jan 9, 2017

Reinmar commented Jan 9, 2017

franciscop commented Jan 9, 2017 • edited Loading

Mavrin commented Jan 9, 2017 • edited Loading

Reinmar commented Jan 10, 2017 • edited Loading

Reinmar commented Jan 10, 2017

ma2ciek commented Jan 10, 2017

Reinmar commented Jan 10, 2017

Reinmar commented Jan 10, 2017

Reinmar commented Jan 10, 2017

ma2ciek commented Jan 12, 2017 • edited Loading

Reinmar commented Jan 12, 2017

ma2ciek commented Jan 30, 2017 • edited Loading

ma2ciek commented Feb 6, 2017

Reinmar commented Feb 6, 2017

Reinmar commented Feb 6, 2017

ma2ciek commented Feb 6, 2017 • edited Loading

Reinmar commented Feb 6, 2017

ma2ciek commented Feb 6, 2017

ma2ciek commented Feb 6, 2017

Reinmar commented Feb 6, 2017

ma2ciek commented Feb 6, 2017

Reinmar commented Feb 6, 2017

ma2ciek commented Feb 7, 2017 • edited Loading

ma2ciek commented Feb 7, 2017 • edited Loading

Reinmar commented Mar 2, 2017

Reinmar commented Mar 2, 2017

Reinmar commented Jan 9, 2017 •

edited

Loading

`ckeditor5-core/src/editor~Editor#constructor`

`ckeditor5-*/lang/contexts.json`

`ckeditor5-utils/src/locale`

`ckeditor5-utils/src/translation-service`

Reinmar commented Jan 9, 2017 •

edited

Loading

jodator commented Jan 9, 2017 •

edited

Loading

franciscop commented Jan 9, 2017 •

edited

Loading

Mavrin commented Jan 9, 2017 •

edited

Loading

Reinmar commented Jan 10, 2017 •

edited

Loading

ma2ciek commented Jan 12, 2017 •

edited

Loading

ma2ciek commented Jan 30, 2017 •

edited

Loading

ma2ciek commented Feb 6, 2017 •

edited

Loading

ma2ciek commented Feb 7, 2017 •

edited

Loading

ma2ciek commented Feb 7, 2017 •

edited

Loading