Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify and translate missing plural keys #596

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions config/locales/en.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ en:
nostdin: Do not read from stdin
out_format: 'Output format: %{valid_text}'
pattern_router: 'Use pattern router: keys moved per config data.write'
skip_interpolation: Skip translating strings that are straight interpolations (eg. "%{comment}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding an option, perhaps we should always simply copy such keys verbatim? There is never a reason to translate them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be preferential in our use case but I didn't want to assume...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we don't even copy them, we have Rails configured with fallback which then just works... Not sure that's a safe assumption either though.

Copy link
Owner

@glebm glebm Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is never any reason to pass interpolation-only values through machine translation, so let's copy them verbatim?

Copy link
Contributor Author

@markedmondson markedmondson Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! :)

Edit: Although in our case we don't want to copy them either, I'd prefer to skip and leave them to fallback. Should I leave a mechanism in place for that?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense to have that as an option.

strict: >-
Avoid inferring dynamic key usages such as t("cats.#{cat}.name"). Takes precedence over
the config setting if set.
Expand Down Expand Up @@ -112,8 +113,11 @@ en:
missing:
details_title: Value in other locales or source
none: No translations are missing.
plural: Missing plural keys
openai_translate:
errors:
invalid_json: invalid JSON returned from backend.
invalid_size: 'Invalid size returned, expected: %{expected}, received: %{actual}.'
no_api_key: >-
Set OpenAI API key via OPENAI_API_KEY environment variable or translation.openai_api_key
in config/i18n-tasks.yml. Get the key at https://openai.com/.
Expand Down
5 changes: 5 additions & 0 deletions config/locales/ru.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ ru:
nostdin: Не читать дерево из стандартного ввода
out_format: 'Формат вывода: %{valid_text}.'
pattern_router: 'Использовать pattern_router: ключи распределятся по файлам согласно data.write'
skip_interpolation: Пропустить перевод строк, которые являются прямыми интерполяциями (например,
"%{comment}")
strict: Не угадывать динамические использования ключей, например `t("category.#{category.key}")`
translation_backend: Движок перевода [google, deepl, yandex, openai]
value: >-
Expand Down Expand Up @@ -111,8 +113,11 @@ ru:
missing:
details_title: На других языках или в коде
none: Всё переведено.
plural: Отсутствуют ключи множественного числа
openai_translate:
errors:
invalid_json: недействительный JSON, возвращенный из бэкэнда
invalid_size: 'Возвращен неверный размер, ожидалось: %{expected}, получено: %{actual}.'
no_api_key: |-
Установить ключ API Яндекса с помощью переменной среды OPENAI_API_KEY или translation.openai_api_key
в config / i18n-tasks.yml. Получите ключ по адресу https://openai.com/.
Expand Down
3 changes: 3 additions & 0 deletions lib/i18n/tasks/base_task.rb
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ class BaseTask
include Data
include Stats

INTERPOLATION_KEY_RE = /%\{[^}]+}/.freeze
INTERPOLATION_ONLY_KEY_RE = /^%\{[^}]+\}$/.freeze

def initialize(config_file: nil, **config)
@config_override = config_file
self.config = config || {}
Expand Down
13 changes: 11 additions & 2 deletions lib/i18n/tasks/command/commands/missing.rb
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,19 @@ def missing(opt = {})
cmd :translate_missing,
pos: '[locale ...]',
desc: t('i18n_tasks.cmd.desc.translate_missing'),
args: [:locales, :locale_to_translate_from, arg(:out_format).from(1), :translation_backend, :pattern]
args: [
:locales, :locale_to_translate_from, arg(:out_format).from(1),
:translation_backend, :pattern, :skip_interpolation
]

def translate_missing(opt = {})
missing = i18n.missing_diff_forest opt[:locales], opt[:from]
missing = i18n.missing_keys(
locales: opt[:locales],
types: %w[diff plural],
base_locale: opt[:from],
skip_interpolation: (opt[:skip_interpolation] != 'false')
)

if opt[:pattern]
pattern_re = i18n.compile_key_pattern(opt[:pattern])
missing.select_keys! { |full_key, _node| full_key =~ pattern_re }
Expand Down
5 changes: 5 additions & 0 deletions lib/i18n/tasks/command/options/common.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,11 @@ module Common
'--config FILE',
t('i18n_tasks.cmd.args.desc.config')

arg :skip_interpolation,
'-s',
'--skip_interpolation',
t('i18n_tasks.cmd.args.desc.skip_interpolation')

def arg_or_pos!(key, opts)
opts[key] ||= opts[:arguments].try(:shift)
end
Expand Down
91 changes: 62 additions & 29 deletions lib/i18n/tasks/missing_keys.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@

# @param types [:used, :diff, :plural] all if `nil`.
# @return [Siblings]
def missing_keys(locales: nil, types: nil, base_locale: nil)
def missing_keys(locales: nil, types: nil, base_locale: nil, skip_interpolation: false)
locales ||= self.locales
types ||= missing_keys_types
base = base_locale || self.base_locale
types.inject(empty_forest) do |f, type|
f.merge! send(:"missing_#{type}_forest", locales, base)
f.merge! send(:"missing_#{type}_forest", locales, base, skip_interpolation)
end
end

Expand All @@ -35,47 +35,31 @@
end
end

def missing_diff_forest(locales, base = base_locale)
def missing_diff_forest(locales, base = base_locale, skip_interpolation = false)
tree = empty_forest

# present in base but not locale
(locales - [base]).each do |locale|
tree.merge! missing_diff_tree(locale, base)
tree.merge! missing_diff_tree(locale, base, skip_interpolation)
end
if locales.include?(base)
# present in locale but not base
(self.locales - [base]).each do |locale|
tree.merge! missing_diff_tree(base, locale)
tree.merge! missing_diff_tree(base, locale, skip_interpolation)
end
end
tree
end

def missing_used_forest(locales, _base = base_locale)
def missing_used_forest(locales, _base = base_locale, _skip_interpolation = false)
locales.inject(empty_forest) do |forest, locale|
forest.merge! missing_used_tree(locale)
end
end

def missing_plural_forest(locales, _base = base_locale)
locales.each_with_object(empty_forest) do |locale, forest|
required_keys = required_plural_keys_for_locale(locale)
next if required_keys.empty?

tree = empty_forest
plural_nodes data[locale] do |node|
children = node.children
present_keys = Set.new(children.map { |c| c.key.to_sym })
next if ignore_key?(node.full_key(root: false), :missing)
next if present_keys.superset?(required_keys)

tree[node.full_key] = node.derive(
value: children.to_hash,
children: nil,
data: node.data.merge(missing_keys: (required_keys - present_keys).to_a)
)
end
tree.set_root_key!(locale, type: :missing_plural)
forest.merge!(tree)
def missing_plural_forest(locales, base = base_locale, _skip_interpolation = false)
locales.inject(empty_forest) do |forest, locale|
forest.merge! missing_plural_tree(locale, base)
end
end

Expand All @@ -93,9 +77,14 @@
end

# keys present in compared_to, but not in locale
def missing_diff_tree(locale, compared_to = base_locale)
data[compared_to].select_keys do |key, _node|
locale_key_missing? locale, depluralize_key(key, compared_to)
def missing_diff_tree(locale, compared_to = base_locale, skip_interpolation = false)
data[compared_to].select_keys do |key, node|
# Skip if the string is just an interpolation, eg: "%{body}"
if skip_interpolation && node.value.to_s.match?(BaseTask::INTERPOLATION_ONLY_KEY_RE)
false
else
locale_key_missing? locale, depluralize_key(key, compared_to)
end
end.set_root_key!(locale, type: :missing_diff).keys do |_key, node|
# change path and locale to base
data = { locale: locale, missing_diff_locale: node.data[:locale] }
Expand All @@ -106,6 +95,50 @@
end
end

def missing_plural_tree(locale, base = base_locale, _skip_interpolation = false) # rubocop:disable Metrics/AbcSize

Check failure on line 98 in lib/i18n/tasks/missing_keys.rb

View workflow job for this annotation

GitHub Actions / lint

Metrics/MethodLength: Method has too many lines. [28/25]
tree = empty_forest

required_keys = required_plural_keys_for_locale(locale)
return tree if required_keys.empty?

plural_nodes data[base] do |node|
children = node.children

# Get the existing translated node if it exists
node_key = node.full_key(root: false)
translated_node = data[locale].get("#{locale}.#{node_key}")
present_keys = if translated_node
# If it's a single existing (non-plural) translation, skip it
next unless translated_node.value.nil?

# Otherwise get the existing plural keys
Set.new(translated_node.children.map { |c| c.key.to_sym })
else
Set.new
end
# Compare the keys to those existing in base
next if present_keys.superset?(required_keys)

# Mark for removal any existing keys that are not required
base_keys = Set.new(node.children.map { |c| c.key.to_sym })
missing_keys = (required_keys - present_keys).to_a
remove_keys = (present_keys + base_keys) - required_keys

# Remove any ignored plural keys, eg: 'something.key.few' or '*.many'
missing_keys.reject! do |plural_key|
ignore_key?("#{node.full_key(root: false)}.#{plural_key}", :missing, locale)
end
next if missing_keys.empty? && remove_keys.empty?

tree[node.full_key] = node.derive(
value: children.to_hash,
children: nil,
data: node.data.merge(missing_keys: missing_keys, remove_keys: remove_keys)
)
end
tree.set_root_key!(locale, type: :missing_plural)
end

# keys used in the code missing translations in locale
def missing_used_tree(locale)
used_tree(strict: true).select_keys do |key, _node|
Expand Down
3 changes: 2 additions & 1 deletion lib/i18n/tasks/reports/terminal.rb
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,8 @@ def missing_key_info(leaf)
when :missing_used
first_occurrence leaf
when :missing_plural
leaf[:data][:missing_keys].join(', ')
"#{Rainbow(I18n.t('i18n_tasks.missing.plural')).cyan} " +
leaf[:data][:missing_keys].join(', ')
else
"#{Rainbow(leaf[:data][:missing_diff_locale]).cyan} " \
"#{format_value(leaf[:value].is_a?(String) ? leaf[:value].strip : leaf[:value])}"
Expand Down
43 changes: 37 additions & 6 deletions lib/i18n/tasks/translators/base_translator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ def initialize(i18n_tasks)
# @return [I18n::Tasks::Tree::Siblings] translated forest
def translate_forest(forest, from)
forest.inject @i18n_tasks.empty_forest do |result, root|
pairs = root.key_values(root: true)
merge_missing_plural_nodes!(root)

pairs = root.key_values(root: true)
@progress_bar = ProgressBar.create(total: pairs.flatten.size, format: '%a <%B> %e %c/%C (%p%%)')

translated = translate_pairs(pairs, to: root.key, from: from)
Expand All @@ -25,6 +26,37 @@ def translate_forest(forest, from)

protected

# Recursively loop through the tree until all :missing_plural nodes have
# been added to the base
# @param [I18n::Tasks::Data::Tree::Node] node
# @return [void]
def merge_missing_plural_nodes!(node) # rubocop:disable Metrics/AbcSize
return unless node.children?

node.children.each do |child|
if child.data[:type] == :missing_plural
child.data.delete(:type)
# Add missing plural keys
child.data.delete(:missing_keys).each do |missing_key|
# If the key is present use it, otherwise use the 'other' value
other_value = child.value[missing_key.to_s] || child.value['other']
new_node = Data::Tree::Node.new(
key: missing_key,
value: other_value,
parent: child.parent,
data: { type: :missing_plural_key, locale: node.root.key }
)
child.append!(new_node)
end
# Remove uneeded plural keys
remove_keys = child.data.delete(:remove_keys) || Set.new
child.select_nodes! { |n| !remove_keys.include?(n.key.to_sym) }
else
merge_missing_plural_nodes!(child)
end
end
end

# @param [Array<[String, Object]>] list of key-value pairs
# @return [Array<[String, Object]>] translated list
def translate_pairs(list, opts)
Expand Down Expand Up @@ -55,7 +87,7 @@ def fetch_translations(list, opts)
# @param [Array<[String, Object]>] list of key-value pairs
# @return [Array<String>] values for translation extracted from list
def to_values(list, opts)
list.map { |l| dump_value(l[1], opts) }.flatten.compact
list.map { |l| dump_value(l[1], opts) }.flatten(1).compact
end

# @param [Array<[String, Object]>] list
Expand Down Expand Up @@ -108,14 +140,13 @@ def parse_value(untranslated, each_translated, opts)
end
end

INTERPOLATION_KEY_RE = /%\{[^}]+}/.freeze
UNTRANSLATABLE_STRING = 'X__'

# @param [String] value
# @return [String] 'hello, %{name}' => 'hello, <round-trippable string>'
def replace_interpolations(value)
i = -1
value.gsub INTERPOLATION_KEY_RE do
value.gsub BaseTask::INTERPOLATION_KEY_RE do
i += 1
"#{UNTRANSLATABLE_STRING}#{i}"
end
Expand All @@ -125,9 +156,9 @@ def replace_interpolations(value)
# @param [String] translated
# @return [String] 'hello, <round-trippable string>' => 'hello, %{name}'
def restore_interpolations(untranslated, translated)
return translated if untranslated !~ INTERPOLATION_KEY_RE
return translated if untranslated !~ BaseTask::INTERPOLATION_KEY_RE

values = untranslated.scan(INTERPOLATION_KEY_RE)
values = untranslated.scan(BaseTask::INTERPOLATION_KEY_RE)
translated.gsub(/#{Regexp.escape(UNTRANSLATABLE_STRING)}\d+/i) do |m|
values[m[UNTRANSLATABLE_STRING.length..].to_i]
end
Expand Down
Loading
Loading