You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*`SPRING_BOOT_ADMIN_CLIENT_INSTANCE_SERVICE-BASE-URL` -- the URL of your Qanary component (has to be visible to the Qanary pipeline)
69
70
*`SERVICE_NAME_COMPONENT` -- the name of your Qanary component (for better identification)
70
71
*`SERVICE_DESCRIPTION_COMPONENT` -- the description of your Qanary component
71
-
*`SOURCE_LANGUAGE` -- (optional) the source language of the text (the component will use langdetect if no source language is given)
72
+
*`SOURCE_LANGUAGE` -- (optional) the default source language of the translation
73
+
*`TARGET_LANGUAGE` -- (optional) the default target language of the translation
72
74
73
75
4. Build the Docker image:
74
76
@@ -82,18 +84,43 @@ docker-compose build .
82
84
docker-compose up
83
85
```
84
86
85
-
After execution, component creates Qanary annotation in the Qanary triplestore:
87
+
After successful execution, component creates Qanary annotation in the Qanary triplestore:
86
88
```
87
89
GRAPH <uuid> {
88
-
?a a qa:AnnotationOfQuestionLanguage .
89
-
?a qa:translationResult "translation result" .
90
-
?a qa:sourceLanguage "ISO_639-1 language code" .
91
-
?a oa:annotatedBy <urn:qanary:app_name> .
92
-
?a oa:annotatedAt ?time .
93
-
}
90
+
?a a qa:AnnotationOfQuestionTranslation .
91
+
?a oa:hasTarget <urn:myQanaryQuestion> .
92
+
?a oa:hasBody "translation_result"@ISO_639-1 language code
93
+
?a oa:annotatedBy <urn:qanary:app_name> .
94
+
?a oa:annotatedAt ?time .
94
95
}
95
96
```
96
97
98
+
### Support for multiple Source and Target Languages
99
+
100
+
This component relies on the presence of one of more existing annotations that associate a question text with a language.
101
+
This can be in the form of an `AnnotationOfQuestionLanguage`, as created by LD components, or an `AnnotationOfQuestionTranslation` as created by MT components.
102
+
103
+
It supports multiple combinations of source and target languages.
104
+
You can specify a desired source and target language independently, or simply use all available language pairings.
105
+
106
+
If a `SOURCE_LANGUAGE` is set, then only texts with this specific language are considered for translation.
107
+
If none is set, then all configured source languages will be used to find candidates for translation.
108
+
109
+
Similarily, if a `TARGET_LANGUAGE` is set, then texts are only translated into that language.
110
+
If none is set, then the texts are translated into all target languages that are supported for their respective source language.
111
+
112
+
Note that while configured source languages naturally determine the possible target languages,
113
+
the configured target languages also determine which source languages can be supported!
114
+
115
+
### Pre-configured Docker Images
116
+
117
+
You may use the included file `docker-compose-pairs.yml` to build a list of images that are preconfigured for specific language pairs.
118
+
Note that if you intend to use these containers at the same time, you need to assign different `SERVER_PORT` values for each image.
119
+
120
+
```bash
121
+
docker-compose -f docker-compose-pairs.yml build
122
+
```
123
+
97
124
## How To Test This Component
98
125
99
126
This component uses the [pytest](https://docs.pytest.org/).
python -c "from transformers.models.marian.modeling_marian import MarianMTModel; from transformers.models.marian.tokenization_marian import MarianTokenizer; supported_langs = ['ru', 'es', 'de', 'fr']; models = {lang: MarianMTModel.from_pretrained('Helsinki-NLP/opus-mt-{lang}-en'.format(lang=lang)) for lang in supported_langs}; tokenizers = {lang: MarianTokenizer.from_pretrained('Helsinki-NLP/opus-mt-{lang}-en'.format(lang=lang)) for lang in supported_langs}"
0 commit comments