-
New Feature
-
Resolution: Unresolved
-
None
-
None
-
Medium
-
Emptyshow more show less
Problem
It is possible meanwhile to let LLMs judge the quality of a translation - or maybe better said: The risk of a translation being wrong.
Solution
This issue contains the basic work that needs to be done to use quality estimation in translate5.
Additional info and changes to the ticket
- locked and blocked segments are also evaluated. Basically all of the segments are send for TQE(translation quality estimation)
- the task workflow in which the TQE is started should be also saved as separate field for info
- for displaying the results of the TQE we use the same layout as the match analysis panel
- with the current implementation only ONE resource per TQE run will be possible
- the available resources for TQE are only the LLM resources. Currently we hardcode this into the loader( load only Azure,Llama,OpenAI services)
- analog to use resource by default, we add new dropdown on language resource create/edit: Use by default for TQE we show this only for LLM based services (Azure,Llama,OpenAI)
- use the provided prompt from the pdf bellow as much as possible. Adjust it so it fits to our structure. For batch requests there is no definition in the pdf so we have to merge and adjust the single segment prompt to work for batches
- new column with checkbox in the pricing preset definition: Standard for TQE right after "Standard" column checkbox. If checked, those pricing pre-sets should be by default pre-selected in the TQE analysis overview. In addition to that, pricing pre-sets are definable also on client level and the same checkbox should be added there to.
New language resource type: AI quality estimation
The current resource type "ChatGPT (OpenAI / MS Azure)" is renamed to "AI translation". Analogous for our other AI resources.
When selecting one of them for creation, a new radio-button alternative appears to be able to select, wether this should be used for translation of quality estimation.
A new language resource type "AI quality estimation" is created.
All options, pre-assignment definintions for tasks, etc. work identically to "AI translation" - except that it will be used for a different context (not as "Translation AI") and only prompts of the prompt management can be used, that have the value "AI quality estimation" set (see below).
For "AI translation" it must be ensured in turn, that only prompts are offered for usage, that have "AI translation" set as value for usage (see below).
In the prompt building window of the language resource (not the prompt management in the preferences) the test area is changed to one that in contrast to the current one that will further be used for translation contains a fields for source and target, that will be send to the AI and as result we will show:
- the unparsed answer as json
- the parsed answer
- the actual result that would be saved as estimation value
Basic system prompt for quality estimation
As a first basic hard-coded quality estimation prompt (to which prompts from the prompt management can be added in the way it works for "AI translation") the prompt from
https://arxiv.org/pdf/2502.12404v1 page 10
is used.
Integration of quality estimation in prompt management
Prompts in the prompt management get a new property: Area of usage.
It will be a pick list field with multi select.
Currently 2 values are offered: "AI translation" and "AI quality estimation".
In the future "Translation proposals" and "Length shortening" will be added (when those features will be implemented).
New tabs "AI quality estimation"
In the areas "task propterty tabs" (where the tabs "task > users, task -> language resources, task>anaylsis, etc. are located) and "clients tabs" (where the tabs client->general, client->language-resources, etc. are located) new tabs "AI quality estimation" are created.
They have similar structure and function, as the language resources tabs in that area.
Tab task->AI quality estimation
It looks similar to the language resources tab. Differences can be seen in the folowing screen.

Tab clients->AI quality estimation
The functionality rembles the functionality of the tab clients->language resources for pre-assignments of resources to tasks. With the differences seen in the screenshot
