-
New Feature
-
Resolution: Fixed
-
None
-
High
-
-
-
Emptyshow more show less
- Calculate levenshtein distances on segment level for distance between first version of the segment in a current workflow step and current one
- Post-Editing time is already calculated since 10 years in translate5
Save those 2 distances in LEK_segments and LEK_segment_history tables and an aggregated distance between first version and current one on task level.
Provide statistical data as follows:
- Calculate the average by the current filtering in the task management grid and make it available in the same way as "Show KPIs" and "Export meta data".
- If filtered by "advanced filters" (third dimension of things that correlate with the task user assignment (job)), then also the statistic is only calculated by those jobs in the filtering
- For further details on how to calculate the averages, please see sheet "Logic" in attached "Calculation and Tooltip Matrix.ods"
Add new advanced filters, that allow to filter the tasks by
-
- match rate range min/max (regarding MT resources quality estimation of Modelfront and probably in the future of LLMs is used in the same way in the same match rate column)
- used language resource(s) (multi-select tag-field)
- language resource type (like TermCollection, TM, MT)
- Please note: If filtered by any filtering that relates to the language resource used for pre-translation, the first language resource used for pre-translation is taken into account. Even if the user decides to manually take over something else from the fuzzy match panel
- Show the resulting calculated levenshtein distance and post-editimg time averages for the current filtering in the existing "Show KPIs" window and make them also available in the xlsx-file that can be downloaded by clicking on "Download meta-data"
UI text for KPI window:
EN
Ø Post-editing time within 1 workflow step
Ø Levenshtein distance within 1 workflow step
Ø Post-editing time from the start of workflow
Ø Levenshtein distance from the start of workflow
DE
Ø Nachbearbeitungszeit innerhalb eines Workflowschritts
Ø Levenshtein-Distanz innerhalb eines Workflowschritts
Ø Nachbearbeitungszeit ab Beginn des Workflows
Ø Levenshtein-Distanz ab Beginn des Workflows
The Levensthein average in the UI should show at least 5 decimal places.
Behind those UI texts there should be an info icon. And there should be a tooltip when hovering across the text or the info icon. The text of the tooltip should be:
For the tooltip texts please see the attached file "Calculation and Tooltip Matrix.ods"
Additional note:
1/ If ClickHouse connection cannot be established, in KPI window we display "Daten nicht verfügbar" ("Data unavailable") next to Post-editing time label
actual ToDos:
Needed Information for Calculation
1. (overall) Levinshtein Distanz (= Anzahl der Änderungen)
2. number of segments
3. numer of workflow-steps
if would be good to have these informations somewhere to get a better feeling why a calculated result is not equal to the wanted result.
When segments are not touched by a user, they are not counted at all (have no entry in segment-history table). This does not feel right for me, see example "Test 2025-01-17: #1"
It is be calculated right, when a certain workflow-step is finished. But within the workflow-step "untouched segments" are not calculated, and therefore are "missing" (count of segments is not all segments, but only the currently edited/touched ones).
=> information of the actual workflow-step might not be calculated correct, because the "0 segements" are not present here.
what to do if a user is editing a segment "outside" any workflow-step. Can be reproduced by creating a task and no NOT assign any user.
=> we decided to have one workflow-step before actual workflow called "no workflow" and one AFTER the workflow.
The "0-segments" where added for "no workflow" on "event start workflow", and the ones for the "after workflow" are added on task-ending.
Those two workflows are always calculated "independant" which mean the are not part of the "distance per workflow-step" or "distance from the start" calculation
-> Labels for 4 new KPI Lev for "Before" (no workflow) and "After" (after workflow)
EN
Ø Levenshtein distance before the start of workflow
Ø Post-editing time before the start of workflow
Ø Levenshtein distance after the end of workflow
Ø Post-editing time after the end of workflow
DE
Ø Levenshtein-Distanz vor Beginn des Workflows
Ø Nachbearbeitungszeit vor Beginn des Workflows
Ø Levenshtein-Distanz nach Ende des Workflows
Ø Nachbearbeitungszeit nach Ende des Workflows
Tooltip in Task-List KPI must only be shown when hover-ing the "i"
In CLI add optional parameter taskId to "t5 statistics:levenshtein/aggregate"
sample: t5 statistics:levenshtein -t 123 => create levenshtein only for task with ID
- blocks
-
TRANSLATE-4221 Save analysis from xliff
- Done
- is duplicated by
-
TRANSLATE-2292 Get the levinsthein distance of a post-edited segment from its MT input
- Done
- relates to
-
TRANSLATE-4181 Add missing workflow select field and workflow steps from all existing workflows to task overview "advanced filters"
- Done
-
TRANSLATE-3916 Make concept for Post-editing and levenshtein statistics
- Done
To calculate Levinsthein distance online for comparison this can be used: https://planetcalc.com/1721/