Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-3535

Evaluate postediting time and levenshtein distance

    XMLWordPrintable

Details

    • New Feature
    • Resolution: Unresolved
    • None
    • None
    • Task Management
    • High
    • Hide
      If postediting time and levenshtein distance KPIs are needed for legacy data then the following commands should be triggered:
      t5 statistics:levenshtein (to calculate missing levenshtein values in segments history)
      t5 statistics:aggregate (to aggregate segments history data into statistics DB)
      Show
      If postediting time and levenshtein distance KPIs are needed for legacy data then the following commands should be triggered: t5 statistics:levenshtein (to calculate missing levenshtein values in segments history) t5 statistics:aggregate (to aggregate segments history data into statistics DB)
    • Added segments editing history data aggregation to calculate and display KPIs related to levenshtein distances and post-editing time

    Description

      • Calculate levenshtein distances on segment level for distance between first version of the segment in a current workflow step and current one
      • Post-Editing time is already calculated since 10 years in translate5

      Save those 2 distances  in LEK_segments and LEK_segment_history tables and an aggregated distance between first version and current one on task level.

      Provide statistical data as follows:

      • Calculate the average by the current filtering in the task management grid and make it available in the same way as "Show KPIs" and "Export meta data".
      • If filtered by "advanced filters" (third dimension of things that correlate with the task user assignment (job)), then also the statistic is only calculated by those jobs in the filtering
      • For further details on how to calculate the averages, please see sheet "Logic" in attached "Calculation and Tooltip Matrix.ods"

      Add new advanced filters, that allow to filter the tasks by

        • match rate range min/max (regarding MT resources quality estimation of Modelfront and probably in the future of LLMs is used in the same way in the same match rate column)
        • used language resource(s) (multi-select tag-field)
        • language resource type (like TermCollection, TM, MT)
      • Please note: If filtered by any filtering that relates to the language resource used for pre-translation, the first language resource used for pre-translation is taken into account. Even if the user decides to manually take over something else from the fuzzy match panel

      • Show the resulting calculated levenshtein distance and post-editimg time averages for the current filtering in the existing "Show KPIs" window and make them also available in the xlsx-file that can be downloaded by clicking on "Download meta-data"

      UI text for KPI window:
      EN
      Ø Post-editing time within 1 workflow step

      Ø Levenshtein distance within 1 workflow step
      Ø Post-editing time from the start of workflow
      Ø Levenshtein distance from the start of workflow

      DE
      Ø Nachbearbeitungszeit innerhalb eines Workflowschritts
      Ø Levenshtein-Abstand innerhalb eines Workflowschritts
      Ø Nachbearbeitungszeit ab Beginn des Workflows
      Ø Levenshtein-Abstand ab Beginn des Workflows

      The Levensthein average in the UI should show at least 5 decimal places.
      Behind those UI texts there should be an info icon. And there should be a tooltip when hovering across the text or the info icon. The text of the tooltip should be:

      For the tooltip texts please see the attached file "Calculation and Tooltip Matrix.ods"

      Additional note:
      1/
      If ClickHouse connection cannot be established, in KPI window we display "Daten nicht verfügbar" ("Data unavailable") next to Post-editing time label

      Attachments

        1. Calculation and Tooltip Matrix.ods
          44 kB
        2. image-2024-09-12-17-12-40-878.png
          image-2024-09-12-17-12-40-878.png
          184 kB
        3. image-2024-09-12-17-15-13-080.png
          image-2024-09-12-17-15-13-080.png
          141 kB
        4. image-2024-12-01-17-41-51-707.png
          image-2024-12-01-17-41-51-707.png
          55 kB
        5. image-2024-12-02-20-29-14-891.png
          image-2024-12-02-20-29-14-891.png
          113 kB
        6. new.txt
          8 kB
        7. newest.txt
          11 kB
        8. Screenshot 2024-10-17 at 17.56.43.png
          Screenshot 2024-10-17 at 17.56.43.png
          200 kB

        Issue Links

          Activity

            People

              volodymyr@mittagqi.com Volodymyr Kyianenko
              marcmittag Marc Mittag [Administrator]
              Thomas Lauria
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: