Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-4830

AI integration: Ensure pre-translation/QA-estimation where LLMs tend to break the syntax can be correctly translated

XMLWordPrintable

    • Icon: New Feature New Feature
    • Resolution: Unresolved
    • None
    • None
    • AI

      Problem

      With Tasks with tags and a lot of characters that are specific to the json-l format (like parantheses and quotation marks) LLMs tend to break the syntax of the json-l

      Solution ideas

      a) JSON:
      Since we can not gather Details, what exactly failed, we switch to single-segment processing as before.
      Batch-Logging should be enriched with task name/id and resource name /id

      b) XLIFF
      Here we re-request the failed Batch starting from the broken Segment and specify, what went wrong (-> the tag that was missing or had a broken structure). Currently, we do not switch to single-segment processing here but log an error when the batch fails after the second attempt. If it turns out, that this happens too often, we have to enable single-segment processing and add tag-repair to the XLIFF handler - OR make the handler changable for single-segment processing

            axelbecher Axel Becher
            marcmittag Marc Mittag [Administrator]
            Leon Kiz
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: