Details

    • Bug
    • Resolution: Unresolved
    • None
    • None
    • LanguageResources

    Description

      Problem

      Currently, the system interacts with multiple translation resources (DeepL, OpenAI, Google, Microsoft, etc.) for translating content. Each resource can process tags, but they have different methods and options for handling tag processing. Additionally, there is no uniform way to configure how tags are processed or repaired across resources.

      Tasks:

      To create a configurable system for tag processing and repair that allows:

      1. Defining the type of tags (HTML or XLIFF) sent to each translation resource.
      2. Configuring whether the tag repair functionality is applied post-translation on the backend.
      3. Aligning the tag repair functionality with the type of tags sent to resources.
      4. Document in confluence, how which MT/LLM resource is currently handling tags with translate5

      Implementation ideas:

      1. Tag Repair Functionality:
        • Introduce a tag repair mechanism for XLIFF tags.
        • Evaluate whether to:
          • Develop a single, unified tag repair class to handle both HTML and XLIFF tags.
          • Create separate tag repair classes for HTML and XLIFF tags.
        • Evaluate, how current tag repair for DeepL works. In Marcs understanding it makes sure
          • no tag is missing
          • tags are syntactically correct
          • if a tag has to be inserted or moved, it will be moved/inserted in a similar position as it had in the source segment (so e. g. after the same number of blocks of word-characters and non-word-characters. If that logic already exist, keep it.
      1. Resource-Specific Configuration:
        • Allow configuration for each resource to specify:
          • The type of tags it processes (HTML or XLIFF).
          • Whether tag repair should be enabled or disabled.
        • Ensure tag repair type aligns with the tag type sent to the resource (e.g., if XLIFF tags are sent, only XLIFF tag repair should be applied).
      1. Validation Logic:
        • Implement validation to prevent mismatches between tag type and tag repair functionality. For example:
          • If XLIFF tags are sent, ensure HTML repair is not attempted.

      Attachments

        Issue Links

          Activity

            People

              aleksandar Aleksandar Mitrev
              aleksandar Aleksandar Mitrev
              Axel Becher, Leon Kiz
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: