Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-4203

DeepL: Switch tag-handling to be able to send tags as xliff tags and whitespace tags as plain whitespace

Details

    • Improvement
    • Resolution: Unresolved
    • None
    • None
    • LanguageResources

    Description

      Problem

      DeepL often places tags not at the right position. Due to former bad tag handling in DeepL (complete omission of tags or messing up of tag order) we currently send all tags as img-tags, even whitespace tags. This of course does not allow DeepL to understand more, what a tag is about and where to place it.

      Solution

      Tag handling was probably very much improved in DeepL.

      We should try to use those parameters of DeepL

      tag_handling=xml, split_sentences=nonewlines

      and send all content as we do it with t5memory (tags as xliff tags and whitespace as plain whitespace).

      If in tests DeepL will then not handle newlines as it should (just keep them and place them at the right position) we would still have to convert newlines to ph-tags.

      Also our tag repair should still kick in at the end and repair, what it thinks it needs to repair.

      The chance is high, that tasks like in the linked TS-issue are then much better.

       

      Attachments

        Issue Links

          Activity

            Loading...
            Uploaded image for project: 'translate5'
            1. translate5
            2. TRANSLATE-4203

            DeepL: Switch tag-handling to be able to send tags as xliff tags and whitespace tags as plain whitespace

            Details

              • Improvement
              • Resolution: Unresolved
              • None
              • None
              • LanguageResources

              Description

                Problem

                DeepL often places tags not at the right position. Due to former bad tag handling in DeepL (complete omission of tags or messing up of tag order) we currently send all tags as img-tags, even whitespace tags. This of course does not allow DeepL to understand more, what a tag is about and where to place it.

                Solution

                Tag handling was probably very much improved in DeepL.

                We should try to use those parameters of DeepL

                tag_handling=xml, split_sentences=nonewlines

                and send all content as we do it with t5memory (tags as xliff tags and whitespace as plain whitespace).

                If in tests DeepL will then not handle newlines as it should (just keep them and place them at the right position) we would still have to convert newlines to ph-tags.

                Also our tag repair should still kick in at the end and repair, what it thinks it needs to repair.

                The chance is high, that tasks like in the linked TS-issue are then much better.

                 

                Attachments

                  Issue Links

                    Activity

                      People

                        aleksandar Aleksandar Mitrev
                        marcmittag Marc Mittag [Administrator]
                        Axel Becher, Leon Kiz
                        Votes:
                        0 Vote for this issue
                        Watchers:
                        1 Start watching this issue

                        Dates

                          Created:
                          Updated:

                          People

                            aleksandar Aleksandar Mitrev
                            marcmittag Marc Mittag [Administrator]
                            Axel Becher, Leon Kiz
                            Votes:
                            0 Vote for this issue
                            Watchers:
                            1 Start watching this issue

                            Dates

                              Created:
                              Updated: