Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-2839

Attach to t5memory service

    XMLWordPrintable

Details

    • High
    • Structural adjustments for t5memory service.
    • -

    Description

      Adopt the differences between OpenTM2 and t5memory service connection in translate5.

      This are mostly differneces regading the tag handling.

      Current XLF OpenTM2 communication

      TMX Import (obsolete with t5memory, already implemented):

      • Keep <ph type="lb"/> tags, since represents line breaks
      • all other it, ph, bpt and ept tags are sanitized, that means, all attributes expect i and x are removed.
      • LanguageCode Fix: map unsupported language codes to supported ones
      • datatype unknown and encoding=utf-16 fix

      TMX Export:

      • revert language mappings → TODO also already implemented for  t5memory

      OpenTM2 lookup query:

      1. all internal tags representing whitespace ('hardReturn', 'softReturn', 'macReturn', 'space', 'tab', 'char') are restored to the characters they mask
        → this remains for t5memory
      2. all remaining real internal tags are converted to bx, ex, x tags. bx and ex pairs are converted to g tag pairs if structure was valid. bx and ex match by rid
        → in future only <bpt id="" rid=""> und <ept id="" rid="">, and <ph id=""> tags (which tag names is in clarification)
        → The Mapping between the queried tags and the tags in the TM (which was done in OpenTM2 previously) is now done in translate5, similar to what we do with the repetition editor
      3. the replaced original tags are stored mapped to so generated <x id> and <%s id="%s" rid="%s"/> tags (tag names will change)
      4. all other tag types are removed (should be no other tag types)
      5. OBSOLETE with t5memory: str_replace(['<x id="','<ex id="','<bx id="'], ['<x mid="','<ex mid="','<bx mid="'] to improve matchrates
      6. OBSOLETE with t5memory: str_replace(["\r\n","\n","\r"], '<ph type="lb"/>', $queryString) back to <ph type="lb"/> to improve matchrates

      OpenTM2 lookup result processing:

      1. OBSOLETE: replace back above match rate improvements (TAG mid to TAG id, <ph lb> to \n)
      2. OBSOLETE it|ph|ept|bpt are replaced to x/bx/ex additional tags, which are removed later, see below
        probable reason why content tags are removed: They can not be repplaied with the 2d map
      3. OBSOLETE removes all non x/bx/ex/g tags
      4. replaces all non usable whitespace back to translate5 internal tags (counter part of step 1 in query code)
      5. the received x/bx/ex/g tags are replaced back to the original tags, by the map stored in query step 3.
      6. all tags not replaced back by the map, are replaced as "additional" tag, which is shown in the GUI but then removed on taking over the segment.
        For pre-translation, additional tags are removed too.

      Attachments

        Issue Links

          Activity

            People

              tlauria Thomas Lauria
              tlauria Thomas Lauria
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: