Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-1481

Improve tag handling with matches coming from OpenTM2

    XMLWordPrintable

Details

    • High
    • The TMX files imported into OpenTM2 are modified. The internal tags are modified (removing type attribute and convert tags with content to single placeholder tags) to improve matching when finding segments.
    • The TMX files imported into OpenTM2 are modified. The internal tags are modified (removing type attribute and convert tags with content to single placeholder tags) to improve matching when finding segments.
    • -

    Description

      problem

      If a user imports a TMX into OpenTM2 where segments contain <it> / <ph> tags, this segments can currently not be used. If such segments are found an email with the following subject and a lot of debug data is sent:

      "OpenTM2 result contains <it> or <ph> tags! Segment not shown as match result!"

       

      background

      Background was, that we only handle content less tags coming from OpenTM2, which are: (x|ex|bx|g|/g) tags.
      The other tags (it and ph) are not used because they encapsulate other content. This is the reason why we ignore them currently. Indeed the list of tags is incomplete since bpt and ept should also be ignored.

       

      solution

      • Investigate how OpenTM2 deals with such tags with content
      • On parsing the OpenTM2 content such ph etc tags should be handled as only one tag, so that they can be replaced with the real tags from the source content in translate5.

      After investigation this results in the following todos:

      • Clean Up on TMX import of tags with content (<ph>foo</ph> to <ph>)
      • Clean Up on TMX import by removing unnecessary attributes, keeping only i and x attributes (if available) for tag matching
      • On memory lookup use in bx and ex tags also the mid instead id, see TRANSLATE-2353
      • replacement of remaining <ph> (and other content tags) with empty <ph/> tags (for TMs imported before above clean up was introduced)
        this additional tags are handled then correctly by the application (showing them in the GUI for example as additional ones, but then not usable)
        Probably this will work only with <x> tags instead <ph> must be tested

      Attachments

        Issue Links

          Activity

            People

              tlauria Thomas Lauria
              tlauria Thomas Lauria
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: