Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-925

support xliff 1.2 as import format

    XMLWordPrintable

Details

    • New Feature
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      This is part of the Okapi integration task

      xliff 1.2 should be integrated in translate5 as import format. This means support for

      • all internal tags according to the xliff 1.2 spec
      • including the sub-tag structure (excluding endless recursion of sub-tags, if this is not easy)
      • including segmenting inside a target or source tag based on mrk-tags
      • deal with alt-trans tags (which contain also source and target tags)
        • currently there is a bug which imports the source and target of the alt-trans instead the real source / target

      Special feature: Please check, if the namespace for xliff 2+ is given. xliff 2+ is not supported by this import mechanism. In this case an error must be thrown.

      Regarding the internal tags <ph> <bpt> <ept> <it>:

      Different vendors use the above tags differently to encode the tags of the original content.
      For the visualization in translate5 we can either replace the whole <ph> tag as a internal tag (include the surrounding tag), or just the content of the tag (exclude the surrounding tag).

      Across Example

              <bpt id="3" rid="2" ax:s-id="1" ax:s-bold="False" />
              What’s New in Across <ph id="1" ax:element-id="1">v6.3</ph>?
              <ept id="4" rid="2" />
      

      For the bpt and ept tags, it makes sense to have the tags in our internal tag representation, for the <ph> both variants would be possible.

      XLIFF 1.2 Representation Guide for HTML

      <p title='Information about Mount Hood'>This is Mount Hood: <img src="mthood.jpg" alt="Mount Hood with its snow-covered top"></p>
      is used as:

      <ph id="a_2"><sub ctype="x-html-p-title">Information about Mount Hood</sub></ph>This is Mount Hood:
      <ph id="a_3" ctype="x-html-img" xhtml:src="mthood.jpg">
       <sub ctype="x-html-img-alt">Mount Hood with its snow-covered top</sub>
      </ph>
      

      But another example shows:
      The title says "<span dir="rtl">פעילות הבינאום, W3C</span>" in Hebrew.

      The title says "<g ctype='phrase' id='b1' html:dir='rtl'><bpt id="1">&lt;span dir="rtl"></bpt>text...<ept id="1">&lt;/span></ept></g>" in Hebrew.
      

      For the first example all tags should be shown in the internal tag, for the second example just the content of the bpt and ept tags would be sufficient and better readable.

      OpenTM2 XLIFF

      <ph>&lt;img src=&quot;link.png&quot; alt=&quot;</ph>Link<ph>&quot;/&gt;</ph>
      

      Here it would also be useful to use just the tag content, and hide the <ph> tags in the frontend.

      Okapi XLIFF

      As far as I have seen, Okapi uses only <g> and <x> tags, so the problem is not applicable here.

      Conclusion

      Across XLIFF include surrounding tags
      OpenTM2 XLIFF exclude surrounding tags
      Native XLIFF include surrounding tags if a ctype is present or the tag contains only child nodes (<sub>) and no direct text

      regarding segmentation of <mrk mtype="seg"> segments and <sub> elements

      The import stops with an error if:

      • A <mrk mtype="seg"> tag has no MID Attribute to match the mrk tags
      • A <sub> tags parent tag (<ph><bpt><ept><it>) tag contains no ID
      • The <mrk> tags between source and target could not be aligned (mrk tag count does not match, missing tags - identified by MID)
        In all of the above cases the alignment of the sub segments can not be done properly, so the import is stopped

      Attachments

        Issue Links

          Activity

            People

              tlauria Thomas Lauria
              marcmittag Marc Mittag [Administrator]
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: