Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-3300

Terms that contain xml special chars are not tagged

XMLWordPrintable

      If a term contains an xml special char > < & ' ", the term will not be termtaggt.

      For example if a term looks like "Checks & Balances", it would not be tagged.

      Please check, if we can solve this without modifying termtagger - so inspect, what is send to termtagger in regard of terms and segment content and if both can be modified in a way that leads to success.

      Please then check, if the same is true for terms, that contain a + character.

      Research result:

      this is how 'Cat & Dog' really looks inside the tbx, exported and fed to termtagger

      Cat&#xA0;&amp;&#xA0;Dog

      and the spaces in the original tbx are non-breaking spaces, as their code is 160

      Conclusion: We should replace all whitespace in terms with a one single simple space in terms, when we create the TBX for TermTagger.
      Please implement this in the frame of this issue

            pavelperminov Pavel Perminov
            marcmittag Marc Mittag [Administrator]
            Axel Becher
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: