Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-2022

Prevent huge segments to be send to the termTagger

       

      problem

      The length of a segment influences the duration the termtagger needs in an exponentially way.

      Therefore very long segments can just bring the termTagger down by blocking the whole termtagger instance.

      solution

      Segments longer as a configurable value are not considered as to be tagged.

      On the import this can be done by the segment status already used by the termtag import, on the GUI tagging we just check the length and if the segment is longer we just send an according message to the GUI that the segment can not be tagged.

      Segment length analysis

      Regarding some of our customers, the distribution of wordCounts shows, that normally the segments do not have more then 150 words. See attached txt file.

      Everything above can be considered as not normal, so the default for the segments not to be tagged a word count of 150 is assumed.

          [TRANSLATE-2022] Prevent huge segments to be send to the termTagger

          Marc Mittag [Administrator] made changes -
          Start Date [Gantt] New: 23/Apr/24 5:00 AM
          Marc Mittag [Administrator] made changes -
          Workflow Original: MittagQI Workflow [ 29417 ] New: MittagQI Workflow with Peer [ 35928 ]
          Marc Mittag [Administrator] made changes -
          Status Original: Test Ready [ 10005 ] New: Done [ 10000 ]
          Thomas Lauria made changes -
          Fix Version/s New: translate5 - 3.4.1 [ 11400 ]
          Thomas Lauria made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: In Progress [ 3 ] New: Test Ready [ 10005 ]
          Thomas Lauria made changes -
          Status Original: Selected for Development [ 10100 ] New: In Progress [ 3 ]
          Thomas Lauria made changes -
          Attachment New: segment-wordcount-count.txt [ 18618 ]
          Thomas Lauria made changes -
          Status Original: Open [ 10002 ] New: Selected for Development [ 10100 ]
          Thomas Lauria made changes -
          Assignee Original: Ines-Paul Baumann [ Ines-Paul ] New: Thomas Lauria [ tlauria ]
          Thomas Lauria made changes -
          Link New: This issue causes TS-350 [ TS-350 ]
          Thomas Lauria created issue -

            tlauria Thomas Lauria
            tlauria Thomas Lauria
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: