Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-4845

Make t5n tags smaller to save segment space on t5memory side

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • None
    • None
    • Content Protection
    • High
    • t5n tags on t5memory side will become smaller so that longer segments may be saved into memory

      Problem

      t5memory has length restriction for segment of 2048 chars for 1 language part

      In process of protection of content translate5 places some content in specific tag "t5n" with currently long property "r":

      <t5:n id="1" r="09ewt1GMNtC1jNXUAFPa2poa9oq60Ym6VY66UYeXHN52eM/hlsPTDs+J1dQHAA==" n="1"/> 

      This leads to exaggerated size of segment in cases where lots of small content (like integers) were protected. As result such a segment will be tossed out as problematic on import

      Solution

      Replace regex hash in r property with generated unique string.

      <t5:n id="1" r="aB1" n="1"/> 

      3 chars from combination of lower/upper case english alphabet or numbers 0..9 should be enough to cover realistic number of rules provided by customer:
      62 chars to use in power of Ā length of 3 -> 62^3 = 238328 variations

      this unique string will be saved in db as part of recognition table and will be used in place of current r content in code

            sanya@mittagqi.com Sanya Mikhliaiev
            sanya@mittagqi.com Sanya Mikhliaiev
            Thomas Lauria
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: