Details
-
Bug
-
Resolution: Fixed
-
None
-
Empty show more show less
Description
problem
The min max segment length is defined in the transunit. Currently translate5 uses for length calculation only the contents inside the mrk tags. So the length of content outside between the mrk tags (mostly whitespace) is missing.
This must be changed, so that the length of the additional characters is also added.
solution
The length of other content (outside/between mrk mtype seg tags) is also saved for length calculation.
Assume the following <target>, where "bef", "betweenX" and "aft" are assumed as whitespace, (since other content as whitespace outside of mrks gices an error).
<target>bef<mrk>text 1</mrk>between1<mrk>text 2</mrk>between1<mrk>text 3</mrk>aft</target>
The length of "bef" is saved as "additionalUnitLength" to each segment, the length of each whitespace after a closed mrk is saved to that mrk as "additionalMrkLength". That means the lengths of "betweenX" and the final "aft" are saved each to the preceding segment defined by the mrk.
Each additionalMrkLength is added automatically to the segments content length in siblingData, the additionalUnitLength instead must be only added once on each length calculation (where siblingData is used) - this is just since the additionalUnitLength is independent of the MRKs. That means: the length stored in siblingData is the segment text length and the additionalMrkLength.
preserveWhitespace has influence to the otherContent:
if preserveWhitespace is true (wether in the trans-unit or in the application), the length of otherContent is always the real length, since the othercontent is taken over completely.
if preserveWhitespace is false in the config, the default behaviour is then: whitespace in other content is removed and ignored in length counting expect the content between two mrk tags (<mrk>content</mrk> HERE <mrk>next content</mrk>): here the othercontent in between the mrks is condensed to one whitespace (or if there is another tag inbetween to one whitespace before that tag, and one after).
Before: <mrk>content</mrk> <mrk>next content</mrk> After: <mrk>content</mrk> <mrk>next content</mrk>
Before: <mrk>content</mrk> <x> <mrk>next content</mrk> After: <mrk>content</mrk> <x> <mrk>next content</mrk>
source and target MRK padding if MRKs are different in source vs target:
if there is no target content (translation task), MRK padding is no problem since there is no target to compare and add missing MRKs
if there is a target content: just use the target otherContent since padded target MRKs could not be edited and are not added as new MRKs in the target. So no otherContent must be considered here. This will change with implementing merging and splitting!