Uploaded image for project: 'translate5'
  1. translate5
  2. TRANSLATE-3436

Integrate GPT-4 with translate5 as translation engine

    XMLWordPrintable

Details

    • Critical
    • To update to this version PHP 8.1.23 is required.
    • New Private Plugin "OpenAI" to use OpenAI-Models as language-resource and base functionality to fine-tune these models

    Description

      The goal is to get translations from GPT-4 in the way we are getting them from other MT services.

      Yet different to other MT resources, we should find out, how it works to transfer additional information to GPT-4, what is the best way for doing so and if this leads to a better translation.

      The following sources should be first evaluated and then if possible (what it should be) integrated for providing GPT-4 with additional information. If those sources are used, should be configurable.

      • Send as many segments in one request as possible and tell GPT-4, that the structure of the segments need to be respected, but that the segments build a context. This will save costs, because GPT is payed by request and not by chars. And it will lead probably to a better translation.
      • For each segment, where we have a 100% match or better, provide it to GPT and tell it, that we have already a translation for this and that it should use it as an inspiration for the other segments.
      • For each segment send GPT the best X fuzzy matches in a structured way. We need to play around here, what makes sense. I would guess, it makes sense to send all fuzzy matches up to a certain percentage, e. g. 70%. And in addition send the best 3 fuzzy matches, if 3 do not exist above 69%.
      • For each segment we mark in the source the found terminology and tell GPT how it should be translated (please see TextShuttle-Plug-in-Code, which also does this).
      • With less prio than the previous: Find out, if we already can send images as context information to GPT-4. If yes: Implement it, that images are send as context.
      • With less prio than the previous: Find out, if we can provide GPT-4 with the complete available TMs before the translation starts to train it. Or at least a certain number of matches.

      More thoughts:

      • send combined segments in a discernable structure to reduce costs
      • send terminology markup and define the term's
      • request alternatives for words
      • Add frontend GUI to request segment phrasing changes e.g. regarding gender

       

      Attachments

        Issue Links

          Activity

            People

              axelbecher Axel Becher
              marcmittag Marc Mittag [Administrator]
              Aleksandar Mitrev
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: