Requirement

Llama should be integrated in translate5 as trainable language resource completely analogous to what we are doing at the moment with OpenAI and Azure GPT.

persistent training of models should be possible with prompts and exmaples
The new training lib features as developed in https://jira.translate5.net/browse/TRANSLATE-4369 should be possible to be used
Context matters. We should send as many segments as possible at the same time to Llama. We should count the tokens before sending the request for translation to Llama, to make sure, we send as many segments as possible for translation.
We should send the terminology to Llama, in the way, we do for GPT and be prepared, that we will implement also https://jira.translate5.net/browse/TRANSLATE-4401 and its sub-issues soon for GPT and for Llama
We should evaluate, if there is a better RAG-based approach for enriching GPT with the terminology (and soon matches from the TM) than what we are currently doing
We should have in mind, that for quality estimation we will probably also need and use the training interfaces

Llama model to be used as long as Llama 3.4 is not released: LLaMA 3.3:70B https://www.llama.com/llama-downloads/

Attachments

Activity

People

Assignee:: Leon Kiz

Reporter:: Marc Mittag [Administrator]

Peer developer:: Aleksandar Mitrev

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 06/Feb/2025 07:14

Updated:: 06/Feb/2025 07:16