-
Sub-task
-
Resolution: Unresolved
-
None
-
None
-
High
-
OpenAI: Data-Model for the Custom Instruction Management
-
Emptyshow more show less
OpenAI GPT Internal Prompts: Data Model
Base for the Internal Prompt Management are the used internal instructions, which will be collected as a seperate JSON file to be able to add descriptions. The used system-messages for e.g. translation are built from smaller instructions - symbolized by a {key}, that are built to bigger instructions/sentences, which have keys to be identified in the completion
Therefore the instructions.json will consist of a list of key-value pairs with description. One speciality of the used keys is, that they can contain quotes that then will be added in the sentence/formulation as triple quotes.
The editing frontend must include the possible to test the manipulated instructions with a set of given segments with terminology and context-data. When a new instruction-set is added, the base is always a copy of the original instructions. When a customzed instruction-set is loaded for editing, any instructions, which are not found in "instructions.json" are dismissed, any new instructions not present in the customized instruction-set will automatically be added with their default-value. So we ensure an easy upgradability of the customized instructions if new instructions are added to the codebase - only the "instructions.json" always must be in-sync with the codebase.
Example for instructions from the code:
'fromTo' => 'from {from} to {to}', 'fullFromTo' => 'from {from} as source language to {to} as target language', 'termpair' => 'use the specific translation delimited by {*delimiter*} {*termpair*}',
The data in "instructions.json":
[
{
"key":"fullFromTo",
"instruction":"from {from} as source language to {to} as target language",
"description":"This is the precise/full instruction that tells the GPT model which languages to expect for source and translate into"
},
{
"key":"termpair",
"instruction":"use the specific translation delimited by {*delimiter*} {*termpair*}",
"description":"This is the instruction that tells the GPT model about a single termpair to be used when translating a segment/text. {*delimiter*} will be resolved to \"triple asterisk\", {*termpair*} will be resolved to \"***sourceterm*** = ***target term***\""
}
]
The json for the customized instructionset:
{
"fromTo" => "from {from} to {to} but customized",
"fullFromTo" => "from {from} as source language to {to} as target language but customized",
"termpair" => "use the given termpair delimited by {*delimiter*} {*termpair*}",
}
New DB Datamodel
LEK_openai_instructionset
columns: ( id | name | comment | json | created | lastChange )
json: {
"fromTo" => "from {from} to {to} but customized",
...
}
LEK_openai_instructionset_assoc
Holds the 1:1 association between the instruction-set and a language-resource.
columns: ( id | instructionSetId | languageResourceId )