Details
-
Sub-task
-
Resolution: Unresolved
-
None
-
None
-
High
-
OpenAI: Data-Model for the Custom Instruction Management
-
Empty show more show less
Description
OpenAI GPT Internal Prompts: Data Model
Base for the Internal Prompt Management are the used internal instructions, which will be collected as a seperate JSON file to be able to add descriptions. The used system-messages for e.g. translation are built from smaller instructions - symbolized by a {key}, that are built to bigger instructions/sentences, which have keys to be identified in the completion
Therefore the instructions.json will consist of a list of key-value pairs with description. One speciality of the used keys is, that they can contain quotes that then will be added in the sentence/formulation as triple quotes.
The editing frontend must include the possible to test the manipulated instructions with a set of given segments with terminology and context-data. When a new instruction-set is added, the base is always a copy of the original instructions. When a customzed instruction-set is loaded for editing, any instructions, which are not found in "instructions.json" are dismissed, any new instructions not present in the customized instruction-set will automatically be added with their default-value. So we ensure an easy upgradability of the customized instructions if new instructions are added to the codebase - only the "instructions.json" always must be in-sync with the codebase.
Example for instructions from the code:
'fromTo' => 'from {from} to {to}', 'fullFromTo' => 'from {from} as source language to {to} as target language', 'termpair' => 'use the specific translation delimited by {*delimiter*} {*termpair*}',
The data in "instructions.json":
[ { "key":"fullFromTo", "instruction":"from {from} as source language to {to} as target language", "description":"This is the precise/full instruction that tells the GPT model which languages to expect for source and translate into" }, { "key":"termpair", "instruction":"use the specific translation delimited by {*delimiter*} {*termpair*}", "description":"This is the instruction that tells the GPT model about a single termpair to be used when translating a segment/text. {*delimiter*} will be resolved to \"triple asterisk\", {*termpair*} will be resolved to \"***sourceterm*** = ***target term***\"" } ]
The json for the customized instructionset:
{ "fromTo" => "from {from} to {to} but customized", "fullFromTo" => "from {from} as source language to {to} as target language but customized", "termpair" => "use the given termpair delimited by {*delimiter*} {*termpair*}", }
New DB Datamodel
LEK_openai_instructionset
columns: ( id | name | comment | json | created | lastChange ) json: { "fromTo" => "from {from} to {to} but customized", ... }
LEK_openai_instructionset_assoc
Holds the 1:1 association between the instruction-set and a language-resource.
columns: ( id | instructionSetId | languageResourceId )