Details

    • Sub-task
    • Resolution: Unresolved
    • None
    • None
    • openai
    • High
    • OpenAI: Data-Model for the Custom Instruction Management

    Description

      OpenAI GPT Internal Prompts: Data Model

      Base for the Internal Prompt Management are the used internal instructions, which will be collected as a seperate JSON file to be able to add descriptions. The used system-messages for e.g. translation are built from smaller instructions - symbolized by a {key}, that are built to bigger instructions/sentences, which have keys to be identified in the completion

      Therefore the instructions.json will consist of a list of key-value pairs with description. One speciality of the used keys is, that they can contain quotes that then will be added in the sentence/formulation as triple quotes.

      The editing frontend must include the possible to test the manipulated instructions with a set of given segments with terminology and context-data. When a new instruction-set is added, the base is always a copy of the original instructions. When a customzed instruction-set is loaded for editing, any instructions, which are not found in "instructions.json" are dismissed, any new instructions not present in the customized instruction-set will automatically be added with their default-value. So we ensure an easy upgradability of the customized instructions if new instructions are added to the codebase - only the "instructions.json" always must be in-sync with the codebase.

       

      Example for instructions from the code:

          'fromTo' => 'from {from} to {to}',
          'fullFromTo' => 'from {from} as source language to {to} as target language',
          'termpair' => 'use the specific translation delimited by {*delimiter*} {*termpair*}',

       

      The data in "instructions.json":

      [
         {
            "key":"fullFromTo",
            "instruction":"from {from} as source language to {to} as target language",
            "description":"This is the precise/full instruction that tells the GPT model which languages to expect for source and translate into"
         },
         {
            "key":"termpair",
            "instruction":"use the specific translation delimited by {*delimiter*} {*termpair*}",
            "description":"This is the instruction that tells the GPT model about a single termpair to be used when translating a segment/text. {*delimiter*} will be resolved to \"triple asterisk\",  {*termpair*} will be resolved to \"***sourceterm*** = ***target term***\""
         }
      ]

      The json for the customized instructionset:

      {
          "fromTo" => "from {from} to {to} but customized",
          "fullFromTo" => "from {from} as source language to {to} as target language but customized",
          "termpair" => "use the given termpair delimited by {*delimiter*} {*termpair*}",
      }

      New DB Datamodel

      LEK_openai_instructionset

          columns: ( id | name | comment | json | created | lastChange )
          json: {
              "fromTo" => "from {from} to {to} but customized",
              ...
          }
      

      LEK_openai_instructionset_assoc

      Holds the 1:1 association between the instruction-set and a language-resource.
       

         columns: ( id | instructionSetId | languageResourceId )
      

      Attachments

        Activity

          People

            pavelperminov Pavel Perminov
            axelbecher Axel Becher
            Axel Becher
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: