Context Navigation

Changes between Version 3 and Version 4 of Public/WhitePaperAiBriefing

Timestamp:: Jun 2, 2023, 12:15:59 PM (2 years ago)
Author:: Boris Horner
Comment:: —

Legend:

: Unmodified
: Added
: Removed
: Modified

Public/WhitePaperAiBriefing

-              v3
+              v4
 == Towards AGI
 AI solutions that were (undiscovered by the large public) in common use before ChatGPT appeared were trained to a certain problem class - like optimizing the pictures taken by a smartphone camera or machine translation.
+AI solutions that were (undiscovered by the large public) in common use before ChatGPT appeared were trained to a certain problem class - like optimizing the pictures taken by a smartphone camera or machine translation. A newer example is !GitHub Copilot, a version of GPT trained with large volumes of software source code in a variety of languages to plug into IDEs and assist software developers with code suggestions while typing. These are often quite useful (not necesserily brilliant, but the AI anticipates a lot of copy/paste or other routine work and can save a lot of typing). This is an example, where nowadays, it makes sense to train an AI with specific data. Depending on the type of operation expected from the AI, the underlying model doesn't necessarily have to be one of the cutting-edge models with significant hardware requirements for inference. If lingual requirements are low, for example for mapping one XML structure to another one, maintaining the data, a relatively "dumb" (low token number training) model can be used and enabled to do its job by a large volume of examples.
+On the other hand, ChatGPT or other large models with a broad general knowledge (the Open Source model Bloom, for example), can be used for a variety of tasks without any specific training. The user just provides instructions and the data to be used with the instructions in natural language. The user can also provide examples. The AI model can "understand" all these and create a useful response (hopefully, without any hallucinated pseudo-information).
+A hypothetical model that can understand and answer in natural language with a broad, general knowledge and a capability of (simulated) reasoning, all at least on the level of a human being with average intelligence and education is called an AGI (Artificial General Intelligence). A user could discuss with an AGI just as with a human chat partner on the other end.
+AI models available outside of laboratories are not yet there, but it feels like they are getting closer rapidly. And I'm not quite sure what happens inside the labs.
+But how do users interact with these systems to perform a specific task? Let's look at two examples.
+== Using chat-based LLMs
+=== Example 1: Text generation
+Let's say you want to use ChatGPT to assist you in writing a text promoting a new product for social media. You know the product, want to point out some unique features and have an idea of the target group and the language to address them in. So you probably start out by giving the chat an outline of what this is about. Next, you will perhaps give it a few examples on how to write the text - perhaps from previous, similar campaigns. You will give it further instructions about the expected length and other details, and more facts to be used as the basis for the text.
+So if we take this apart, we have three categories:
+* Instructions
+* Examples
+* Facts
+Finally, you ask ChatGPT to generate the text, iteratively give it some more instructions to improve the result and when it's good enough use it for your intended purpose.
+=== Example 2: Structuring unstructured text
+We did a number of tests with using LLM for standardized text operations. One example is to provide a confusing text like this one:
+{{{#!div style="background: #F8C471; margin: 5px; margin-left: 40%; margin-right: 30px; padding: 5px; border-radius: 10px;"
+We have a new variant of the MBZ1xxx series, the MBZ1403. The main difference is its axial fan which performs a lot better than the radial one we still use in the MBZ1000 models. To servcie the device, you must first open the casing. To do so, you must reliese the two locks on the sides and slide the casing upward. Oh BTW, need less to say to unplug the power cord before you open the box. The new variant operates at a voltage of 220 volts and consumes 1400 watts of power. To replace the fuse (10 amps, not 8 amps like for MBZ1000), first remove the protective cap. Make sure to use the correct rating. The device has an overload protection that triggers the fuse before it overheats. After replacing the fuse, close the casing (but don't forget to replace the protective cap before that).
+}}}
+And get this result (to be further refined in subsequent processing steps):
+{{{#!div style="background: #EEEEEE; margin: 5px; margin-right: 40%; margin-left: 30px; padding: 5px; border-radius: 10px;"
+{{{
+OUTPUT
+***task***
+* Release the two locks on the sides.
+* Slide the cover upward.
+* Unplug the power cord to reduce the risk of electric shock.
+WARNING Danger of death by electric shock.
+* Remove the protective cap.
+* Remove the fuse.
+* Insert a 10A fuse.
+* Replace the protective cap.
+* Close the cover.
+***reference***
+Device version  Voltage Power consumption       Fuse rating     Fan type
+MBZ1000 220V    1000W   8A      Radial
+MBZ1403 220V    1400W   10A     Axial
+***concept***
+The axial fan of the MBZ1403 performs much better than the radial fan used in the MBZ1000 model.
+***other***
+***metadata***
+CAT TechDoc
+AUDIENCE Tech
+TITLE MBZ1403 Device Servicing Instructions
+***log***
+***end***
+}}}
+}}}
+Yes, this is actually the result received from ChatGPT 3.5 API with a single call.
+== How does this work?
+The secret is, we do not only pass the actual text "We have a new variant..." to the LLM, but we prefix it with what we call a **briefing**. LLM APIs allow to not only pass one message and expect a response, but an entire dialog between human and AI can be passed, that the AI then continues with its response. This is how chat works, the latest history of the chat is passed again to the machine, together with the new user message.
+We use this mechanism to prefix the user input with a preconfigured dialog containing very condensed training data, consisting of instructions, examples and facts. We also define an input and output syntax in that. This prefix is stored as a file and automatically sent to the model when a user passes data, and chooses the specific operation.
+It's a small fraction of the data normally used for training, but if the operation is not too complex in its details, it works quite well and fits conveniently into the 4k token limit of many models. It works due to the much better text understanding and reasoning capabilities of today's models, few examples are sufficient to learn the relevant patterns.
+To make this easily usable in all types of applications, we've written a web based API to provide such a system with briefing capabilities.
+[[Image(Public/ImageContainer:djAI.jpg, align=center, width=50%, margin-bottom=20, link=)]]