wiki:Public/WhitePaperChatForOne

Version 29 (modified by Boris Horner, 20 months ago) ( diff )

Chat for one

A white paper about experiences with ChatGPT

Introduction

Over the last months, I spent some time chatting with ChatGPT. The first chats were rather uncertain and probing, getting an idea of the AI's capabilities of understanding and generating text. With more and more tasks the AI successfully performed, I became more daring, posing tasks where I didn't really expect the AI to return a meaningful response. In not so few cases, I was surprised. In some cases, I don't have a solid theory how ChatGPT did this.

It's natural, that many of the tests focused on the question of how AI could help technical writers, translators, project managers, service technicians and other stakeholders in the field of Technical Documentation. But likeways, many other tests touched other areas, like music, games, cooking or retrieval of information of any kind. In software development, it proved extremely helpful, because it can not only describe how to proceed, but it can also generate code that tries to solve the problem - often, it does so quite intelligently. And when I mentioned music, it was helpful in songwriting by finding words that rhymed with a given ending, or by finding the chord for any given set of notes - so far, I expected it could do such things. However, I brought it to write new verses based on existing ones, tuning in to the mood of the song, and it was even aware of rhythm in relation with the lyrics. Amazing!

Finally, I asked it to lead a simple pen and paper role playing session - first it seemed stubborn and only explained what pen&paper RPG is, but then it picked up the task. It seemed to have some trouble to remember how many gold pieces my character owned and what equipment in detail he has in a certain situation, but the adventure itself is quite consistent so far, and the interactions with other characters are quite realistic, they are consistent and each has a recognizable personality.

All of the dialogs are based on the standard chat interface of ChatGPT that's available to everyone. To iron out the glitches in the performance of the AI models, the API based integration offers more possibilities. Knowledge can be injected in a structured form instead of exposing it verbally in a chat. While problems remain to be solved, including the one what happens with data when you type it into the chat or upload it through the API, the powers of ChatGPT's NLP model are fascinating and impressing in many different ways.

I can encourage you to make your own tests, and remember, this is only the beginning…

Structuring confusing text

Hello, I have a text here, it's a guide, but it's poorly structured. Can you please improve it?

Yes, I can definitely help you with that. Please share the text with me and I'll see what I can do.

Here it is:

"Remove the cover plate after loosening all 4 screws on the sides. Then pull the locking bar forward and fold the insulation layer back. Before starting this work, unplug the device. The device operates on 230 volts and has a power consumption of 1000 watts. To replace the fuse, you must first remove the protective cap. Once the protective cap has been removed, you can replace the fuse. Pull out the fuse, insert the new fuse, put the protective cap back on, and then fold the insulation layer back down. Finally, tighten the screws on the cover plate. Place the cover plate on before doing so."

Here's what you should do:

  • Segment it into simple and clear steps.
  • Order the steps correctly.
  • Extract technical data to a table.

Here's an improved version of the guide:

Step-by-Step Guide:

Step 1: Unplug the device
Before beginning any work, unplug the device from the power outlet to avoid electrical hazards.

Step 2: Remove the cover plate
Loosen all 4 screws on the sides of the device.
Remove the cover plate.

Step 3: Pull the locking bar and open the insulation layer
Pull the locking bar forward.
Fold the insulation layer back.

Step 4: Replace the fuse
Remove the protective cap.
Take out the old fuse.
Insert the new fuse.
Put the protective cap back on.

Step 5: Close the insulation layer
Fold the insulation layer back down.

Step 6: Secure the cover plate
Place the cover plate back on.
Tighten the screws on the sides of the device.

Technical Data:

Voltage Power Consumption
230 volts 1000 W

So, ChatGPT seems to be able to improve the quality of instructions by ordering them, applying some rules (like writing in clear and simple language), removing redundancy and even adding the information why it makes sense to unplug the device before opening it.

So let's see what else it can do.

DITA

Knowing that ChatGPT was able to extract information from an unstructured text and to transform it into a step-by-step procedure with warnings, I tried whether it was able to express this in DITA. I knew ChatGPT was able to create program code, or fixe code that does not work correctly. DITA is nothing entirely diffent. So I tried.

I started out with simple cooking recipes. First, I retrieved instructions for pancakes. When I was satisfied with the recipe itself, I asked ChatGPT to create a DITA topic. It created a syntactically and grammatically correct DITA topic, but it was a topic, not a task. So I asked it to create a task. At first, it was invalid, but I was able to teach it how to perform the task correctly and ended up with a valid DITA task.

In the next iteration, I asked it to place the ingredients as prereq and the warnings as hazardstatement. At first, the hazardstatement was wrongly structured, but I was also able to teach it how to write correct warning messages.

To keep ChatGPT from just repeating what I tought it, I asked it to apply what it learned to a different task, resetting a Google Pixel 7 phone to factory settings. Here's the result:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE task PUBLIC "-//OASIS//DTD DITA Task//EN" "task.dtd">
<task id="factory-reset-pixel-7">
  <title>Performing a Factory Reset on Google Pixel 7</title>
  <taskbody>
    <prereq>Before starting, make sure your Google Pixel 7 is fully charged and backed up. Performing a factory reset will erase all data on your phone.</prereq>
    <steps>
      <step>
        <hazardstatement type="warning">
          <messagepanel>
            <typeofhazard>Data loss.</typeofhazard>
            <consequence>Performing a factory reset will erase all data on your phone.</consequence>
            <howtoavoid>Make sure you have backed up all important data before proceeding.</howtoavoid>
          </messagepanel>
        </hazardstatement>
        <cmd>Go to the Settings app on your phone.</cmd>
      </step>
      <step>
        <cmd>Scroll down and tap on System.</cmd>
      </step>
      <step>
        <cmd>Tap on Advanced.</cmd>
      </step>
      <step>
        <cmd>Tap on Reset options.</cmd>
      </step>
      <step>
        <cmd>Tap on Erase all data (factory reset).</cmd>
      </step>
      <step>
        <cmd>Follow the on-screen instructions to complete the reset process.</cmd>
      </step>
    </steps>
  </taskbody>
</task>

A similar task of explaining how to remove old paint using NaOH solution can be found here.

Terminology correction and extraction

Here's the text to be corrected:
The sintering device serves to vitrify material that is transported via the conveyor belt. The sintering plant is directly connected to the conveying device so that material from the previous processing station is vitrified without delay. When loading the material into the plant, it should be noted that the strands are not inserted twisted into the sintering chamber. Twisted strands can lead to the deactivation of the sintering heating system and render the material unusable.

Here are the rules:
Term: sintering device
Forbidden: sintering plant, sintering chamber, sintering heating system

Term: conveyor belt
Forbidden: conveying device

The sintering device is used to vitrify material that is transported via the conveyor belt. The device is directly connected to the conveyor belt, allowing material from the previous processing station to be vitrified without delay. When loading the material into the device, it should be noted that the strands should not be introduced twisted into the sintering chamber. Twisted strands can cause the deactivation of the sintering heating device and render the material unusable.

No, you should write "sintering device" instead of "sintering heating device", since the former is the term to be used according to the rules.

You are correct. I apologize for my mistake. Here's the corrected text:

The sintering device is used to vitrify material that is transported via the conveyor belt. The device is directly connected to the conveyor belt, allowing material from the previous processing station to be vitrified without delay. When loading the material into the device, it should be noted that the strands should not be introduced twisted into the sintering chamber. Twisted strands can cause the deactivation of the sintering device and render the material unusable.

It basically performed quite well, but not perfectly well. This is quite characteristic for using the chat. It tends to forget things that were mentioned a while ago and they need to be brought back into consideration by repeating them. When using the API, rule sets can be uploaded which makes the process more reliable. I tested the same text in German before I tried English, and it performed significantly better. I assume, this is due to the way nouns can be composed in German. Apparently, this grammatical property of the German language makes it easier for ChatGPT to recognize terms (and then replace them).

Tests with the extraction of technical terms (and then listing them in alphabetical order, grouped by nouns, verbs and adjectives) were successful in the sense that ChatGPT extracted many words it considered technical terms and sorted them correctly into the categories. However, it seemed to have a rather diffuse "understanding" of what a technical term is. Another problem that can be solved through the API by giving it many examples of text and the contained technical terms to train it.

Other markup

Encouraged by these DITA results, I asked it to compile an HTML document with a table of CCMS products I gave it as input. I asked for a column that shows whether the software product fully supports DITA. It did not retrieve the information correctly in all cases, it was too optimistic. But that's only natural, it uses the info sources it has, and among them are surely websites where software vendors claim their products provide full DITA support while they don't.

Again, if the system has access to solid data through the API, it should return reliable results. But I find the HTML and CSS very adequate. And if you want something changed, just ask for it.

<!DOCTYPE html>
<html>
<head>
        <meta charset="UTF-8">
        <title>CCMS DITA Support</title>
        <style>
                table {
                        border-collapse: collapse;
                        width: 100%;
                }
                th, td {
                        padding: 10px;
                        text-align: left;
                        border-bottom: 1px solid #ddd;
                }
                th {
                        background-color: #f2f2f2;
                        font-weight: bold;
                        color: #555;
                        text-transform: uppercase;
                        letter-spacing: 0.1em;
                }
                td:first-child {
                        font-weight: bold;
                        color: #333;
                }
                .yes {
                        background-color: #b3e6b3;
                        color: #008000;
                        font-weight: bold;
                }
                .limited {
                        background-color: #ffe6b3;
                        color: #ff9900;
                        font-weight: bold;
                }
                .no {
                        background-color: #ffc2b3;
                        color: #ff0000;
                        font-weight: bold;
                }
        </style>
</head>
<body>
        <h1>CCMS DITA Support</h1>
        <table>
                <tr>
                        <th>Product</th>
                        <th>DITA Support</th>
                </tr>
                <tr>
                        <td>One product</td>
                        <td class="yes">Yes</td>
                </tr>
                <tr>
                        <td>Another product</td>
                        <td class="yes">Yes</td>
                </tr>
                <tr>
                        <td>Yet another</td>
                        <td class="no">No</td>
                </tr>
                <tr>
                        <td>Cinnamon CMS</td>
                        <td class="yes">Yes</td>
                </tr>
        </table>
</body>
</html>

ChatGPT can apparently generate many types of files that can be expressed as text. Since SVG is basically XML, I tried whether ChatGPT can draw. I asked for a diagram with a few boxes with the acronyms "ERP", "CCMS" and "PIM", linked up with lines between them. Here's the result, after some iterations to improve the layout (center text, adjust box and text sizes, distances etc.). Again, this is something that can be done better with the API by providing static styling rules. SVG diagram created by AI

I admit, this looks somehow impressing, but not like a killer app. Drawing this image with InkScape would probably have taken less time than explaining it.

A more realistic use case in real life could be to upload an SVG from a supplier through the API, apply a set of style guide rules (for example, unify line widths, exchange logos and apply CI colors), then extract all the strings. These would then be translated the traditional way, using a Translation Memory, and finally re-injected into the SVG. Another approach for translation could be to let the AI translate, while applying a defined terminology, and, as a quality check, translate back to the source language. There shouldn't be any differences, as there wouldn't be in a TM based approach. Letting the AI do this, until the back-translation is identical, and learning by trial and error would be the approach closest to how AIs work.

Using and developing software

ChatGPT is a very helpful companion in software development. I was recently porting a web application to .net Core MVC. I compared solving some of the problems "by hand", using the documentation and forum entries a search engine retrieved, with asking ChatGPT for help. In many cases, the latter is much more efficient, particularly, when the field you are trying to get acquainted with is rather unknown so it's even difficult to find the right search terms.

With AI support, it's possible to just explain the problem.

Still, it won't work writing a program without any programming skills at all. Solutions from ChatGPT are often very helpful and illustrate the way how to do it, but often, they are not totally perfect. Without any knowledge, a ChatGPT user wouldn't get this to run, but an experienced programmer without knowledge about the target system could resolve the remaining issues by him/herself, or by asking the AI again.

But also for users it's very helpful. I wrote a LISP script for the DTP program Interleaf (a.k.a. Quicksilver) some 20 years ago. Recently, I needed to test it on some data, but I couldn't remember how to run the script. I asked ChatGPT and it replied:

To apply the Lisp plugin to a document, you first need to load it into Quicksilver. Here's how to do it:

  1. Click on "Tools" in the Quicksilver menu bar.
  2. Select "Load Lisp" from the dropdown menu.
  3. Choose the .lsp file that you have saved on your desktop.
  4. The plugin is now loaded into Quicksilver.

Once the plugin is loaded, you can apply it to a document by following these steps:

  1. Open the document that you want to apply the plugin to.
  2. Click on "Tools" in the Quicksilver menu bar.
  3. Select the plugin from the list of loaded Lisp plugins.
  4. The plugin is now applied to the document.

Depending on what the plugin does, you may need to take additional steps to achieve the desired result. If you need further assistance, you can also reach out to the Quicksilver community or support.

The outdated version I was using behaved a bit differently, but the task description was close enough I got it to work. Getting help on nearly anything with extremely low requirements of a-priori knowledge can be very relieving and saves a lot of time.

Music

Having gathered some insight into what NLP AIs can do for technical writers, software users and software developers, I drew the circle a bit wider. In my spare time, I write and produce songs with a Digital Audio Workstation. For a song I was just working on, I needed some help with the lyrics. The chat was rather long, so I will summarize the key steps.

First I needed a word that rhymed with shines. ChatGPT suggested:

  • Lines
  • Spines
  • Divines
  • Combines
  • Designs
  • Declines
  • Refines
  • Undermines
  • Outshines
  • Realigns

I found a good line that ends with lines.

Next, I wanted some help for the first two lines of the second verse of the lyrics. The music for the verse is already more or less defined, so I had a certain rhythm in mind. The first two lines of the first verse were:

I anticipated a future that’s bright and tender
And we negotiated, but it feels like a false surrender

So I asked for suggestions. The result was:

The world is spinning faster, I'm losing my grip
I'm reaching for the future, but I'm slipping
The walls are closing in, I'm suffocating
I'm praying for a miracle, but it's not forthcoming

Well - I told it, that this doesn't quite capture the spirit of the song. These lines sound desperate, but the song is not meant to be so at all. Perhaps a little serene, at best…

And what's even worse, the lines didn't match the rhythm at all. So I explained:

Also, the verse doesn't quite fit the intended rhythm of the verse…
The first two lines of my verse can be considered to be split behind "anticipated" and "negotiated", like:
I anticipated
a future that’s bright and tender
And we negotiated,
but it feels like a false surrender

Each of these lines is two bars long (4/4, BTW)
The 1st line "I-an-ti-ci-pa-ted" is sung on the 3-4-1-2-3-4 of the first two bars. 3rd line equivalent.

Can you please try again?

Sure, based on the additional information you provided, here are some revised lyrics that might work better:

Verse
I'm walking through the shadows, in search of something true
The light is always shining, just out of my view
I'm wandering in circles, searching for a sign
The road is long and winding, but I know I'll be just fine

Each line of this verse is two bars long, with a similar rhythm to the first two lines of your existing verse. The first line is sung on the 1-2-3-4 of the first two bars, the second line is sung on the 3-4-1-2-3-4 of the first two bars, and so on.

I hope this helps! Let me know if you need any further assistance.

So, isn't that impressing? I expected it to be able to remove the unwanted tristesse from the lyrics, but the rhythmic instructions were rather a test I assumed to fail. It might be difficult to judge if you don't know the music that should go with it, but be assured, the rhythm works.

By Boris Horner, April 2023

Note: See TracWiki for help on using the wiki.