wiki:Public/StartPageQualityChecking

Online and offline quality checking

What does "automatic quality checking" mean?

First, we must define Quality. For the purposes of this article, let's see quality as the level of adherance to clearly defined quality standards.

Manual quality checking happens, for example, during review of data, where reviewers check the content for completeness, factual and lingual correctness, adherance to authoring guidelines etc.

Automatic quality checking is measuring the adherance to quality standards (represented as a set of rules) by means of software, and reporting broken rules. Not to mention, software can currently not - or at least not very well - check whether any technical information is correct, since the software does not "understand" the technical reality that's subject of the documentation. Automatic checking is, however, very good in mass checking formal rules.

Here are some examples of rules that software can evaluate, and that are common in authoring guidelines:

  • Not more than seven bullet points in a list.
  • Non-inline images must have a caption (even though the XML data model might not enforce it).
  • Forbidden terminology must not be used.
  • Sentences must not exceed a certain number of words.
  • XML elements expecting numbers or dates as content must fulfill formal rules (correct separator characters in numbers, correct date formats, correct currency notation, etc.).

The above checks could, for example, be performed in the CCMS whenever an author checks a module in. If there are rule breeches, the author gets notified immediately and must correct the content. Only when no further rule breeches exist, the module can be passed on to (human) review. In this sense, they are online checks, because they are closely integrated into the authoring process. Such automatic checks against so-called business rules are common practice in S1000D based aerospace and defense projects.

In other situations, offline checks make sense. Offline checks do not occur at a certain step in the authoring process, but they are applied to an existing set of data. Here are some real-world examples of offline checks:

  • Translation Memory systems are often configured to allow multiple, coexisting translations from the same source, and translators freely use this option. After some time, it is hard to find any modules in the CCMS where all text segments have unambiguous matches in the TM. Thus, automatic translation from the TM can rarely be applied, causing significant extra cost. Offline ambiguity checks in a TM can identify and clean up those duplicates.
  • During migration, it makes sense to check source data first whether it fulfills some minimal requirements to reasonably transform them to a new data model. Modules containing significant structural faults are not transformed but rather assigned to an author for manual transformation.

What we can do for you

Any rule-based checking (manual or automatic, online or offline) requires a set of rules to check against. Applicable rules are often layed down in authoring guidelines or in graphics style guides. Thus, if you don't have any such rules, we should first work out a reasonable set of rules together with you.

Not all of such guideline rules are candidates for automatic checking, but many are. Some checks require specialized software products like terminology databases. We can help you finding the checks that make sense and design the processes making optimal use of them.

The example of translation memory cleanup described above can be handled with TMQE (Translation Memory Quality Editor), a software tool developed by texolution and reinisch. Using the tool, we can efficiently clean up your TMs or enable you to do so yourself.

Last modified 2 months ago Last modified on Sep 18, 2020, 10:35:10 AM
Note: See TracWiki for help on using the wiki.