Changes between Version 1 and Version 2 of Public/WhitePaperAiBriefing


Ignore:
Timestamp:
Jun 2, 2023, 10:29:50 AM (18 months ago)
Author:
Boris Horner
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/WhitePaperAiBriefing

    v1 v2  
    11= AI briefing
     2The classical way of bringing Large Language Models (LLMs) to reliably perform certain tasks is training. To train a LLM, typically, a large number of inputs and expected outputs of the AI are compiled to a data set. In a training run, the data is integrated into the AI's "knowledge" by computing weights based on the new data. This step requires very significant computational power of high-end GPUs with sufficient RAM.
     3
     4Just to give an impression of the dimensions, realistic hardware for training could be a server with 192 GB of RAM and 5 to 10 NVIDIA A100 boards with 80 GB Video RAM each, and each costing around $15,000. Such servers can be rented as dedicated machines for significant monthly fees, but since training is only performed sometimes and takes a few hours to days, it's more adequate to rent cloud hardware for this purpose when it's needed and use smaller, cheaper hardware for inference.
     5
     6While renting such a server for a training sounds more expensive than it actually is (taking into account the short time it's needed), typically in the hundreds of US$, the more expensive aspect is gathering and preparing the training data.
     7
     8But is this really the way users have experienced LLMs since ChatGPT was released in November 2022? Who has actually trained an AI to achieve their goals?
     9
     10
     11