How Generative AI Works in 4 Steps: AI as a Function
In simple terms, AI is a function that has numerous inputs and various possible outputs.
Unlike the simple functions you learned in math class, such as f(x) = x + 2
, AI is not a straightforward function.
AI is an extremely complex function that involves a vast amount of data and a myriad of possible outputs, requiring immense data and computation to develop.
For instance,
GPT-3
, released in 2020, was reportedly trained on about 570GB of text data.
A 300-page book typically takes up about 1MB of data, so 570GB is equivalent to around 580,000 books worth of content.
Assuming it takes an average of 6 hours to read one book, this amounts to about 397 years
of reading time.
What Does Training AI Mean?
Training AI involves optimizing the parameters
(numerical values that help AI make accurate predictions) that configure this function.
Through training, the optimized function recognizes patterns in the input data and predicts outcomes for new inputs by using the learned parameters.
AI generates numerous predictions and selects the most appropriate result to produce an output.
A function that processes natural language
(language used in daily human communication) inputs and generates appropriate responses is called Generative AI
.
Generative AI produces content in various forms, such as text, images, and audio, based on its training.
How Does Generative AI Work?
The operation of generative AI can be divided into four main stages.
1. Data Training
The data preprocessing
stage is used to create a large-scale dataset
, which is then used to train the AI model
.
Key Terms
-
Data Preprocessing: Preparing data in a format suitable for training an AI model
-
Dataset: A collection of data gathered for a specific purpose, used to train or evaluate machine learning models
-
AI Model: A computer program that analyzes data to learn patterns and rules, making predictions and decisions on new data
For instance, text generation AI learns from various text-based documents such as books, articles, and web pages, while image generation AI learns from numerous photographs and illustrations.
A computer program that has completed training is called a model
, which takes input data to generate new content.
2. Pattern Recognition
AI recognizes patterns and extracts features from the input data based on the learned data.
- Text Input: The input text is
tokenized
to analyze sentence structure, vocabulary patterns, etc. Tokenization divides sentences into units like words, punctuation, and numbers.
Input: "The weather is nice today."
Tokenization: ['The', 'weather', 'is', 'nice', 'today', '.']
- Image Input: The input image's shapes, colors, and key elements are analyzed to extract features, converting them into vectors. Vectors represent words or sentences in numerical form.
Input: "Apple image"
Feature Extraction: Color (red), Shape (round), Object (apple)
Vectorization: [0.9, 0.1, 0.0, ...]
3. Context Understanding
Based on the given input data, AI identifies how each element is interrelated to come up with an appropriate output. In text input, for example, it understands the relationships among words within a sentence.
4. Content Generation
The trained AI model generates new data. For text generation, it predicts the next word probabilistically and completes the sentence based on this. For image generation, it creates a new image suitable to the given description.
More detailed content on how generative AI processes text input can be found in the course How Generative AI Understands Prompts.