How Generative AI Understands Prompts

As we introduced earlier, AI is a function.

However, unlike the simple functions like f(x) = 3x + 2 that we learned during Math class, AI is a function of innumerable variables = vast output possibilities, which is complex enough to be difficult for ordinary people to understand.

Just as human intelligence emerges from the brain, AI intelligence comes from a model consisting of complex functions.

AI models learn data to create neurons (nerve cells of the brain) similar to those in the human brain, and solve problems using this foundation.

The neurons of recently released generative AI use a model called Transformer.

Transformers analyze the provided prompt by breaking it down into sub-units like words and tokens, and then probabilistically predict the next word to generate sentences.

The process of how AI understands prompts can be divided into four main steps.

1. Tokenization

A token refers to the small units into which a sentence is divided, including words, punctuation, numbers, etc. When AI receives the prompt "The cat climbed up the tree," it breaks down this sentence into tokens.

Example of Tokenized Sentence

The / cat / climbed / up / the / tree

Each token helps AI find meaning in its learned data and understand the context of the sentence. Though tokens may be defined slightly differently depending on the AI model, they are generally defined as follows.

English Tokenization

English tokenization usually separates words based on spaces or punctuation (symbols used in a sentence, such as periods).

Example: "The quick brown fox jumps over the lazy dog."

Tokenizing this sentence results in 10 tokens:

The
quick
brown
fox
jumps
over
the
lazy
dog
.

Here, each word and punctuation mark becomes a token.

A single word can be divided into multiple tokens based on prefixes, patterns, and suffixes. For instance, the word "unconscious" can be broken down into the sub-elements un (prefix implying negation), con (a common pattern in English words), and ious (a common English suffix), recognized as three tokens.

Korean Tokenization

Korean tokenization is a bit more complex. Due to the abundance of particles and verb endings, it utilizes morpheme (the smallest unit that holds meaning in a language) based tokenization rather than simple word-based tokenization.

Example: "I was reading a book at the library."

Tokenizing this sentence can break it down into 11 tokens:

I (noun)
was (particle)
library (noun)
at (particle)
book (noun)
to (particle)
read (verb stem)
and (connective ending)
be (verb stem)
ing (past tense ending)
end (terminal ending)

Generally, Korean requires twice as many tokens as English for sentences of the same length.

The method of processing tokens varies depending on the AI model and the type of characters processed. ChatGPT typically allocates 1 token for every 1-4 alphabets and tokenizes Korean based on morphemes.

Note: Most text-generation AIs, like ChatGPT, charge based on the number of input and output tokens. Thus, minimizing unnecessary tokens is crucial.

2. Embedding

Tokenized words are transformed into numeric vectors. For example, the word cat can be converted into a vector (a numeric representation of words or sentences) as follows:

Vector for Cat

[0.11, 0.34, 0.56, ...]

Words with similar meanings have similar vector values. For instance, the word dog might have a vector value similar to the following:

Vector for Dog

[0.12, 0.84, 0.32, ...]

Words with similar vector values are located close to each other in vector space.

3. Context Understanding

AI understands the context of a sentence based on the vector values of the tokenized words. For example, when the words cat and tree appear together, AI understands the relationship between these two words.

It uses an Attention Mechanism to calculate how each word in a sentence is connected to other words in that sentence. The attention mechanism assigns higher importance to crucial words by calculating how each word connects with others in the sentence.

The Transformer model, one of the AI neural network models, utilizes an attention mechanism to understand the relationships among all the words simultaneously. For instance, it comprehends how cat and tree interact and identifies that cat is the subject of the action climbed.

ChatGPT is based on the Transformer model, understanding the context of prompts through the attention mechanism.

4. Response Generation

AI predicts the first word based on the input vector. In this process, AI uses a pre-trained language model to choose the most appropriate word given the context. For example, the word "cat" might be predicted first and included in the response.

Once the first word is predicted, AI incorporates this word into the context and predicts the next word. This process repeats until the response is complete, with AI continuously using the newly generated words to predict each subsequent word.

After predicting the cat, the AI might choose is small.
After predicting the cat is small, the AI might choose as a animal.

AI continues generating words until it forms a complete, coherent sentence that satisfies the original prompt.

Example of Generated Response

The cat is a small animal, commonly kept as a domestic pet.

Practice

Send a prompt example and compare the AI's response.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.

1. Tokenization​

English Tokenization​

Example: "The quick brown fox jumps over the lazy dog."​

Korean Tokenization​

Example: "I was reading a book at the library."​

2. Embedding​

3. Context Understanding​

4. Response Generation​

Practice​