What Does It Mean for AI to 'Learn'?
Training an AI involves analyzing patterns
from vast amounts of example data, allowing the AI to accurately process new, unseen data.
From a technical standpoint, AI training revolves around developing an algorithm, which is a step-by-step procedure that enables the model to generate the correct output based on new input data.
To better understand how AI learns, let's walk through the process of training a model to classify spam emails.
1. Data Collection and Preprocessing
Before an AI model can learn, it requires a large dataset containing email examples. This raw data must be transformed into a format that the model can understand, a process known as preprocessing.
First, prepare a large amount of email data for the AI model to learn from, and transform this data into a format the AI model can understand. This process is known as preprocessing.
For example, converting gender input to 1 for male and 0 for female, or transforming certain words into numbers following a specific rule, are part of preprocessing.
Also, handling missing data
and removing duplicate data
are important preprocessing tasks.
2. Pattern Analysis (Learning Process)
Once preprocessed data is fed into the AI model, the learning process begins. The model extracts features from the data and identifies patterns relevant to its task.
-
Data Input: The email text is fed into the AI model.
-
Pattern Recognition: The model analyzes features such as word frequency, sender's email address, and email length. Initially, it makes random connections, but over time, it learns which features are important for distinguishing spam emails.
-
Repeated Learning: The model undergoes thousands or even millions of training cycles, continuously refining its ability to classify emails as spam or non-spam.
Through this iterative process, the AI model improves its accuracy in identifying spam emails.
3. Storage of Learned Information
Once trained, the model saves the extracted knowledge in files with formats such as .h5
, .pkl
, or .pb
. These files contain numerical values that represent how the model evaluates different features.
Internally, AI models store their knowledge in weights and biases, which determine how much influence each feature has on the final prediction.
Key Terms
-
Weights: These determine the importance of specific features in the input data. For instance, if words like
"free"
,"win"
, or"click"
are common in spam emails, they will be assigned higher weights, increasing the likelihood of classification as spam. -
Bias: This adjusts the model’s prediction output to prevent skewing. If spam emails tend to have certain characteristics, the bias helps the model recognize them more easily, even in the absence of specific words.
The relationship between weights and bias is expressed through the following equation:
y = w1x1 + w2x2 + ... + wnxn + b
Here, y
is the model's output (final result), w
represents the weights, x
is the input data, and b
is the bias.
The bias b
helps activate a neuron even if all input values are zero. In other words, it adjusts the result to calibrate the activation threshold of neurons.
Weights and Bias Saved as Files
* Weights Matrix:
[
[0.2, -0.4, 0.6, 0.1],
[-0.3, 0.8, -0.5, 0.2],
[0.1, -0.2, 0.3, -0.6],
[0.7, 0.1, -0.4, 0.5]
]
* Bias Vector:
[0.1, -0.2, 0.3, 0.4]
Here, the weights matrix
consists of four rows and four columns, with each weight element indicating the importance of a specific feature that the model has learned.
The bias vector
contains four bias values corresponding to each row, representing values additionally added during the model's prediction.
The signs and magnitudes demonstrate how the AI model evaluates each feature. A positive (+)
value may imply a positive influence, while a negative (-)
value can imply a negative influence.
4. Model Utilization
Once trained, the model is ready to process new emails.
-
New Data Input: Input a new email into the model.
-
Pattern Matching: The model analyzes the email’s features using the saved weights and predicts whether it is spam.
-
Result Output: The model outputs whether the email is spam or a normal email.