Skip to main content
Practice

Types of AI and Generative AI

In the previous chapter, we looked at how AI learns and how it performs intellectual tasks. In this chapter, we will explore the major types of AI through real-world examples. We will also examine what makes generative AI different from traditional AI, and why it is advancing so rapidly.

Types of AI and generative AI

AI can be broadly divided into four types:

  • Perception: recognizing what something is
  • Prediction: forecasting what will happen next
  • Decision-making: choosing what action to take
  • Generation: creating new outputs

These four types are often combined into a single service. For example, autonomous driving requires both "perception" and "decision-making," YouTube recommendations are driven by "prediction," and chatbots center on "generation."

1. Perception AI: The Ability to "Classify" the World

Perception AI classifies and identifies what an input is. It excels at problems with predetermined correct answers and typically handles unstructured data.

What Is Unstructured Data?

Data that is not organized in a structured format like rows and columns in a spreadsheet is called unstructured data. This includes photos, audio, video, and free-form text: data without a fixed format or structure.

Image Recognition: Identifying Subjects in Photos

The feature in smartphone photo apps that automatically categorizes photos as "cat," "ocean," or "food" is a classic example of perception AI in action. A photo looks like a single image to a human, but inside a computer it is represented as a collection of countless pixel values. Perception AI learns from large numbers of example images to understand the numerical patterns common to specific subjects, then compares a new image against those learned patterns to calculate the probability of each category and selects the most likely result.

Real-world examples include:

  • Smartphone face unlock: Analyzes facial landmarks (the position and proportions of eyes, nose, and mouth) to verify a match with the registered user.
  • SNS auto-tag suggestions: Identifies people in photos and suggests who they might be.
  • Text OCR (Optical Character Recognition): Recognizes characters in images and converts them into text that computers can process. For example, it can automatically extract the store name, date, and amount from a receipt photo.

Speech Recognition: Converting Spoken Words into Text

As we saw earlier, the spoken words "hello" are, to a computer, a numerical array of vibration values changing over time. A microphone converts air vibrations into electrical signals, which are then translated into a large number of numerical values. Perception AI analyzes these numerical patterns to determine which sounds they correspond to and converts them into text.

Real-world examples include:

  • Automatic call transcription: Converts the content of a phone call from speech to text.
  • Automatic meeting notes: Transcribes a presenter's words in real time to generate meeting minutes.
  • Voice command recognition: Recognizes short commands like "set a 10-minute timer" and executes the corresponding function.

Medical Imaging Analysis: Distinguishing Normal from Abnormal

In medicine, distinguishing "normal" from "abnormal" is critically important. Perception AI analyzes medical images such as X-rays, CT scans, and MRIs to flag areas that may indicate an abnormality, assisting doctors in their assessments. AI does not replace the doctor's diagnosis. It often plays a supporting role by detecting subtle patterns that a human might miss and giving the doctor additional information.

The key benchmark for perception AI is accuracy. Misclassification can lead to safety issues or financial losses, so model performance is validated through rigorous evaluation processes.

2. Prediction AI: The Ability to "Anticipate" What Comes Next

Prediction AI handles problems that do not have a single fixed correct answer. Instead, it calculates various possibilities and presents the most probable result. The core task is not to assert what will happen, but to calculate "what is most likely to happen."

Recommendation Systems: Calculating What You Will Choose Next

Services like YouTube, Netflix, and Spotify predict what a user will watch (or listen to) next. On the surface this looks like understanding a user's taste, but in practice it is closer to analyzing patterns in behavioral data.

For example, a recommendation system uses information such as:

  • Which videos were clicked
  • How long they were watched (completion rate)
  • Whether content was skipped
  • What users with similar behavioral patterns chose

Based on this data, the system calculates "this user is likely to choose this content next" and assembles a recommendation list. The reason platforms seem to know our taste after a while is that they are actually predicting based on statistical similarities in behavioral patterns, not on taste itself.

Demand Forecasting: Calculating How Much Will Be Needed

Prediction AI is also used to anticipate future demand. For example, if a convenience store has too much prepared food left over, it loses money; too little, and it misses sales opportunities. Prediction AI combines information about the date, day of the week, weather, local characteristics, and events like exam periods or holidays to forecast sales volumes.

Real-world examples include:

  • Delivery apps: Predict likely spikes in orders during specific time slots and adjust pricing or driver allocation strategies accordingly.
  • Public transportation systems: Predict congestion levels by time of day and adjust service frequency.
  • School cafeterias: Use preference predictions for specific menus to decide how much to prepare.

Anomaly Detection: Identifying Warning Signs Early

Prediction AI is also used to detect signals before a problem occurs, rather than after the fact. This is closely tied to risk management.

  • Financial fraud detection: Analyzes unusual payment times, locations, or amount patterns to warn of potential fraud.
  • Server failure prediction: Analyzes trends in traffic increases or changes in error log patterns to predict potential failures in advance.

The core of prediction AI is probability calculation and risk management. The goal is not to produce a 100% accurate answer, but to present the most likely option given limited information and to detect potential risks early.

3. Decision-Making AI: The Ability to "Choose" What to Do

Decision-making AI goes beyond classifying correct answers or predicting the future. It decides what action to take in order to achieve a goal. This typically involves the concept of a reward. A reward is a numerical evaluation of how good the outcome of a particular action was. Actions that produce good outcomes receive high rewards; actions with undesirable outcomes receive low rewards. AI learns to act in ways that maximize these rewards.

Autonomous Driving: A Combination of Perception and Decision-Making

An autonomous driving system is a prime example of perception AI and decision-making AI working together. First, cameras and sensors identify pedestrians, traffic lights, lane markings, and surrounding vehicles. Then, in the next step, actions are chosen: "should I stop, slow down, or change lanes?" The goal is not simply to move quickly but to make choices that satisfy both safety and traffic rules at once.

Real-world examples include:

  • Slowing down or stopping when a pedestrian approaches
  • Braking immediately when the car ahead makes a sudden stop
  • Adjusting the departure timing in anticipation of a changing traffic light

This is the role of decision-making AI: considering various circumstances holistically and selecting the most appropriate action.

Game AI: Choosing Strategies Toward a Goal

Decision-making AI is also used in games. A classic example is the Go AI AlphaGo. It analyzes the current state of the board, then calculates "which move has the highest probability of winning" and selects that move. This is not simply following predetermined rules. It involves learning from vast amounts of game data, simulating countless possible sequences, and finding the most advantageous choice.

The core of decision-making AI lies in setting goals and evaluation criteria. What counts as success, and how highly each outcome is scored, completely changes how the AI behaves. Therefore, in decision-making AI, not only the algorithm itself, but also how the objective is defined and how rewards are designed, is critically important.

4. Generative AI: AI That Creates New Outputs

Generative AI goes beyond the traditional AI approach of classifying or selecting from existing answers. It creates new outputs: sentences, images, sounds, and video. This is what has led so many people to reconsider the possibilities of artificial intelligence.

How Is Generative AI Used?

Generative AI is used in a wide variety of contexts to produce drafts and organize ideas.

  • Students: summarizing text, reviewing concepts, answering questions, drafting presentations
  • Teachers: drafting lesson plans, suggesting exam questions, designing rubrics for assessments
  • Office workers: organizing meeting notes, drafting emails, structuring report outlines, assisting with coding
  • Designers and content creators: generating image mockups, writing ad copy, generating background music, drafting video storyboards

Note that generative AI does not "create from nothing." It learns patterns from vast training data and generates new combinations based on those patterns.

Generation Is Ultimately a Continuous Series of Predictions

Generative AI does not produce a complete output all at once. It quickly repeats small-scale predictions, progressively assembling a result. Understanding what unit of prediction is used in each domain makes the underlying principle clearer.

  • Text generation: Predicts the next word (or token) one at a time. These predicted words chain together sequentially to form paragraphs and complete texts.
  • Image generation: Progressively predicts and refines pixel colors, brightness values, or noise patterns to gradually sharpen the overall image.
  • Speech generation: Continuously predicts the waveform values or voice characteristics of the sound that follows over time to produce natural-sounding speech.
  • Video generation: Predicts changes in the image for the next frame in a sequence of scenes to ensure smooth, natural movement.

Generative AI may look like it is "creating," but in reality it is closer to a process of continuously appending the most statistically natural next step. Small predictions repeated hundreds or thousands of times produce a single piece of text, an image, a voice file, or a video.

As these small predictions chain together, a document, an image, an audio file, or a video is completed. Generative AI can therefore be understood as AI that assembles outputs by performing predictions in continuous succession.

5. How Is Generative AI Different from Other Types of AI?

Generative AI operates differently from perception, prediction, and decision-making AI. While the latter types focus primarily on "classifying" and "selecting," generative AI focuses on "expressing and producing." The table below compares what question each type of AI answers, what form its output takes, and what representative examples look like.

CategoryPerception AIPrediction AIDecision-Making AIGenerative AI
Typical question"What is this?""What will happen next?""What should I do?""Create something new"
Output formClassification resultProbability / scoreAction selectionText / image / code
Representative examplesFace recognition, speech recognitionRecommendation systems, demand forecastingAutonomous driving, game AIChatbots, image generation

In the next chapter, we will take a deeper look at the process by which AI learns.