Skip to main content
Practice

Merging and Joining DataFrames

In real-world data tasks, information is often spread across multiple tables. For example, one DataFrame might hold customer info, while another has their orders. To analyze them together, you’ll need to merge or join the datasets.


Merge and Join Basics

Pandas provides flexible tools for combining data:

  • pd.merge() combines rows from two DataFrames based on matching column values (like SQL joins).
  • .join() is a method that adds columns from one DataFrame to another, using the index or a key column.

Common Join Types

Join TypeDescription
InnerOnly matching rows are kept (default).
LeftAll rows from the left DataFrame, plus matches from the right.
RightAll rows from the right DataFrame, plus matches from the left.
OuterAll rows from both sides; missing values filled with NaN.

These joins let you control how much data you include — whether you want a strict match or a full combination.


A Simple Example

Imagine two tables:

  • One has employee names and IDs.
  • Another has IDs and department names.

You can merge them using the ID column as the key.


We’ll try this in Jupyter next to see how these operations work in practice.