Handling HTML and JSON Responses
When using web APIs or scraping web pages, the responses from the server can come in various formats.
1. HTML Responses
For web pages, responses usually come in HTML format. You can parse and extract data from static HTML using libraries like BeautifulSoup
.
In this case, after requesting an HTML page with requests
, you can parse and extract the desired data using BeautifulSoup
.
import requests
from bs4 import BeautifulSoup
response = requests.get('https://example.com')
soup = BeautifulSoup(response.text, 'html.parser')
# Extract webpage title
title = soup.find('title').text
2. JSON Responses
API responses from servers are mostly in JSON format.
When you receive a JSON response, you can convert it into a Python object using the response.json()
method.
Conversion takes place in the form of a Python dictionary, making it easy to utilize the data in Python code.
Here, you can also use Python's json
module to convert JSON data into a Python object.
import requests
from bs4 import BeautifulSoup
response = requests.get('https://httpbin.org/get')
# Parse JSON data
data = response.json()
url = data['url']
# Print JSON data
print(data)
print('-' * 20)
# Utilize JSON data
print("Host:", url)
Practice
Click the Run Code
button on the right side of the screen to verify crawling results or adjust the code!
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.