Skip to main content
Practice

How to Crawl Static Stock Data

If stock data is provided in a static format, web crawling can be performed using just the requests and BeautifulSoup libraries, as shown in the practice example.

In this lesson, we will learn how to crawl data within a virtual stock data table like the one found at this link.

Company NameCurrent PriceChangeRate of Change
Company A1064262.44%
Company B1458-35-2.40%
Company C1991492.46%
Company D2595220.85%
Company E3074-36-1.17%
Company F59820.33%

The example table data presented is static, which means it does not change unless refreshed.


Code Explanation

Similarly to what was learned earlier, you can perform crawling using the requests and BeautifulSoup libraries, but it is essential to set response.encoding = "utf-8" to correctly retrieve any characters, such as Unicode text.

We will also explore more advanced usage of the find method like find("td", {"class": "company-cell"}) and find_all("tr").

Step 1

Fetching HTML from a Web Page
response = requests.get(url)
response.encoding = "utf-8"
html_content = response.text
  • requests.get(url): Sends a request to the specified URL and retrieves the webpage data.
  • response.encoding = "utf-8": Sets the response encoding to UTF-8 to prevent character issues like broken Unicode text.
  • html_content = response.text: Stores the received HTML content in text format.

Step 2

Parsing HTML
soup = BeautifulSoup(html_content, "html.parser")
  • Creates a BeautifulSoup object that parses the HTML content, enabling easy access to HTML elements.

Step 3

stock_table = soup.find("table", {"id": "stock-table"})
  • Uses the soup.find() method to locate the table element containing stock data (<table id="stock-table">) within the HTML.

Step 4

Extracting Data from the Table
for row in stock_table.find("tbody").find_all("tr"):
  • stock_table.find("tbody").find_all("tr"): Iterates over every row (<tr>) in the table’s <tbody> section.

Extracts the following data from each row:

  • Company name: Text from the <td> element with class="company-cell".
  • Current price: Text from the <td> element with class="current-price-cell".
  • Price change: Text from the <td> element with class="diff-cell".
  • Rate of change: Text from the <td> element with class="fluct-cell".

Step 5

Output
print(f"{company_name}: Current Price {current_price}, Change {price_change}, Rate of Change {change_percentage}")
  • Formats and prints the extracted data.

Practice

Click the Run Code button on the right side of the screen to see the crawling results or tweak the code as needed!

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.