Skip to main content
Practice

Web Scraping U.S. Stock Indices with Selenium

In the previous lesson, we introduced the requests and BeautifulSoup libraries, which can be used to extract desired data by fetching the HTML code of a specific web page.

However, if a web page is dynamically generated, meaning the content changes based on user interactions, simply using requests and BeautifulSoup is not sufficient to extract the desired data.

Modern websites often receive constantly changing data from the server and display it to the user; such web pages are called dynamic web pages.

Because requests and BeautifulSoup cannot handle dynamic data received from the server, we need another method to extract dynamic data.

In such cases, we use the Selenium library to scrape data from dynamic web pages.


Introduction to the Selenium Library

Selenium is a library used to automate and test web pages.

Since it can directly control a web browser, it can perform tasks such as scraping dynamic data or clicking on and entering data into specific elements on a web page.


Practical Example: Scraping U.S. Stock Indices with Selenium

In this practical example, we will introduce how to use Selenium to scrape real-time U.S. stock indices.

The code used in the exercise will scrape real-time U.S. stock indices from the Yahoo Finance website.

U.S. Stock Indices Scraping Code
# Launch the Chrome web driver to open a browser window
driver = webdriver.Chrome()

# Navigate to the 'Markets' page on Yahoo Finance
driver.get('https://finance.yahoo.com/markets/')

# Wait until the page is fully loaded (maximum wait time of 10 seconds)
wait = WebDriverWait(driver, 10)

...(truncated)...

More details about Selenium can be found in Chapter 3 of the course Essential Knowledge for Work Automation, which will expedite your workflow.

Click the green ▶︎ Run button in the code editor to check real-time U.S. stock indices!