Skip to main content
Crowdfunding
Python + AI for Geeks
Practice

Fundamentals of Web Crawling in HTML

HTML (HyperText Markup Language) defines the backbone of a web page and serves as a markup language for structuring content.

A markup language is a system for defining the structure and content of a document.


Basic Structure of HTML

An HTML document consists of tags enclosed in angle brackets (< >).

A tag typically includes a start tag (<tag>) and an end tag (</tag>), with content placed between these tags.

For example, the <h1> tag represents a heading, and is closed with a /, like this: </h1>.

An HTML document consists of elements, which are units made up of tags and content.

For instance, <h1>Title</h1> is an element where the text "Title" is enclosed within an <h1> tag.

The basic structure of an HTML document is as follows:

Basic HTML Structure
<!DOCTYPE html>
<html>
<!-- The section containing the document's metadata -->
<head>
<title>Page Title</title>
</head>
<!-- The section containing the content of the web page -->
<body>
<h1>Heading Element</h1>
<p>Paragraph Element</p>
</body>
</html>
  • <!DOCTYPE html>: Defines the version of the HTML document, indicating that it is an HTML5 document.

  • <html>: The root element of the HTML document that includes all HTML elements.

  • <head>: Contains the document's metadata (title, description, styles, etc.).

  • <title>: Defines the page title displayed on the browser tab.

  • <body>: Encloses the content of the web page.

  • <h1>, <p>, etc.: Various HTML elements represent different types of content such as headings, paragraphs, etc.


Key HTML Tags

HTML includes a variety of tags, each representing specific types of content:

  • <h1> to <h6>: Heading tags, where <h1> is the largest and <h6> is the smallest.

  • <p>: Represents a paragraph.

  • <a>: Creates a hyperlink that links to another web page.

  • <img>: Embeds an image in the document.

  • <ul>, <ol>, <li>: Define unordered lists (<ul>) and ordered lists (<ol>), with <li> representing list items.


For more detailed information about HTML, check out the Introduction to HTML Course.


Practice

Follow the sections highlighted in the code to fill in the blanks.