Web scraping is a technique used to extract data from websites. In this tutorial, we will learn how to perform web scraping using Python.
We will use the requests
and BeautifulSoup
libraries for web scraping. Install them using pip:
pip install requests beautifulsoup4
First, we need to fetch the content of a web page. We can do this using the requests
library:
import requests
url = 'https://example.com'
response = requests.get(url)
print(response.text)
Next, we will parse the HTML content using BeautifulSoup
:
from bs4 import BeautifulSoup
soup = BeautifulSoup(response.text, 'html.parser')
print(soup.prettify())
We can now extract specific data from the HTML. For example, to extract all the links:
links = soup.find_all('a')
for link in links:
print(link.get('href'))
In this tutorial, we learned the basics of web scraping using Python. We covered fetching a web page, parsing HTML content, and extracting data.