You've successfully subscribed to The Poor Coder | Hackerrank Solutions
Great! Next, complete checkout for full access to The Poor Coder | Hackerrank Solutions
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Search for required tags in a website using python requests and beautiful soup

Search for required tags in a website using python requests and beautiful soup

Using python, we will find out whether there is a html table in the list of urls we have

Beeze Aal
Beeze Aal

Assuming we have a urls.txt file where we have a list of urls

https://www.w3schools.com/
https://www.w3schools.com/html/html_tables.asp
https://www.thepoorcoder.com
https://www.thepoorcoder.com/generating-random-marks-and-plotting-to-graph-in-python/

We will find out whether there is a html table in the list of urls we have

Solution

from bs4 import BeautifulSoup
import requests

#open urls.txt file where we have our list of urls
with open("urls.txt", "r") as f:
    urls = f.read().splitlines()
    
#Fake user agent to avoid blocking
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36"
}
#print urls found in our text file
print("List of urls are\n"+"\n".join(urls)+"\n")

#loop through each url
for url in urls:
	#use requests to get page html source
    res = requests.get(url, headers=headers)
    #use beautiful soup to parse html page
    soup = BeautifulSoup(res.text, "html.parser")
    #print page print
    print("Page title:",soup.title.string)
    #search for table using soup.find("required-tag-here")
    if soup.find("table"):
        print("Table found in", url)
    else:
        print("No table found in", url)

Output

List of urls are
https://www.w3schools.com/
https://www.w3schools.com/html/html_tables.asp
https://www.thepoorcoder.com
https://www.thepoorcoder.com/generating-random-marks-and-plotting-to-graph-in-python/

Page title: W3Schools Online Web Tutorials
No table found in https://www.w3schools.com/
Page title: HTML Tables
Table found in https://www.w3schools.com/html/html_tables.asp
Page title: The Poor Coder
No table found in https://www.thepoorcoder.com
Page title: Generating random student marks and plotting to graph in Python
Table found in https://www.thepoorcoder.com/generating-random-marks-and-plotting-to-graph-in-python/