Working in Mosh’s lecture on webcraping, I found out that the classes referred to in the lecture where he extracted the latest questions and their number of votes from “Newest Questions - Stack Overflow” have changed since the page has been updated after Mosh’s video lectures where made in the year 2018.
Although I was able to replace some of the classes in the lectures with the new classes as currently used on the site, I could not get everything done. For instance I was able to extract the list of questions from the first page only, but could get the appropriate classes to use in extracting the corresponding votes for each question. Secondly, I could not loop over all pages to extract questions beyond the first page.
Any help would be appreciated, I need to progress in my learning. Thanks in anticipation.
Below is my code on VS Code:
import requests
from bs4 import BeautifulSoup
response = requests.get(“Newest Questions - Stack Overflow”)
soup = BeautifulSoup(response.text, “html.parser”)
questions = soup.select(“.s-post-summary–content”)
for question in questions:
print(question.select_one(“.s-link”).getText())
I could not get the class to use for the number of votes. I tried the following line but it returned “None”.
print(question.select_one(“.s-post-summary–stats-item-unit”).getText())
Mosh’s code for the same program is as follows:
import requests
from bs4 import BeautifulSoup
response = requests.get(“Newest Questions - Stack Overflow”)
soup = BeautifulSoup(response.text, “html.parser”)
questions = soup.select(“.question summary”)
for question in questions:
print(question.select_one(“.question-hyperlink”).getText())
print(question.select_one(“.vote-count-post”).getText())
Mosh didn’t also complete the part of looping over all the pages to extract questions beyond page one.
Please help!