I need help with Webscraping

Working in Mosh’s lecture on webcraping, I found out that the classes referred to in the lecture where he extracted the latest questions and their number of votes from “Newest Questions - Stack Overflow” have changed since the page has been updated after Mosh’s video lectures where made in the year 2018.

Although I was able to replace some of the classes in the lectures with the new classes as currently used on the site, I could not get everything done. For instance I was able to extract the list of questions from the first page only, but could get the appropriate classes to use in extracting the corresponding votes for each question. Secondly, I could not loop over all pages to extract questions beyond the first page.

Any help would be appreciated, I need to progress in my learning. Thanks in anticipation.

Below is my code on VS Code:

import requests
from bs4 import BeautifulSoup

response = requests.get(“Newest Questions - Stack Overflow”)

soup = BeautifulSoup(response.text, “html.parser”)

questions = soup.select(“.s-post-summary–content”)
for question in questions:
print(question.select_one(“.s-link”).getText())

I could not get the class to use for the number of votes. I tried the following line but it returned “None”.

print(question.select_one(“.s-post-summary–stats-item-unit”).getText())

Mosh’s code for the same program is as follows:

import requests
from bs4 import BeautifulSoup

response = requests.get(“Newest Questions - Stack Overflow”)

soup = BeautifulSoup(response.text, “html.parser”)

questions = soup.select(“.question summary”)
for question in questions:
print(question.select_one(“.question-hyperlink”).getText())
print(question.select_one(“.vote-count-post”).getText())

Mosh didn’t also complete the part of looping over all the pages to extract questions beyond page one.

Please help!

All,

Frustrating!

What we are looking for is a “class” type to choose from. Since that time, the class for “questions” on (Newest Questions - Stack Overflow) has become:

.... That class above contains two (2) class references that are space delimited! CSS class names lists are whitespace seperated, so simply select on EITHER of the two class names: 's-post-summary' or 'js-post-summary'

like:
selector = ‘.js-post-summary’
questions = soup.select(selector)