Suggestion: Make it easier for students to complete all classes. In other words, do not leave out details, such that the student gets bogged down, and loses momentum in taking classes. Permit the students to fluidly complete the classes and the course.
Problem: I was unable to get the “from bs4 import beautfulsoup” to work for the web scraping class on my Windows PC.
Effort: Spent 30 minutes trying various settings in json for the “python.pythonPath” and “code-runner.executorMap”“python”. Also, various permutations of pipenv for the packages. Then, next day 30 min for the same.
Resolution: I spun up a “droplet” on DigitalOcean (cloud service provider), with ubuntu 20.04 then ran the assignment as follows, and got it working using the python command line interpreter.
login as root
sudo apt update
sudo apt upgrade
sudo apt install software-properties-common
sudo apt update
sudo apt install python3.8
python3 --version
apt install python3-pip
apt-get install python3-venv
sudo -H pip3 install pandas
pip3 install requests bs4
sudo apt install openssh-server
sudo service ssh start
sudo apt install net-tools
ifconfig
Mosh tutorial here
mkdir pycrawler
cd pycrawler
python3 -m venv env
source env/bin/activate
pip install requests bs4
python
import requests
from bs4 import BeautifulSoup
response = requests.get(“Newest Questions - Stack Overflow”)
soup = BeautifulSoup(response.text, “html.parser”)
questions = soup.select(".question-summary")
print(type(questions[0]))
print(questions[0].select_one(".question-hyperlink").getText())
for question in questions:
print(question.select_one(".question-hyperlink").getText())
print(question.select_one(".vote-count-post").getText())
Hint: hit enter after seeing the ellipses in the python interpreter, in order to complete the for loop iterative section.
Credits: I hired a fiverr contractor a while back to teach me a simple web scraping assignment in python and he provided the above code. So, not sure of every line is needed, rather that it worked to get through the Mosh tutorial on web scraping in Python.