I have a web scrapper written in python with the library selenium that works on my local machine. When I push the data to my droplet, I cannot run that app. This is one of my web scrapping methods:
def defense_dash_lt10(pbp_stats, season):
# Less than 10 foot
url = 'https://www.nba.com/stats/players/defense-dash-lt10?Season=' + season
options = Options()
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(service=ChromeService(
ChromeDriverManager().install()), options=options)
driver.get(url)
selects = driver.find_elements(By.CLASS_NAME, "DropDown_select__4pIg9 ")
for select in selects:
options = Select(select).options
for option in options:
if option.text == 'All':
option.click() # select() in earlier versions of webdriver
break
# Find the table element
table = driver.find_element(By.CLASS_NAME, 'Crom_table__p1iZz')
# Find all rows in the table
rows = table.find_elements(By.TAG_NAME, 'tr')
defense_dash_lt10 = []
# Loop through each row and extract the data from each cell
for row in rows:
player_dd_lt10 = []
# Find all cells in the row
cells = row.find_elements(By.TAG_NAME, 'td')
for cell in cells:
player_dd_lt10.append(cell.text)
# Add pbp stats to defense dash if not empty
if player_dd_lt10:
if player_dd_lt10[0] in pbp_stats:
defense_dash_lt10.append(player_dd_lt10 + pbp_stats[player_dd_lt10[0]][-2:])
else:
defense_dash_lt10.append(player_dd_lt10 + ['NaN', 'NaN'])
header = ['Player', 'Team', 'Age', 'Position', 'GP', 'Games', 'FREQ%', 'DFGM', 'DFGA', 'DFG%', 'FG%', 'DIFF%', "MP", "BLKR"]
return header, defense_dash_lt10
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
Heya,
To run a Selenium-based web scraper on a server, such as a DigitalOcean droplet, you need to configure it to work in a headless environment. Servers typically don’t have a GUI, so you can’t run browsers in the regular, graphical mode. Here’s how to modify your existing Selenium setup to work on a server:
Ensure Python is installed on the server.
Install the necessary drivers and browser. For Chrome, you’ll need ChromeDriver and the Chrome browser itself. You can install them using your server’s package manager. For example, in Ubuntu:
This will initialize Chrome in headless mode, allowing it to run without a GUI.
Remember, running a web scraper on a server is essentially the same as running it locally, with the key difference being the headless setup and ensuring all dependencies are correctly installed on the server.
Hi there!
Running a Selenium-based web scraper on a DigitalOcean Droplet involves several considerations that differ from running the script on your local machine. You will have to set up a headless browser environment, managing web driver installations, and ensuring your script can run in a non-GUI server environment.
Here’s how you could do that:
1. Install Required Packages
Ensure your Droplet is up to date and has Python installed. You’ll also need to install Selenium and a web driver manager, such as
webdriver-manager
, which simplifies the management of binary drivers for different web browsers.You can install these using
pip
. If you haven’t installed pip, you can install it using your package manager (e.g.,apt
for Ubuntu/Debian).2. Install a Web Browser and WebDriver
For headless operation, you can use Chrome or Firefox. This example uses Chrome, but the process is similar for Firefox.
Install Google Chrome:
Install ChromeDriver: The
webdriver-manager
package you installed earlier will handle the ChromeDriver installation in your Python script, so you don’t need to manually install ChromeDriver.3. Modify Your Script for Headless Operation
To run your browser in headless mode (without a GUI), you need to modify your Selenium script to specify headless options.
4. Running Your Script
Now, you should be able to run your script on the server just like you would on your local machine. Ensure you’re using the correct Python command (
python3
orpython
) based on your server’s configuration.If you encounter any issues, reviewing the error messages and logs can provide insights into what might be going wrong!
Hope that this helps.
Best,
Bobby