Turtle

Activity

Posted in Address already in use for headless browser

October 11, 2020 11:54am

I use Watir and headless Firefox/Chrome to scrape a website deployed with Hatchbox:

def perform(uid_number)
pool = RandomPort::Pool.new #random_port gem to generate a free port
port_number = pool.acquire #random_port gem
@company = Company.find_by!(uid: uid_number)
Webdrivers.configure do |config|
config.proxy_addr = "some_url"
config.proxy_port = "80"
config.proxy_user = "user_name"
config.proxy_pass = "password"
end
args = ["--headless", "--marionette-host=#{port_number}"]
browser = Watir::Browser.new :firefox, headless: true, options: {args: args}
#code for scraping goes here
browser.close
pool.release(port_number) #random_port gem
end

When I run above command from the command line, it works without problems. When I run several headless browsers (Firefox/Chrome) as a background job, I get Errno::EADDRINUSE: Address already in use - bind(2) for [::]:port_number (hence the use of the random_port gem). I read that the webdrivers should look for a free port, so I don't know how to fix this. Even when I kill the process, restart the server I get the problem after a short while. How can I solve this?

I would also be interested what setup you use to scrape JavaScript heavy websites as I'm just starting out. Thanks!

Rails for Beginners

Advanced Ruby: Behind the Magic

Payments with Rails Master Class

Refactoring Rails

Learn Hotwire

Install and Deploy Rails Guides

Hatchbox.io

Jumpstart Rails SaaS Template

Remote Ruby Podcast

GoRails Open Source

Rails Hackathon

Beginner Bounties

Ruby on Rails Job Board

Activity

Posted in Address already in use for headless browser