James Reece
Joined
Activity
I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data
https://www.thegazette.co.uk/company/07877158/filings/data.json
The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information
The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)
I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites#
I've tried adding response["Cache-Control: no-cache"]
but no sucess there
My rake code is as follows
desc "Monitor"
task :S_01 => :environment do
require 'rubygems'
require 'open-uri'
require 'openssl'
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options = {
use_ssl: uri.scheme == "https",
}
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
end
company = CompanyBorrower.where(id: 43)
company.each do |f|
begin
#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)
ch_s = gf_scrape.fetch('total_count', nil) #scrape
puts ch_s
if not f.filing_count == ch_s # has teh cound change - if not, skip
f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)
gf_scrape['items'].first(3).each_with_index do |f1, index|
#fetch & save data here
end
end
rescue
next
end
end
end