How to by pass a string that has '???????' when scraping for dates and turning the date into an integer?
I'm scraping a anime information website. Right now, I'm trying to get the date going, but on some of the anime the date comes out as ???,??,????. I'm tying to figure out how to skip that or put some text like "No date" or just put 00,00,0000.
This is what i have so far and it works until it sees that "???????" can not be turned into a integer.
page.css('span.remain-time').each do |line|
n = 3
d = line.text[/(\S+\s+){#{n}}/].strip
date = Date.parse(d)
puts date
end
Hey Wesley,
Is the line.text
for those actually a string of question marks? I was going to suggest that you could put an if statement in there to check, and then you could have your default "No date" or whatever when it detects that.
Does that help any?
For line.text
it looks for the date and on some that don't have any date's it will get the question marks because that's what the site has to replace the date.
How would you go to setting an if statement to check? I have not been having any luck.
For example, something like this (assuming that's the string):
page.css('span.remain-time').each do |line|
if line.text == "???,??,???"
date = "No date"
else
n = 3
d = line.text[/(\S+\s+){#{n}}/].strip
date = Date.parse(d)
end
puts date
end
That worked perfect, but now i ran into another issue. Rather then only being question marks some are displayed as "??? ??, 2001" or "03 ??, ????.
I have to look for a work around.
You could change it to just check to see if there are question marks in the text at all instead.
if line.text.include?("?")