Ask A Question

Notifications

You’re not receiving notifications from this thread.

How to by pass a string that has '???????' when scraping for dates and turning the date into an integer?

Wes asked in General

I'm scraping a anime information website. Right now, I'm trying to get the date going, but on some of the anime the date comes out as ???,??,????. I'm tying to figure out how to skip that or put some text like "No date" or just put 00,00,0000.

This is what i have so far and it works until it sees that "???????" can not be turned into a integer.

page.css('span.remain-time').each do |line|
 n = 3
 d = line.text[/(\S+\s+){#{n}}/].strip
 date = Date.parse(d)
 puts date
end
Reply

Hey Wesley,

Is the line.text for those actually a string of question marks? I was going to suggest that you could put an if statement in there to check, and then you could have your default "No date" or whatever when it detects that.

Does that help any?

Reply

For line.text it looks for the date and on some that don't have any date's it will get the question marks because that's what the site has to replace the date.

How would you go to setting an if statement to check? I have not been having any luck.

Reply

For example, something like this (assuming that's the string):

page.css('span.remain-time').each do |line|
  if line.text == "???,??,???"
    date = "No date"
  else
   n = 3
   d = line.text[/(\S+\s+){#{n}}/].strip
   date = Date.parse(d)
  end

  puts date
end
Reply

That worked perfect, but now i ran into another issue. Rather then only being question marks some are displayed as "??? ??, 2001" or "03 ??, ????.
I have to look for a work around.

Reply

You could change it to just check to see if there are question marks in the text at all instead.

if line.text.include?("?")
Reply

Thank you so much. This worked perfect.

Reply
Join the discussion
Create an account Log in

Want to stay up-to-date with Ruby on Rails?

Join 87,563+ developers who get early access to new tutorials, screencasts, articles, and more.

    We care about the protection of your data. Read our Privacy Policy.