Cron is a great tool to use for basic concepts like this. However, it scales poorly, as you've surmised! Look into job processing tools, like the open-source (and multi-language) Gearman.
I would schedule a script daily, let the script query the 10,000 websites just one after another. Just one script that loops over all the websites and send a request and process the results one by one. For this kind of numbers there's no need make in any more difficult, imho.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.