Since it's existing code I would look for a way to split that list of 900k words Everything else would require much more changes.
Since it's existing code I would look for a way to split that list of 900k words. Everything else would require much more changes.
The set of words may be changed every week. – ARZ Aug 8 at 11:30 @ARZ: that shouldn't hinder splitting it. You could split it by a range (start letters) or by size (1 to 200, 201 to 400, etc.) or by a combination thereof.
– Christian. K Aug 8 at 11:42 Number of available worker systems may be changed weekly. – ARZ Aug 8 at 11:45 Depending on your needs, I'd go with this approach first too.
Since you're using SQl as a central datastore it should be relatively easy to split the list of words into batches and have the systems pull down a portion to process it. Make enough batches (say # of machines * 10) so you don;t get machines running idle becuase they got an easy batch. – gjvdkamp Aug 8 at 11:47 @Henk:Can you explain your idea in more details?
– ARZ Aug 8 at 11:56.
I think this is addressed with Dryadlinq. Only know of it, no handson experience myself but it sounds like it fits the bill. GJ.
Answer in the form: Requirement -- Tool Scheduled Runs -- Quartz. NET Quartz allows you to run "jobs" on any given schedule. It also maintains state between runs so if for some reason the server goes down, when it comes back up it knows to begin running the job.
Pretty cool stuff. Distributed Queue -- NServiceBus A good ServiceBus is worth it's weight in gold. Basically what you want to do is ensure that all your workers are only doing a given operation for however many operations are queued.
If you ensure your operations are idempotent NServiceBus is a great way to accomplish this. Queue -> Worker1 += Worker 2 += Worker 3 --> Local Data Storage -> Data Queue + Workers -> Remote Data Storage Data Cache -- RavenDb or SQLite Basically in order to ensure that the return values of the given operations are sufficiently isolated from the SQL Server you want to make sure and cache the value somewhere in a local storage system. This could be something fast and non-relational like RavenDB or something structured like SQLite.
You'd then throw some identifier into another queue via NServiceBus and sync it to the SQL Server, queues are your friend! :-) Async Operations -- Task Parallel Library and TPL DataFlow You essentially want to ensure that none of your operations are blocking and sufficiently atomic. If you don't know about TPL already you should, it's some really powerful stuff!
I hear this a lot from Java folks, but it's worth mentioning...C# is becoming a really great language for async and parallel workflows! Also one cool thing coming out of the new Async CTP is TPL DataFlow. I haven't used it, but it seems to be right up your alley!
You could create an application that acted like server software. If would manage the list of words and distribute them to the clients. Your client software would be installed on the distrubuted pc's.
You could then use MSMQ for a quick way to communicate back and forth.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.