What sort of workloads would be appropriate for use on Amazon EC2 Spot Instances?

Obviously this is for any workload that doesn't need to be real-time.

Obviously this is for any workload that doesn't need to be real-time. Let's say on smaller scale, how this could apply to stackoverflow? For example, many badges on this site are not calculated in real-time.

There is periodical process that will evaluate eligibility and it doesn't matter whether it runs at 4am or 4pm everyday as long as it runs. Doing it at 4am could be 5 cents cheaper. (obviously they don't use EC2 at all for this) Larger scale?

Search engine over large set of data might need huge computing capacity to build its indexes. If you index new data once a day and it takes 2 hours to index them on hundreds of servers, you can do it overnight and save perhaps thousands of dollars every day. By spreading workload around the clock helps Amazon maximize utilization of their resources and therefore provide the cheapest prices on the market.

1: all good points IMO. – jldupont Dec 15 '09 at 3:16.

Amazon could only think of these workloads: Image and video processing, conversion and rendering Scientific research data processing Financial modeling and analysis Spot Instances remind me of "double tariff electricity meters", where you pay less for energy when the demand is less. I think it is a very interesting concept, and quite an unexpected introduction to the cloud, but it will probably be difficult to apply to conventional business problems.

Hmmm... just copying text from the page I already referenced isn't creative IMO. – jldupont Dec 15 '09 at 2:18 Yes, I know :) ... However the fact that Amazon could only think of those 3 scenarios probably means something... Since they are instances that can be terminated at any time, it makes them very difficult to apply to business problems. – Daniel Vassallo Dec 15 '09 at 2:24 1 +1 for Daniel's observation that it's hard to see how spot instances could be used to solve business problems.

AFAICT there's no guarantee at all that making a given bid will get you an instance at all, so you can't even rely on it being launched in off-peak hours - it all depends on the demand at the time. Thus, 1 for your original question as well. – gareth_bowles Dec 15 '09 at 4:29.

I am considering setting up a flexible cluster (say HADOOP) with a backbone that runs on regular instances and a few sets of additional instances at decreasing spot prices. As the price drops, additional instances become available to process work units. If the price increases, nodes will be shut down.

The cluster handles this by re-issuing the work units to other nodes, just as it would in case of node failure. Obviously this is a rather hostile environment so some adjustments need to be made. If you work with standard 3-fold replication for the global filesystem and the three nodes containing the block are shut down at the same time, you lose.

Spreading the spot instance prices decreases the likelihood of losing many in one fell swoop. Increasing the replication factor will reduce the impact, and disk space is free with the instance anyway so that won't be a factor. Is this enough?

We'll see.

1: creative thinking. – jldupont Dec 21 '09 at 11:56 Oh BTW my main business is computing, not writing software to run clusters. So if any of you guys know other people/projects doing anything like this, let me know :P – Ranieri Dec 21 '09 at 11:59.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions