This is a perfect use of EC2. Memory bandwidth in my tests has been 
pretty stable most of the time but you might get unlucky once in a 
while. You can do multiple tests and then come up with your conclusion. 
If you keep the traffic within EC2, all you're paying for EC2 compute 
time (even up for a minute is billed as an hour) which means you can get 
high CPU count instances that match your real systems.

For disk I/O, the initial reads are slow (40 MB/s) but after that, my 
disk I/O benchmark results have a range between 150 to 220 MB/s.

Wayne Johnson wrote:
> We have a need to stress test our product.  We have a few multi core 
> machines to run as a DB server and App server but they are pretty 
> heavily used and hiding under a desk. 
>
> Now the question.  Would it be reasonable to try and run stress 
> testing on EC2 (or other) farms?  Since we only will need them 
> occasionally, but beat them to death when we do?  Would it be cost 
> effective to run this on a farm?  If there is no control over how much 
> memory bandwidth you get, you may not be able to get a consistent 
> load.  Is there a similar issue with disk I/O?
>
>
>  
> ---
> Wayne Johnson,             | There are two kinds of people: Those
> 3943 Penn Ave. N.          | who say to God, "Thy will be done,"
> Minneapolis, MN 55412-1908 | and those to whom God says, "All right,
> (612) 522-7003             | then, have it your way." --C.S. Lewis
>
>
> ------------------------------------------------------------------------
> *From:* Elvedin Trnjanin <trnja001 at umn.edu>
> *To:* steve ulrich <sulrich at botwerks.org>
> *Cc:* Mike Miller <mbmiller+l at gmail.com>; TCLUG List 
> <tclug-list at mn-linux.org>
> *Sent:* Tuesday, July 7, 2009 12:15:04 PM
> *Subject:* Re: [tclug-list] cheapest "farm"?
>
> I've spent a lot of time working with EC2 and I would not really 
> recommend it for this purpose without putting a lot of effort into 
> planning and considering all the options. First of all, EC2 can be 
> more expensive than purchasing your own hardware unless you do it 
> right. There are two billing types of Amazon Machine Image (AMI) 
> instances; on demand and reserved. On demand instances are intended to 
> be up for the short term - from a few hours to days. Their pricing per 
> hour reflects this. Reserved instances are cheaper to run per hour (3 
> cents compared to 10 cents for certain instances) since you pay a 
> chunk of money up front. Throwing your infrastructure in the cloud is 
> not always cost effective unless you plan it correctly. (There are 
> companies that do this - I work for one) Keep in mind that after a 
> year or two of hardcore EC2 usage, you might have spent enough to have 
> purchased your own cluster; all expenses after that point is wasted 
> money.
>
> The other issues are designing your infrastructure over non-persistent 
> storage. You might need to set up your own AMIs to ease some of the 
> initial configuration (application installation and cluster management 
> software). While you can use the many gigabytes EC2 instances come 
> with for scratch space, you will need a combination of Simple Storage 
> Service (S3) and Elastic Block Storage (EBS) for persistent storage. 
> Each of these services has their own limitations. S3 can store an 
> unlimited amount of files but maximum file size is around 5GB. An EBS 
> volume can only be mounted by one instance at a time (for now). An EBS 
> volume is also only available to EC2 instances in the same 
> availability zone. You can think of availability zones as data centers 
> in the same geographic region (although this isn't necessarily correct).
>
> While data transfers are free between EC2 instances (over local IP 
> addresses), they are not when your are using the public IP, even if it 
> is between EC2 instances as I've heard. If you're transferring 
> gigabytes or even terabytes of data to be computed or resulting from 
> computation, this can be an expensive and slow process. Amazon 
> provides a service (AWS Import/Export) where you can send in storage 
> devices and they'll copy the data over to S3. If you have a lot of 
> devices, it can be very expensive. Amazon does provide a nice and 
> simple calculator for this - 
> http://awsimportexport.s3.amazonaws.com/aws-import-export-calculator.html 
> - so that you can pick which option works best.
>
> They also have another calculator for their other services like EC2 
> and S3 - http://calculator.s3.amazonaws.com/calc5.html
>
> The biggest flaw with EC2 is that while you do have guaranteed CPU and 
> memory resources, there is no guarantee of memory bandwidth. This 
> means if there is a separate instance from a different AWS account 
> sharing the same physical machine as your compute job, the other 
> instance could be taking up all or most of the memory bandwidth thus 
> making your job run slower. Not only does your job take longer to 
> finish, it is actually more expensive.
>
> Since the infrastructure for power, space, and cooling already exists 
> for you, it might be a better route to go with purchasing your own 
> hardware. The biggest issue I see with deciding how many cores to put 
> in a system is the network architecture you choose to purchase. If you 
> choose to go with gigabit Ethernet, it doesn't make a huge difference. 
> If you're thinking of using high speed interconnects like Infiniband, 
> the number of systems you have is crucial since the switches and 
> adapters can cost quite a bit of money. While a 24 port switch can be 
> reasonably cheap (around $5000), a 48 port switches may not be 
> ($20k-50k - 
> http://www.provantage.com/scripts/search.dll?QUERY=Infiniband+switch ) 
> so you would need to buy multiple smaller switches to get the right 
> number of ports, and then add the right amount of switches to that so 
> you can have good enough bisection bandwidth.
>
> For the current Intel Xeon (non-Nehalem) processors, you shouldn't 
> really get more than 8 cores in the system as if you go over that 
> count, there isn't enough memory bandwidth to keep them all well fed 
> with work. Dell and sometimes Sun offer good deals to academic groups, 
> so you might benefit from that. Both companies also offer free trials 
> of hardware so you can benchmark your applications on each and pick 
> which is best. While you could get more AMD nodes that have same or 
> equal power for about the same price of a single Intel node, keep in 
> mind the costs of having many less powerful systems opposed to few 
> very powerful ones can be a financial hit in the future.
>
> steve ulrich wrote:
>> mike -
>>
>> building out your own compute infrastructure is so 2002. ;)
>>
>> i've used amazon EC2 for a very similar application where i've been
>> running large simulations on their infrastructure with my own VM image
>> that i use for my purposes.  you can simply dial up the number of
>> processors that you purchase and use.  you're charged by the hour for
>> the the number of CPU instances you use.
>>
>> instead of buying hardware yourself that you have to power up, replace
>> HDDs, etc. for and manage connectivity for you let someone pay for
>> that and simply use their resources on demand.
>>
>> On Tue, Jul 7, 2009 at 9:29 AM, Mike Miller<mbmiller+l at gmail.com> wrote:
>>   
>>> We want to put together a few computers to make a little "farm" for doing
>>> our statistical analyses.  It would be good to have 50-100 cores.  What is
>>> the cheapest way to go?  About 4GB RAM per core should be more than
>>> enough.  I'm thinking quad-core chips are going to be cheaper.  How many
>>> sockets per mobo? I guess 1-, 2- and 4-socket mobos are available.  We
>>> don't need SMP, but we'll take it if it is cheap (which I doubt).  We'll
>>> use cloned HDDs in these boxes. My first thought is "blade" but maybe
>>> blades are more expensive than somewhat less convenient ways of housing
>>> the mobos.
>>>
>>> We have people here to house it and manage it and to pay for
>>> electricity(!). They also will have ideas about what we should buy.
>>>
>>> Any ideas?
>>>
>>> Which CPU gives the most flops/dollar these days?
>>>
>>> Mike
>>>
>>> _______________________________________________
>>> TCLUG Mailing List -
>>>  Minneapolis/St. Paul, Minnesota
>>> tclug-list at mn-linux.org
>>> http://mailman.mn-linux.org/mailman/listinfo/tclug-list
>>>
>>>     
>>   
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mn-linux.org/pipermail/tclug-list/attachments/20090707/cd2a7550/attachment.htm