Present Perfect


Picture Gallery
Present Perfect

Low cost per core

Filed under: Question,sysadmin,Work — Thomas @ 12:16 pm

12:16 pm

For work, I’m re-reviewing servers and systems with the simple-but-not-easy goal of lowering the basic monthly cost per core. The world of racks, servers, CPU’s and cores is a more complicated place than it was a few years ago, since in a few U’s you can put anything from a bunch of small cheap servers up to monster boards with four CPU sockets and 12 core CPU’s for a total of 48 CPU’s in a 2U space. And a look at a Blade system still makes me drool, although I’m still not sure in what case a Blade really makes sense.

In any case, I tried to do a little comparison (which is hard, because you end up comparing apples and oranges) using Dell’s online configurator.

On the one hand, filling racks with Poweredge R810, 4 x 8 core 1.86 GHz, Intel XeonL7555 machines, gets the price down 26 euro per core per month. Doing the same with Opterons, which surely aren’t as powerful as the Intel ones, I can get a Poweredge R815, 48 cores quad Opteron 2.2 GHz, 6174, for 48 cores total, at 9.53 euro per core per month.

And then I thought a Blade would be an even better deal, but it turns out that it isn’t really. The cost per core, with similar CPU’s, really did come out pretty much the same as the R810 based solution. Probably not that surprising in the end since if you fill a machine with cores, the CPU cost will start dominating. But somehow I thought that Blades would end up being cheaper for maximum core power.

Maybe I’m approaching this the wrong way ? If the main concern is cost per core in a datacenter, how would you go about selecting systems ?


  1. As you mention yourself, there can be a considerable gap in performance between cores, so it begs the question: will optimizing for cost/core set you up with the best deal?

    Wouldn’t it be better to use a metric such as (#cores * perf per core) / (acquisition cost + power cost), that factors in the performance.

    Unless off-course your goal is to just get as much cores as possible. Go for that 512 core atom server then ;-)

    Comment by RubenV — 2010-6-21 @ 12:56 pm

  2. I have no idea what I am talking about, but I also wanted to mention the 512 core atom server: http://www.anandtech.com/show/3768/seamicro-announces-sm10000-server-with-512-atom-cpus-and-low-power-consumption

    I think the advantage of blades is probably the bandwidth to the outside and maybe local memory / disk also helps. They might also scale better, because you won’t run one OS instance on all of those cores.

    Comment by Christof Damian — 2010-6-21 @ 1:21 pm

  3. Over the last 15 years I’ve personally gone from a cost per core analysis to a hybrid ROI model. You need to take a deep look at what the actual abstracted resources are in a simplistic I/O model (what you put into the “black box” and what you get out of the “black box”). Once you’re able to really determine the high level goals actually sit down and profile the applications that will be used on the systems. If your apps aren’t multi-threaded, running multiple children processes or forking out tons and tons of long life or highly latent individual processes, why in the world would you go for a 32 core system when you’ll only be utilizing a few cores at a single time?

    You could argue dedicating each one of those cores to system processes, but consider do you really need a full core dedicated to a process that does a while(1) { select(); }? Seems like a major waste of resources, thus why I preach focus on ROI, not raw cost. If cost is your bottleneck, provide the raw numbers that show to the higher ups that the budget they’ve given you is not sufficient to support the requirements and explain what exactly will happen if they absolutely do not provide the resources to meet the bare minimum requirements.

    One other thing to keep in mind is any overhead in terms of latency. When you think latency, I do not simply mean cpu wait time. Think of the entire process and all tasks required. You could only be doing a while(1) { select(); } but when select returns a set of fd’s to process and you’re making system calls that require a huge amount of over head in terms of cpu system time (not user/wait time) – that will add to your context switches and a backlog of (if I remember/understand correctly – someone feel free to validate this) blocking processes and potentially if the apps aren’t written in a failsafe manner can easily send your run queue sky rocketing (if I understand the underlying implementation correctly, its been awhile since I’ve looked @ how this is computed).

    Anyway with that being said, while you can apply metrical formula or two as mentioned above by RubenV – If you really are trying to maximize your ROI don’t go simply based on the idea of “the more the better”, you may over spend much more then you need, or you may fail in your requirements all together and not achieve any state of usable functionality.

    Have a good one :)

    P.S. If you have any questions, monetary, job, resource, beer, etc. and want a pretty quick response shoot an e-mail to the name I used to comment (lowercased) at orbitalsecurity dot com.

    Comment by Benjamin — 2010-6-21 @ 1:45 pm

  4. Does cost-per-core really matter? My particular experience has been that storage tends to dominate the total cost of a deployment because storage is still pretty spendy on the enterprise scale (for example, $10k USD buys a lot more server than it does top- or second- tier SAN). This of course ignores the fact that you might be doing something super nifty like Gluster, etc…

    Also, wouldn’t available space in the DC, cooling, and power come into play before hitting CPU walls? I’ve seen a number of people hit limits in their deployment because they planned for a higher density than what the DC could support, so ended up needing an extra cage just for additional power or cooling capacity.

    The way that I’ve always approached this problem is to divide my workload into very broad “classes” and then build a configuration of a system for each class with an appropriate amount of power for the class workload at an appropriate price (where price == power, cooling, maintenance, ico) point. Web heads don’t need 8 disk slots, and db servers need more than 4 cores, etc… This allows me to buy a pile of boxes for each class at once to take advantage of larger purchasing power, then deploy from the free pool when needed and still have a consistent hardware platform.



    Comment by nathan hruby — 2010-6-21 @ 1:55 pm

  5. You should also take the power draw in the cost calculation as well as the different configs will draw different total amounts which will affect the annual cost of running the server as well. Blades will take less power but then also the overall draw could affect the total rack density of the chassis and offset the advantage.

    Comment by Peter Robinson — 2010-6-21 @ 2:02 pm

  6. You also want to look at TCO from a cabling perspective. A normal rackmount server in my data center takes 5 ethernet cables, 2 power cables and 2 fiber runs. Or a blade chassis takes 12 network connections, 4 power, and 2 fiber runs. This blade chassis holds 14 servers. It takes many less cables, and thus many less expensive ports on big network core switches and san directors. Cost per core only takes into account the cost of server hardware acquisition, and of course that is not the whole picture.

    Comment by stahnma — 2010-6-21 @ 2:51 pm

  7. Another hidden cost you might consider – how many cores you can actually fit into square footage. Space costs money – the more CPU power you can cram into an area, the better off you are, particularly if you’re experiencing rapid expansion.

    Comment by robyn bergeron — 2010-6-21 @ 4:17 pm

  8. Power & cooling are really big “cost centers”, and many massive hosters are doing funky things in that area – OVH comes to mind, as does the “just open the windows” cooling strategy.

    Given my lack of experience in really big deployments, I would try to approach it with the end goal in mind: reducing the actual monthly bill, month by month. Measure total cost for a given rack, experiment, repeat. That probably doesn’t work well for experiments that require big upfront investment (total (anal ?) airflow management for instance), but certainly a measure of “continuous improvement” is always good.

    Comment by Stefan P — 2010-6-21 @ 4:24 pm

  9. I think the fore-mentioned comments are worthy. I will add the following thoughts:
    * Depends on your needs, current as well as projected.
    * Consider how will upgrades and expandability of the systems procured fit into your decision process.
    * Is virtualization and computing power in your equation for management, then review accordingly.
    * Consider the disadvantages as well as advantages as side notes.

    Happy computing to you. :)

    Comment by David Ramsey — 2010-6-22 @ 5:28 am

  10. I’ve found this blog very insightful concerning the balance between cores, power, cost, scale, performance, etc.

    Comment by Floris — 2010-6-22 @ 10:02 am

  11. Having done a cost comparison about a year ago I came to understand two benefits of the systems:

    1. Higher rack density, though this can be a negligible benefit with some recent server designs;
    2. built-in redundancy; and
    3. low latency, high bandwidth interconnects (typically 10Gig-E)

    Redundancy was the draw for me, since it gives you nice, self-contained groups of server hardware that has 10Gig-E and Fibre-Channel redundancy (through the optional modules).

    Outside of those considerations though, it’s probably not getting you anything you can’t do with good management software.

    As for core count and as others here have been saying, it’s usually a more complicated problem than core count alone. That said, I’m guessing you already know you need lots of cores for some kind of single purpose cluster. I would reccommend asking for some evaluation hardware or even buying some proof of concept hardware so you can actually do performance tests. If the Intel processors don’t give you a marked improvement per core for your particular application then obviously the AMD processors are going to be a better value when you scale up.

    Comment by Thub — 2010-6-22 @ 9:02 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment