In the
last post I wrote about disk latency and how to detect them. Today I want to build on that.
Imagine an urgently called meeting where somebody says one of these:
- »We have to run this on dedicated hardware, we need more performance«
- »Customers are complaining about occasional slow response times, this app does not seem to work with insert-your-favorite-hypervisor-here«
Having a déjà vu reading this? Many people still think that critical or heavy workloads are not suited for virtual environments. But why is that so? Probably bad experiences while trying to virtualize something. While I am huge fan of virtualization I have to admit that I made similar experiences in the last years. Slow databases, unresponsive user interfaces, even application errors not happening on dedicated hardware were witnessed by me.
Those experiences are in direct contrast to vendor benchmarks like
these, virtualization overhead analysis like
this one or a study like
this. In most of the analyzed performance issues by me the gap between almost native speed vs. witnessed unusable applications was caused by issues in the storage environment. The reason for that is simple: It is the weakest link in the virtual chain (i.e. the slowest component). In a dedicated setup like a
vBlock you can easily detect issues, but how about large shared SAN infrastructures? Before I go into details some background details about VMware and todays storage arrays. If you are running another hypervisor hang on, most of what I write still applies, but the solution is different.
VMware added a pretty neat feature to vSphere 4.1 called Storage IO Control or
SIOC for short. It distributes the available IO capacity fairly in case of a congestion (i.e. increased IO latency). I do not go into the gory details, so if you are interested I recommend
this paper. In short SIOC distributes the available IO capacity to the VMs depending on their shares. A VM with 10% of the total shares will get at least 10% of the total capacity. Shares were available before 4.1, but with SIOC the mechanism works across all ESX hosts sharing a datastore (i.e. running in the same cluster). In addition SIOC does nothing as long as the storage array (or the path to it) does not get saturated. So less prioritized VMs can consume more IOs as long as other VMs do not need them.
Now, to the core part of this post. If you take a look at todays storage arrays you will notice that almost all vendors are offering pooling functionality aggregating multiple raid sets into a big chunk of storage. NetApp calls this aggregate, EMC, HDS and HP are calling it (disk/storage) pooling.
The block based arrays are often striping provisioned LUNs across all raid sets in a pool achieving throughput not possible using a single raid set. In addition this approach allows thin provisioning, easy dynamic LUN resizing and overall less work for the storage engineer.
 |
No QoS: Per datastore latency on a shared disk pool without SIOC |
Sounds great? Imagine a mid range array with 200 2.5" 10k disks in a single pool offering approx. 15,000 IOPS in a standard RAID 5 7+1 setup and a 70/30 read/write ratio. What happens if you provision multiple datastores from this single pool and all VMs combined want to consume more than the offered 15k IOPS?
In a classical setup you cannot guarantee any quality of service as you can see in the shown latency graph. As soon as a single data store has issues, all other stores in the pool will also be affected.
Luckily as you can read in a
knowledge base entry from VMware, SIOC can handle this situation as long as two requirements are met. First, all provisioned datastores in the disk pool must be managed by a single vCenter. Secondly the disk pool should not be shared with other non-virtual workloads. While vSphere SIOC can detect these workloads it can only prevent starvation and assure that the remaining IO capacity is fairly distributed across all virtual machines. SIOC simply can't offer any real quality of service in this setup as long as there is no array/SAN based QoS mechanism in place. But more on that in a later post.
All this sounds reasonable but you are running another hypervisior or can't afford the pricy Enterprise Plus licenses required for Storage IO Control? Right now you probably have to go a traditional approach and split up the disk pool in as many parts required (e.g. one pool per service and/or customer) to assure a decent quality of service.
/jr