InfoSecurity India's First Magazine on Comprehensive IT Security
Menu Bar
InfoStore Feb 2010
Tech Focus
Storage Virtualization: Understanding the Right Approach

As businesses grow ever more reliant on the exploitation of digitized content, the issue of how we handle data has never been more important. Virtualization is just one element making the task of information management tougher than ever before, as it brings a number of benefits but needs to be approached in the right way.

Storage, together with processing and networking, is one of the central pillars of information technology infrastructure. More than anything else, storage is considered as something that should just work -- like a dog burying a bone, and then expecting it still to be there when he comes back, storage is considered by all but storage managers, as something straightforward to the point of even being quite boring.

The impact of server virtualization

However, while disk capacities may not have kept up with processor speeds, there have nonetheless been some fantastic innovations in how data is stored and managed. Such developments have not just been in the drives, though today we are seeing solid-state disk, capacities of over 2TB, perpendicular recording and so on. Also however, what we know as ‘storage’ encompasses many advances in how information flows are managed from server to disc.

As well as ploughing their own furrow when it comes to innovation, storage technologies have to keep up with new developments in servers and networking, and the application architectures that all three infrastructure pillars need to support. Today for example, we are seeing x86 server virtualization offering new opportunities to balance and manage application workloads across physical resources. While the principle of virtualization may not be new, outside of the mainframe world the practical existence of an abstraction layer between physical resources and their logical application workloads certainly is (Figure 1).

Figure 1

From our research we know there are three reasons why organizations are using server virtualization:

  • First and foremost, organizations need to reduce costs while maintaining service levels – server virtualization is seen as a cost effective option in terms of requiring less physical hardware, easier provisioning and reduced licensing overhead.

  • Second, there is a drive to improve the responsiveness and flexibility of IT. Virtualization simplifies operational management – e.g. a virtual machine can be moved between physical servers such that the underlying hardware can be upgraded.

  • Third, many organizations are looking to improve their power consumption. Many physical servers are under-utilized: roughly 5-10 virtual machines can be migrated onto a single physical server, and the rest switched off with immediate benefit.

Server virtualization has its benefits, to be sure, but these are still early days for the technology.
Many organizations are still only piloting virtualization, or they use it specifically for discrete, non-mission- critical workloads such as test, research and development environments. Unsurprisingly, less than a quarter of respondents to a Freeform Dynamics study earlier this year said they had anything approaching ‘expert’ level experience of virtualization.

Reviewing core requirements on storage

Innovations such as server virtualization do impose new constraints, but these are not the only forces on the storage layer. Despite the economic pressures that need to be taken into account as much as ever today, information growth remains a primary concern for many organizations – indeed, it was the number one criterion reported in a recent study concerning the deployment of new server infrastructure, so one can only imagine where it sits with respect to storage infrastructure!

Before we go on then, let’s revisit the fundamentals of what we need of our storage. First and foremost, we absolutely need to know that when something is stored, it can be retrieved consistently and reliably. Building on this, we can consider the following criteria:

Availability and performance: Two perspectives may be taken: user and workload. From a user perspective, storage availability and performance come down to service level criteria, which can be measured in seconds – though remember a user’s ‘storage experience’ may also be based on the time taken for information to cross the network. Meanwhile however, applications may be even less tolerant than users than users when it comes to performance. Latency thresholds (based on Input/Output Operations per second, or IOPs) need to be stuck to, otherwise the storage layer can become a bottleneck to applications. And finally, information may need to be kept available and performance maintained at an acceptable level in the case of a major failure, be it caused through a systems crash, fire or flood.

Accessibility and security: Information must be accessible, this much goes without saying.
Accessibility boils down to standards in terms of data formats, file systems supported and indeed the standards used to access storage devices, either directly or over a network (such as SATA,
Fibre Channel, iSCSI, and so on). It also imposes requirements on how information is shared between workloads. As a counter to accessibility, security criteria need to ensure that only the applications and users who are authorized to access certain information, can access it.

Adaptability and manageability: The storage layer needs to be able to adapt to changing conditions and requirements: the faster it can do this, the less costly it will be. In concrete terms for example, adding capacity to a storage ‘pool’ might require devices to be turned off or reconfigured – the time delay is an overhead in both operational and business terms. Also, while it would be good (in principle) that the storage layer responds automatically to new needs or thresholds being reached, in practical terms this will require human intervention so some level of management reporting and configuration capability will always be required.

Storage infrastructures have evolved through the years to support the needs of specific workloads. Two main paths exist, depending on the type of information to be stored:

  • File-based information (unstructured content – documents, media files and so on) has tended to be stored either on storage directly attached to servers or using a storage device attached to the local network, known as Network-Attached Storage (NAS).

  • Block-based information (generally structured content – that is, databases, data warehouses etc) has also moved from the server to be stored in specialized storage environments known as
    Storage Area Networks (SANs).

Server virtualization impacts each of these approaches. Let’s take a look.

What impact does server virtualization have on storage?

To understand how server virtualization impacts the storage layer, we need first to consider how larger IT environments have evolved. Apart from once-a decade consolidation activities, in general IT is added to each time a new application is deployed – with a new set of servers and a new set of storage devices, in a so-called ‘silo’.

While this approach causes problems of its own (physical space for a start, as well as the manageability overhead of trying to deal with multiple, disparate hardware and software platforms), it nonetheless has one major benefit: that each application’s storage requirements have been considered more or less individually. The paths between application processing and storage, and the bandwidth required, can be calculated relatively straightforwardly for individual applications. And while bottlenecks have always existed, the siloed nature makes them relatively easy to spot.

From the storage perspective there is an additional downside, however. As each storage system was acquired fully specified up front to cope with expected loads, a lot of spare capacity exists today that has never held valuable data, but which is unavailable for anything else to use. Storage is today one of the highest costs on the hardware and software acquisition budget, and even higher on operational maintenance.

Enter server virtualization – which, as we have said, offers a great way (in principle) to consolidate multiple workloads onto a single physical server. The benefits of server consolidation via virtualization are generally well appreciated. In principle again, these can also result in more efficient use of storage, by breaking the silos of storage that were bought to serve the needs of particular applications.

What’s not to like? The downside of server virtualization (the clue is in the name) is that it doesn’t currently pay sufficient attention to the physical environment beyond the server. It is all very well in principle to shift multiple workloads onto a single server. But rather than having their own routes to the physical storage layer, information flows now take place using the same physical interface (Figure 2).

Figure 2

From talking to early adopters of server virtualization, we know this can easily cause a bottleneck if the workloads are data intensive and/or they require access to concurrent physical resources. As mentioned earlier, at least silos made bottlenecks easier to spot. Not so with virtualization: a number of IT managers have told us of difficulties understanding what physical resources a virtual machine is actually using. Management tools are not yet sufficiently mature to provide such a view, and IT staffs do not yet have sufficient experience of the new environments to make any such mapping straightforward.

All is not lost however – and we need to remember these are still early days for virtualization in the distributed systems environment. Management tools vendors are stepping up to the plate and working with virtualization vendors to ensure the physical and virtual worlds are both taken into account – and we are starting to see the fruits of their labours. Meanwhile, while traditional storage infrastructure (NAS, SAN and the like) evolved without any forethought of virtualization, new capabilities from a number of vendors are starting to ‘bake in’ recognition of server virtualization.

An important element in all of this is storage virtualization, which we consider below.

Bringing storage virtualization into the mix

The mechanisms to manage and deliver disk capacity to applications and users – that is, all the innovations we consider under the banner of ‘storage’ – have not been standing still. Indeed, storage virtualization has been around as a mainstream technology for longer than server virtualization.

This is not the place to go into technical detail about storage virtualization. But let’s recall the principle – once again, it is about implementing an abstraction layer between physical resources (in this case, disks and storage networking) and logical resources which can be served up to applications. As we have discussed however, the abstraction layer needs to have a pretty good grasp of what’s going on in the physical world, such that bottlenecks can be avoided, or at least spotted and dealt with.

Storage virtualization certainly has its place – not least because it makes the overall pool of storage easier to manage, and enables the allocation of storage as and when it is needed (so-called thin provisioning) – this has knock-on benefits in terms of reduced power utilisation. It also offers more opportunity to support disaster recovery scenarios, as data can be replicated and moved within the overall storage pool with minimum disruption to users and applications.

In practical terms however, it is not enough to consider either server virtualization, or storage virtualization, in isolation from each other. What both server and storage virtualization give us, is a number of options, tools in the tool chest as it were, that make it possible to revisit how things are done. This is particularly true with storage virtualization, as there is more than one way to implement the technology. With this in mind, let’s look at some practical examples.

Adopting a joined up approach

For better or worse, server virtualization appears to be inevitable in many organizations. But it cannot be considered in a vacuum: and we know from research that it will have an impact on both hardware acquisition strategy, and on management processes (figure 3).

Figure 3

What does this mean from a storage perspective? Traditionally storage has been an “afterthought”, acquired as part of a bigger system. But it is becoming clear that, for virtualized server environments to run efficiently, it is necessary for them to consider the interface with storage in a more joined up manner. The question to ask is, “What about the storage?” but not from a siloed, application-specific perspective, but from the point of view of the environment as a whole.

Meeting the needs of the virtualized server environment may well require some investment in storage infrastructure – and we know from out research feedback that this can be difficult to justify. The good news is that a number of capabilities now exist that were previously only available to enterprise customers, but now are an option for companies with smaller IT budgets. However it is important to think about your needs first in business, rather than purely technical terms.

You will no doubt already have a good grasp of where the challenges lie in your existing environment, but what may be lacking is the big-picture view – knowing what workloads are seen as a priority from the business perspective, and expectations in terms of service levels. It’s by building such a picture that you can understand what needs to be prioritized, as well as giving the starting point for a business case that takes into account not just the up-front capital costs but also the impacts on the existing environment, including the inevitable re-training and ongoing management overheads.

From a technical perspective as well, there’s a caveat: given the state of flux of both server and storage virtualization, it is not just a case of assuming it will all ‘just work’. Some virtual machine documentation will casually mention that in order to take advantage of a particular capability you will need a particular networked storage solution type. As a final thought, good management processes are perhaps the most important element of all. The fact that virtualization, by its very nature, helps to hide infrastructure complexity is an advantage on which many benefits are built but without good management approaches and solutions the hiding of complexity can also become a factor in its downfall. It is likely that ongoing management capabilities, your ability to undertake them and modify your operational procedures to take advantage of virtualization will ultimately be the key to success or failure.

-By: Jon Collins, Managing Director & CEO, Freeform Dynamics Ltd.


Home   |   Current Issue   |   Archives   |   Subscription   |   Advertisement   |   Contacts

© 2006-07 'InfoStore' magazine. All rights reserved.
Website designed, developed and maintained by Fanatic Media