An Introduction to Storage Virtualization
The following is the second article in a series on virtualization.
In the introductory article to this series, I noted that virtualization technology has been available for many years. However, recently, we arrived at an inflection point where a holistic approach to virtualization of storage, networks, and servers can produce dramatic benefits that were previously unrealistic.
Rapid advances in technology and communication have shrunk the available time for decision making, and having access to the right data at the right time has become an integral part of most corporate strategies. As every aspect of business is recorded and analyzed, we continue to gather large volumes of data, which requires increasing efforts to deduplicate, manage, and back up data. Common approaches to data management are no longer sustainable. As organizations grow larger and more complex, enterprise storage is one of the first areas to which virtualization should be applied.
What Is Storage Virtualization?
Storage virtualization is the pooling of physical storage from multiple devices so that it is viewed as a single storage entity, which greatly simplifies its management. Implicit in storage virtualization is the idea of centralizing much of a corporation’s data to the new platform. But, before I continue, I want to present three important preparatory steps:
Document Data Architecture: Well-documented data architecture will serve as a logical map to be overlaid on the physical infrastructure. It also provides an opportunity to review data security and privacy requirements.
Streamline Data: Optimize by eliminating any data that is not required by your business or records management policy.
Eliminate Duplicate Data: Most corporations keep many copies of the same information, resulting in significant wasted space.
Storage virtualization provides a great opportunity to do some of the things that have been on your company’s list of tasks to accomplish. These steps listed above will likely produce as much efficiency as the virtualization effort and will make your project more successful.
There are three approaches to storage:
Network Attached Storage
SUN Microsystems pioneered work with Network Attached Storage (NAS). NAS enables files to be shared over an Ethernet network. A change from when files were stored within a server, SUN developed the Network File System (NFS) protocol that allows users to access a file on a remote computer as if it were their own. Later, Microsoft developed the Common Internet File System (CIFS) protocol that allows NAS to be used with Windows-based machines.
Storage Area Network
Whereas older storage technology required disks to be connected directly to servers over short-range cables, Storage Area Network (SAN) allowed the connection of multiple servers to multiple shared disks over a special network. Sharing the disks increased efficiency and made it more realistic to implement high availability features and a common backup solution. SAN technology requires high-reliable, low-latency, high-bandwidth networks, which often run proprietary protocols over fiber optic connections.
Also known as “cloud” storage, Internet-based storage, an emerging alternative, uses the prevalence of the Internet to offer users online storage, sometimes even free of charge (usually with a cap on maximum upload). Some vendors also offer more than ordinary storage by including file management and collaboration tools. While the potential for Internet-based storage is tremendous, companies have to address key issues like security more comprehensively. With major companies like Google announcing a foray into online storage, Internet-based storage is likely to undergo rapid development.
Because NAS and SAN have very different architectures, companies must consider the application landscape the environment would run on. NAS uses TCP/IP on an Ethernet network (using either the CIFS or NFS protocol) and is primarily suited for the following:
Personal e-mail repositories such as enterprise Personal Storage Tables (.pst)
Peer-to-peer data sharing
SAN typically uses proprietary protocols to access block storage over an optical network and is well suited for the following:
Messaging applications like Lotus Notes or Microsoft Exchange
Low-latency, high-bandwidth applications
Advantages of Storage Virtualization
Easier Manageability: Storage is viewed and managed as a single “virtual” resource, allowing for easier configuration, policy management, and interoperability within the computing environment.
Better Utilization: Available space is more effectively used, which ensures that capital expenditures on additional hardware requirements are made only when necessary. With virtualization, resource allocation features are more flexible in the allocation of physical resources to applications or tasks.
More Automation: Centralization to a single platform simplifies automation, which reduces manual support requirements.
Hidden Complexity: Virtualization allows for an improvement in backup, recovery, and archiving policies and strategies since it hides the actual complexity of the storage system. System administrators can view and manage it as a single resource.
Complements Data Architecture and Security Efforts: Storage virtualization provides an opportunity to update data architecture as well as information management policies. It also provides an opportunity to design a level of data protection that would have been difficult or cost prohibitive in a distributed environment.
Challenges of Storage Virtualization
Potential Interoperability Issues: Because virtualization is a concept—and not yet a standard—there can be issues between devices. While some vendors (like EMC) publish comprehensive interoperability data to reduce these issues, companies should continue with certified configurations of storage systems, software, and peripherals because of the lack of cross-vendor standards.
Backup: Consolidation of data will typically result in a vast amount of data residing in a single location. Depending on the data size, backing up and replicating this data could be a major time drain. Bandwidth availability and storage device backplane bottlenecks are other key issues.
Increased Hardware Dependence: Virtualization places increased demand on the performance of the physical hardware and also requires greater reliability. Increases in performance and reliability translate to more expensive components, but the system will be more efficient overall.
Trends in Storage Virtualization
The industry is heading towards consolidation, and vendors are looking to provide end-to-end solutions rather than niche offerings. To accomplish this, major technology vendors are increasingly acquiring smaller, niche companies:
HP has recently purchased Left Hand Networks, the last independent iSCSI SAN manufacturer. This will allow HP to use Left Hand’s intelligent cloning techniques.
IBM acquired online storage company Arsenal Digital Solutions.
Dell purchased EqualLogic, giving Dell a complete line of iSCSI storage systems. iSCSI is SAN over TCP/IP, which can be viewed as a hybrid of SAN and NAS.
HP also acquired PolyServe, which will speed up its clustered file system roadmap.
As mentioned, Internet-based storage is also an important trend. Google is working to make online storage easier by using web interfaces and simplifying the process of file upload and access.
Storage virtualization is well established, yet it is still one of the most exciting opportunities and the foundation for a holistic approach to a virtual computing environment.