SAN/NAS TECHNICAL TIPS|
Easing I/O bottlenecks with Microsoft Scalable Network Pack
On the surface, Microsoft's Scalable Networking Pack (SNP) may appear to have nothing to do with storage; however, a closer look reveals a different story. For example, SNP addresses TCP performance used for performing network and storage I/O to support file transfers and network-based data backup, accesses network attached storage (NAS) or file servers (NFS and CIFS/Server Message Block), along with iSCSI-based block storage access.Microsoft made SNP available for Windows Server 2003 in late 2006, part of what Microsoft refers to as the Windows Scalable Networking Initiative. The Windows Scalable Networking Initiative enables server and application performance scaling by eliminating operating system TCP network bottlenecks. SNP is also intended to be transparent to Windows-based application and network management tools. Windows SNP is free; however it requires Windows Server 2003 Service Pack 1 and a compatible network offload adapter.
TCP offload is key to removing operating system bottlenecks associated with TCP network processing and reducing server CPU utilization offload. SNP addresses the offload of TCP network protocol processing from Windows running on a server CPU to a TCP acceleration adapter, similar to leveraging a graphics or video adapter to improve graphics and video application performance. TCP accelerator adapters are commonly called TCP offload engines, or TOEs, with some optimized for iSCSI, some for general TCP handling and some for both iSCSI and general TCP offload processing. The amount of performance or server CPU utilization improvement will vary depending on the size of the processor, type of applications, size of TCP network I/O operations and application workload.
Currently 90% to 95% of iSCSI deployments, particularly on Windows based servers, have relied on software initiators instead of iSCSI HBAs or TCP offload adapters. It should be no surprise that the high number of software-based initiators is due in large part to the recent success and adoption of iSCSI in its target market sweet spot. Put another way, key value propositions for iSCSI have been the low cost, and that users can leverage common or existing infrastructure to support general purpose applications at good performance levels. Thus, the adoption of iSCSI has been, for the most part, outside of more I/O demanding applications and environments requiring iSCSI offload processing.
For most iSCSI deployments, performance has been good enough with adequate CPU bandwidth available for TCP processing. However, moving forward as more I/O intensive applications are migrated to iSCSI or NAS requiring TCP processing, more CPU cycles will be needed to handle the network interrupts unless an offload or TCP accelerator card is added. SNP should produce more improvements for applications performing medium to large-sized I/Os where CPU utilization is higher in order to process the network protocol stack.
Microsoft Windows SNP consists of:
The TCP Chimney Offload provides a clear path to the TOEs to perform network packet processing, including segmentation and reassembly. By leveraging this clear path, the operating system passes off common network processing to a TOE, freeing up the servers' CPU for handling other applications and network exception handling. The benefit is that TOE devices that support SNP have a more seamless integration with Windows to leverage their capabilities to enhance network I/O performance handling.
To help accelerate performance while preserving in-order TCP packet delivery, Receive Side Scaling (RSS) enables multiple CPUs to be involved with network protocol stack processing instead of bottlenecking on a single CPU. With SNP, a network adapter is not associated with a single processor in a multi-CPU server. In order to support scaling, RSS can dynamically share inbound network traffic across multiple adapters and CPU processors to meet workload and service requirements needs.
For network I/O intensive environments, RSS removes the bottleneck of relying on a single CPU in a multi-CPU system from having to process network protocol stacks enabling more connections and throughput per second.
Network Direct Memory Access (NetDMA) allows for a DMA engine on the peripheral component interconnect (PCI) bus. The TCP/IP stack can use the DMA engine to copy data instead of interrupting the CPU to handle the copy operation. Placing the DMA engine on the PCI bus also can offload processing of memory to memory data transfers.. For example, NetDMA can leverage systems enabled with Intel I/O Acceleration Technology (IOAT) to minimize the amount of CPU processing overhead and to move data packets between application and adapter memory buffers. In addition to NetDMA, Microsoft previously released remote DMA (RDMA) capability via an interface called Winsock Direct (WSD) with Sockets Direct Protocol (SDP) based on WSD.
In addition to iSCSI adapter vendors SilverBack Technologies Inc., Emulex Corp. and QLogic Corp., the Microsoft Windows Scalable Networking Initiative partners include Acer Inc., Alacritech Inc., Ample Communications Inc., Broadcom Corp., Chelsio Communications, Dell Inc., Fujitsu, Hewlett-Packard Co., IBM, Intel Corp., NEC Corp., Neterion Inc., NetXen Inc. and Nvidia Corp.
Improvements, such as Microsoft SNP for Windows Server 2003, help to remove network and I/O bottlenecks to fully utilize existing 1 Gbit networks and moving forward, more effectively leverage 10 Gbit and faster networks' interfaces for I/O and storage applications.
About the author: Greg Schulz is founder and senior analyst with the IT infrastructure analyst and consulting firm StorageIO. Greg is also the author and illustrator of "Resilient Storage Networks" (Elsevier) and has contributed material to "Storage" magazine and other TechTarget venues.
All Rights Reserved, Copyright 2000 - 2007, TechTarget
Questions or problems regarding this web site should be directed to email@example.com.
Copyright © 2008 Art Beckman. All rights reserved.
Last Modified: March 9, 2008