Tuesday, November 27, 2012

Slow Network Access Within Virtual Machines - Broadcom and Hyper-V

Time to update this old dusty thing! I figured this is worthy of a blog entry, as it's not exactly something you can google quickly at the time of this writing.

Let's talk about network speeds with Hyper-V. I have been testing Server 2008 R2 with Hyper-V enabled running a virtual Win7 'guest' and for the life of me couldn't figure out why a member server hosting file shares was getting a ping response time of 30 to over 200+ms. Doing my research pulled up a dozen or so similar instances of network slowness and everyone seemed to attribute that to the TCP chimney offloading and disabling this in the operating system as well as on the physical and virtual network adapters will correct the problem instantly.

Well, it didn't. Turns out after playing with several of the settings within the Broadcom network adapter and recreating the virtual network for the VM's half a dozen times that the real culprit is a feature called 'Virtual Machine Queues'. If any of you out there are running into any issues with your VMs' network functionality, you might try disabling this (along with the TCP offloading, or chimney offloading) and see how that works in your case. It made an immediate difference in mine and now my response time is <1ms as usual and as is seen in any physical box on the network.

I'll flesh this out a bit later if I have time, I just wanted to get this down here before I forget and hopefully I'll have helped a few others out. Turns out I may need to dig further as that feature should actually increase performance according to this whitepaper: http://www.dell.com/downloads/global/power/ps1q10-20100101-Chaudhary.pdf

27 comments:

Greg Bogumil said...

Thanks. That was exactly my problem. The setting immediately helped!

AnonymousDog said...

Dude! Nice piece of research and good, concise write up. Thank you for standing out among the posts about TCP Offloading, Receive Side Scaling, etc.

I think the show stopper on VMQ is that, according to MS's article on the feature (http://technet.microsoft.com/en-us/library/gg162681(v=ws.10).aspx), "To use this feature, VMQ must be supported by the network hardware." So, it sounds like your switches and possibly other network equipment would have to support VMQ in order for it to work.

I know that in my case, as soon as I disabled VMQ on the host NIC (dedicated) to which guest VM NICs were connected, throughput increased by a factor of 10 and latency fell from as high as 30ms (and quite variable) to <1ms and consistent.

Clay said...

Really glad to hear folks are getting some use out of this! AnonDog, yep sounds about right. Hadn't thought about the switches and/or any other network equipment. Good food for thought!

Robert Pearman said...

Thanks, used this article to help someone in the Technet forum!

JohnvanJ said...

Thanks for this post. I was facing the same problem with my new DELL R620's and disabling the VMQ on the BroadCom adapters did the trick.

Anonymous said...

just followed your advice and I don't think I'm going to have to shoot myself now. As soon as I disabled VMQ my network sped up about 10 times and latency decreased tremendously. I have two NIC's, one dedicated to the host and the other to my VM's. I disabled VMQ on both. Any thoughts about whether it was necessary on my NIC dedicated to the host?

Matthew Studer said...

OMG!!!! I LOVE YOU!!! That fixed my issue with slow VM guest, broadcom NIC...

GRRRRRR

Anonymous said...

Was having the problem on a new Dell T520 with 4-Broadcom NIC’s. Network performance was horrible. Ping results were consistently unpredictable. Disabled the VMQ on the NIC’s and saw an immediate improvement. Ping results <1ms. Thanks.

Anonymous said...

Amazing! Thank you for this. I was in a data migration and pulling my hair out. As soon as I disabled VMQ, My transfer speeds went up 1000%!!!

Anonymous said...

you are a life saver, 500+ people surfing with good speeds now and thank you as well!

Anonymous said...

Wow! I never would have found this without your help. Huge improvement! I was beginning to think that our new host machine was a lemon. Thank you, thank you, thank you!

Anonymous said...

I searched forever and only found the usual stuff about TCP offloading, etc. None of it did a thing. Found your post and instantly fixed the problem. Thanks so much, you saved my arse!

Buutteef said...

Thank you! after all that google and testing... this fix my VM network performance problems. Thanx!

Anonymous said...

Thank you! If only I would have found this page earlier! Brand new Dell server.

Richard Duffy said...

Let me echo everyone else's comments and say....THANK YOU....solved my problems as well on a Dell R620.
The users were about to lynch me!!!

Richard Duffy
SAP Business One Evangelist
SAP

Ves said...

thanks worked for me and my voipswitch virtual machine

will by you a pinte if you come by Paris France

Clay said...

Haha thanks Ves, glad to hear this is helping people. I rarely post unless it's something I can't find elsewhere, so sorry about the lack of content here :p

Network Admin said...

Thank you for posting this issue !!!

I am setting up a new Dell R520 server running Hyper-V and Windows Server 2012. The server's Broadcom gigabit adapters (2 LOM and 2 on a PCI card) are connected to a gigabit LAN, but network throughput to the VMs via the Hyper-V virtual switch was dismal, averaging less than 10 Kbps. Virtually unusable, pun intended. (I felt like it was 1980 all over again, and I was downloading a file from Compuserve with my brand new 2400 baud modem.)

Disabling "Virtual Machine Queues" under the ADVANCED tab of the network adapter properties in Device Manager solved the problem immediately. I even went back and toggled the VMQ setting from disabled to enabled several times while monitoring the network throughput to one of the VMs, and it was simply amazing to watch how profoundly it was affecting throughput. Makes me think that Broadcom should be changing the default setting for VMQ in their NIC driver to DISABLED.

I am so glad I Googled the issue and found this solution. Being new to Hyper-V, I was completely lost as to why the virtual machines were experiencing such poor network performance !

Anonymous said...

I just resolved this same issue using your instructions on a hyper-v machine with many guests.
I switched my vSwitch from External to internal, change my adapter settings as described, and then switched my vSwitch back to external on that same NIC. Worked like a charm, even on fully up systems with a 3 minute outage.
Thanks!

Anonymous said...

Worked for me as well. Just to clarify, you're talking about disabling Virtual Machine Queuing for the NIC used for VM's, but in the Device manager of the Host.

Anonymous said...

Worked for me as well. Just to clarify, you're talking about disabling Virtual Machine Queuing for the NIC used for VM's, but in the Device manager of the Host.

Anonymous said...

Awesome post. My Server 2012 VMs were really slow running on a PowerEdge. It took six minutes to log into them via Remote Desktop. I turned off VMQ for the physical NICs as well as the VMS, and login was cut down to about six seconds.

Anonymous said...

The real issue here is a duplication of every packet running through the VM switch. Wiresharking and pings under unix will result in DUP packets.

Funny thing was I was unable to reproduce the issue with 1gb nic's but the issue was constant with 10gb broadcom nics; it's clearly a bug and disabling on VM's using alot of network IO will spike CPU. The feature is needed, but a fix is needed more.

Charlie Wiener said...

The VMQ was the issue in my case and my Host OS is 2012 R2 with 2008 R2 guest. turned off the VMQ on the virtual switch interface and my network came alive.

Thanks

Anonymous said...

No one has yet to ask why Broadcom or Microsoft hasn't addressed this issue. It is obvious so many people have wasted their time with this issue. I spent a solid few hours trying to adjust settings, then another few hours trying to find something online, then finally found a related post. It's sad someone doesn't actually FIX the issue. Broadcom's driver should obviously default to disabling that feature. Shame on them for wasting all our time.

Mark B said...

Add another happy reader of your solution -- much thanks. It solved my issue.

I have multiple WS 2012 with Hyper-V hosts and most of the guests have run OK -- because I used the built-in NIC driver, not Broadcom's. The one with the Broadcom driver had VMQ enabled, and even though it was disabled in Hyper-v, it still caused slowness. With VMQ enabled on HyperV and the NIC, it ran better. But it ran best with VMQ disabled on both.

Timothy J Robinson said...

I can't believe they haven't fixed their driver after all this time. This fix still works!