MacOS X 10.10.3 Update is *TOXIC*
As a PSA (public service announcement), I'm reporting here that updating your Yosemite system to 10.10.3 is incredibly toxic if you use WiFi.
I've seen other reports of this, and I've experienced it myself. What happened is that the update for 10.10.3 seems to have done something tragically bad to the WiFi drivers, such that it completely hammers the network to the point of making it unusable for everyone else on the network.
I have late 2013 iMac 27", and after I updated, I found that other systems started badly badly misbehaving. I blamed my ISP, and the router, because I was seeing ping times of tens of seconds!
(No, not milliseconds, seconds!!! In one case I saw responses over 64 seconds.) This was on other systems that were not upgraded. Needless to say, that basically left the network unusable.
(The behavior was cyclical -- I'd get a few tens of seconds where pings to 8.8.8.8 would be in the 20 msec range, and then it would start to jump up very quickly until maxing up around a minute or so. It would stay there for a minute or two, then rest or drop back to sane times. But only very briefly.)
This was most severe when using a 5GHz network. Switching down to 2.4GHz reduced some of the symptoms -- although still over 10 seconds to get traffic through and thoroughly unusable for a wide variety of applications.
There are reports that disabling Bluetooth may alleviate this, and also some people reported some success with clearing certain settings files. I've not tried either of these yet. Google around for the answer if you want to. For now, my iMac 27" is powered off, until I can take the chance to disrupt the network again to try these "fixes".
Apple, I'm seriously seriously disappointed here. I'm not sure at all how this testing got past you, but you need to fix this. Its absolutely criminal that applying a recommended update with security critical fixes in it should turn my computer into a DoS device for my local network. I'm shocked that several days later I've not seen a release update from Apple to fix this critical problem.
Anyway, my advice is, if possible, hold off for the update to 10.10.3. Its tragically, horribly toxic not just to the upgraded device, but probably to the entire network it sits on. I'm a little astounded that a bug in the code could hose an entire WiFi network as badly as this does -- I would have previously thought this impossible (and this was part of the reason why it took a while to diagnose this down to the computer -- I thought the ridiculous ping responses had to be a problem with my upstream provider!)
I'll post an update here if one becomes available.
I've seen other reports of this, and I've experienced it myself. What happened is that the update for 10.10.3 seems to have done something tragically bad to the WiFi drivers, such that it completely hammers the network to the point of making it unusable for everyone else on the network.
I have late 2013 iMac 27", and after I updated, I found that other systems started badly badly misbehaving. I blamed my ISP, and the router, because I was seeing ping times of tens of seconds!
(No, not milliseconds, seconds!!! In one case I saw responses over 64 seconds.) This was on other systems that were not upgraded. Needless to say, that basically left the network unusable.
(The behavior was cyclical -- I'd get a few tens of seconds where pings to 8.8.8.8 would be in the 20 msec range, and then it would start to jump up very quickly until maxing up around a minute or so. It would stay there for a minute or two, then rest or drop back to sane times. But only very briefly.)
This was most severe when using a 5GHz network. Switching down to 2.4GHz reduced some of the symptoms -- although still over 10 seconds to get traffic through and thoroughly unusable for a wide variety of applications.
There are reports that disabling Bluetooth may alleviate this, and also some people reported some success with clearing certain settings files. I've not tried either of these yet. Google around for the answer if you want to. For now, my iMac 27" is powered off, until I can take the chance to disrupt the network again to try these "fixes".
Apple, I'm seriously seriously disappointed here. I'm not sure at all how this testing got past you, but you need to fix this. Its absolutely criminal that applying a recommended update with security critical fixes in it should turn my computer into a DoS device for my local network. I'm shocked that several days later I've not seen a release update from Apple to fix this critical problem.
Anyway, my advice is, if possible, hold off for the update to 10.10.3. Its tragically, horribly toxic not just to the upgraded device, but probably to the entire network it sits on. I'm a little astounded that a bug in the code could hose an entire WiFi network as badly as this does -- I would have previously thought this impossible (and this was part of the reason why it took a while to diagnose this down to the computer -- I thought the ridiculous ping responses had to be a problem with my upstream provider!)
I'll post an update here if one becomes available.
Comments
Perhaps this is a problem.
Dylan: The switch to a different router makes sense. I think that this is a physical layer problem, and that its bad interaction between the driver/chipset/firmware in OS 10.10.3, and the firmware on my router. TBH, I think the router is just as complicit, but I'm not an expert in WiFi radio specifics, so this is just gut instinct on my part. (It *should* be shared medium, my gut instinct is that perhaps the new firmware isn't properly spreading its load across spectrum, or is somehow saturating spectrum. Somehow though this is triggering bad behavior on the router. I don't fully understand it.
brob: The proof is simple. Turn off the mac, or turn off the WiFi on the mac, and all other stations on the network immediately begin working as normal. Turn the mac on (or enable its WiFi), and boom all dead. Originally I misdiagnosed this as a router fault, it took replacing the router (with a different brand!) and then doing some further troubleshooting to determine it wasn't the router -- then I discovered it was the mac. (Part of the reason for the misdiagnosis was that turning off the wifi on the router immediately made life better for *wired* stations. I thought this was WiFi on the router being broken, only later did I understand my misdiagnosis. (Frankly, if you want further proof, you can try this out yourself.) I think for real deeper proof you ned to use a spectrum analyzer which I lack (and for which I lack the skill to use.)
In my case it was caused by AWDL and Airdrop.
I came accross an incredibly detailed article on Medium by Mario Ciabarra about the issue with the helper app called WiFriedX to mitigate the issue.
See https://medium.com/@mariociabarra/wifriedx-in-depth-look-at-yosemite-wifi-and-awdl-airdrop-41a93eb22e48
sudo ifconfig awdl0 down
You're welcome.
I've been working on installing CeroWRT but have not had the bandwidth to fully configure and test it.
So open up OS X Activity Monitor and see which process is consuming your bandwidth. Could be Photo, iCould, a backup tool, another cloud solution, etc.
Note that some router can use QoS and when overloaded by data will put all ICMP traffic (such as ping) in low priority (and there could be many devices on the network (including on your ISP or any gateway between you an the ping target) which behave the same. So ping can give you an estimate of the RTT (2x latency) but it can also provide completely irrelevant values if it is aggressively buffered. So it is possible that ping reports RTT of several seconds whereas opening an HTTP connection would still have a RTT of 20-60 ms, depending on the QoS applied by all network devices between your browser and the web site.
Note2: deactivating bluetooth could make your WiFi in the 2.4GHz range better, but not in the 5GHz range, unless you can point me to a bluetooth specification/implementation which uses the 5GHz band!
For the reason why it's related, please see this article: http://lartc.org/howto/lartc.cookbook.ultimate-tc.html
If you enabled iCloud Photo Library, try to limit the upload bandwitdh of your Mac (or iPhone or iPad) during the uploading process, by changing your router settings. An example of how to do it: http://www.tp-link.com/en/faq-557.html
Good luck!
But Wondershaper is missing out on the last 15 years of network research. It doesn't handle IPv6. And it is severely out of date with modern network traffic control. That's why Dave Täht wrote the article: Wondershaper Must Die http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
In direct response to the OP:
1) If turning off the Airport on the 10.10.3 computer makes everything work fine, then there's a bug in the wireless. It's not your duty to debug farther.
2) You just got an OS upgrade, so you deserve to get support from Apple. Call the AppleCare folks 800-275-2273.
3) Here's how to find out if your router is bufferbloated: go to the DSLReports Speed Test at http://dslreports.com/speedtest It tests latency *during* the download and upload. (Other speed test sites only test a couple pings before starting the test, so that's no test at all...) If the latency/lag figures get high during the download, then your router is bufferbloated.
4) For more info about bufferbloat, read http://richb-hanover.com/bufferbloat-and-the-ski-shop/ It contains recommendations for making your router less bloated.
Thanks Rich Brown for very valuable information!
I guess my router firmware does indeed suffer badly from buffer bloat or mishandlng of QoS. (Rescheduling packets for *seconds* is tragically stupid IMO -- better to have a fair-share sharing scheme than to blindy starve one stream in favor of another.)
What's more, current-gen home routers seem to have the same problem. I tried the latest ASUS AC2400 router, and also my home AC1900 router (nighthawk) suffers the same problem.
Still, I also blame Apple -- their approach here to flood the network also seems incredibly poor.
First, network equipment, like cable modems and routers can end up buffering a lot of packets, (enough that it may take the queue seconds to drain).
Second, when a buffer or a QoS queue fills, they end up dropping the last packets in the queue.
TCP/IP has congestion management built in. It sends packets and waits for those packets to be acknowledged by the far end. Ordinary network latency is such that it can take a while (1/10th - 1/100 of a second) before the acknowledgements come back, so there is a window where it will keep sending packets while waiting for acknowledgements. Typically, this window starts out relatively small, and then increase until packet loss occurs. At which point, it backs off dramatically, and inches back up again.
The problem with dropping the most recently received packets is that it takes a long time for the sender to realize whats happened, and it takes a long time to adjust.
So, given these problems, perhaps Apple should be more conservative in picking an upload rate. On the other hand, poorly behaving network equipment, while not uncommon, is still broken, and how much should Apple do to accommodate that?I don't have a good answer. Ultimately, Apple gets the blame, but if they worked around this, there would be less pressure on router makers to finally fix this.
Now, all that said, I have already implemented fixes for buffer bloat on my network, and I'm not using iCloud photo library yet, and yet, I seem to be having more WiFi related problems since I started using 10.10.3. Or it seems that way. Its hard to track down.
It's regrettable that Apple hasn't leapt out front and implemented something like SQM/fq_codel in OSX (and in iOS). They could lord it over Microsoft, who also hasn't implemented any SQM...
So really really buffering in NICs and OSs needs to be smaller (alot).
For example a 1Mbps link can only deliver 83 full MTU frames per *second*. That says, at this really slow link speed, you really want almost *no* buffering -- at most a frame or two. You can scale up as the link speeds increase.
So I have a 50 Mbps down, and 5 Mbps up link. At 5 Mbps, if I queue up 1024 frames (a common count), and they are full MTU, we are looking at about 2 seconds of buffer bloat.
Conversely, at 1Gbps, I can handle 83000 such frames in just one second. So this is why you see larger buffers -- because the link speeds being high suggests that you need to have deeper buffers to avoid starving the link between interrupts and context switches.
I've not looked extensively at CoDel, but it seems that the way to do this would be to figure out your effective link bandwidth (asymmetric doesn't help here if you have to choose just one), and then try to configure something like 1 msec of buffer at each point in the stack. You may have to go a little bit higher for slower links.
Now I really wish I had a fast high end symmetric link (FiOS? Too bad I can't get it!)
But, us poor slobs with slow uplinks wind up with a ton of data buffered if we don't do anything. That's where fq_codel comes in.
Read my Bufferbloat and the Ski Shop essay (http://richb-hanover.com/bufferbloat-and-the-ski-shop/) to get a quick overview of fq_codel's machinery.
It's really clever - give it your link speeds (up and down) and it'll manage the bottleneck so that all the connections share the bandwidth fairly. (And it's a piece of cake to install if you have an OpenWrt-capable router...)
Upgraded to 10.10.3. I started seeing awful ping times. (Out of 10 pings, 5 < 70ms, 4 around 400ms, 1 timeout.)
Then I signed out of iCloud.
10/10 pings < 70ms.
10/10, would sign out of iCloud again.
(I'm an Android user and don't even use iCloud.)