Mediamarkt – Doch blöd? Or how Artists have gone over the edge


At least the german audience of this blog will probably recognize the rather blunt punch line. Mediamarkt, for those who don’t know, is Europe’s biggest retail chain for Electronics with something like 650 stores. For different reasons, most of them not really relevant here, they did not have a Webshop up until last week. Can you believe that?

Well, now they do. Last week they launched the site and did, as they often do, a pretty bold statement. No other than Amazon is the opponent they would like to attack. So I was curious, how they perform from a Web Performance perspective… To make a long story: Pretty badly.

The dry numbers, pulled from WebPageTest:


  • Time to Render 0.7 seconds
  • Visually Complete 2.3 seconds


  •  Time to render 3.4 seconds
  • Visually Complete 18.2 seconds

Yep, you are reading that right: Amazon is already visually complete, before Mediamarkt starts to draw a single pixel.

Both tests done from WPT’s Paris node with 1.5 MBit/s bandwidth and IE8. Reason I picked Paris and not Frankfurt is that Paris apparently has a higher headroom regarding CPU Power. With Frankfurt I often encounter measurements, that are clearly having CPU Power as influencing bottleneck. Paris comes at a latency price of +10ms, which normally doesn’t worsen the results significantly.

What is the reason, I hear you asking… Well actually it is two things Mediamarkt is paying for. First and foremost they are loading WAY too many CSS and JS Files (green and orange bars) at the beginning of the page. There are ~15 CSS and ~15 JS files placed in the  HEAD section of the Page.

As you can see in the picture above, it takes almost 3.5 seconds, before all necessary CSS and JS files are recieved. To make things worse, the Browser is already pulling images, before all CSSes and JSes are requested. The “final” necessary object is actually Number 44 on the wire…

They could reduce the Time to Render by quite a lot, if they would carefully concatenate them or inline part of them.

But as you recall, it is not only the Time to Render, that is pretty bad. Visually complete doesn’t look much better, either. The simple root cause, and reason for the Headline, is weight. Or better Overweight. The page clocks in at 2.5 MByte! A simple math will tell you, that @1.5MBit/s it won’t get much faster than 15 seconds to download the page.

And, as you can see, it is the images. All of them are PNGs, and apparently at True Color depth. 4 Images of them come in at roughly 360KB each. So just these 4 images are already half of the weight of the page. I converted them to JPG with Quality setting 100, and then used JPEGMini to crunch them. As a result, each of them was crunched to 60KB instead of 360KB. Reduced to 1/6th of its former size. For the quality degradation, look for yourself:



Additionally, another flaw they did, they implemented a rotating banner in a rather badly way. These 4 HUGE Images are loaded IN PARALLEL. Which means, even though you only see ONE of them at the beginning, all of them have to be pushed down the pipe simultanously. So instead of loading the visible one fast and immediately, and lazy load the other ones, as soon as the page completes rendering, they are loading them all at once. Which means that the VISIBLE one loads roughly ~4 times slower, than it could. Look at the Video below, how slowly that main banner progesses:

But it doesn’t end here… BELOW THE FOLD, they have another Banner rotation with 5 images, each at 30KB resulting in additional 150 KB Page weight. And another 100KB image, which actually isn’t visible AT ALL.

So, to sum it up, I assume you will be able to bring the page weight below 1 MByte, and if you properly lazy load, have a visual complete at around 800KB.

Three other minor things stood out, but aren’t as problematic, as the issues mentioned above:

  • Caching is rather on the short side. The good thing is, that almost every object DOES have a Cache-Control-Header. The bad news is, that the Time-to-Live is rather short with 1 hour, resp. 1 day.
  • Cookies. Normally I would count that as Micro-Optimization. Their Cookies though are actually THAT big (don’t fit in a single frame) and are sent even with static objects, that they should have a look at that.
  • They are serving 1 CSS and 1 Image from a different Domain, owned by them. No cookies there and no functionality. IMHO, they could put these two on the main domain, and save a DNS Lookup as well as 2 TCP Handshakes (This specific Domain sends a Connection: Close Header).

To conclude, even though Mediamarkt achieves a decent 81 Pagespeed Score we shouldn’t come to the conclusion, that the user experience is mirrored by that. An invalid causation, as Catchpoint excellently has written about.

From a performance point of view (and thereof user experience point of view), if they keep it, like it currently is,  it will be quite a an uphill battle for Mediamarkt to fight.

Comments (3)

Transfer-Encoding: Chunked Debugging w/ WPT – 101

Hi there!

Just recently there was an interesting performance issue raised in the WPT Forum that served really well as a showcase for the rich debugging capabilities of WPT and I would focus in this Blog Post on Transfer-Encoding: Chunked.

Transfer-Encoding: Chunked was defined in the HTTP1.1 RFC and can be a nice performance improvement regarding Time-to-Render.

In HTTP1.0 the default behaviour of Browser-Server interaction was to a establish a TCP Connection to a Webserver using TCP’s well-known 3-way-handshake. After the Connection was established, the Browser fired its GET-Request and waited for the response. The requested object was sent down the wire from the Server to the Browser, and as soon as all Bytes were transmitted, the Server closed the TCP Connection. The Browser then would establish a new TCP connection to request the next object and so on.

With HTTP1.1 so-called persistent connections became the default, meaning that a Browser could request sequentially, one after another, multiple objects on a single TCP connection (though it would have to wait for each request to be fulfilled, before it could request the next object. Otherwise it would be Pipelining).

Now looking at performance a transportation method was defined in HTTP1.1 called Transfer-Encoding: Chunked. Reason for this was, that sometimes the Werbserver would have already a couple of Bytes to be sent, but would not know yet the total length (size) of the whole answer. Due to this the Webserver would have to wait until the last Byte is served to him, so the Webserver could write the correct Content-Length Header of the full answer into the HTTP Header.

Otherwise, how would the Browser be able to know, that the received answer is complete, and be allowed to send the next request on this very same TCP connection?

Transfer-Encoding: Chunked solves this problem. It tells the Browser basically: “Here are the first Bytes to your request, but more will come. I do not know yet how many, but I will mark the ending of the Byte stream with a defined marker”. How this exactly is marked can be seen in the above linked RFC. And the performance improvement is, that the Browser can already “work” with this partial answer to start rendering or request other objects. How to use this technique best is described here by Steve Souders.

So after we have now some basic understanding of the Why and How of Transfer-Encoding: Chunked let’s get to the issue of this specific site:

In the waterfall below you can see, how this website was loading. You see, that the base page, the HTML document, is loading for a very slow 15 seconds.

What might be the reason? I did repeat the test a couple of times to make sure this wasn’t a single packet being lost. But the pattern remained stable.

As you can see in the bandwidth utilization below the waterfall, it is most definitely not a bandwidth issue. Most of the time the bandwidth consumption is close to zero. And also you see by the blue bar, that the content started to come down very early, but it took a REALLY long time to complete.

So, clicking at the object will tell us a little bit more about it:

First you see the size of the object. A small 3.3 KByte. So the reason for the long transfer time is not the size of the object. And again, that the first Bytes of the object arrived under 1 second, but it took 15 seconds for the full answer.

Additionally you see, that the Browser was allowing the content to be gezipped, and the Server indeed not only gezipped it, but also applied Transfer-Encoding: Chunked.

So, how can you dig deeper into this? A good idea is using the feature TCPDump of WPT, which allows you to see each and every Byte and its timing on the wire. So when doing this,  you get the following picture:

Now you get some more information, of what the issue is. You can see in the middle pane, that the answer consisted of 3 chunks, whereof the first 2 chunks arrived almost instantly (Frame 8 and 28), and then, after 15 seconds, you received the rest of 444 Bytes in Frame 242. So it seems, that the beginning of the object could be sent very fast, but then the webserver had to wait for 15 seconds for the last part to be generated. Unfortunately, the content is gezipped, so you can’t see, what was the content of these last 444 Bytes. You can only guess, that somewhere in the last 15% of the document is the code causing the delay.

But fortunately enough there is Pat. Pat Meenan. He gave me the advise: You might want to look at the setHeader command from WPT’s scripting capabilities. Bingo! setHeader allows you to modify/override any of the Browsers GET-Request Headers. And that’s what you can do. Simply use the Script:

setHeader Accept-Encoding: None

and your done!

Now when repeating this test, you see indeed in the Headers that no gzip was applied:

And in Wireshark you can see the content of the last chunk in plain text!

So now you CAN (or rather COULD) see, what code was in the delayed frame(s). Could, as in this “real-life-example” the site owner fixed the issue, before I was able to apply the Accept-Encoding: None Header.🙂

Anyway, two scenarios might have been possible to see:

a) Issue still present -> You can see the causing code piece.

b) Issue gone -> You have a problem with flushing gzip buffers.

or something totally wild like serverside malware, that tried and failed to attach malware code to the bottom of the object…😉

Leave a Comment

Objects in the LAN may appear SLOWER than they are…


Many of you are pretty much aware of the fact, that you should never judge the performance / load times of your site testing from the Local LAN. This is actually pretty common knowledge, as the results may be skewed due to the fact, that your LAN is often connected via a big pipe to the site you are working on.

But, there is actually more to it. The results might be be skewed in the opposite direction as well, and here I would like to point out, what reasons there might be. And also, why you should care anyway, even though that you already know, that testing from your LAN is not recommended.

So let’s answer the second question first. I am responsible for the second time in my career for a rather large portal. And the second time, it was much slower form the local LAN compared to what normal customers see. The reason we did and do care is simply the reason of doubt of our INTERNAL customers. Being in a tech department, our internal customers are Marketing and Customer Services. And these employees (as the rest of our company. Like the CEO for example) of course might  (and some indeed are) thinking: “WTF, they  are celebrating how fast our portal is, and even though I am almost directly connected to it, it is f*king slow!”

There are times, when you have luck, and they confront you with that. And then you might have some good Videos under your belt, “proving” that the customer experience is much better. But I can assure you, doubts will remain (“They came back with some lame techie excuses”). And sometimes they don’t confront you with that. So you don’t even have the chance to defend yourself. We just had that recently, when we had a relaunch of our Portal, announcing big performance improvements, and we got some pretty harsh responses by our colleagues. So this is the reason you maybe SHOULD care about it, that it is at least not SLOWER than customers perception.

After we covered now the motivation, let’s have a look now at the root causes:

Debugging this was difficult, as workstations in the LAN a) rarely do have admin priviliges so some of your tools might be difficult to get running and b) are under the protection of data privacy laws, so tools like Wireshark might be forbidden. In our case most of the analysis was done using Fiddler.

Things we found, sorted by priority:

  1. Internet Explorer: This thing actually has a couple of issues. In my company IE8 is the mandatory Browser, and it is directed to a corporate proxy. The impact on performance is massive:
    IE 6 to IE 8 is limiting the amount of TCP Connections when connecting through a proxy down to 2! As we shard our Portal across three domains, this means for IE 8 a difference of 18 vs. 2 connections.
    IE 6 to IE 8 is by default downgrading from HTTP 1.1 to HTTP 1.0 when connecting through a Proxy! This is massive. You won’t have persistent connections, which is extremely painful with SSL (which is the case with our Portal), but you also lose the ability to use your carefully crafted Cache-Control Headers!The first issue can be solved via some Registry Key, the second one is a Browser Setting. Especially regarding the persistent connection be aware that you have to check the whole chain (Browser, Proxy, Webserver), that none of them is configured to downgrade to HTTP 1.0! Eric Law from Microsoft has written for example an excellent Blogpost on that.
  2. Security: Within our LAN we actually have two kind of proxies. One for unknown domains, and one for “known secure” domains. Which means some kind of white list. Of course our portal is on it🙂 The proxy for unknown domains checks each and every object for Viruses. Now when we introduced with our relaunch of the portal 2 sharded domains, we forgot to put them on the white list. Resulting in all objects fetched from the sharded domains (~90%) went through a time consuming Virus scan!
  3. DNS: As we found out, in our corporate setup one device in front of our local DNS Servers was configured to drop traffic on TCP Port 53. Unfortunately the workstations in our LAN were trying to resolve our Portal domains using TCP first, and only after a time out, fell back to using UDP. So we had a nice lag in Time-to-Render right at the beginning. This behaviour has been in the past apparantly so common, that they published an RFC to halt people from thinking, UDP Port 53 is enough to support the DNS System.

So… well, we fixed the issues, and now they (our colleagues and the CEO) lived happily ever after. Testing, though, we still don’t do from our local LAN🙂

A big “Thanks” go out to Diemo S., Lars W. and Holger T. who actually DID the research and the fixes that I was just blogging about🙂

Leave a Comment

Quiz: Guess the impact of 50 KByte on Page load via DSL

As you might recall, is one of our web properties we’re responsible for. And due to this we had a rather busy week. Reason for this can be found here. But I do not want to comment on that, but rather about the technical outcome🙂 Because due to “the story” we had to redo quite a few of our graphics on our website.

And THAT was actually one thing I was eagerly waiting for.

Before “the story” we had a page header with a rather difficult image. Our current design language pretty often challenges us (or better, our agencies) quite a bit, as the graphics often consist of a photo-realistic part, which is in front of a colour-gradient background. So the different compression methods fail one way or the other. If we compress using PNG8, the quality of the photo-realistic part degrades rather badly. If we compress using JPG, the colour gradients and sharp edges become really ugly, making it necessary to compress with high quality settings, resulting in rather large files.  If we would work with 2 files using transparancy and different compression methods, well, we would have 2 HTTP Requests instead of 1.

So, to make a long story short, this header image was formerly 72 KByte of size. I asked my colleagues to make sure, that the new one would be much smaller in size, and to really push for that. What we got back was an 18 KByte image. I wasn’t totally satisfied with the result, as the agency used JPG again, even though the new image wasn’t really suitable for JPG. With PNG8 I was able to further reduce it to 8 KByte instead of 18 KByte. Nevertheless I was happy enough by the reduction of ~50 KByte.

Now back to the quiz question: What do you think, how much faster our site reaches its time to visually complete due to that reduction? (using the Frankfurt node of WPT @ 1.5 MBit/s, 50 ms RTT and IE8)

The answer might surprise some people (at least it did within our company). Normally you might do a napkin calculation like this: 50 KByte = 400 KBit. 400 KBit on a 1.5 MBit/s line should be transmitted in ~1/4th of a second = 250 ms. So the page load time might decrease by 250 ms.

But this omits from the equation TCP Slow Start! If you are unfamiliar with TCP Slow Start, the VERY, VERY simplified and brief explanation is: When a TCP Connection is established, it doesn’t utilize all available bandwidth from the beginning, but instead is “slowly” increasing the bandwidth utilization, to test the available bandwidth. A TCP connection “can’t know” the available bandwidth, so in order to not overload the network, it starts slowly and increases over time.

A much better and longer explanation is here by Steve Souders, an excellent Video from Velocity 2010 can be found here, and a really great animation visualizing it can be found here.

Sooo… What was the question again? Oh, right! The benefit of the image size reduction! Getting back to our napkin calculation: The former image was ~70 KByte of size, which is roughly 0.5 MBit, which should load in ~333 ms over a 1.5 MBit/s line. Right?

Again I used WPT with its tcpdump feature and loaded the image. And the result is, without DNS resolution…: ~666 ms!🙂 So it is roughly the double! Why so? You guessed it, the reason is TCP Slow Start.

As you can see in the image above, using Wiresharks TCP Bandwidth Statistic Analysis, it takes close to 400 ms before this TCP Connection has reached its bandwidth limitation!

Now the problem is, that the header image is quite at the beginning of the HTML basepage. And therefore it gets loaded on a rather “cold” TCP Connection. With IE8 opening up to 6 connections per server, you will start close to the beginning of the page load with 5 “cold” TCP connections.

Just recently a lot of smart people started working on circumventing the limitations of TCP Slow Start in different areas. So SPDY for example multiplexes requests on a single TCP connection, therefore going through TCP Slow Start only once. Firefox now reuses connections by the highest CWND. Starting with Linux Kernel 2.6.33 the initial CWND has been increased from 3 to 10.

But, as you can’t force your visitiors to use a specific Browser, or you might not be able to choose your Linux Kernel, your best bet is still:

Reduce Bytes!
And don’t be fooled by the fact, that 50 KByte on a 1.5 MBit/s line sounds neglectable.

While rambling, I almost forgot the initial question: The impact of this 50 KByte saved on Time-to-visually-complete. See yourself!🙂

Comments (6)

Going Async!

A lot of things have changed since my last post. In the meantime my company (Hansenet aka Alice) was sold and is now part of Telefonica, which sails under the “O2” Brand in Germany. My responsibilities have changed also quite a bit, and part of it is now the O2 Portal.

We had a big Relaunch of the site something like 6 weeks ago, and are now doing some more (performance) tweaks here and there.

One of the tweaks we had on our Agenda was going Async for the newly integrated social media widgets of Facebook, Twitter and Google+. The idea came up actually by a tweet from Steve Souders announcing that the Google Plus Button has gone async.

This was then further fueled by a Post from Stoyan Stefanov, where he showed a way to go async for actually all of the most prominent widgets.

So, maybe you are asking yourself, what is so important on async, that some of the brightest guys are working and evangelizing so much on that? Reason for that is the way Browsers behave. They are single threaded, and so quite a few Browsers will simply stop doing anything else (even downloading other assets in parallel) when they are downloading and executing Javascript. So a) it comes at a performance price for some browsers, as no other asset is downloaded and therefore interfers/blocks/halts the process of Page load and rendering, and b) due to above mentioned behaviour it becomes a Single-Point-of-Failure (SPOF). If an image loads slowly (or not at all) it is not a big issue. If a Javascript loads slowly (or not at all), it becomes really ugly for the above mentioned reasons.

And with the social widgets it is even more difficult: If your image Server is unreachable, you can politely call your engineer. If Twitter is unavailable… well… you can’t even tweet that.

And just recently Pat Meenan has written an excellent Blogpost, how the result can look like, visible in this Video. You see how the Page of businessinsider remains blank for full 20 seconds (White Page! Nothing!) just because Twitter is unavailable. And the nice chap that Pat is, he already implemented the possibility to check your Site for these SPOFs in Webpagetest.

So I tried it with our site and the result was this (Left: Twitter unavailable, Right: Twitter available) :

Video Before

What you see here is not as bad as Businessinsider. Nevertheless the Browser is spinning wheels for more than 25 seconds, indicating, that the page hasn’t completed. Even more problematic is, that the Slider (arrow >) to the right of the Screen does not show up until 24 seconds within the page load. And THAT is problematic, as the Slider functionality of the main teaser is quite an important feature to our site.

So overnight🙂 we (actually others, see below) did some magic and today the widgets have gone async! Result? (Left: Twitter unavailable, Right: Twitter available)

Video After

As you can see, the page doesn’t need any longer, if any of the social widget providers is unavailable! (well actually it is even a bit faster due to less data to download) And as a goodie, our PageSpeed Score has improved by +1 due to ~150KB of Javascript having been defered.

Thanks to Alex K., Sasa S. and Siegfried H. for the quick fix, and as always, to the community (and this time specifically to Steve, Stoyan and Pat) for continously developing tools and sharing research results and best practices.

So I recommend to testdrive your site with Webpagetest, and make sure, it doesn’t break!

Leave a Comment

Domain-Sharding and SSL

Hi there!

Some time ago I noticed some weird behaviour on our website so I thought it would be a good idea to revisit this issue. Part of the above mentioned website is a selfcare area, where our customers are able to view their bill, to change the product options etc. etc. This area requires a login and so the login itself, as well as the subsequent pages are secured by SSL.

SSL and Performance Optimization was for quite some time a very difficult beast to debug, but fortunately with Pat Meenans WebPageTest this was solved at least for IE. In the old days you could use HTTPWatch for example, to see the HTTP interaction within the SSL tunnel, but you were unable to view the TCP Connection flows. With Microsofts Visual Roundtrip Analyzer you were able to see the TCP Connection flows, but you were unable to see the HTTP Conversation within the SSL Tunnel. Pat’s tool was the first (maybe still is the only one) that allowed to see both, the TCP Connection flow as well as what happens within the SSL Tunnel.

So, back to our page. In one of our former optimization iterations we decided to do some domain sharding. With this we wanted to improve performance especially for IE6 and IE7 users, as these Browsers only open up 2 TCP Connections per domain. Resulting in a performance limit, that is often lower than what your bandwidth is able to provide.

We implemented that in a way, that we could follow another performance rule, which is “Make static content cookie-free”. So we set up another Domain,, which we used for the above mentioned domain sharding, as well as keeping most of the content cookie-free.

There are some reasonable concerns regarding domain sharding in conjunction with SSL, as you not only have an additional DNS lookup and TCP handshake, but also an additional SSL handshake, which can be quite time-consuming. But we were pretty confident, as our customers are solely based in small Germany with a RTT of ~50 ms, that the benefit of 4 connections would outweigh the impact of that additional DNS lookup and TCP/SSL Handshake.

So, when I did a run with, the result in IE7 looked like this:

Hmm… that was weird… As you can see, you have some 2 connection waterfall behaviour, but for the first 7 objects, it seems like the TCP / SSL handshake is closed after each object, and re-established for the next one. Now THAT is much more painful, than what we would have expected…

So I was almost picking up the phone to call the Webserver Admin, that probably the Connection: Keep-alive Headers were not set correctly in Apache, but I decided first to check the HTTP Headers. And they looked like this:

Request Headers:

GET /selfcare/content/staticcode/tls/kundencenter/1025300/2010-06-23-09-49-15/ui.all.css HTTP/1.1
Accept: */*
Accept-Language: en-us
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; PTST 2.257)
Connection: Keep-Alive

Response Headers:

HTTP/1.1 200 OK
Date: Thu, 13 Jan 2011 07:46:20 GMT
Server: Apache
Expires: Fri, 14 Jan 2011 07:45:33 GMT
Cache-Control: public, max-age=3628800
Content-Language: de-DE
Vary: Accept-Encoding
Content-Encoding: gzip
Age: 47
Content-Length: 5122
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/css;charset=UTF-8

As you can see, BOTH ends have a “Connection: Keep-Alive Header” using HTTP1.1. So why in the hell was the connection closed???

So I fired up Wireshark to look at the raw packet flow. Wireshark normally does not allow me to see the content within the SSL Tunnel, but in this case it didn’t matter, as I only wanted to see, WHO is closing the connection.

So it was ME (Or my machine) which was closing the connection, not the Server. As you can see in the trace, I am the one sending the TCP FIN.

So far, so bad.

Thinking about a solution / workaround I was wondering: Only the CSS files are affected. The Headers are correct and the same as the other assets, which are downloaded later on from the same domain. And with these assets I see a perfect persistent connection… Maybe it simply is due to being hosted on another domain?

Setting up another test page I dropped the idea of domain sharding, and used just one domain. The domain, where the base page resides. And the result was:

And the issue is gone… Meanwhile the Time-to-Render has improved from 2.2 to 1.6.

As it seems, the buggy TCP Connection behaviour seems to show up, when you are delivering CSS files VIA SSL from a different domain, than the base page.

I did a further test (waterfall coming soon), where I have shifted only the CSS files back to the domain, where the basepage resides. Result is

As you can see, the connection was kept-alive during the CSS downloads. BUT, later on, when the CSS-Images were downloaded from the sharded domain, suddenly these CSS-Images showed the Connection: Close behaviour!

Now, which IE Versions are affected? IE 7, as you can see above. IE 6 as well, IE 8 and IE 9 are not also affected. (Though I have to check with IE 8 and IE 9, if the issue would only be visible, if I add a seventh CSS file in the HEAD. But then you have another issue anyway). With IE 8 and IE 9, though, the issue is a lot less impacting, as you will be hit ONLY if you have more than 6 Stylesheets referenced in HEAD. Something which you should avoid anyway.

Firefox is not affected.

Finally: Who should care (If my observations/assumptions are true. Feel free to comment)?

– If it is not SSL, you have no problem.

– If you have inlined CSS, you have no problem.

– If your CSS files and CSS Images are on the same domain as the base page, you have no problem.

– If you have just a single CSS file via SSL in the HEAD, it is cacheable and your customers have a rather low latency, you shouldn’t worry too much.

– BUT, if you have multiple CSS files, or different CSS files on each page, which are on a different domain, then it might be an issue. More so, if they are not cacheable, and even more so, when your customers DO have high latency. Remember, NO RENDERING, until all CSS files from the HEAD are received! Blank Page!

So, even though there seem to be quite a lot of  conditions that have to be met, my assumption is that quite a few pages might be hit by this issue. I coincidentally saw that on Gateway’s site for example (CSS Images via SSL on a sharded Domain. Look at the bottom of the waterfall and the corresponding HTTP Headers). And it sounds reasonable. When you follow some of the performance best practices (Domain sharding, Make static assets cookie-free, Use a CDN) and are using SSL, you probably run automatically in this scenario.

Finally a big thank you to Thomas G., Daniel G. and Michael S. who supported me in this analysis!




I got in contact with Microsofts IE Team via their blog and indeed, they confirmed the above mentioned behaviour. So this is not a unique fail case with our domain, but actually a bug in IE, present in versions 6 to 9. In the same response they said they are looking to fix this in an upcoming IE version.

See the last 2 comments:

Comments (2)

Taming IE 8 Javascript download order

Hi there!

While having started to analyze, how IE 8 interacts with our site in my recent post here, I observed another, to me, strange behaviour. We have read of course Steve Souders book High Performance Websites, and therefore decided to place as much external Javascript files as possible at THE BOTTOM of the page. Due to some reasons, though, we were forced to have still some external Javascript files in the HEAD, to be precise: 2 of them. Everything else went to the bottom.

So, when I looked at IE 7 Waterfall diagram, again plugging Pat’s great, everything seemed to be fine.

You see the 2 JS Assets in the HEAD being downloaded, sequentially, as it is IE 7, and then going on downloading the images as defined in the html document.

So, just as a comparison, I did the same thing using IE 8. And, boy, was I surprised (Which is a frequent feeling, when doing web performance optimization). The Waterfall looked like this:

It seems all our optimization efforts never took place. Again, we see all JS being downloaded first! This left me a little bit puzzled… So I looked around at, and saw other waterfalls, showing the same behaviour. But also others, that didn’t.

So I built a really simple page and was playing around a little bit. First attempt was to try to repro the results by a page being as simple as possible, but that would match this behaviour.  I succeeded with this page and the Waterfall indeed showed the same behaviour:

A small JS Script 01 is placed in the HEAD (splitting the inital payload🙂 ). This JS Script is actually empty, nothing in there. But even though in the html document all images are placed right at the TOP of the BODY, and the JS files 02-09 are at the BOTTOM of the BODY, IE 8 fetches these JS resources, before it starts to download the images.

Rendering, though, is not being blocked by these “elevated” JS-Files. But the images of this rather… image-centric🙂 page are not visible up until ~4.75 seconds.

So while it does NOT block Rendering, it CLOGS UP your available TCP Connections.

So by fiddling around, I found the issue is the small JS in the HEAD of the document. When I remove this JS 01 from the HEAD of the Document, the Waterfall in IE 8 suddenly changes dramatically to this:

In comparison to the first example, this page displays the first image right after 1.5 seconds, instead of 4.75 seconds. That is three times as fast! And the difference is a ~0 Byte external Javascript placed in the HEAD! (Well, okay, my example might be a little bit strechted and rather rare in the wild🙂 )

So being also an avid reader of Steves “Even faster Websites”, I remembered, that even though IE 8 is not blocking downloads of Images while fetching JS, this might be a different story for CSS and IFRAMEs.

I did further tests and to make a long story short: While “elevating” the JS from the Bottom to the Top, these “elevated” JS DO NOT block IFRAMEs or CSS files, to be seen here:

(As you can see, even though I placed the CSS file at the TOP of the BODY and not in the HEAD, IE 8 pulls it first, and afterwards the “elevated” JS files)

and here:

So, after we have now checked out rather thoroughly the behaviour of IE 8, what can you do about this, if you simply NEED an external JS in your HEAD?

Well, you can, for instance, use the method explained here by Nicholas Zakas. Using this method, my Test Page Waterfall looks in IE 8 now like that:

Apart of finally being able to tame IE 8 Javascript load order, this also gives you the benefit of “unblocking” JS downloads in IE 7.

The result of these tests to me is the following:

With IE 8, in case an external JS-file is being placed in the HEAD by the conventional <script> method, all other external JS assets in the BODY being fetched by <script> will be elevated in the download order. These assets are not blocking rendering, but they do clog up your connections.

This can be avoided by the above mentioned method (probably there are other methods as well).

IE 7 does not show this behaviour, FF 3.6.3 does not show this behaviour.


Leave a Comment

Older Posts »