Peter Levine from Andreesen Horowitz wrote an article on The Renaissance of Enterprise Computing yesterday that finally sprouted the seed of an idea that has been dormant at the back of my brain for a few months. While the ideas of enterprise computing and web/mobile performance seem disconnected, they’re not.
When companies begin to rely on outside services (Levine mentions Box, Google Docs, and others in his article) they have given part of their infrastructure over to an outside organization. And, when you do that, this means that any performance hiccups that affect us as consumers can have a very major effect on us as employees.
Even if your company decides to purchase and deploy an enterprise application within your own infrastructure or datacenters, the performance and experience that your employees experience when using it on their desktops or on their mobile devices can affect productivity and effectiveness in the workplace. An unmanaged (read unmonitored) solution can have shut down groups in the company for minutes or hours.
Think of the call-center. No matter the industry you’re in, what increase customer calls: slow performance or a poor experience with the web/mobile application. Now, if your employees rely on a variant of the same web application to answer questions in the call-center, have you actually improved the customer experience and increased employee productivity?
Some considerations when managing, designing, or buying an enterprise application in the coming year:
What do your peers tell you about their experience implementing the solution or using an outside service – has it made employees more effective and efficient?
Are employees already using a “workaround” that makes them more effective and efficient? Why aren’t they using the internal or mandated solution?
Is performance and experience a driving factor in the lack of adoption of the mandated solution?
Do you have clear and insightful performance information that shows when employees are experiencing issues performing critical tasks? Can you clearly understand what the root cause is?
Are employees experiencing issues using the application in certain browsers or on certain mobile devices? How quickly can your design or your outside service respond to these issues?
Are you reviewing the chosen solution regularly to understand how usage is changing and how this could affect the performance of the application in the future?
Performance issues are not simply affecting the customers you serve. Your own employees use many of the same systems and applications in their day-to-day tasks, so a primary goal of managing these application should be to ensure that the applications deliver performance and experience that encourages employees to use them, no matter whether they are developed in-house or purchased as software or SaaS.
Every site has them. Whether they’re for analytics, advertising, customer support, or CDN services, third-party services are here to stay. However, for 2013, I believe that these services will face a level of scrutiny that many have avoided up until now.
Recent performance trends indicate that while web site content has been tested and scaled to meet even the highest levels of traffic, the third-party services that these sites have some to rely on (with a few exceptions) are not yet prepared to handle the largest volumes of traffic that occur when many of their customers experience a peak on the same day.
In 2013, I see web site owners asking their third-party service providers to provide verification that their systems be able to handle the highest volumes of traffic on their busiest days, with an additional amount of overhead – I suggest 20% – available for growth and to absorb “super-spikes”. Customer experience is built on the performance of the entire site, so leaving a one component of site delivery untested (and definitely unmonitored!) leaves companies exposed to brand and reputation degradation as well as performance degradation.
In your own organizations, make 2013 the year you:
Implement tight controls over how outside content is deployed and managed
Implement tight change control policies that clearly describe the process for adding third-party content to your site, including the measurement of performance impacts
Define clear SLAs and SLOs for your third-party content providers, including the performance levels at which their content will be disabled or removed from the site.
When speak to your third-party content and service providers about their plans for 2013, ask them to:
Explicitly detail how they handled traffic on their busiest days in 2012, and what they plan to do to effectively handle growth in 2013
Clearly demonstrate how they are invested in helping their customers deliver successful mobile sites and apps in 2013
Lay out how they will provide more transparent access to system performance metrics and what the goals of their performance strategy for 2013 are.
Take control of your third-party content. Don’t let it control you.
As we approach the end of 2012, I will be looking at a few trends that will become important in 2013. In a previous post, I identified optimization as an important performance trend to watch. It is one of the items on a performance checklist that companies can directly influence through the design and implementation of their web and mobile sites.
The key to optimization in any organization is to think of objects transmitted to customers, regardless of where they originate, as having a cost to you and to the customer. So, a site that makes $100,000 in a day and transfers 10 million objects to customers has an object-to-revenue ratio of 100. But, if the site is optimized and only 7.5 million objects are transferred to make $100,000, that ratio goes down to 75; and if the reduction in objects causes revenue go up to $150,000, the ratio drops to 50.
This approach is simplistic and does not include the actual cost to deliver each object, which includes costs for bandwidth, CDN services, customer service providers, etc. as well as revenue generated by third-party ads and services you present to customers. The act of balancing the cost of the site (to develop and manage), the performance you measure, the revenue you generate, the experience your customers have, and the reputation of your brand is an ongoing process that must be closely considered every time someone asks, “And if we add this to the site/app…”.
There is no optimal figure for site optimization. But there are some simple rules:
Control your third-party services. This means having a sane method for managing these services, and shutting them off if necessary. Have every team that is responsible for the site meet to approve (or deny) the addition of new third-party services. And those who want it better come with a strong cost/benefit analysis.
Optimization is the act of making the sites you create as effective and efficient as the business you run. No matter how “low” the cost to operate a web site is, each object on a site can cost the company more money than it is worth in revenue. And if that object slows the site down, it could turn a profitable transaction into a lost customer.
As we moved through the traditional start of the holiday shopping season (Thanksgiving / Black Friday / Cyber Monday), it is clear that most sites were prepared for what was coming. No big names went down, no performance slowdowns rose to the headlines, and online revenue – both web and mobile – appears to have increased over 2011.
But when you these companies do their year-end review, they need to take a step back and ask: “Could we have done it better?”
While performance events were few and far between (if they occurred at all), companies will need to examine the cost of scaling their sites for performance. When planning for the peak performance period, companies will need to asses whether simply scaling-up to handle increased traffic and sales could have been managed more effectively, by implementing sites that were not only fast, but also efficient. Joshua Bixby (here) noted that web page size has increased 20% in the last 6 months, an indication that efficiency is not always at the top of mind when new web content is presented to visitors. In order to deliver ever more complex web content, companies are spending more on services such as CDNs and cloud services to deliver their own content, while incorporating ever increasing numbers of third-party items into their pages to supply additional content and services (analytics, performance, customer service, Help Desk, and many more) that they have outsourced.
Increasing page size, outside acceleration and cloud services, and third-party services – a potent mix that companies need to asses critically, with an eye to understanding what all of these mean for the performance experienced by their visitors and customers. Add in the increasing importance of the mobile internet, with its variable connection speeds and service quality, and things become even more interesting.
In 2013, I see companies assessing these three trends with a focus on making sites perform the same (or better!) at the same (or lower!) cost than they did in 2012.
Over the next 12 months, I will be watching the performance industry news to see if those companies that have been successful at making their sites perform under the heaviest loads increasingly focus not just on speed and availability, but on efficient delivery of their entire site at a lower cost with the best user experience possible.
The key strategics questions that online businesses will be asking in 2013 will be:
Have we optimized our content? This does not mean make it faster, this means make it better and more efficient. It is almost absurdly easy to make a big, inefficient site fast, but it is harder to step back and “edit” the site in a way that you deliver the same content with less work – think Chevy Volt, not Cadillac Escalade.
Are we in control of our third-party services? Managing what services get placed on your site is only the first step. Understanding where the content you have added comes from and whether it is optimized for the heaviest shared loads will also become important checklist items for companies.
Can we deliver the design and functionality our customers want at a lower cost? This is the hardest one to be successful at, as each company is different. But Devops teams should be prepared to be accountable for not just cool, but also for the cost of creating, deploying, and managing a site.
image courtesy of Corey Seeman – http://www.flickr.com/photos/cseeman/
The latest trend in web performance measurement is the drive to implement Real User Measurement (RUM) as a component of a web performance measurement strategy. As someone who cut their teeth on synthetic measurements using distributed robots and repeatable scripts, it took me a long time to see the light of RUM, but I am now a complete convert – I understand that the richness and completeness of RUM provides data that I was blocked from seeing with synthetic data.
They key for organizations now is to realize that RUM is not a replacement for Synthetic Measurements. In fact, the two are integral to each other for identifying and solving tricky external web performance issues that can be missed by using a single measurement perspective.
My view is that the best way to drive RUM collection is to shape the metrics in a manner similar to that you have chosen to segment and analyze your visitors using traditional web analytics. The time and effort used in this effort can inform RUM configuration by determining:
Unique customer populations – registered users, loyalty program levels, etc
Browser and Device
Pages and site categories visited
This information needs to bleed through so that it can be linked directly to the components of the infrastructure and codebase that were used when the customer made their visit. But to limit this vast new data pool to the identification and solving of infrastructure, application, and operations issues isolates the information from a potentially huge population of hungry RUM consumers – the business side of any organization.
This side of the company, the side that fed their web analytics data into the setup of RUM, needs to now see the benefit of their efforts. By sharing RUM with the teams that use web analytics and aligning the two strategies, companies can directly tie detailed performance data to existing customer analytics. With this combination, they can begin to truly understand the effects of A/B testing, marketing campaigns, and performance changes on business success and health. But business users need a different language to understand the data that web performance professionals consume so naturally.
I don’t know what the language is, but developing it means taking the data into business teams and seeing how it works for them. What companies will likely find is that the data used by one group won’t be the same as for the other, but there will be enough shared characteristics to allow the group to share a dialectic of performance when speaking to each other.
This new audience presents the challenge of clearly presenting the data in a form that is easily consumed by business teams alongside existing analytics data. Providing yet another tool or interface will not drive adoption. Adoption will be driven be attaching RUM to the multi-billion dollar analytics industry so that the value of these critical metrics is easily understood by and made actionable to the business side of any organization.
So, as the proponents of RUM in web performance, the question we need to ask is not “Should we do this?”, but rather “Why aren’t we doing this already?”.
Recently, there has been a big push for the Dev/Ops culture, an integrated blending of development and operations who work closely together to ensure that poor performing web and mobile applications don’t make it out the door. They have become the rockstars of the conference circuit and the employment boards.
I fit into neither of these categories. I have never run anything more than a couple of linux servers with Apache and MySQL. I write code because I’m curious, not because I’m good at it – in fact, I write the worst code in the world and I am willing to prove it!
I am a member of a web and mobile performance culture that is language and platform independent, to use some buzzwords.
I am a web and mobile performance consultant and analyst.
I can take apart reams of data to find statistical patterns and anomalies. I believe that averages are evil, and have believed this for more than a decade. I have been using frequency and percentile distributions for almost as long and watched as the industry finally caught up.
I can link the business issue that faces your company with the technical concerns you are facing and help guide you to the middle ground where performance and the balance sheet are in careful equilibrium.
I don’t care what you write your code in. I don’t care what you run it on. Now, don’t get me wrong: I respect and admire the Dev/Ops folks I have met and know. I am just not in their tribe.
Apache has been my web server of choice for more than a decade. It was one of the first things I learned to compile and manage properly on linux, so I have a great affinity for it. However, there are still a few gotchas that are out there that make me grateful that I still know my way around the httpd.conf file.
HTTP compression is something I have advocated for a long time (just Googled my name and compression – I wrote some of that stuff?) as just basic common sense.
Make Stuff Smaller. Go Faster. Cost Less Bandwidth. Lower CDN Charges. [Ok, I can’t be sure of the last one.]
But, browsers haven’t always played nice. At least up until about 2008. After then, I can be pretty safe in saying that even the most brain-damaged web and mobile browsers could handle pretty much any compressed content we threw at them.
Oh, Apache! But where were you? There is an old rule that is still out there, buried deep in the httpd.conf file that can shoot you. I actually caught it yesterday when looking at a site using IE8 and Firefox 8 measurement agents at work. Firefox was about 570K while IE was nearly 980K. Turns out that server was not compressing CSS and JS files sent to IE due to this little gem:
BrowserMatch \bMSIE !no-gzip gzip-only-text/html
This was in response to some issues with HTTP Compression in IE 5 and early versions of IE6 – remember them? – and was appropriate then. Guess what? If you still have this buried in your Apache configuration (or any web server or hardware device that does compression for you), break out the chisels: it’s likely your httpd.conf file hasn’t been touched since the stone age.
Take. It. Out.
Your site shouldn’t see traffic from any browsers that don’t support compression (unless they’re robots and then, oh well!) so having rules that might accidentally deny compression might cause troubles. Turn the old security ACL rule around for HTTP compression:
Allow everything, then explicitly disable compression.
That should help prevent any accidents. Or higher bandwidth bills due to IE traffic.
The GoDaddy DNS event (which I wrote about here) has been the subject of many a post-mortem and water-cooler conversation in the web performance world for the last week. In addition to the many well-publicized issues that have been discussed, there was one more, hidden effect that most folks may not have noticed – unless you use Firefox.
Firefox uses OCSPlookups to validate the certificate of SSL certificates. If you go to a new site and connect using SSL, Firefox has a process to check the validity of SSL cert. The results are of the lookup cached and stored for some time (I have heard 3 days, this could be incorrect) before checking again.
Before the security wonks in the audience get upset, realize I’m not an OCSP or SSL expert, and would love some comments and feedback that help the rest of us understand exactly how this works. What I do know is that anyone who came to a site the relied on an SSL cert provided and/or signed by GoDaddy at some point in its cert validation path discovered a nasty side-effect of this really great idea when the GoDaddy DNS outage occurred: If you can’t reach the cert signer, the performance of your site will be significantly delayed.
Remember this: It was GoDaddy this time; next time, it could be your cert signing authority.
How did this happen? Performing an OCSP lookup requires a opening a new TCP connection so that an HTTP request can be made to the OCSP provider. A new TCP connection requires a DNS lookup. If you can’t perform a successful DNS lookup to find the IP address of the OCSP host…well, I think you can guess the rest.
Unlike other third-party outages, these are not ones that can be shrugged off. These are ones that will affect page rendering by blocking the downloading the mobile or web application content you present to customers.
I am not someone who can comment on the effectiveness of OCSP lookups in increasing web and mobile security. OCSP lookup for Firefox are simply one more indication of how complex the design and management of modern online applications is.
Learning from the near-disaster state and preventing it from happening again is more important that a disaster post-mortem. The signs of potential complexity collapse exist throughout your applications, if you take the time to look. And while something like OCSP may like like a minor inconvenience, when it affects a discernible portion of your Firefox users, it becomes a very large mouse scaring a very jumpy elephant.
Context is everything. Where you stand when reading or watching something shapes the way you experience it. Just as Einstein explained to us in the Train/Platform Thought Experiment, the position of the observer dictates how the event is described and recorded.
There is no difference with web performance. When a company develops an online application and presents it to customers (it doesn’t matter if they are outside/retail or inside/partner/employee), the perspective of the team that approved, created, tested, and released the application becomes, as a VP at a previous company explained to me, “interesting, but irrelevant”.
Step away from the world of online application performance for a minute, and put yourself in the shoes of the customer; become a consumer. How do you feel when a site, application, or mobile app is slow to give you what you want? I’ll give you some idea:
The stress levels of volunteers who took part in the study rose significantly when they were confronted with a poor online shopping experience, proving the existence of ‘Web Stress’. Brain wave analysis from the experiment revealed that participants had to concentrate up to 50% more when using badly performing websites, while EOG technology* and behavioural analysis of the subjects also revealed greater agitation and stress in these periods. (“Web Stress: A Wake Up Call for European Business”, emphasis mine)
I know it comes from a competitor, but it is true. It applies to me; it applies to you. And web performance professionals need to step away from the screens for a minute and put themselves in the shoes of the people standing on the platform.
Everyday, your online applications change, grow, fail, falter, and evolve – the train is always moving. To the people on the platform, all they see is your train and how it’s moving compared to the other trains they have watched go by. You worked hard on your train, polishing the brass, adding new cars, even upgrading the engine. To you, the train is a magnificent achievement that everyone should admire, especially now that the new engine makes it so much faster!
The customer on the platform is measuring how your updated train is moving compared to the MAGLEV bullet train on the super-conducting rail next to you and asking “How come this train is so slow?”
The complexity of a modern web site is astounding, and improving performance by 0.4 seconds is often a feat worthy of applause…among web performance professionals. From the perspective of your customers, that 0.4 second improvement is still not enough.
Web performance is a numbers game. As an industry, we have been focused on one set of numbers for too long. The customer experience, not the stopwatch, has to drive your company to the next level of performance maturity. To do that, you have to step off your online application train and take a cold hard look at what you deliver to your customers, alongside them down on the platform.
Had a great conversation with a colleague today. She and I were bouncing around some ideas, and I listed my top 3 topics in Web performance as “Speed, Revenue, and Experience”. She was quick to correct me.
“No, not revenue, conversions”.
She was right. Just last week, I talked about how critical it is to convert visitors into customers. Doing this in some businesses doesn’t mean that there is any revenue, but the goal remains the same.
Speed is the one everything thinks is the same as Web Performance. It’s not. It’s the don’t be that guy measure of Web Performance, the one that can be easily quantified and put on display. But performance for an online application is so much more than raw speed.
Experience is the hardest of the three to measure, because what it is depends on who you ask. Is it design, flow, ease of use, clarity, or none of these things? But a fast application can still make people cranky. There are online applications that are clearly designed to make the customer do things the way the vendor demands and these are the ones that make you go “Why am I here?”.
Now, can all the metrics that measure Web Performance be distilled to Speed, Conversions, and Experience? If you stepped away from the very product specific terms the Web Performance industry uses every day, what would describe the final, bottled, and served essence of Web Performance?