Speaker: Alan Novitskiy, Performance Lead at OpenTable
I’m a performance lead here at OpenTable. Part of my job here is to ensure high performance and throughput. I work to detect bottlenecks and communicate those with engineers and make sure that we optimize the sections of our apps that are responsible for a good dining experience.
So we realized that our apps really need to be fast because our major competitor is the phone. People who come online for reservations kind of expect a faster, more convenient experience, and I feel like if we don’t deliver that for them they’re going to go back to using their phone, possibly for a long time. We’d really be losing out on that business.
We had done thorough code analysis in pre-production environments, we had done rigorous benchmarking, but when we release things into the wild all kinds of new variables are introduced, and we didn’t have full visibility into what those different variables were.
So AppDynamics wasn’t the first vendor that we looked at for an APM solution, and we had already a little bit of experience in the evaluation process. The first thing we did, because we saw the overhead with the other solution, is we put AppDynamics through our benchmarking environment to make sure that our overhead was within promised numbers, and we found that to be acceptable. So we moved further, we installed it on a single server, on a single node in our production environment, and actually started looking at what kind of data it would produce for us.
I really liked how AppDynamics detected issues kind of automatically, it’s really able to tell you about problems before you even know that they exist. With other solutions you really have to point them specifically at areas that you’re interested in monitoring, and AppDynamics is really good about just going out there and detecting key areas of your application and starting to collect statistics about them.
Another thing that really impressed me is that with other solutions what I’ve seen is you kind of set thresholds and static numbers, number of milliseconds that this transaction is supposed to take, and with AppDynamics they’re kind of grabbing an average and then computing a standard deviation above that average for a single request, and if that looks like it’s misbehaving it lets you know. I think that’s really important because that’s a lot better for detection than having a set number.
In the first few months of having it running in our production environment we’ve already gotten a much better and clearer picture of our systems and how everything interacts, and we’ve already spotted a few specific issues that we were able to address pretty quickly.
I think you should let your application monitoring solution tell you about problems that you haven’t thought about. We often tend to focus on those key business transactions that directly contribute to the bottom line and sometimes we fail to see the other things in the system that could be affecting performance. If your solution is able to give you a better, clearer picture of your whole system, that’s really powerful knowledge.