Performance if one main key to getting and keeping website traffic over time, as it will impact SEO and visitors behavior with great strength. It’s actually quite easy to get distracted from the website you’re visiting when you have to wait a few seconds after any interaction you may do, so don’t underestimate the importance of speed in your code to keep your audience.
In order to identify the bottlenecks of an application, the easiest way is by using a profiler which will detail the list of functions called during the processing of a request. You are then able to spot immediately the most time consuming components of the page. I’m personally using the xDebug profiler as I like it much for the fine graphical representations you can get from the generated report files, but there are other tools around here to do the same, one of the most popular being xhprof, a Facebook extension.
Installation and configuration
The installation is pretty straightforward on any linux based OS, using your package manager :
# debian sudo aptitude install php5-xdebug # centos sudo yum install php5-xdebug
On a Mac you will probably to have also GraphViz package installed with the “dot” binary being located in your /usr/bin directory.
You can then enable the profiler by editing the module file in the php module directory, depending on your configuration (like /etc/php5/apache2/conf.d/xdebug.ini or /etc/php.d/xdebug.ini). In this file, you’ll be able to modify xDebug’s configuration and more specifically the profiler activation, by using the “profiler_enable” parameter :
After changing this file you just need to reload apache to have it taken into account.
I personally use another configuration key to enable the profiler, which is “xdebug.profiler_enable_trigger=1”. The difference with the previous one is that it only triggers the profiler when you have a XDEBUG_PROFILE=1 parameter in your request parameters (GET, POST or cookie data), and it stays idle the rest of the time. As the overhead of using the profiler is really significant, it allows me to use it when I need it without changing the apache configuration every time.
We will also need a tool to visualize the data from our page. You can use kCacheGrind (available through your packet manager) if you’re working on Linux or qCacheGrind (which is the same software) on Windows or Mac OSX.
Generating the profile data
Once everything has been configured, all that is needed is to refresh the page you want to profile (using the XDEBUG_PROFILE=1 parameter if you went for this option). You’ll notice the page will take longer to display, maybe 3 or 4 times the normal duration, because xDebug is tracing all function calls that are happening.
This process will generate a file in the /tmp folder of your server called cachegrind.out.*, containing the collected data for this request. Depending on the page complexity, its size may vary between a few megabytes and several hundreds of MB if the page is really slow. It’s possible to change to pattern of this generated file and the destination directory by tweaking the configuration (see the documentation for more details). I advise you to change the name of the file for something more relevant concerning the analysis (like cachegrind.out.homepage) if you plan to profile multiple pages successively.
Once the data has been generated, just open the cachegrind file inside kCacheGrind.
You should see a 3 sections window with various information in it. On the left column you have the most costly functions called during the rendering of your page. On the right, the top square provides details about the callers when you focus on a given function, whereas the bottom one will display data about the callees of the same function.
For instance, if you use the “callee map” tab when selecting a function, you can overview the time-share of each subfunction call (the surface of a rectangle not occupied by another rectangle — which will be a subfunction call — corresponds to the “self” part of the function) :
The left-hand column will be your first point of focus. As I said, the most time-consuming functions are listed here, with the amount of time they took to process. You’ll notice that two times are displayed : “self” indicates the time spent in the body of the function itself, while “incl” also counts the time spent in sub-functions. Knowing this, it’s be logical to see that the entry point of the application has an “incl” of 100%. The other helpful information is the “called” column, with the amount of times each function has been called during the execution. You can therefore quickly spot an anomaly when some function is called thousands of times and you wouldn’t expect it to.
My sample data
I was recently asked as to why a particular page was so terribly slow (response time was about 5 seconds) while only displaying few elements and not even doing heavy DB access. The application was a symfony 1.4 website running on a vagrant machine, itself hosted on a decent configuration, so the reasons for this slowness wasn’t obvious.
When taking a look at the functions list, a few lines immediately attracted attention due to their high “self” time :
We can directly notice that the number of function calls is very high (except for the file() function), so there must be something going on here. Let’s take a look at the function sfYamlInline::dump and watch the call graph in the bottom right section of kCacheGrind :
We can see quite easily that the function calls we spotted earlier are actually related as cascade calls, so by eliminating the root cause, all calls should drop significantly. It’s actually quite common to have this configuration : several function appear to be very time consuming on the overall page generation but they actually are just all related to the same cause. The key point is to find the function call where things go crazy and then you can diagnose the issue by looking directly at your code.
So what was the root cause in my case ? Well you can see that the calls to the Yaml classes are initiated mostly by the debug panel, so the debug mode was one of the reasons the problem was spotted. Moreover in this case, the design of the application made it access many times a very heavy configuration file through the sfYamlInline class, making the development machine very slow on this particular page. Also, from the function list we know that the calls to php::file() have heavy cost, so we needed to figure out why. If you remember that the website was running on a vagrant environment, the answer may have come to you know. The system can perform quite badly when prompted to do many disk I/O, due to the poor shared directory system used by VirtualBox (it can be improved with NFS but remains heavy). So the solution here would be to rework the processing of the page to something much cleaner with a smaller configuration file to avoid the I/O and the file analysis. Anyway, we’ve got our diagnostic with little effort by just profiling the page !
XDebug offers other features for developers, like the step-by-step debugger, but as I appreciate keeping good performance, the profiler is the one I use most in combination with a tool like kCacheGrind. It’s easy to setup so there’s nothing holding you back from using it. The one thing that is missing is a way to observe memory consumption in the same way that xhprof offers it, but unfortunately xDebug’s profiler seems to be focused on time measurements only.