Calvin French-Owen on February 4th 2013
It’s been said that “constraints drive creativity.” If that’s true, then PHP is a language which is ripe for creative solutions. I just spent the past week building our PHP library for Segment, and discovered a variety of approaches used to get good performance making server-side requests.
When designing client libraries to send data to our API, one of our top priorities is to make sure that none of our code affects the performance of your core application. That is tricky when you have a single-threaded, “shared-nothing” language like PHP.
To make matters more complicated, hosted PHP installations come in many flavors. If you’re lucky, your hosting provider will let you fork processes, write files, and install your own extensions. If you’re not, you’re sharing an installation with some noisy neighbors and can only upload
Ideally, we like to keep the setup process minimal and address a wide variety of use cases. As long as it runs with PHP (and possibly a common script or two), you should be ready to dive right in.
We ended up experimenting with three main approaches to make requests in PHP. Here’s what we learned.
The top search results for PHP async requests all use the same method: write to a socket and then close it before waiting for a response.
The idea here is that you open a connection to the server and then write to it as soon as it is ready. Since the socket write is fast and you don’t need the response at all, you close the connection immediately post-write. This saves you from waiting on a single round-trip time.
But as you can see from some of the comments on StackOverflow, there’s some debate about what’s actually going on here. It left me wondering: “How asynchronous is the socket approach?”
Here’s what our own code using sockets looks like:
The initial results weren’t promising. A single
fsockopen call was taking upwards of 300 milliseconds, and occasionally much longer.
As it turns out,
fsockopen is blocking - not very asynchronous at all! To see what’s really going on here, you have to dive into the internals of how
fsockopen actually works.
As a refresher, the basic protocol on which the internet run is called TCP. It ensures that messages between computers are transmitted reliably and get ordered properly. Since nearly all HTTP runs over TCP, we use it for our API to make writing custom clients simple.
Here’s the gist of how a TCP socket gets started:
The client sends a
syn message to the server.
The server responds with an
The client sends a final
ack message and starts sending data.
For those of you counting, that’s a full roundtrip before we can send data to the server, and before
fsockopen will even return. Once the connection is open, we can write our data to the socket. Typically this can take anywhere from
30-100msto establish a connection to our servers.
While TCP connections are relatively fast, the chief culprit here is the extra handshake required for SSL.
The SSL implementation works on top of TCP. After the TCP handshake happens, the client then begins a TLS handshake.
It ends up being 3 round trips to establish an SSL connection, not to mention the time required to set up the public key encryption.
SSL connections in the browser can avoid some of these round-trips by reusing a shared secret which has been agreed upon by client and server. Since normal sockets aren’t shared between PHP executions, we have to use a fresh socket each time and can’t re-use the secret!
It’s possible to use the
socket_set_nonblock to create a “non-blocking” socket. This won’t block on the open call but you’ll still have to wait before writing to it. Unless you’re able to schedule time-intensive work in between opening the socket and writing data, your page load will still be slowed by
A better approach is to open a persistent socket using
pfsockopen. This will re-use earlier socket connections made by the PHP process, which doesn’t require a TCP handshake each time. Though the initial latency is higher during the first time a request is made, I was able to send over 1000 events/sec from my development machine. Additionally we can decide to read from the responsebuffer when debugging, or choose to ignore it in production.
To sum it up:
Sockets can still be used when the daemon running PHP has limited privileges.
fsockopen is blocking and even non-blocking sockets must wait before writing.
Using SSL creates significant slowdown due to extra round-trips and crypto setup.
Opening a connection sets every page request back
pfsockopen will block the first time, but can re-use earlier connections without a handshake.
Sockets are great if you don’t have access to other parts of the machine, but an approach which will give you better performance is to log all of the events to a file. This log file can then be processed “out of band” by a worker process or a cron job.
The file-based approach has the advantage of minimizing outbound requests to the API. Instead of making a request whenever we call
identify from our PHP code, our worker process can make requests for
100 events at a time.
Another advantage is that a PHP process can log to a file relatively quickly, processing a write in only a few milliseconds. Once PHP has opened the file handle, appending to it with
fwrite is a simple task. The log file essentially acts as the “shared memory queue” which is difficult to achieve in pure PHP.
To read from the analytics log file, we wrote a simple python uploader which uses of our batching
analytics-python library. To ensure that the log files don’t get too large, the uploading script renames the file atomically. Activitely writing PHP files are still able to write to their existing file handles in memory, and new requests create a new log file where the old one used to be.
There’s not too much magic to this approach. It does require a more work on the side of the developer to set up the cron job and separately install our python library through PyPI. The key takeaways are:
Writing to a file is fast and takes few system resources.
Logging requires some drive space, and the daemon must have capabilities to write to the file.
You must run a worker process to process the logged messages out of band.
As a last alternative, your server can run
exec to make requests using a forked
curl process. The
curl request can complete as part of a separate process, allowing your PHP code to render without blocking on the socket connection.
In terms of performance, forking a process sits between the two of our earlier approaches. It is much quicker than opening a socket, but more resource intensive than opening a handle to a file.
To execute the forked curl process, our condensed code looks like this:
If we’re running in production mode, we want to make sure that we aren’t waiting on the forked process for output. That’s why we add the
"> /dev/null 2>&1 &" to our command, to ensure the process gets properly forked and doesn’t log anywhere.
The equivalent shell command looks like this:
It takes a little over
1ms to fork the process, which then uses around 4k of resident memory. While the
curl process takes the standard SSL
300ms to make the request, the
exec call can return to the PHP script right away! This lets us serve up the page to our clients much more quickly.
On my moderately sized machine, I can fork around
curl requests per second without them stacking up in memory. Without SSL, it can do significantly more:
Forking a process without waiting for the output is fast.
curl takes the same time to make a request as socket, but it is processed out of band.
Forking curl requires only normal unix primitives.
Forking sets a single request back only a few milliseconds, but many concurrent forks will start to slow your servers.
While not an approach to making async requests, we found that destructor functions help us batch API requests.
To reduce the number of requests we make to our API, we want to queue these requests in memory and then batch them to the API. Without using runtime extensions, this can only happen on a single script execution of PHP.
To do this we create a queue on initialization. When the script ends its execution, we send all the queued requests in batch:
We establish the queue when the object is created, and then flush the queue when the object is ready to be destroyed. This guarantees that our queue is only flushed once per request.
Additionally, we can create the socket itself in a non-blocking way in the constructor, then attempt to write to it in the destructor. This gives the connection more time to be established while the PHP interpreter is busy trying to render the page - but we will still have to wait before actually writing to the socket.
Our holy grail is a pure-PHP implementation which doesn’t interface with other processes, yet still is conservative when it comes to making requests. We’d like to make developer setup as easy as possible without requiring a dedicated queue on a separate host.
In practice, this is extremely hard to achieve. Each one of our methods have caveats and restrictions depending on how much traffic you’re dealing with and what your system allows you to do. Since no single approach can cover every use case, we built different adapters to support different users with different needs.
Originally we used the
curl forking approach as our default. Forking a process doesn’t cause a significant performance hit for page load, and is still able to scale out to many requests per second per host. However, this is limited to the configuration of the host, and can have scary consequences if your PHP program starts forking too many processes at once.
After switching to persistent sockets, we decided to make the socket approach our default. Without the TCP handshake per every request, the sockets can deal with thousands of request per second. This approach also has significantly better portability than the
For really high traffic clients who have bit more control over their own hardware, we still support the log file system. If the process which actually serves the PHP can’t re-use socket connections, then this is the best option from a performance perspective.
Ultimately, it comes to knowing a little bit about the limitations of your system and its load profile. It’s all about determining which trade-offs you’re comfortable making.
Edit 2/6/13: Originally I had stated that we used the
curl forking approach for our default and had ignored persistent sockets altoghether. After switching to persistent sockets, the performance of the socket approach increased enough to make it our default approach. It also has better portability across PHP installations.
PS. If you’re running WordPress, we also released a WordPress plugin that handles everything for you!