Performance of non-blocking writes to files via PHP

von

This is not too easy. At Log4PHP we have exactly that problem right now. Somebody is using the FileAppender and figured out, one Apache process was waiting long time before it could write. Reason: the logger locked the Logfile for the whole time of the request. If you have lots of requests, you can think what it means. Performance is gone, in the case.

Time for me to think about the different options to write to log files.

I figured out, that I have to compare the following options:

1) Not closing the file while the whole request is running. This is not an option in a live system, but will give me a good idea whats currently the case

$fp = fopen($file, 'a+');
while($count > $loop) {
   fwrite($fp, $text);
}
fclose($fp);

2) Closing the file directly after fwrite is called

while($count > $loop) {
  $fp = fopen($file, 'a+');
  fwrite($fp, $text);
  fclose($fp);
}

3) Use file_put_contents, which is known as an alias to fopen, fwrite and fclose

while($count > $loop) {
  file_put_contents($file, $text, FILE_APPEND);
}

4) Leave the file open while the whole request, but unlock it with flock and flock it again, when the next log event occurs

$fp = fopen($file, 'a+');
flock($fp, LOCK_UN);
while($count > $loop) {
  if (flock($fp, LOCK_EX)) {
    fwrite($fp, $text);
  }
  flock($fp, LOCK_UN);
}
fclose($fp);

5) Use a nonblocking stream for this and flock

  $fp = fopen($file, 'a+');
stream_set_blocking($fp, 0);

while($count > $loop) {
  if (flock($fp, LOCK_EX)) {
    fwrite($fp, $text);
  }
  flock($fp, LOCK_UN);
}
fclose($fp);

6) Use the error_log method, which my friend Kevin Horst brought up

while($count > $loop) {
  error_log($text, 3, $file);
}

Please note: As huangxuan (see comments) let me knew fwrite doesn’t need an exclusive lock as it is an atomic operation. I am pretty sure this information was not available in 2009 when I read the docs. If you are just doing a simple write operation, you might not even need to lock the file. If you are doing some more things, like cutting the logfile and creating a new one, you might still want to use the exclusive lock approach.

For each of this options I wrote a simple function which wrote 10000 times 100 characters in a freshly created log file. I measured before opening and after closing. Additionally I tried out with 2 separated threads if the write access is non-blocking. Good thing is, option 2 to 6 are actually non-blocking. And here are the timing results:

  1. Execution with NOT closing the log file took 0.0668561458588 seconds
  2. Execution with CLOSING the log after each write file took 30.1630220413 seconds
  3. Execution with file_put_content took 30.153963089 seconds
  4. Execution with leaving the file open, but LOCKING and UNLOCKING it took 0.148998975754 seconds
  5. Execution with nonblocking stream took 0.149605989456 seconds
  6. Execution with the error_log method took 30.069578886 seconds

Let’s see what it means.

Not closing the file until the 10000 fwrite calls are handled as actually the fastest. No surprise. It just took 0.0668 seconds but this one is not really an option, cause other threads have to wait until this request has been finished.

If you close the file after each fwrite and make it available to other threads and then reopen it, is hell. For 10000 calls of this kind I needed 30 seconds! It’s insane to have this in a productive system. The same goes with file_put_contents. Well, I allready knew (its in the php docs) that this method is nothing else then a wrapper for fopen, fwrite and fclose. Times are so similar that I say it’s exactly the same. Sometimes the one is some millis faster, sometimes the other.

If you open the file, unlock it with flock and flock it again it works very well. Just 0.14 ms for the 10000 fwrite calls. That’s the double amount of option 1, but yeah, here we do some more stuff. The interpreter cares about who is allowed to write, together with the OS. flock works that way, that a call to this function blocks until the requester gets the actual lock. You can be sure that only one thread is actually writing.

Same goes to the non-blocking stream. This works with stream_set_blocking($fp, 0);. The file stream is non-blocking, means each thread could write at it the same time. That no mess happens, we need an flock here too. That brings us to the nearly same results as fopen, flock, fwrite, flock, fclose option above. But looking at the logfiles of this one and option 4, this one looks more nice to me. This is just subjective, but it looks like the lock is shared more nicely between both threads.

Last one is the error_log method. It didn’t had any idea what to expect, but… 30 seconds! This one behaviours like a wrapper for fopen, fwrite and fclose, like file_put_content. No guys, this is not really a method enabled for logging! If one would use this in a framework like Log4PHP, that would be hell to performance. I would think that this should better removed out. The name suggests a good logging method, but this is not the case.

Having that all said, Log4PHP will get the lock and unlock option number 4. I feel good with it, since it’s quite straightforward. I don’t have too much experience with the non blocking stream and don’t want to have this in a framework like log4php is.

However, Logging must be used carefully at all. I think on a system with 10000 request a second. Enabling logging into one file could bring the system down. I think a live system should have the option to log exactly one request. Maybe triggered by an url param. Think carefully what you log and how you configure your live system.

The complete script can be found here.

Tags: #Apache Log4PHP #Performance #PHP