One of the most popular logging libraries for node.js is Winston. Winston is a great library that lets you easily send your logs to any number of services directly from your application. For lots of cases Winston is all you need, however there are some problems with this technique when you’re dealing with mission critical nodes in a distributed system. To solve them we wrote a simple solution called Yet-Another-Logger.

The Problem

The typical multicast logging setup looks something like this:

The biggest issue with this technique for us was that many of these plugins are only enabled in production, or cause problems that are only visible under heavy load. For example it’s not uncommon for such libraries to use CPU-intensive methods of retrieving stack traces, or cause memory leaks, or even worse uncaught exceptions!

Another major drawback is that if your application is network-bound like ours is, then sending millions of log requests out to multiple services can quickly take its toll on the network, slowing down everything else.

Finally the use of logging intermediaries allows you to add to or remove services at will, with without re-deploying most of your cluster or making code changes to the applications themselves.

The Solution

Our solution was to build a simple client/server system of nodes to isolate any probelms just to a set of servers whose sole job is to fan out the logs. We call it Yet-Another-Logger, or YAL.

It’s made up of two parts: the YAL Client which sends data to a cluster of YAL Servers, which in turn fan-out your logs out to the target services. Together they give you an architecture like this:

YAL Client

The Yet-Another-Logger client is pretty much you would expect from a standard logging client. It has some log-level methods and accepts type and messagearguments—standard stuff. The only difference is that you instantiate the client with an array of YAL Server addresses, which it uses to round-robin:

var Logger = require('yal');

var log = new Logger([
  'tcp://10.0.0.95:5000',
  'tcp://10.0.0.96:5000',
  'tcp://10.0.0.97:5000'
]);

setInterval(function(){
  log.info('viewed page', { user: 'tobi' });
}, 300);

setInterval(function(){
  log.info('signed in', { user: 'jane' });
}, 1000);

setInterval(function(){
  log.error('oh no boom', { something: 'here' });
}, 3000);

YAL is backed by the Axon library, a zeromq-inspired messaging library. The great thing about this is that when a node goes down, messages will be routed to stable nodes, and then resume when the node comes back online.

YAL Server

The YAL server is also extremely simple. It accepts log events from the clients and distributes them to any number of configured services, taking the load off of mission-critical applications.

At the time of writing YAL Server is a library, and does not provide an executable, however in the near future an executable may be provided too. Until then a typical setup would include writing a little executable specific to your system.

Server plugins are simply functions that accept a server instance, and listen on the 'message' event. That makes writing YAL plugins really simple. It’s also trivial to re-use an existing Winston setup by just plunking your Winston code right into YAL Server.

var Server = require('yal-server');
var server = new Server;

server.bind('tcp://0.0.0.0:5000');
server.use(stdout);

function stdout(server){
  server.on('message', function(msg){
    console.log(msg);
  });
};

I’d recommend always running at least 3 YAL Servers in a cluster for redundancy, so you can be sure not to lose any data.

That’s it!

That’s all for now! The two pieces themselves are very simple, but combined they give your distributed system a nice layer of added protection against logging-related outages.

Coming up soon I’ll be blogging about some Elasticsearch tooling that we’ve built exlusively for digging through all of those logs we’re sending through YAL!