Fanout

Fanout is real-time API development kit that helps you push data to connected devices easily. Fanout is a cross between a reverse proxy and a message broker. This unique design lets you delegate away the complexity and load of realtime data push, while leveraging your API stack for business logic.

The Case for a Push CDN

It is true that there are a bunch of software solutions that make pushing data in realtime easier, and if you’re enthusiastic about maintaining your own servers then a cloud service may not be that interesting. It’s important to recognize, though, that Fanout Cloud is about more than just making push easy. It’s about making it scalable.

The key to scaling is delegating work among many machines. Fanout Cloud achieves this by load balancing message deliveries across a set of servers. This delegation is even more important for push than it is for traditional request/response traffic. Aside from cases like a single tweet from Ashton Kutcher driving thousands of people to pounce on your website simultaneously, requests tend to be evenly distributed over a given period of time, with a rise during peak hours. This is because there is generally no coordination between clients, and in the case of web traffic people click links with a degree of random timing. Push traffic, on the other hand, is bursty. Suppose your single server website handles 200 hits per second at maximum, but every once in awhile you need to push a message to 5000 recipients. If we say the work needed to handle a hit is roughly the same as the work needed to push a message, then it would take 25 seconds to make all of the deliveries. That’s a long time. Ideally, pushes would be near instantaneous, but is this practical? According to the math, getting 5000 deliveries out in 1 second would require 25 servers!

This is where the power of a shared service can come into play. If you need to push a message to 5000 recipients instantly, once per hour, it is likely wasteful to invest in a heap of distributed server infrastructure that will spend 3599 of every 3600 seconds idle. On the other hand, what if 3600 organizations with identical requirements split the cost? Suddenly a state-of-the-art infrastructure becomes not only affordable, but a steal. This is the reason traditional content delivery networks (CDNs) are popular. Just look at Akamai or Amazon CloudFront. These are powerful services that you would have little hope in replicating on your own unless you are in the business of making CDNs.

I think cost alone makes the case here, but some people may cite that there are drawbacks that come with dependence on an external service. Certainly this is true. Often, the choice to use the cloud is not a pure win but a trade-off: simpler administration in exchange for some loss of control or increased latency. However, keep in mind that the trade-offs vary by service type. If you maintain your documents in Google Docs, you lose control, increase latency, and even limit accessibility (offline situations). If you use the Tumblr service instead of a WordPress install on your own server, you lose control, but latency and accessibility should remain about the same.

With Fanout Cloud, you lose a little bit of control in the way you route network transmissions, but that’s about it. You don’t lose control of your data, as Fanout Cloud is not a database and does not store your data. You might think Fanout Cloud would introduce latency, and while the truth is that it does, it’s the kind of necessary latency that is unavoidable as you grow. In other words, if your web service today consists of a single server in a single location, then Fanout Cloud will indeed add latency. If you’ve grown to the point where you need multiple servers in multiple locations (for example, one server in California and one server in Virginia), then suddenly you may be introducing a small, but necessary amount of latency for sake of scalability. To achieve a Fanout-level of scale or throughput on your own would require making the very same trade-offs, with your own infrastructure.

Fanout Overview

In a nutshell, clients connect to Fanout Cloud to listen for data, and API calls can be made to Fanout Cloud to send data to one or more connected clients. It’s like a publish-subscribe service, but with a twist: incoming client requests are proxied to a configured origin server (e.g. your API backend server), and Fanout Cloud’s behavior is determined by the responses it receives.

The network architecture looks like this:

_images/fanout-diagram-small.png

Clients connect to Fanout Cloud, and Fanout Cloud communicates with the origin server using regular, short-lived HTTP requests. The origin server application can be written in any language and use any webserver. There are two main integration points:

  1. The origin server must handle proxied requests from Fanout Cloud. For HTTP, each incoming request is proxied to the origin server. For WebSockets, the activity of each connection is translated into a series of HTTP requests sent to the origin server. The responses from the origin server are used to control which publish-subscribe channels to associate with each connection, among other things.
  2. Your application must send data to Fanout Cloud whenever there is data to push out to listeners. This is done by making an HTTP POST request to Fanout Cloud’s Publish endpoint. The data will then be injected into any client connections as necessary.

Additionally, Fanout Cloud supports pushing data using Webhooks, in which case the receivers are not clients but servers able to accept HTTP requests.

Product Site

Documentation

Pushpin

Pushpin’s primary value prop is that it is an open source solution (the open source version of Fanout) that enables real-time push — a requisite of evented APIs (GitHub Repo). At it’s core, it is a reverse proxy server that makes it easy to implement WebSocket, HTTP streaming, and HTTP long-polling services. Structurally, Pushpin communicates with backend web applications using regular, short-lived HTTP requests.

This architecture provides a few core benefits:

  • Backend languages can be written in any language and use any webserver.
  • Data can be pushed via a simple HTTP POST request to Pushpin’s private control API
  • It is invisible to connected clients
  • It manages stateful elements by acting as the responsible party when requiring data from your backend server
  • Horizontally scalable by not requiring communication between Pushpin instances
  • It harnesses a publish-subscribe model for data transmission
  • It can act as both a proxy server and publish-subscribe broker

Integrating Pushpin

From a more systemic perspective, there are a few ways you can integrate Pushpin into your stack. The most basic setup is to put Pushpin in front of a typical web service backend, where the backend publishes data directly to Pushpin. The web service itself might publish data in reaction to incoming requests, or there might be some kind of background process/job that publishes data.

PushPin real-time reverse proxy

Because Pushpin is a proxy server, it works with most API management systems — allowing you to do perform actual API development.  For instance, you can chain proxies together, placing Pushpin in the front so your API management system isn’t subjected to long-lived connections.  More importantly, Pushpin can translate WebSocket protocol to HTTP, allowing the API management system to operate on the translated data.

PushPin real-time reverse proxy with API management system

Pushpin Technical Details

Pushpin makes it easy to create HTTP long-polling, HTTP streaming, and WebSocket services using any web stack as the backend. It’s compatible with any framework, whether Django, Rails, ASP, PHP, Node, etc. Pushpin works as a reverse proxy, sitting in front of your server application and managing all of the open client connections.

pushpin-diagram2

Communication between Pushpin and the backend server is done using conventional short-lived HTTP requests and responses. There is also a ZeroMQ interface for advanced users.

The approach is powerful for several reasons:

  • The application logic can be written in the most natural way, using existing web frameworks.
  • Scaling is easy and also natural. If your bottleneck is the number of recipients you can push realtime updates to, then add more Pushpin instances.
  • It’s highly versatile. You define the HTTP/WebSocket exchanges between the client and server. This makes it ideal for building APIs.

How it works

Like any reverse proxy, Pushpin relays HTTP requests and responses between clients and a backend server. Unless and until the backend invokes any of Pushpin’s special realtime features, this proxying is purely a pass-through. The magic happens when the backend server decides to respond with special instructions to a request. For example, if the backend server wants to long-poll a request, it can respond to the request with instructions saying that the connection be held open and bound to a channel. Pushpin will act on these instructions rather than forwarding them down to the requesting client. Later on, when the backend wants to respond to a request being held open, it makes a publish call to Pushpin’s local REST API containing the HTTP response data to be delivered.

Below is a sequence diagram showing the network interactions:

pushpin-diagram3

As you can see, the backend web application can either respond to an HTTP request normally, or it can respond with holding instructions and send data down the connection at a later time. Either way, the backend never maintains long-lived connections on its own. Instead, it is Pushpin’s job to maintain long-lived connections to clients.

The interfacing protocol between Pushpin and the backend server is called “GRIP”. You can read more about GRIP here.

An example

Let’s say you want to build an “incrementing counter” service that supports live updates. You could design a REST API as follows:

  • Single integer counter exists at resource /counter/value/.
  • POST /counter/value/ to increment and return the counter value (the value after incrementing).
  • GET /counter/value/ to retrieve the current counter value. Optionally, pass parameter last=N to specify the last value known by the client. If the server recognizes this value as the current value, then long-poll until the value changes.

Before we discuss how to implement this API with Pushpin, let’s go over the counter API design in more detail so it’s clear what we are trying to accomplish.

The POST action is straightforward. It’s the GET action that’s more complex, because it needs to long-poll or not, depending on the state of things. Suppose the current counter value is 120. Below, different GET requests are shown with the expected server behavior.

Client requests counter value, without specifying last known value:

GET /counter/value/ HTTP/1.1

Server immediately responds:

HTTP/1.1 200 OK
Content-Type: application/json

120

Client requests counter value, specifying a last known value that is not the current value:

GET /counter/value/?last=119 HTTP/1.1

Server immediately responds:

HTTP/1.1 200 OK
Content-Type: application/json

120

Client requests counter value, specifying last known value that is the current value:

GET /counter/value/?last=120 HTTP/1.1

The server will now wait (long-poll) before responding. The server will either respond with the next value eventually:

HTTP/1.1 200 OK
Content-Type: application/json

121

Or, the server will timeout the request, because the counter has not changed within some timeout window. In this case we’ll say the server should respond with an empty JSON object:

HTTP/1.1 200 OK
Content-Type: application/json

{}

At this point we haven’t even gotten to the Pushpin part. We’re just designing and describing a counter API, and there is nothing necessarily Pushpin-specific about the above design. You might come to this same design regardless of how you actually planned to implement it. This helps showcase Pushpin’s versatility in being able to drive any API. In fact, if a counter service already existed with this API, it could be migrated to Pushpin and clients wouldn’t even notice the switch.

Normally, implementing any kind of custom long-polling interface would require using an event-driven framework such as Node.js, Twisted, Tornado, etc. With Pushpin, however, one can implement this interfacing using any web framework, even those that are not event-driven. Below we’ll go over how one might implement the counter API using Django.

First, here’s the model code, which creates a database table with two columns, name (string) and value (integer):

class Counter(models.Model):
  name = models.CharField(max_length=32)
  value = models.IntegerField(default=0)

  @classmethod
  def inc(cls, name):
    cls.objects.filter(name=name).update(value = F('value') + 1)

Just a basic model with an increment method. Our service will use a counter called “main”. Now for the view, where things get interesting:

from gripcontrol import GripPubControl, create_grip_channel_header

pub = GripPubControl({'uri': 'http://localhost:5561'})

def value(request):
    if request.method == 'GET':
        c = Counter.objects.get(name='main')
        last = request.GET.get('last')
        if last is None or int(last) < c.value:
            resp = HttpResponse(json.dumps(c.value) + '\n')
        else:
            resp = HttpResponse('{}\n')
            resp['Grip-Hold'] = 'response'
            resp['Grip-Channel'] = create_grip_channel_header('counter')
        return resp
    elif request.method == 'POST':
        Counter.inc('main') # DB-level atomic increment
        c = Counter.objects.get(name='main')
        pub.publish_http_response('counter', str(c.value) + '\n')
        return HttpResponse(json.dumps(c.value) + '\n')
    else:
        return HttpResponseNotAllowed(['GET', 'POST'])

Here we’re using the Python gripcontrol library to interface with Pushpin. It’s not necessary to use a special library to speak GRIP (it’s just headers/JSON over HTTP), but the library is a nice convenience. We’ll go over the key lines:

pub = GripPubControl({'uri': 'http://localhost:5561'})

The above line sets up the library to point at Pushpin’s local REST API. No remote accesses are performed on this line, but whenever we attempt to interact with Pushpin later on in the code, calls will be made against this base URI.

resp = HttpResponse('{}\n')
resp['Grip-Hold'] = 'response'
resp['Grip-Channel'] = create_grip_channel_header('counter')

The above code generates a hold instruction, sent as an HTTP response to a proxied request. Essentially this tells Pushpin to hold the HTTP request (to the client) open until we publish data on a channel named “counter”. If enough time passes without a publish occurring, then Pushpin should timeout the connection by responding to the client with an empty JSON object. Once we respond with these instructions, the HTTP request between Pushpin and the Django application is finished, even though the HTTP request between Pushpin and the client remains open.

pub.publish_http_response('counter', str(c.value) + '\n')

The above call publishes an “HTTP response” to Pushpin, with the body of the response set to the value of the counter. This payload is published on the “counter” channel, causing Pushpin to deliver it to any requests that are currently open and bound to this channel.

That’s all there is to it!

Realtime is no longer special

The great part about being able to use existing web frameworks is that you don’t need separate codebases for realtime and non-realtime. It’s not uncommon for projects to implement the non-realtime parts of their API using a traditional web framework, and the realtime parts in a more customized way using a specialized server. Pushpin eliminates the need for multiple worlds here. Instead, your entire API, realtime or not, can be implemented using the same framework (e.g. entirely in Django). Any HTTP resource can be made to stream or long-poll on a whim. All facilities of your traditional web framework, such as authentication or debugging, will work within a realtime context.

Ideal for everyone

Finally, lest Pushpin be misunderstood solely as a way to shoehorn realtime capabilities onto non-event-driven web frameworks, it’s worth emphasizing that the proxying approach makes a lot of sense even if your backend is Node.js. The decoupling of application logic from connection management will make your overall application much easier to manage and maintain. Additionally, introducing proxying layers is the inevitable endgame for high scale data delivery (just look at the topology of a CDN).

Pushpin is open source and available on GitHub. For more information about the motivation and thought process behind Pushpin, see this article. And if you find yourself wishing there was a cloud service that worked like Pushpin, there is.