Realtime API

How To Power Your App Using a Realtime Data CDN

Combining Fastly (high scale pull) and Fanout (high scale push) to power realtime messaging at the edge

CDN — Content Delivery Network

Let’s start with defining a CDN. A content delivery network (CDN) is a system of distributed servers that traditionally delivers web content to a user, based on the geographic locations of the user, the origin of the webpage and the content delivery server. I use the term traditionally because we’re entering an era where CDNs are doing more than just delivering web content.

An example would be Cloudflare Workers, which lets you use their CDN to run code at the edge, rather than just serve web pages / cached content. You are basically able to deploy and run JavaScript away from the origin server — allowing you to decouple code from a user’s device. According to Cloudflare, “these Workers also enable programmatic functionality for routing, filtering and responding to HTTP requests that would otherwise need to be run on a customer’s server at the origin.”

The main point is that CDNs and edge computing are continuously evolving — whereby the two are starting to meld together in an era where high scalability is paramount.

Melding Realtime Data Push with Realtime Data Pull

Many realtime applications need to work with data that is both pushed and pulled (i.e live sports scores, auctions, chat). Separately, data push and data pull are fairly straightforward as independent entities. At initialization time, past content could be retrieved from a pull CDN and new/future updates could be pushed from a separate service.

But, what if you could chain these mechanisms together?

Proxy Chaining with Fastly and Fanout

Fastly is an edge cloud platform that enables applications to process, serve, and secure data at the edge of a network. It is essentially high scalable data pull and response, using a platform that can listen and respond to users’ needs in realtime. Similar to a traditional CDN, Fastly does allow you cache content, but it also lets you deliver application logic at the edge.

On the other hand, Fanout is high scalable data push — serving as a reverse proxy that handles long-lived client connections and pushes data as it becomes available.

Both Fastly and Fanout work as reverse proxies, so it is possible to have Fanout proxy traffic through Fastly — rather than sending that traffic directly to your origin server. Together, this coupled system has some interesting benefits:

  1. High availability — If your origin server goes down, Fastly can serve cached data and instructions to Fanout. This means clients could connect to your API endpoint, receive historical data, and activate a streaming connection, all without needing access to the origin server.
  2. Cached initial data — Fanout lets you build API endpoints that serve both historical and future content, for example an HTTP streaming connection that returns some initial data before switching into push mode. Fastly can provide that initial data, reducing load on your origin server.
  3. Cached Fanout instructions — Fanout’s behavior (e.g. transport mode, channels to subscribe to, etc.) is determined by instructions provided in origin server responses (using a system of special headers called Grip). Fastly can subsequently cache these instructions and headers.
  4. High scalability — By caching Fanout instructions and headers, Fastly can further reduce the load on your origin server — bringing that processing logic closer to the edge.

Mapping the Network Flow

Using Fanout and Fastly, let’s map the network flow to see how these push and pull mechanisms could work together.

Let’s suppose there’s an API endpoint /stream that returns some initial data and then stays open until there is a new update to push. With Fanout, this can be implemented by having the origin server respond with instructions:

HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 29
Grip-Hold: stream
Grip-Channel: updates

{"data": "current value"}

When Fanout receives this response from the origin server, it converts it into a streaming response to the client:

HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: Transfer-Encoding

{"data": "current value"}

The request between Fanout and the origin server is now finished, but the request between the client and Fanout remains open. Here’s a sequence diagram of the process:

Since the request to the origin server is just a normal short-lived request/response interaction, it can alternatively be served through a caching server such as Fastly.

Here’s what the process looks like with Fastly in the mix:

Now, when the next client makes a request to the /stream endpoint, the origin server isn’t involved at all:

In other words, Fastly serves the same response to Fanout, with those special HTTP headers and initial data, and Fanout sets up a streaming connection with the client.

Of course, this is only the connection setup. To send updates to connected clients, the data must be published to Fanout.

Purging the Fastly Cache

If an event that triggers a publish causes the origin server response to change, then we may also need to purge the Fastly cache.

For example, suppose the “value” that the /stream endpoint serves has been changed. The new value could be published to all current connections, but we’d also want any new connections that arrive afterwards to receive this latest value as well, rather than the older cached value. This can be solved by purging from Fastly and publishing to Fanout at the same time.

This sequence diagram illustrates a client connecting, receiving an update, and then another client connecting:

Effectively Handling Rate-Limiting

If your publishing data rate is relatively high, then this can negate the caching benefit of using Fastly.

The ideal data rate to effectively harness Fastly’s cache would be data that is:

  • Accessed frequently — many new vistors per second
  • Changed frequently — updates ever few seconds or minutes
  • Delivered instantly — in milliseconds

An example of this would be a live blog, whereby most requests can be served and handled from cache.

However, if your data changes multiple times per second (or has the potential to change that fast during peak moments), and you expect frequent access, you really don’t want to be purging your cache multiple times per second.

The workaround is to rate-limit your purges. For example, during periods of high throughput, you might purge and publish at a maximum rate of once per second or so. This way, the majority of new visitors can be served from cache, and the data will be updated shortly after.


You can reference the Github source code for the Fastly/Fanout high scale Live Counter demo. Requests first go to Fanout, then to Fastly, then to a Django backend server which manages the counter API logic. Whenever a counter is incremented, the Fastly cache is purged and the data is published through Fanout. The purge and publish process is also rate-limited to maximize caching benefit.

Final Thoughts: The Emergence of a Messaging CDN?

Broadly speaking, we could define a messaging content delivery network as a geographically distributed group of servers which work together to provide near realtime delivery of dynamic data and web content.

This new genre of CDN could allow data processing to take place at the edge, away from an app’s origin — thereby ushering in a new era of realtime computing that is both affordable and scalable.

Payments Are Moving To Real-Time In Countries Around The World

In this Forbes’ article, Tom Groenfeldt discusses the emergence of the realtime payment ecosystem and its demand on the realtime API environment.

Faster payment systems are being adopted in countries all around the globe even though there is no compelling ROI argument for them, according to the fourth annual Flavors of Fast payments study from FIS.

In fact, many of the innovations accompanying faster payments are not about pure speed but other attributes such as 24x7x365 operating hours and standards like ISO 20022 that support data like invoices moving with payments or requests for payments.

Full Article

Serverless WebSockets with AWS Lambda & Fanout

The basics of adding realtime data push to your serverless backend



Serverless is one of the developer world’s most popular misnomers. Contrary to its name, serverless computing does in fact use servers, but the benefit is that you can worry less about maintenance, scale, and configuration. This is because serverless is a cloud computing execution model where a cloud provider dynamically manages the allocation of machine and computational resources. You are basically deploying code to an environment without visible processes, operating systems, servers, or virtual machines. From a pricing perspective, you are typically charged for the actual amount of resources consumed and not by pre-purchased capacity.


  • Reduced architectural complexity
  • Simplified packaging and deployment
  • Reduced cost to scale
  • Eliminates the need for system admins
  • Works well with microservice architectures
  • Reduced operational costs
  • Typically decreased time to market with faster releases


  • Performance issues — typically higher latency due to how commute resources are allocated
  • Vendor lock-in (hard to move to a new provider)
  • Not efficient for long-running applications
  • Multi-tenancy issues where service providers may run software for several different customers on the same server
  • Difficult to test functions locally
  • Different FaaS implementations provide different methods for logging in functions

AWS Lambda

Amazon’s take on serverless comes in the form of AWS LambdaAWS Lambda lets you run code without provisioning or managing servers — while you only pay for your actual usage. With Lambda, you can run code for virtually any type of application or backend service — Lambda automatically runs and scales your application code. Moreover, you can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.


A WebSocket provides a long-lived connection for exchanging messages between client and server. Messages may flow in either direction for full-duplex communication. A client creates a WebSocket connection to a server, using a WebSocket client library. WebSocket libraries are generally available in every language, and of course browsers support it natively using the WebSocket JavaScript object. The connection negotiation uses an HTTP-like exchange, and a successful negotiation is indicated with status code 101. After the negotiation response is sent, the connection remains open to be used for exchanging message frames in either binary or unicode string format. Peers may also exchange close frames to perform a clean close.

Building AWS IoT Websockets

Function-as-a-service backends, such as AWS Lambda, are not designed to handle long-lived connections on their own. This is because the function invocations are meant to be short-lived. Lambda is designed to integrate with services such as AWS IoT to handle these types of connections. AWS IoT Core supports MQTT (either natively or over WebSockets), a lightweight communication protocol specifically designed to tolerate intermittent connections.

AWS IoT Core Site

However, this approach alone will not give you access to the raw protocol elements — and will not allow you to build a pure Lambda-powered API (if that is your intended use case). If you want this access, then you need to take a different approach.

Building Lambda-Powered WebSockets with Fanout

You can also build custom Lambda-powered WebSockets by integrating a service like Fanout — a cross between a message broker and a reverse proxy that enables realtime data push for apps and APIs. With these services together, we can build a Lambda-powered API that supports plain WebSockets.

This approach uses GRIP, the Generic Realtime Intermediary Protocol — making it possible for a web service to delegate realtime push behavior to a proxy component.

This FaaS GRIP library makes it easy to delegate long-lived connection management to Fanout, so that backend functions only need to be invoked when there is connection activity. The other benefit is that backend functions do not have to run for the duration of each connection.

The following step-by-step breakdown is meant as a quick configuration reference. You can checkout the Github libraries for Node and Pythonintegrations.

1. Initial Configuration

You will first configure your Fanout Cloud domain/environment and set up an API and resource in AWS API Gateway to point to your Lambda function, using a Lambda Proxy Integration.

2. Using Websockets

Whenever an HTTP request or WebSocket connection is made to your Fanout Cloud domain, your Lambda function will be able to control it. To do this, Fanout converts incoming WebSocket connection activity into a series of HTTP requests to your backend.

3. You’ve Got Realtime

You now have a realtime WebSockets driven by a Lambda function!

An Example

This Node.js code implements a WebSocket echo service. I recommend checking out the full FaaS GRIP library for a step-by-step breakdown, and for instructions on implementing HTTP long polling and HTTP streaming.

var grip = require('grip');
var faas_grip = require('faas-grip');

exports.handler = function (event, context, callback) {
    var ws;
    try {
        ws = faas_grip.lambdaGetWebSocket(event);
    } catch (err) {
        callback(null, {
            statusCode: 400,
            headers: {'Content-Type': 'text/plain'},
            body: 'Not a WebSocket-over-HTTP request\n'

    // if this is a new connection, accept it
    if (ws.isOpening()) {

    // here we loop over any messages
    while (ws.canRecv()) {
        var message = ws.recv();

        // if return value is null, then the connection is closed
        if (message == null) {

        // echo the message

    callback(null, ws.toResponse());

Overall, if you‘re not looking for full control over your raw protocol elements, then you may find it easier to try a Lambda/AWS IoT configuration. If you need more WebSocket visibility and control, then the Lambda+Fanout integration is probably your best bet.

The Edge is Nothing Without the Fog

Edge computing is hot right now. The growing maturity of IoT networks ranging from industrial to VR applications means that there’s an enormous amount of discussion around moving from the cloud to the edge (from us as well). But edge computing is only the first step.

We first want to makes sure we define the terms we’ll use.

  • The edge refers to the devices, sensors, or other sources of data at the edge of the network.
  • The cloud is the datacenter at the “center” of the network.
  • The fog is a management layer in-between the two (we know this is vague, read on)

More data, more problems

As more and more devices become connected to networks, we’re going to see an enormous uptick in the amount of data generated. Andy Daecher and Robert Schmid of Deloitte believe that “globally, the data created by IoT devices in 2019 will be 269 times greater than the data being transmitted to data centers from end-user devices and 49 times higher than total data center traffic.” Calling this big data is an understatement.

These volumes of data mean big problems:

  1. Moving this amount of data means latency issues for networks
  2. Privacy and security concerns increase as more data is moved
  3. Devices sending more data require more hardware and power to run

Prioritization is the answer, but it’s not solved at the edge

The answer to increasing data volume is the fog: the prioritization and management layer on the continuum between the edge and the cloud. The fog needs to answer the crucial decision: what to analyze at the end, and what to push back to the cloud?

It’s unreasonable to expect an IoT sensor at the edge (like a drone, that requires sub-millisecond reaction times), to process all the data it collects in realtime or push that data all the way to the cloud for processing. The fog reduces latency and takes the processing load off the drone, acting as a management layer and allowing for efficient distribution of resources across the network.

So, what does architecture incorporating the cloud, the edge, and the fog look like?

Justin Baker of RealtimeAPIHub has an excellent guide, including this graphic from Ergomonitor:


Intelligently separating data analysis tasks across the network continuum will be crucial as we move forward into the next era of IoT.

Spotlight Article: API Eventing Is The Next Big Opportunity For API Providers by James Higginbotham

In this article, James Higginbotham outlines 5 reasons why your product’s API should support events.   He discusses this in the context of ‘API Eventing’, whereby APIs become event-driven.

For the last decade, modern web APIs have grown from solutions like Flickr, to robust platforms that generate new business models. Throughout this period of growth, most APIs have been limited to request-response over HTTP. We are now seeing a move back to eventing with the popularity of webhooks to connect SaaS solutions, the introduction of technologies such as Kafka to drive internal messaging, and the need for integrating IoT devices.

API eventing completely changes the way API consumers interact with our APIs, creating new possibilities that request-response cannot. Let’s examine the driving factors contributing to the rise of API eventing in greater detail, along with the opportunities that may inspire you to consider adding API event support to your API.

Full Article

Getting Started with Building Realtime API Infrastructure

How companies are adding realtime capabilities to their products and building realtime APIs

Mirroring the rise of API-driven applications, realtime is becoming an emerging, omnipresent force in modern application development. It is powering instant messaging, live sports feeds, geolocation, big data, and social feeds. But, what is realtime and what does it really mean? What types of software and technology are powering this industry? Let’s dive into it.

What Is Realtime?

For the more technical audience, realtime traditionally describes realtime computing, whereby “hardware and software systems are subject to a realtime constraint, for example from event to system response” (Source). For this article, we’re framing realtime from the perspective of an end-user: the perception that an event or action happens sufficiently quickly to be perceived as nearly instantaneous.

Moreover, realtime could be defined in a more relative temporal sense. It could mean that a change in A synchronizes with a change in B. Or, it could mean that a change in A immediately triggers a change in B. Or… it could mean that A tells B that something changed, yet B does nothing. Or… does it mean that A tells everyone something changed, but doesn’t care who listens?

Let’s dig a bit deeper. Realtime does not necessarily mean that something is updated instantly (in fact, there’s no singular definition of “instantly”). So, let’s not focus on the effect, but rather the mechanism. Realtime is about pushing data as fast as possible — it is automated, synchronous, and bi-directional communication between endpoints at a speed within a few hundred milliseconds. 

  • Synchronous means that both endpoints have access to data at the same time.
  • Bi-directional means that data can be sent in either direction.
  • Endpoints are senders or receivers of data (phone, tablet, server).
  • A few hundred milliseconds is a somewhat arbitrary metric since data cannot be delivered instantly, but it most closely aligns to what humans perceive as realtime (Robert Miller proved this in 1986).

With this definition and its caveats in mind, let’s explore the concept of pushing data.

Data Push

We’ll start by contrasting data push with “request-response.” Request-response is the most fundamental way that computer systems communicate. Computer A sends a request for something from Computer B, and Computer B responds with an answer. In other words, you can open up a browser and type “” The browser sends a request to Reddit’s servers and they respond with the web page.

Request-Response vs Evented APIs

In a data push model, data is pushed to a user’s device rather than pulled (requested) by the user. For example, modern push email allows users to receive email messages without having to check manually. Similarly, we can examine data push in a more continuous sense, whereby data is continuously broadcasted. Anyone who has access to a particular channel or frequency can receive that data and decide what to do with it.

Moreover, there are a few ways that data push/streaming is currently achieved:

HTTP Streaming

HTTP Streaming

HTTP streaming provides a long-lived connection for instant and continuous data push. You get the familiarity of HTTP with the performance of WebSockets. The client sends a request to the server and the server holds the response open for an indefinite length. This connection will stay open until a client closes it or a server side-side event occurs. If there is no new data to push, the application will send a series of keep-alive ticks so the connection doesn’t close.


HTTP Web Sockets

WebSockets provide a long-lived connection for exchanging messages between client and server. Messages may flow in either direction for full-duplex communication. This bi-directional connection is established through a WebSocket handshake. Just like in HTTP Streaming and HTTP Long-Polling, the client sends a regular HTTP request to the server first. If the server agrees to the connection, the HTTP connection is replaced with a WebSocket connection.



Webhooks are a simple way of sending data between servers. No long-lived connections are needed. The sender makes an HTTP request to the receiver when there is data to push. A WebHook registers or “hooks” to a callback URL and will notify you anytime an event has occurred. You register this URL in advance and when an event happens, the server sends a HTTP POST request with an Event Object to the callback URL. This event object contains the new data that will be pushed to the callback URL. You might use a WebHook if you want to receive notifications about certain topics. It could also be used to notify you whenever a user changes or updates their profile.

HTTP Long-Polling

HTTP Long Polling

HTTP long-polling provides a long-lived connection for instant data push. It is the easiest mechanism to consume and also the easiest to make reliable. This technique provides a long-lived connection for instant data push. The server holds the request open until new data or a timeout occurs. Most send a timeout after 30 to 120 seconds, it depends on how the API was setup. After the client receives a response (whether that be from new data or a timeout), the client will send another request and this is repeated continuously.

Is pushing data hard? Yes, it is, especially at scale (ex. pushing updates to millions of phones simultaneously). To meet this demand, an entire realtime industry has emerged, which we’ll define as Realtime Infrastructure as Service (Realtime IaaS).

Realtime Libraries

Here is a compilation of resources that are available for developers to build realtime applications based on specific languages / frameworks:

Realtime Infrastructure as a Service

According to Gartner, “Infrastructure as a service (IaaS) is a standardized, highly automated offering, where compute resources, complemented by storage and networking capabilities are owned and hosted by a service provider and offered to customers on-demand. Customers are able to self-provision this infrastructure, using a Web-based graphical user interface that serves as an IT operations management console for the overall environment. API access to the infrastructure may also be offered as an option.”

We often here PaaS (Platform as a Service) and SaaS (Software as a Service), so how are they different than IaaS?

  • Infrastructure as a Service (IaaS): hardware is provided by an external provider and managed for you.
  • Platform as a Service (PaaS): both hardware and your operating system layer are managed for you.
  • Software as a Service (SaaS): an application layer is provided for the platform and infrastructure (which is managed for you).

To power realtime, applications require a carefully architected system of servers, APIs, load balancers, etc. Instead of building these systems in-house, organizations are finding it more cost-effective and resource-efficient to purchase much of this systemic infrastructure and then drive it in-house. These systems, therefore, are not just IaaS, but typically provide both a platform and software layer to help with management. Foundationally speaking, their core benefit is that they provide realtime infrastructure, whether you host it internally or rely on managed instance

It all comes down to the simple truth that realtime is hard for a number of reasons:

  • Customer Uptime Demand – Customers that depend on realtime updates will immediately notice when your network is not performant.
  • Horizontal Scalability – You must be able to handle volatile and massive loads on your system or risk downtime. This is typically achieved through clever horizontal scalability and systems that are able to manage millions of simultaneous connections.
  • Architectural Complexity – Maintaining a performant realtime system is not only complex, but it requires extensive experience and expertise. This is expensive to buy, especially in today’s high demand engineering market.
  • Contingencies – Inevitably, your system will experience some downtime, whether due to an anticipated load spike or a newly released feature. It is important, therefore, to have multiple uptime contingencies in place to make sure that the system knows what to do, should your primary realtime mechanism fail to perform.
  • Queuing – When you’re sending a lot of data, then you likely need an intermediate queuing mechanism to ensure that your backend processes are not overburdened with increased message loads.

Realtime Application IaaS

Realtime app infrastructure sends data to browsers and clients. It typically uses pub/sub messaging, webhooks, and/or websockets — and is separate from an application or service’s main API. These solutions are best for organizations that are looking for realtime messaging without the need to build their own realtime APIs.

Pub-Subscribe PubSub Pattern for Realtime API

These systems also have more well-built platform/software management tools on top of their infrastructure offerings. For instance, the leading providers have built-in configuration tools like access controls, event delegation, debugging tools, and channel configuration.

Benefits of Realtime App IaaS

  • Speed – typically explicitly designed to deliver data with low latencies to end-user devices, including smartphones, tablets, browsers, and laptops.
  • Multiple SDKs for easier integration.
  • Uses globally distributed realtime data delivery platforms.
  • Multiple protocol adapters.
  • Well-tested in production environments.
  • Keeps internal configuration to a minimum.

Use Cases

While some of the platforms out there function differently, here are some of the most typical use cases:

  • Realtime Chat – In a microservice environment, a realtime API proxy makes it easy to listen for instant updates from other microservices without the need for a centralized message broker. Each microservice gets its own proxy instance, and microservices communicate with each other via your organization’s own API contracts rather than a vendor-specific mechanism.
  • IoT Device Control – Securely monitor, control, provision and stream data between Internet-connected devices.
  • Geotracking / Mapping Realtime Updates – Integrates with other realtime APIs like (Google Maps) to construct rich realtime updates.
  • Multiplayer Game Synchronization – Synchronize communications amongst multiple simultaneous players to keep play fluid.


Here are some realtime application IaaS providers (managed) to check out for further learning: PubNubPusher, and Ably.

Realtime API IaaS for API Development

Realtime API infrastructure specifically allows developers to build realtime data push into their existing APIs. Typically, you would not need to modify your existing API contracts, as the streaming server would serve as a proxy. The proxy design allows these services to fit nicely within an API stack. This means it can inherit other facilities from your REST API, such as authentication, logging, throttling, etc and, consequently, it can be easily combined with an API management system. In the case of WebSocket messages being proxied out as HTTP requests, the messages may be handled statelessly by the backend. Messages from a single connection can even be load balanced across a set of backend instances.

Realtime API Infrastructure as a service IaaS Proxy

All in all, realtime API IaaS is used for API development, specifically geared for organizations that need to build highly-performant realtime APIs like Slack, Instagram, Google, etc. All of these orgs build and manage their infrastructure internally, so the IaaS offering can be thought of as a way to extend these capabilities to organizations that lack the resources and technical expertise to build a realtime API from scratch.

Benefits of Realtime API IaaS

  • Custom build an internal API.
  • Works with existing API management systems.
  • Does not lock you into a particular tech stack.
  • Provides realtime capabilities throughout entire stack.
  • Usually proxy-based, with pub/sub or polling.
  • Add realtime to any API, no matter what backend language or database.
  • Cloud or self-hosted API infrastructure.
  • It can inherit facilities from your REST API, such as authentication, logging, throttling.

Use Cases

While some of the platforms out there function differently, here are some of the most typical use cases:

  • API development – As we’ve discussed, you can build custom realtime APIs on top of your existing API infrastructure.
  • Microservices – In a microservice environment, a realtime API proxy makes it easy to listen for instant updates from other microservices without the need for a centralized message broker. Each microservice gets its own proxy instance, and microservices communicate with each other via your organization’s own API contracts rather than a vendor-specific mechanism.
  • Message queue – If you have a lot of data to push, you may want to introduce an intermediate message queue. This way, backend processes can publish data once to the message queue, and the queue can relay the data via an adapter to one or more proxy instances. The realtime proxy is able to forward subscription information to such adapters, so that messages can be sent only to the proxy instances that have subscribers for a given channel.
  • API management – It’s possible to combine an API management system with a realtime proxy. Most API management systems work as proxy servers as well, which means all you need to do is chain the proxies together. Place the realtime proxy in the front, so that the API management system isn’t subjected to long-lived connections. Also, the realtime proxy can typically translate WebSocket protocol to HTTP, allowing the API management system to operate on the translated data.
  • Large scale CDN performance – Since realtime proxy instances don’t talk to each other, and message delivery can be tiered, this means the realtime proxy instances can be geographically distributed to create a realtime push CDN. Clients can connect to the nearest regional edge server, and events can radiate out from a data source to the edges.


Here are some realtime API IaaS providers (managed/open source) to check out for further learning: Fanout/, and LiveResource.


Realtime is becoming an emerging, omnipresent force in modern application development. It is not only a product differentiator, but is often sufficient for product success. It has accelerated the proliferation of widely-used apps like Google Maps, Lyft, and Slack. Whether you’re looking to build your own API from scratch or build on top of an IaaS platform, realtime capabilities are increasingly becoming a requirement of the modern tech ecosystem.

Resource Spotlight: Spec by Sam Curren and Phillip J. Windley

This online resource is a unique way to frame a conceptual model for evented APIs.  Sam Curren and Phillip J. Windley discuss the fundamentals of evented APIs, how evented systems work, and a proposed protocol.

Events indicate something has happened. In this they differ from the request-response interaction style popular on the Web. Event-based systems are declarative whereas request-response systems are interrogatory. The difference between events (“this happened”) and requests (“will you do this?”) offers benefits in looser coupling of components as well as semantic encapsulation (see On Hierarchies and Networks for more detail).

APIs have become an economic imperative for many companies. But APIs based solely on request-response style interactions limit integrations to those where one system always knows what it wants from the other. The calling service must script the interaction and the APIs simply follow along.

We envision a world where applications integrate multiple products and services as equals based on event-driven interactions. Evented APIs, following the form described in this document, enable building such applications.

realtime_notifications fanout slack pushpin intercom

Realtime data for smart notifications

From the Fanout Blog

It’s becoming the new normal that messaging and collaboration apps and platforms are available across multiple devices.

Business tools like Slack and JIRA offer feature-rich mobile apps, and users increasingly consume content from social networks like Facebook on their mobile devices instead of a desktop or laptop.

This isn’t a surprise – and we’re here to share our perspective on how developers can use realtime data to provide cross-platform users with the best notification experience.

Mary Meeker’s 2017 Internet Trends Report tracks the trend towards increasing mobile adoption:


What’s not stated explicitly in this slide is that that much of this engagement occurs simultaneously – it’s not uncommon for users to have an app open on their desktop and phone at the same time.

‘Dumb’ notifications produce a poor user experience

Simultaneous use of cross-platform apps has created a user experience issue that many of us are familiar with. When a new Trello card is assigned to me, I get a push notification on my phone, a ping in the Trello interface, and an email in my inbox. Due to my Trello integration with Slack, things quickly get worse – I get a notification on Slack on each of my devices. I can get up to 6 notifications tied to a single event.

This isn’t ideal – and as more devices become connected, the problem will only be compounded. Imagine a future where your phone, smartwatch, smart TV, and smart thermostat are all buzzing simultaneously. It doesn’t need to be this way.

Collaboration and messaging app developers can get smart

We didn’t come up with the idea of ‘smart’ notifications (entire companies like Intercom and OpenBackare built to enable them) – but we do have a perspective on how app developers can use realtime data to enable them.

Realtime data is already present in many chat or collaboration apps – typing indicators, read receipts, and live editing are all features that we take for granted. The next step for developers is taking a wider variety of realtime data into account when building notifications into their user experiences.

Luckily, mobile devices offer a wealth of realtime data to developers who want to do this:

Presence and attention-awareness (knowing which device a user is active on) allows a single ping to that device, instead of a ping to all devices. Results-driven logic can drive a secondary notification to another device or channel in the instance the first notification is not responded to. This can lead to some pretty complex logic, as in the case of Slack’s notification tree below:


Slack’s blog post on how they built a lightweight desktop client to handle the complex interactions between team, channel, and user preferences and states when sending notifications is worth reading.

Time and location data is crucial – work notifications don’t need to be sent on the weekend, and pop-up notifications for events or sales are only relevant in bounded areas. Slack enables manual setting of ‘Do Not Disturb’ hours in order to keep notifications from taking over user’s lives. Context can be user-generated (like in the Slack example) or learned based on prior interactions with notifications.

Device and connection state information is underutilized. Know a user has low battery life? Maybe the notification to download the latest game update can wait. Users on Wifi are more likely to interact with rich notifications than those on cellular connections. If a user loses connectivity and many notifications are queued, they may no longer all be relevant when the user is back in range.

Realtime is a crucial component for smart notifications

As users constantly switch devices and platforms, realtime knowledge of their status is key to providing intelligent notifications. Developers who do this well will continue to retain user interest, and those who don’t will have a hard time keeping their attention.

realtime-api what is it fanout pushpin

What is a realtime API?

Many software developers are familiar with realtime, but we believe that realtime concepts and user experiences are becoming increasingly important for less technical individuals to understand.

At Fanout, we power realtime APIs to instantly push data to endpoints – which can range from the actual endpoints of an API (the technical term) to external businesses or end users. We use the word in this post loosely to refer to any destination for data.

We’re here to share our experience with realtime: we’ll provide a definition and current examples, peer into the future of realtime, and try and shed some light on the eternal realtime vs. real-time vs. real time semantic debate.

The simple definition

Realtime refers to a synchronous, bi-directional communication channel between endpoints at a speed of less than 100ms.

We’ll break that down in plain[er] english:

  • Synchronous means that both endpoints have access to data at the same time (not to be confused with sync/async programming).
  • Bi-directional means that endpoints can send data in either direction.
  • Endpoints are senders or receivers of data: they could be anything from an API endpoint that makes data available to a user chatting on their phone.
  • 100ms is somewhat arbitrary: data cannot be delivered instantly – but under 100ms is pretty close, especially with respect to human perception. Robert Miller proved this in 1986.

An example of a realtime user experience

A simple example of a realtime user experience is that of a chat app. In a chat app, you ‘immediately’ (sub 100ms) see messages from the person (endpoint) you’re chatting with, and can receive information about when they read your messages (synchronous, bi-directional).

Realtime vs. request-response

Web experiences are beginning to move from request-response experiences to live, realtime ones. Social feeds don’t require a refresh (a request) to update, and you don’t need to email documents as attachments that need to be downloaded (request) and sent back with edits (response) – you just use collaboration software that works in realtime.

More realtime experiences

Realtime user experiences are everywhere you look – especially where near-instant access to information is valuable. You’ll find realtime in:

  • Collaboration: realtime access to internal and external information from your team is becoming the norm. It’s accepted that a sales inquiry (data) can be instantaneously relayed from live chat on your website, into your customer service portal and then into Slack.
  • Finance: stock tracking and bitcoin wallets require immediate access to information. Applications like high-frequency trading exist specifically because of the ability of certain parties to access and act on data faster than others.
  • Events: second-screen experiences for sports, including live betting with realtime odds updates, are becoming increasingly common.
  • Crowdsourcing: distributed collection, analysis, and dissemination of data from distributed endpoints (think reports from WeatherUnderground stations or from the traffic app Waze) is only valuable when it occurs in realtime.

Realtime in the future

As we see it (and admittedly, we are a little biased), realtime is quickly becoming the new normal. Up-to-date information is expected by businesses and end users. Realtime is the natural complement to trends like:

Big Data: as the number of digitally connected businesses, experiences, and devices rises, so does the amount of data generated. Data becomes more valuable as the three V’s of a dataset (velocity, volume, variety) increase – and realtime transmission is central to the velocity component.

In the past, companies benefitted from hoarding data, but increasingly data is becoming most valuable when shared (and monetized). The companies that can aggregate and share the most data, as quickly as possible, will be successful.

Proliferation of APIs: businesses sharing data are increasingly going to do so through APIs. Entire businesses are being built on APIs by platform providers like Twillio (they only have an API) or they are coming to comprise substantial portions of existing businesses (like Salesforce’s API).

An elegant end-user experience is increasingly the product of data that’s being moved through multiple APIs – and the number of APIs is only going to increase as they trend towards becoming less technical and more accessible and interoperable. The APIs that provide access to data or move it through their system as quickly as possible will rise over those that cannot.

Realtime vs. real-time vs. real time

The endless debate – what’s the correct way to write what we’ve been discussing? We use realtime, because we believe that “real time” refers to something experienced at normal speed and not condensed or sped up. For example, watching grass grow in ‘real time’ is not very exciting – but a time lapse is.

We also don’t like hyphenating – so we went with realtime instead of real-time (and it looks like most of the industry agrees with us).