Author Archive

Supercharging Kafka:  Enable Realtime Web Streaming by Adding Pushpin

Exposing Kafka messages via a public HTTP streaming API

Matt Butler

Apache Kafka is the new hotness when it comes to adding realtime messaging capabilities to your system. At its core, it is an open source distributed messaging system that uses a publish-subscribe system for building realtime data pipelines. But, more broadly speaking, it is a distributed and horizontally scaleable commit log.

In a Kafka cluster, you will have topics, producers, consumers, and brokers:

  • Topics — A categorization for a group of messages
  • Producers — Push messages into a Kafka topic
  • Consumers — Pulls messages off of a Kafka topic
  • Kafka Broker — A Kafka node
  • Kafka Cluster— A collection of Kafka brokers

Take a deep dive into Kafka here.

Overall, Kafka provides fast, highly scalable and redundant messaging through a publish-subscribe model.

A pub-sub model is a messaging pattern where publishers categorize published messages into topics without knowledge of which subscribers would receive those messages (if any). Likewise, subscribers express interest in one or more topics and only receive messages that are of interest, without knowing anything about the publishers (source).

Kafka Strengths

As a messaging system, Kafka has some transformative strengths that have catalyzed its rising popularity

  1. Realtime Data Pipeline — Can handle realtime messaging throughput with high currency
  2. High-throughput — Ability to support high-velocity and high-volume data (1000’s per second)
  3. Fault-tolerant — Due to its distributed nature, it is relatively resistant to node failure within a cluster
  4. Low Latency — Milliseconds to handle thousands of messages
  5. Scalability — Kafka’s distributed nature allows you to add additional nodes without downtime, facilitating partitioning and replication

Kafka Limits

Due to its intrinsic architecture, Kafka is not optimized to provide API consumers with friendly access to realtime data. As such, many orgs are hesitant to expose their Kafka endpoints publicly.

In other words, it is difficult to expose Kafka across a public API boundary if you want to use traditional protocols (like websockets or HTTP).

To overcome this limit, we can integrate Pushpin into our Kafka ecosystem to handle more traditional protocols and expose our public API in a more accessible and standardized way.

Pushpin + Kafka

Server-sent events (SSE) is a technology where a browser receives automatic updates from a server via HTTP connection (standardized in HTML5 standards). Kafka doesn’t natively support this protocol, so we need to add an additional service to make this happen.

Pushpin’s primary value prop is that it is an open source solution that enables realtime push — a requisite of evented APIs (GitHub Repo). At its core, it is a reverse proxy server that makes it easy to implement WebSocket, HTTP streaming, and HTTP long-polling services. Structurally, Pushpin communicates with backend web applications using regular, short-lived HTTP requests.

Integrating Pushpin and Kafka provides you with some notable benefits:

  • Resource-Oriented API — Provides a more logical resource-oriented API to consumers that fits in with an existing REST API. In other words, you can expose data over standardized, more-secure protocols.
  • Authentication — Reuses existing authentication tokens and data formats.
  • API Management — Harnesses your existing API management system or load balancers.
  • Web Tier Scaleability — If the number of your web consumers grows substantially, then it may be more economical and performant to scale out your web tier, rather than your Kafka cluster.

In this next example, we will expose Kafka message via HTTP streaming API. 

Building Kafka Server-Sent Events

This example project reads messages from a Kafka service and exposes the data over a streaming API using Server-Sent Events (SSE) protocol over HTTP. It is written using Python & Django, and relies on Pushpin for managing the streaming connections.

How it Works

In this demo, we drop a Pushpin instance on top of our Kafka broker. Pushpin acts as a Kafka consumer, subscribes to all topics, and re-publishes received messages to connected clients. Clients listen to events via Pushpin.

More granularly, we use to set up an SSE endpoint, while handles the messaging input and output.

  1. First, we need to setup virtualenv and install dependencies:
virtualenv --python=python3 venv
. venv/bin/activate
pip install -r requirements.txt

2. Create a suitable .env with Kafka and Pushpin settings:


3. Run the Django server:

python runserver

4. Run Pushpin:

pushpin --route="* localhost:8000"

5. Run the relay command:

python relay

The relay command sets up a Kafka consumer according to KAFKA_CONSUMER_CONFIG, subscribes to all topics, and re-publishes received messages to Pushpin, wrapped in SSE format.

Clients can listen to events by making a request (through Pushpin) to /events/{topic}/:

curl -i http://localhost:7999/events/test/

The output stream might look like this:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Transfer-Encoding: chunked
Connection: Transfer-Encoding
event: message
data: hello
event: message
data: world

Repo on GitHub

How To Power Your App Using a Realtime Data CDN

Combining Fastly (high scale pull) and Fanout (high scale push) to power realtime messaging at the edge

CDN — Content Delivery Network

Let’s start with defining a CDN. A content delivery network (CDN) is a system of distributed servers that traditionally delivers web content to a user, based on the geographic locations of the user, the origin of the webpage and the content delivery server. I use the term traditionally because we’re entering an era where CDNs are doing more than just delivering web content.

An example would be Cloudflare Workers, which lets you use their CDN to run code at the edge, rather than just serve web pages / cached content. You are basically able to deploy and run JavaScript away from the origin server — allowing you to decouple code from a user’s device. According to Cloudflare, “these Workers also enable programmatic functionality for routing, filtering and responding to HTTP requests that would otherwise need to be run on a customer’s server at the origin.”

The main point is that CDNs and edge computing are continuously evolving — whereby the two are starting to meld together in an era where high scalability is paramount.

Melding Realtime Data Push with Realtime Data Pull

Many realtime applications need to work with data that is both pushed and pulled (i.e live sports scores, auctions, chat). Separately, data push and data pull are fairly straightforward as independent entities. At initialization time, past content could be retrieved from a pull CDN and new/future updates could be pushed from a separate service.

But, what if you could chain these mechanisms together?

Proxy Chaining with Fastly and Fanout

Fastly is an edge cloud platform that enables applications to process, serve, and secure data at the edge of a network. It is essentially high scalable data pull and response, using a platform that can listen and respond to users’ needs in realtime. Similar to a traditional CDN, Fastly does allow you cache content, but it also lets you deliver application logic at the edge.

On the other hand, Fanout is high scalable data push — serving as a reverse proxy that handles long-lived client connections and pushes data as it becomes available.

Both Fastly and Fanout work as reverse proxies, so it is possible to have Fanout proxy traffic through Fastly — rather than sending that traffic directly to your origin server. Together, this coupled system has some interesting benefits:

  1. High availability — If your origin server goes down, Fastly can serve cached data and instructions to Fanout. This means clients could connect to your API endpoint, receive historical data, and activate a streaming connection, all without needing access to the origin server.
  2. Cached initial data — Fanout lets you build API endpoints that serve both historical and future content, for example an HTTP streaming connection that returns some initial data before switching into push mode. Fastly can provide that initial data, reducing load on your origin server.
  3. Cached Fanout instructions — Fanout’s behavior (e.g. transport mode, channels to subscribe to, etc.) is determined by instructions provided in origin server responses (using a system of special headers called Grip). Fastly can subsequently cache these instructions and headers.
  4. High scalability — By caching Fanout instructions and headers, Fastly can further reduce the load on your origin server — bringing that processing logic closer to the edge.

Mapping the Network Flow

Using Fanout and Fastly, let’s map the network flow to see how these push and pull mechanisms could work together.

Let’s suppose there’s an API endpoint /stream that returns some initial data and then stays open until there is a new update to push. With Fanout, this can be implemented by having the origin server respond with instructions:

HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 29
Grip-Hold: stream
Grip-Channel: updates

{"data": "current value"}

When Fanout receives this response from the origin server, it converts it into a streaming response to the client:

HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: Transfer-Encoding

{"data": "current value"}

The request between Fanout and the origin server is now finished, but the request between the client and Fanout remains open. Here’s a sequence diagram of the process:

Since the request to the origin server is just a normal short-lived request/response interaction, it can alternatively be served through a caching server such as Fastly.

Here’s what the process looks like with Fastly in the mix:

Now, when the next client makes a request to the /stream endpoint, the origin server isn’t involved at all:

In other words, Fastly serves the same response to Fanout, with those special HTTP headers and initial data, and Fanout sets up a streaming connection with the client.

Of course, this is only the connection setup. To send updates to connected clients, the data must be published to Fanout.

Purging the Fastly Cache

If an event that triggers a publish causes the origin server response to change, then we may also need to purge the Fastly cache.

For example, suppose the “value” that the /stream endpoint serves has been changed. The new value could be published to all current connections, but we’d also want any new connections that arrive afterwards to receive this latest value as well, rather than the older cached value. This can be solved by purging from Fastly and publishing to Fanout at the same time.

This sequence diagram illustrates a client connecting, receiving an update, and then another client connecting:

Effectively Handling Rate-Limiting

If your publishing data rate is relatively high, then this can negate the caching benefit of using Fastly.

The ideal data rate to effectively harness Fastly’s cache would be data that is:

  • Accessed frequently — many new vistors per second
  • Changed frequently — updates ever few seconds or minutes
  • Delivered instantly — in milliseconds

An example of this would be a live blog, whereby most requests can be served and handled from cache.

However, if your data changes multiple times per second (or has the potential to change that fast during peak moments), and you expect frequent access, you really don’t want to be purging your cache multiple times per second.

The workaround is to rate-limit your purges. For example, during periods of high throughput, you might purge and publish at a maximum rate of once per second or so. This way, the majority of new visitors can be served from cache, and the data will be updated shortly after.


You can reference the Github source code for the Fastly/Fanout high scale Live Counter demo. Requests first go to Fanout, then to Fastly, then to a Django backend server which manages the counter API logic. Whenever a counter is incremented, the Fastly cache is purged and the data is published through Fanout. The purge and publish process is also rate-limited to maximize caching benefit.

Final Thoughts: The Emergence of a Messaging CDN?

Broadly speaking, we could define a messaging content delivery network as a geographically distributed group of servers which work together to provide near realtime delivery of dynamic data and web content.

This new genre of CDN could allow data processing to take place at the edge, away from an app’s origin — thereby ushering in a new era of realtime computing that is both affordable and scalable.

How Blockchain and Realtime APIs are Totally Changing Healthcare

In his article, Philip Levinson discusses how realtime data has become essential for advances in healthcare software — specifically as applied to new blockchain technology.

As with other healthcare companies and organizations, one of the keys to Oscar’s model depends on “using real-time data to get actionable insights in front of members and physicians,” says Schlosser.

As a result, blockchain “has the power to revive the healthcare industry by reorganizing operations, generating new business models and integrating patients’ medical records,” according to Zacks.

The latter of these represent two ways blockchain is most likely to change healthcare in the short-run.

Full Article 

Payments Are Moving To Real-Time In Countries Around The World

In this Forbes’ article, Tom Groenfeldt discusses the emergence of the realtime payment ecosystem and its demand on the realtime API environment.

Faster payment systems are being adopted in countries all around the globe even though there is no compelling ROI argument for them, according to the fourth annual Flavors of Fast payments study from FIS.

In fact, many of the innovations accompanying faster payments are not about pure speed but other attributes such as 24x7x365 operating hours and standards like ISO 20022 that support data like invoices moving with payments or requests for payments.

Full Article

Serverless WebSockets with AWS Lambda & Fanout

The basics of adding realtime data push to your serverless backend



Serverless is one of the developer world’s most popular misnomers. Contrary to its name, serverless computing does in fact use servers, but the benefit is that you can worry less about maintenance, scale, and configuration. This is because serverless is a cloud computing execution model where a cloud provider dynamically manages the allocation of machine and computational resources. You are basically deploying code to an environment without visible processes, operating systems, servers, or virtual machines. From a pricing perspective, you are typically charged for the actual amount of resources consumed and not by pre-purchased capacity.


  • Reduced architectural complexity
  • Simplified packaging and deployment
  • Reduced cost to scale
  • Eliminates the need for system admins
  • Works well with microservice architectures
  • Reduced operational costs
  • Typically decreased time to market with faster releases


  • Performance issues — typically higher latency due to how commute resources are allocated
  • Vendor lock-in (hard to move to a new provider)
  • Not efficient for long-running applications
  • Multi-tenancy issues where service providers may run software for several different customers on the same server
  • Difficult to test functions locally
  • Different FaaS implementations provide different methods for logging in functions

AWS Lambda

Amazon’s take on serverless comes in the form of AWS LambdaAWS Lambda lets you run code without provisioning or managing servers — while you only pay for your actual usage. With Lambda, you can run code for virtually any type of application or backend service — Lambda automatically runs and scales your application code. Moreover, you can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.


A WebSocket provides a long-lived connection for exchanging messages between client and server. Messages may flow in either direction for full-duplex communication. A client creates a WebSocket connection to a server, using a WebSocket client library. WebSocket libraries are generally available in every language, and of course browsers support it natively using the WebSocket JavaScript object. The connection negotiation uses an HTTP-like exchange, and a successful negotiation is indicated with status code 101. After the negotiation response is sent, the connection remains open to be used for exchanging message frames in either binary or unicode string format. Peers may also exchange close frames to perform a clean close.

Building AWS IoT Websockets

Function-as-a-service backends, such as AWS Lambda, are not designed to handle long-lived connections on their own. This is because the function invocations are meant to be short-lived. Lambda is designed to integrate with services such as AWS IoT to handle these types of connections. AWS IoT Core supports MQTT (either natively or over WebSockets), a lightweight communication protocol specifically designed to tolerate intermittent connections.

AWS IoT Core Site

However, this approach alone will not give you access to the raw protocol elements — and will not allow you to build a pure Lambda-powered API (if that is your intended use case). If you want this access, then you need to take a different approach.

Building Lambda-Powered WebSockets with Fanout

You can also build custom Lambda-powered WebSockets by integrating a service like Fanout — a cross between a message broker and a reverse proxy that enables realtime data push for apps and APIs. With these services together, we can build a Lambda-powered API that supports plain WebSockets.

This approach uses GRIP, the Generic Realtime Intermediary Protocol — making it possible for a web service to delegate realtime push behavior to a proxy component.

This FaaS GRIP library makes it easy to delegate long-lived connection management to Fanout, so that backend functions only need to be invoked when there is connection activity. The other benefit is that backend functions do not have to run for the duration of each connection.

The following step-by-step breakdown is meant as a quick configuration reference. You can checkout the Github libraries for Node and Pythonintegrations.

1. Initial Configuration

You will first configure your Fanout Cloud domain/environment and set up an API and resource in AWS API Gateway to point to your Lambda function, using a Lambda Proxy Integration.

2. Using Websockets

Whenever an HTTP request or WebSocket connection is made to your Fanout Cloud domain, your Lambda function will be able to control it. To do this, Fanout converts incoming WebSocket connection activity into a series of HTTP requests to your backend.

3. You’ve Got Realtime

You now have a realtime WebSockets driven by a Lambda function!

An Example

This Node.js code implements a WebSocket echo service. I recommend checking out the full FaaS GRIP library for a step-by-step breakdown, and for instructions on implementing HTTP long polling and HTTP streaming.

var grip = require('grip');
var faas_grip = require('faas-grip');

exports.handler = function (event, context, callback) {
    var ws;
    try {
        ws = faas_grip.lambdaGetWebSocket(event);
    } catch (err) {
        callback(null, {
            statusCode: 400,
            headers: {'Content-Type': 'text/plain'},
            body: 'Not a WebSocket-over-HTTP request\n'

    // if this is a new connection, accept it
    if (ws.isOpening()) {

    // here we loop over any messages
    while (ws.canRecv()) {
        var message = ws.recv();

        // if return value is null, then the connection is closed
        if (message == null) {

        // echo the message

    callback(null, ws.toResponse());

Overall, if you‘re not looking for full control over your raw protocol elements, then you may find it easier to try a Lambda/AWS IoT configuration. If you need more WebSocket visibility and control, then the Lambda+Fanout integration is probably your best bet.

Edge Computing —A Beginner’s Guide

Learn the basics of edge computing and how it is transforming the realtime landscape

Machine Pulse

The Edge

The ‘edge’ refers to computing infrastructure that exists close to the origin sources of data. It is distributed IT architecture and infrastructure where data is processed at the periphery of the network, as close to the originating source as possible.

Edge computing is a method of optimizing cloudcomputing systems by performing data processing at theedge of the network, near the source of the data.


Living on the Edge

A series of gateway servers sit outside your primary cloud environment, allowing for more localized data processing.

Examples of edge computing can be found throughout our everyday lives — we just may not notice them.

Industrial Internet of Things (IIoT)

  • Wind turbines
  • Magnetic resonance (MR) scanner
  • Undersea blowout preventers
  • Industrial controllers such as SCADA systems
  • Automated industrial machines
  • Smart power grid technology
  • Smart streetlights

Internet of Things (IoT)

  • Motor vehicles (Cars and trucks)
  • Mobile devices
  • Traffic lights
  • Thermostats
  • Home appliances

Edge Computing Benefits

Edge computing allows for the clear scoping of computing resources for optimal processing.

  1. Time-sensitive data can be processed at the point of origin by a localized processor (a device that has its own computing ability).
  2. Intermediary servers can be used to process data in close geographical proximity to the source (this assumes that intermediate latency is okay, though realtime decisions should be made as close to the origin as possible).
  3. Cloud servers can be used to process less time sensitive data or to store data for the longterm. With IoT, you’ll see this manifest in analytics dashboards.
  4. Edge application services significantly decrease the volumes of data that must be moved, the consequent traffic, and the distance the data must travel, thereby reducing transmission costs, shrinking latency, and improving quality of service(QoS) (source).
  5. Edge computing removes a major bottleneck and potential point of failure by de-emphasizing the dependency on the core computing environment.
  6. Security improves as encrypted data is checked as it passes through protected firewalls and other security points, where viruses, compromised data, and active hackers can be caught early on (source).
  7. The edge augments scalability by logically grouping CPU capabilities as needed, saving costs on realtime data transmission.

Why the Edge

Transmitting massive amounts of data is expensive and taxing on network resources. Edge computing allows you to process data near the source and only send relevant data over the network to an intermediate data processor.

For example, a smart refrigerator does not need to continually send internal temperature data back to a cloud analytics dashboard. Rather, it can be configured to only send data when the temperature has changed beyond a particular point; or, it could be polled to send data only when the dashboard is loaded. Similarly, an IoT security camera could only need to send data back to your device when it detects motion or when you explicitly toggle a live data feed.

Devise Relationship Management (DRM)

To manage edge devices, device relationship management (DRM) refers to the monitoring and maintenance of complex, intelligent, and interconnected equipment over the internet. DRM is specifically designed to interface with the microprocessors and local software in IoT devices.

Device relationship management (DRM) is enterprise software that enables the monitoring, managing, and servicing of intelligent devices over the Internet.


The Fog

Between the edge and cloud is the fog layer, which helps bridge the connections between edge devices and cloud data centers. According to Matt Newton of Opto 22:

Fog computing pushes intelligence down to the local area network level of network architecture, processing data in a fog node or IoT gateway.

Edge computing pushes the intelligence, processing power and communication capabilities of an edge gateway or appliance directly into devices like programmable automation controllers (PACs).


Edge and Realtime

Sensors and remotely deployed devices demand realtime processing. A centralized cloud system is often too slow for this, especially when decisions need to be made in microseconds. This is especially true for IoT devices in regions or locations with poor connectivity.

However, sometimes realtime capabilities demand cloud processing. For example, lets say data consumed by remote tornado weather monitors needs to be sent in realtime to massive supercomputers.

This is where realtime infrastructure comes into play to help enable those data transactions.

By 2020, 50% of Managed APIs Projected to be Event-Driven

The proliferation of event-driven, realtime APIs fueled by big data, IoT, and consumer expectations

According to Mark O’Neill and Paolo Malinverno of Gartner, 50% of managed APIs will support event-driven IT by 2020 (2017 Report).  These event-driven APIs will not necessarily replace RESTful request-response architectures, but will become necessary supplements to expand an organization’s functional offerings and overall performance.

In another 2017 IoT report, Gartner projects “8.4 billion connected devices, up 31% from 2016, and will reach 20.4 billion by 2020. Total spending on endpoint infrastructure and services will reach almost $2 trillion in 2017.”

So, what’s driving this evolution? “Realtime” is becoming an omnipresent force in the modern tech stack. As consumers demand faster experiences and more instantaneous data transactions, companies are increasingly investing in product infrastructure that accelerates these transactions. Though we’ve seen APIs become an economic and technological imperative, they are typically based on request-response style interactions, which limits their scope and effectiveness in the realtime arena.

Request-Response vs Event-Driven APIs

At its core, request–response is a message exchange pattern in which a requestor sends a request message to a replier system. The replier system receives and processes the request, and if all goes well, it returns a message in response. While this exchange format works well for more structured requests, it limits integrations to those where the expectant system has a clear idea what it wants from the other. These request-response style APIs, therefore, must follow the interaction script from the calling service.

Request-Response vs Event-Driven Realtime APIs

In an event-driven architecture, applications integrate multiple services and products as equals based on event-driven interactions. These interactions are driven by event emitters, event consumers, and event channels, whereby the events, themselves, are typically significant ‘changes in state’ that are produced, published, propagated, detected, or consumed. This architectural pattern supports loose coupling amongst software components and services. The advantage is that an event emitter does not need to know the state of the consumer, who the consumer is, or how the event will be processed (if at all). It is a mechanism of pushing data through a persistent stream.

The $195 Billion IoT Market

The proliferation and ‘smartening’ of IoT-driven devices is projected to reach a market cap exceeding $195 billion in 2023, according to analysts at ReportsnReports. From a market of $16 billion in 2016, this growth is mainly fueled by the increasingly ubiquitous manufacturing of smarter in-home, mobile, and transportation devices — and the need to capture that data and enhance communication infrastructure.

The smarter devices become, the more data they need to make complex, realtime decisions. Sensors and external data gathering implements are becoming an essential catalyst for IoT industry growth. The accuracy of sensors and actuators that measure geospatial proximity, acceleration, temperature, and motion will separate the industry leaders from the laggards.

IoT_realtime API


Taking a deeper dive into the actual core components, like semiconductors, Gartner forecasts a $45 billion IoT-driven semiconductor market by 2020, with consumer IoT taking the lion’s share and the automotive industry (including self-driving vehicles) taking second.

Data & Business Intelligence

The goal of a truly interconnected tech ecosystem will also mirror equal growth in data and business intelligence. The more things are interconnected, the more companies will need to gather data, push remote updates, and control devices in the field. Hence, remote communication needs to be reliable, data needs to be accurate, and the ability to extract meaningful information from big data becomes paramount.

In a 2015 report by Seagate, 25% of all data will need to be processed and generated in realtime by 2025 out of a total of 160 Zettabytes.

rise of realtime data

Event-Driven API Mechanisms

If you’re looking to understand the web infrastructure behind realtime, then let’s explore some of its basic components. A more thorough analysis can be found in Getting Started with Realtime API Infrastructure.

Realtime is all about pushing data. In a data push model, data is pushed to a user’s device rather than pulled (requested) by the user. For example, modern push email allows users to receive email messages without having to check manually. Similarly, we can examine data push in a more continuous sense, whereby data is continuously broadcasted. Anyone who has access to a particular channel or frequency can receive that data and decide what to do with it.

HTTP Streaming

HTTP streaming provides a long-lived connection for instant and continuous data push. You get the familiarity of HTTP with the performance of WebSockets. The client sends a request to the server and the server holds the response open for an indefinite length. This connection will stay open until a client closes it or a server side-side event occurs. If there is no new data to push, the application will send a series of keep-alive ticks so the connection doesn’t close.


WebSockets provide a long-lived connection for exchanging messages between client and server. Messages may flow in either direction for full-duplex communication. This bi-directional connection is established through a WebSocket handshake. Just like in HTTP Streaming and HTTP Long-Polling, the client sends a regular HTTP request to the server first. If the server agrees to the connection, the HTTP connection is replaced with a WebSocket connection.


Webhooks are a simple way of sending data between servers. No long-lived connections are needed. The sender makes an HTTP request to the receiver when there is data to push. A WebHook registers or “hooks” to a callback URL and will notify you anytime an event has occurred. You register this URL in advance and when an event happens, the server sends a HTTP POST request with an Event Object to the callback URL. This event object contains the new data that will be pushed to the callback URL. You might use a WebHook if you want to receive notifications about certain topics. It could also be used to notify you whenever a user changes or updates their profile.

HTTP Long-Polling

HTTP long-polling provides a long-lived connection for instant data push. It is the easiest mechanism to consume and also the easiest to make reliable. This technique provides a long-lived connection for instant data push. The server holds the request open until new data or a timeout occurs. Most send a timeout after 30 to 120 seconds, it depends on how the API was setup. After the client receives a response (whether that be from new data or a timeout), the client will send another request and this is repeated continuously.

And, of course, there is the infrastructure behind it all.

Realtime API Infrastructure – Realtime API infrastructure specifically allows developers to build realtime data push into their existing APIs.  Typically, you would not need to modify your existing API contracts, as the streaming server would serve as a proxy. The proxy design allows these services to fit nicely within an API stack. This means it can inherit other facilities from your REST API, such as authentication, logging, throttling, etc. It can be combined with an API management system.  In the case of WebSocket messages being proxied out as HTTP requests, the messages may be handled statelessly by the backend. Messages from a single connection can even be load balanced across a set of backend instances.

Realtime Application Infrastructure – Realtime app infrastructure sends data to browsers and clients. It typically uses pub/sub messaging, webhooks, and/or websockets — and is separate from an application or service’s main API.

Main Take-Aways

IoT, big data, and consumer expectations are fueling the proliferation of event-driven / realtime APIs. One of the greatest challenges facing engineers over the next few years will be constructing scalable, fault-tolerant event-driven architectures at scale.  This is why we are seeing companies spend more than $2 trillion in 2017 to support event-driven endpoints and infrastructure.

While RESTful architectures will remain a necessity, it is important for organizations to understand and plan for event-driven systems — which add a new dimension of realtime API infrastructure complexity.

The 6 Step Build vs Buy Model for Developers

Defining a process for objectively selecting homegrown or purchased solutions

For almost every functional or architectural application component, there are a plethora of ‘as a service’ offerings. We see infrastructure as a service (IaaS), backend as a service (BaaS), SaaS, PaaS.. and a new ‘aaS’ seems to be added daily.

What do all these services have in common? Well, they aspirationally promise to give you, the engineer, (1) more freedom to focus on your core product, (2) faster time to market, and (3) production-ready solutions for complex and repeatable engineering operations.

Sometimes this is case. Sometimes it isn’t. This purpose of this guide is to provide a rational set of objective criteria to assess whether you should build or buy a particular service.

What is build? What is buy?

Build does not necessarily mean that you are making something from scratch. It means that you are combining custom code, open source libraries, and individual/community expertise to construct a solution for your use case. This solution is something that you will design, build, run, maintain, and scale internally.

On the other hand, buy does not necessarily mean that you are purchasing an end-to-end, out-of-the-box solution for your use case. It more accurately represents the purchase of a defined service that adds near-immediate value to your use case. Typically, the viability of the service itself will be guaranteed by the seller and you will not need to design and build the service itself. However, depending on the type of service purchased, you may choose to run and scale it internally. Generally, you will offload the running, maintenance, and scalability to the seller.

The Developer Mind

Before we continue, let’s reset our frame of mind.

Many developers have strong egos, and that’s generally an empowering attribute. Strong egos give devs the confidence to power through complex obstacles, focus for days and weeks at a time, and cultivate entirely new industries. However, there’s a fine line between reasonable and unreasonable confidence.

“I can build ____ in ____ days!”
“Ha! I can build a better ____ in a weekend!”
“This is so expensive. I’m just going to build it.”

We frequently see and hear these comments on dev forums, aggregators like Reddit and HackerNews, and in our day-to-day interactions. If we don’t say it, then some of us probably think it from time to time. Hey, sometimes we’re probably right, but often times, our initial ego-driven reaction distances us from the objective criteria we apply to our general practice of programming.

When assessing what to build vs buy, or which ratio we choose, it is critical that we reset our frame of mind and approach our solutioning as open-minded and objectively as possible. Excluding the purists, no one cares if we were able to build our product from scratch or if we cleverly integrated a series of purchased solutions together. What people care about is if our product works and delivers exceptional value to customers.

With the build vs buy decision-making process, we will answer the question: “How do we deliver exceptional value to our customers quickly, efficiently, and prudently?”

Build vs Buy Decision-Making Model

build versus buy guide and process for developers to choose software

Step 1 – Identify and categorize your product’s functional scope

Your team has been tasked with building an ecommerce platform that allows users to upvote and downvote products. So, what are your product’s functional and architectural features?


  • Marketplace service
  • Voting service
  • Product display service
  • Inventory management service
  • Transaction service
  • Buyer, seller, and admin account management service
  • Search, filter, refine service

Architectural and Process

  • Databases
  • Servers
  • Load Balancers
  • Dev Environment / Version Control
  • Continuous Integration / Delivery Pipeline
  • REST / Realtime APIs
  • Frontend Framework
  • Deployment Controls / AB Testing

While these are not comprehensive feature sets, the important point is that there is a clear distinction between core product features (marketplace, voting), and necessary system & process architecture (server environment, CI/CD pipeline). There are features that are proprietary and unique to your product, and there are architectural features that are found in almost every modern application system.

Your job is to identify which of these features are proprietary to your platform and which are replicable proven solutions. To do this, ask the following questions:

  • What are the proprietary, core features that make my application unique?
  • What architectural services do I need for my platform scaffolding?
  • What is my ideal development pipeline going to look like?

Keep in mind, we are not solutioning yet or deciding what to build vs buy. We are identifying and categorizing our product’s functionality.

Step 2 – Define the scope of work and reconcile against constraints

Based on your feature categorization in step 1, it is time to define the scope of work to build each feature.

First, itemize and prioritize the detailed functionality for each feature:

  • What is the minimum functional scope for the feature to be viable?
  • What is the ideal functional scope for the feature?
  • Is this a feature I need now? Or can it wait?

Second, for each feature, answer the following build questions for the minimum and ideal functional scope:

  • How many developer resources do I have available to build this feature? Maintain this feature?
  • Can I harness any domain experts to help design this feature?
  • Has anyone on my team built this before?
  • How much time to design (A), build (B), test (C), deploy (D), maintain (E) this feature?
  • Will building this divert resources from something else?
  • Do I need to hire additional resources? If so, what is the cost breakdown?
  • What is the infrastructure cost to run this internally?

Third, for each core feature, answer the following buy questions for the minimum and ideal functional scope:

  • What is my monthly budget for this service?
  • How do I anticipate my budget changing over time?
  • Can I harness any domain experts to help me assess the best solution?
  • What developer resources do I have available to integrate and configure the solution?
  • If applicable, will I have the resources to self-host, run, maintain, and scale the service?

Step 3 – Solution divergence

Now we can get to the good stuff! In this step, we are not deciding what to build or buy; rather, we are aggregating an inventory of choices.

First, scour the interwebs, get referrals, and assess the solution ecosystem. Have other teams built this successfully? Have they bought it successfully? What are the horror and success stories?

Second, create a build vs buy comparison matrix. Make sure to note the monthly, infrastructure, and long-term maintenance costs. Note the total upfront and ongoing time needed for each build or buy solution (having build/buy hybrids are great too!).

Step 4 – Solution convergence

Start narrowing down your options.

Remember that buying does not mean out-of-the-box instant magic. There are always build costs associated with buying:

  • Sandboxing and initial technical vetting
  • Integration and setup
  • Configuration and fine tuning
  • Operational training and staff onboarding

Similarly, building does not necessarily mean that everything is made from scratch, but it does mean that you will assume the costs of ongoing maintenance, scaling, and debugging. You will also need to train staff and develop new operational processes.

Step 5 – Build or buy or both

Choose a primary and secondary solution option for each feature. This way, you will have a backup plan if the primary solution does not pan out. It is absolutely critical that you involve your team during the selection process and make the selection criteria transparent.

Step 6 – Develop guidelines for reassessment

The solution that you’ve selected for day 1 of your product will likely not fit your product at day 600. This is okay, but we must be able to anticipate and preempt any future scaling issues. To do this, set both quantitative and qualitative benchmarks for triggering a build vs buy scaling reassessment. For example, we’re confident that our current architectural solution allows us to handle up to 500k concurrent connections with ease, but our current growth model forecasts 2m connections in 8 months. When we start to near the 300k mark, then this will trigger another build vs buy assessment so we can preempt any issues at scale. This reassessment should include:

  • What have we learned about the needs of our product in the past X months?
  • What has been more difficult than anticipated? What has been easier?
  • How has our resource and knowledge pool shifted?
  • Have our product’s core competencies shifted?
  • Is there anything new and better out there?

Final Thoughts – Try It Your Way

Well, this looks like a lot of work. It may even take a day or multiple days to assess a feature. But realistically, when we take into account the full lifecycle of your product, a few upfront days can save you months and lots of money down the road. Those few days may also make or break your product.

Customize your build vs buy assessment process to meet your organization’s needs. Though a large enterprise is way different than a startup, the assessment metrics remain very similar. Add or remove metrics, codify a more refined process, or make your own from scratch.

Either way, it is important to remember that building a successful product is very hard, so don’t make it harder on yourself than necessary. Let your decision be driven by choosing the right solution for your product, rather than the right solution for you.

Spotlight Article: What do you mean by “Event-Driven”? by Martin Fowler

In his article, Martin Fowler discusses the meaning of ‘event-driven’ and all its nuances.  He tries to make sense of the various patterns that make up the event-driven landscape.

Towards the end of last year I attended a workshop with my colleagues in ThoughtWorks to discuss the nature of “event-driven” applications. Over the last few years we’ve been building lots of systems that make a lot of use of events, and they’ve been often praised, and often damned. Our North American office organized a summit, and ThoughtWorks senior developers from all over the world showed up to share ideas.

The biggest outcome of the summit was recognizing that when people talk about “events”, they actually mean some quite different things. So we spent a lot of time trying to tease out what some useful patterns might be. This note is a brief summary of the main ones we identified.

Full Article

.NET/C# Realtime Resources

This section highlights the realtime resources available for .NET / C# developers.

Realtime .NET/C# Libraries

SignalR: Incredibly simple real-time web for .NET  – ASP.NET SignalR is a library for ASP.NET developers that makes it incredibly simple to add real-time web functionality to your applications. What is “real-time web” functionality? It’s the ability to have your server-side code push content to the connected clients as it happens, in real-time.

ASP.NET Core SignalR: Incredibly simple real-time web for ASP.NET Core  – ASP.NET Core SignalR is a new library for ASP.NET Core developers that makes it incredibly simple to add real-time web functionality to your applications. What is “real-time web” functionality? It’s the ability to have your server-side code push content to the connected clients as it happens, in real-time. You can watch an introductory presentation here – Introducing ASP.NET Core Sockets.  This project is part of ASP.NET Core. You can find samples, documentation and getting started instructions for ASP.NET Core at the Home repo.

Awesome Dotnet: A collection of awesome .NET libraries, tools, frameworks and software – A collection of awesome .NET libraries, tools, frameworks, and software.

.NET Websocket-Manager: Real-Time library for ASP .NET Core – This is an Asp .Net Core middleware that provides real-time functionality to .NET Core applications. To the core, it is a WebSocket middleware for Asp .Net Core with TypeScript / JavaScript client and .Net Core client that supports the client and the server invoking each others’ methods.

C# C# library for Firebase Realtime Database –  Simple wrapper on top of Firebase Realtime Database REST API. Among others it supports streaming API which you can use for realtime notifications. For Authenticating with Firebase checkout the Firebase Authentication library and related blog post .

Rocket.Chat: A Rocket.Chat realtime managed .Net driver and bot – A Rocket.Chat real-time, managed, .Net driver, and bot.

GTFS Realtime Bindings: .NET GTFS-realtime Language Bindings  – Provides .NET classes generated from the GTFS-realtime Protocol Buffer specification. These classes will allow you to parse a binary Protocol Buffer GTFS-realtime data feed into C# objects.

Spreads: Series and Panels for Real-time and Exploratory Analysis of Data Streams – Spreads is an ultra-fast library for complex event processing and time series manipulation. It could process tens of millions items per second per thread – historical and real-time data in the same fashion, which allows to build and test analytical systems on historical data and use the same code for processing real-time data.

OpenRA: Open Source real-time strategy game engine  – A Libre/Free Real Time Strategy game engine supporting early Westwood classics, such as Command & Conquer: Red Alert written in C# using SDL and OpenGL. Runs on Windows, Linux, *BSD and Mac OS X.

CompBench: Benchmark for native C# realtime compression libraries  – This is a tiny benchmark program I wrote a couple years ago for my personal use. It generates a few data files and feeds them to different compression libraries to measure compression ratio and speed.

How to add Real-time Data to your .NET Application


  • Author: Bitovi, Brian Moschel
  • April 2017



Web applications have increasingly turned to real-time data to provide more dynamic and useful features – for example chat, collaborative editing, and real-time analytics. This trend is evident in the .NET world. While .NET is great, real-time .NET is even better.

Similar to the popularity of AJAX leading to more single-page applications and fewer page refreshes, the recent addition of WebSockets and similar real-time protocols in mainstream browsers has lead to more real-time data connections and less “request data on page load and force the user to refresh if they want updated data” applications.

In this article, you’ll learn a simple way to add real-time functionality to your .NET application. The article will introduce two technologies — SignalR on the server and can-connect-signalr on the client — which make setting up real-time connections both simple and quick. We’ll show how to use both of these libraries by making a simple chat application.

Real-Time Web Apps and .NET. What are your options?


  • Author: Nexmo, Phil Leggetter
  • May 2016



So many applications now offer some form of real-time UX and real-time functionality is becoming increasingly essential as technology trends evolve. Notifications and activity streams in Facebook, Twitter, news and sports apps; real-time location tracking in Uber and most other taxi (logistics) apps; real-time collaboration in Google Docs and Microsoft Office 365 online. What sort of experience would chat apps like Slack, HipChat, WhatsApp, Viber or WeChat offer if messaging weren’t instantaneous? And you can be sure that bots will be powered by real-time technologies.

So, in order to meet user expectations and deliver innovative solutions that align with technology trends, you’re going to need to make use of real-time technologies.

If you build your apps using a .NET stack and you want to add real-time communications functionality to a .NET web app, what considerations should you take into account when choosing a real-time solution? What .NET frameworks or solutions exist? Should you restrict yourself to .NET? If not, how do you integrate with another technology?

Real-time applications using ASP.NET Core, SignalR & Angular


  • Author: Christos Sakell
  • October 2016



SignalR has been out for a long time but ASP.NET Core and Angular 2 aren’t. On this post we ‘ll see what takes to bind all those frameworks and libraries together and build a Real time application. This is not an Angular tutorial nor a SignalR one. Because of the fact that the final project associated to this post contains code that we have already seen on previous posts, I will only explain the parts that you actually need to know in order to build a real time application. And this is why I will strongly recomend you to download the Live-Game-Feed app and study the code along with me without typing it. Here’s what we ‘ll see in more detail..

Realtime Infrastructure Services

  • Realtime API Infrastructure – Realtime API infrastructure specifically allows developers to build realtime data push into their existing APIs.  Typically, you would not need to modify your existing API contracts, as the streaming server would serve as a proxy. The proxy design allows these services to fit nicely within an API stack. This means it can inherit other facilities from your REST API, such as authentication, logging, throttling, etc. It can be combined with an API management system.  In the case of WebSocket messages being proxied out as HTTP requests, the messages may be handled statelessly by the backend. Messages from a single connection can even be load balanced across a set of backend instances.
    • Fanout/Pushpin – Fanout is a real-time API development kit that helps you push data to connected devices easily. Fanout is a cross between a reverse proxy and a message broker. Pushpin is the open source version.
    • – a SaaS API proxy tool that converts standard API requests into a streaming API. In other words, it provides a proxy as a service for any HTTP API by polling and acting as a streaming API.
    • LiveResource – LiveResource is a protocol specification and JavaScript reference library for receiving live updates of web resources. It has the following principles:
  • Realtime Application Infrastructure – Realtime app infrastructure sends data to browsers and clients. It typically uses pub/sub messaging, webhooks, and/or websockets — and is separate from an application or service’s main API.
    • Firebase – Firebase is a BaaS (Backend-as-a-Service) that allows developers to create web applications with no server-side programming required.
    • Pubnub – PubNub is a programmable Data Stream Network (DSN) and realtime infrastructure-as-a-service (IaaS) company. Primarily, they are a messaging solution hosted on a cloud service that allows developers to publish data instantly to one or multiple devices.
    • Pusher – Pusher is a hosted service that allows developers to add realtime bi-directional functionality via WebSockets (with HTTP-based fallbacks) to the web and mobile apps.
    • Ably – Ably is a realtime data delivery platform that provides creators the tools to create, deliver, and manage projects. Their main realtime functionality consists of pub/sub, presence, authentication, encryption, and connection state recovery.