API

Realtime Web Data Streaming with Kafka and Pushpin

Supercharging Kafka:  Enable Realtime Web Streaming by Adding Pushpin

Exposing Kafka messages via a public HTTP streaming API

Matt Butler

Apache Kafka is the new hotness when it comes to adding realtime messaging capabilities to your system. At its core, it is an open source distributed messaging system that uses a publish-subscribe system for building realtime data pipelines. But, more broadly speaking, it is a distributed and horizontally scaleable commit log.

In a Kafka cluster, you will have topics, producers, consumers, and brokers:

  • Topics — A categorization for a group of messages
  • Producers — Push messages into a Kafka topic
  • Consumers — Pulls messages off of a Kafka topic
  • Kafka Broker — A Kafka node
  • Kafka Cluster— A collection of Kafka brokers

Take a deep dive into Kafka here.

Overall, Kafka provides fast, highly scalable and redundant messaging through a publish-subscribe model.

A pub-sub model is a messaging pattern where publishers categorize published messages into topics without knowledge of which subscribers would receive those messages (if any). Likewise, subscribers express interest in one or more topics and only receive messages that are of interest, without knowing anything about the publishers (source).

Kafka Strengths

As a messaging system, Kafka has some transformative strengths that have catalyzed its rising popularity

  1. Realtime Data Pipeline — Can handle realtime messaging throughput with high currency
  2. High-throughput — Ability to support high-velocity and high-volume data (1000’s per second)
  3. Fault-tolerant — Due to its distributed nature, it is relatively resistant to node failure within a cluster
  4. Low Latency — Milliseconds to handle thousands of messages
  5. Scalability — Kafka’s distributed nature allows you to add additional nodes without downtime, facilitating partitioning and replication

Kafka Limits

Due to its intrinsic architecture, Kafka is not optimized to provide API consumers with friendly access to realtime data. As such, many orgs are hesitant to expose their Kafka endpoints publicly.

In other words, it is difficult to expose Kafka across a public API boundary if you want to use traditional protocols (like websockets or HTTP).

To overcome this limit, we can integrate Pushpin into our Kafka ecosystem to handle more traditional protocols and expose our public API in a more accessible and standardized way.

Pushpin + Kafka

Server-sent events (SSE) is a technology where a browser receives automatic updates from a server via HTTP connection (standardized in HTML5 standards). Kafka doesn’t natively support this protocol, so we need to add an additional service to make this happen.

Pushpin’s primary value prop is that it is an open source solution that enables realtime push — a requisite of evented APIs (GitHub Repo). At its core, it is a reverse proxy server that makes it easy to implement WebSocket, HTTP streaming, and HTTP long-polling services. Structurally, Pushpin communicates with backend web applications using regular, short-lived HTTP requests.

Integrating Pushpin and Kafka provides you with some notable benefits:

  • Resource-Oriented API — Provides a more logical resource-oriented API to consumers that fits in with an existing REST API. In other words, you can expose data over standardized, more-secure protocols.
  • Authentication — Reuses existing authentication tokens and data formats.
  • API Management — Harnesses your existing API management system or load balancers.
  • Web Tier Scaleability — If the number of your web consumers grows substantially, then it may be more economical and performant to scale out your web tier, rather than your Kafka cluster.

In this next example, we will expose Kafka message via HTTP streaming API. 

Building Kafka Server-Sent Events

This example project reads messages from a Kafka service and exposes the data over a streaming API using Server-Sent Events (SSE) protocol over HTTP. It is written using Python & Django, and relies on Pushpin for managing the streaming connections.

How it Works

In this demo, we drop a Pushpin instance on top of our Kafka broker. Pushpin acts as a Kafka consumer, subscribes to all topics, and re-publishes received messages to connected clients. Clients listen to events via Pushpin.

More granularly, we use views.py to set up an SSE endpoint, while relay.py handles the messaging input and output.

  1. First, we need to setup virtualenv and install dependencies:
virtualenv --python=python3 venv
. venv/bin/activate
pip install -r requirements.txt

2. Create a suitable .env with Kafka and Pushpin settings:

KAFKA_CONSUMER_CONFIG={"bootstrap.servers":"localhost:9092","group.id":"mygroup"}
GRIP_URL=http://localhost:5561

3. Run the Django server:

python manage.py runserver

4. Run Pushpin:

pushpin --route="* localhost:8000"

5. Run the relay command:

python manage.py relay

The relay command sets up a Kafka consumer according to KAFKA_CONSUMER_CONFIG, subscribes to all topics, and re-publishes received messages to Pushpin, wrapped in SSE format.

Clients can listen to events by making a request (through Pushpin) to /events/{topic}/:

curl -i http://localhost:7999/events/test/

The output stream might look like this:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Transfer-Encoding: chunked
Connection: Transfer-Encoding
event: message
data: hello
event: message
data: world

Repo on GitHub

How Blockchain and Realtime APIs are Totally Changing Healthcare

In his article, Philip Levinson discusses how realtime data has become essential for advances in healthcare software — specifically as applied to new blockchain technology.

As with other healthcare companies and organizations, one of the keys to Oscar’s model depends on “using real-time data to get actionable insights in front of members and physicians,” says Schlosser.

As a result, blockchain “has the power to revive the healthcare industry by reorganizing operations, generating new business models and integrating patients’ medical records,” according to Zacks.

The latter of these represent two ways blockchain is most likely to change healthcare in the short-run.

Full Article 

High scalability with Fanout and Fastly

Fanout Cloud is for high scale data push. Fastly is for high scale data pull. Many realtime applications need to work with data that is both pushed and pulled, and thus can benefit from using both of these systems in the same application. Fanout and Fastly can even be connected together!

fanout-fastly

Using Fanout and Fastly in the same application, independently, is pretty straightforward. For example, at initialization time, past content could be retrieved from Fastly, and Fanout Cloud could provide future pushed updates. What does it mean to connect the two systems together though? Read on to find out.

Proxy chaining

Since Fanout and Fastly both work as reverse proxies, it is possible to have Fanout proxy traffic through Fastly rather than sending it directly to your origin server. This provides some unique benefits:

  1. Cached initial data. Fanout lets you build API endpoints that serve both historical and future content, for example an HTTP streaming connection that returns some initial data before switching into push mode. Fastly can provide that initial data, reducing load on your origin server.
  2. Cached Fanout instructions. Fanout’s behavior (e.g. transport mode, channels to subscribe to, etc.) is determined by instructions provided in origin server responses, usually in the form of special headers such as Grip-Hold and Grip-Channel. Fastly can cache these instructions/headers, again reducing load on your origin server.
  3. High availability. If your origin server goes down, Fastly can serve cached data and instructions to Fanout. This means clients could connect to your API endpoint, receive historical data, and activate a streaming connection, all without needing access to the origin server.

Network flow

Suppose there’s an API endpoint /stream that returns some initial data and then stays open until there is an update to push. With Fanout, this can be implemented by having the origin server respond with instructions:

HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 29
Grip-Hold: stream
Grip-Channel: updates

{"data": "current value"}

When Fanout Cloud receives this response from the origin server, it converts it into a streaming response to the client:

HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: Transfer-Encoding

{"data": "current value"}

The request between Fanout Cloud and the origin server is now finished, but the request between the client and Fanout Cloud remains open. Here’s a sequence diagram of the process:

fanout-flow

Since the request to the origin server is just a normal short-lived request/response interaction, it can alternatively be served through a caching server such as Fastly. Here’s what the process looks like with Fastly in the mix:

fanout-fastly1

Now, guess what happens when the next client makes a request to the /stream endpoint?

fanout-fastly2

That’s right, the origin server isn’t involved at all! Fastly serves the same response to Fanout Cloud, with those special HTTP headers and initial data, and Fanout Cloud sets up a streaming connection with the client.

Of course, this is only the connection setup. To send updates to connected clients, the data must be published to Fanout Cloud.

We may also need to purge the Fastly cache, if an event that triggers a publish causes the origin server response to change as well. For example, suppose the “value” that the /stream endpoint serves has been changed. The new value could be published to all current connections, but we’d also want any new connections that arrive afterwards to receive this latest value as well, rather than the older cached value. This can be solved by purging from Fastly and publishing to Fanout Cloud at the same time.

Here’s a (long) sequence diagram of a client connecting, receiving an update, and then another client connecting:

fanout-fastly3

At the end of this sequence, the first and second clients have both received the latest data.

Rate-limiting

One gotcha with purging at the same time as publishing is if your data rate is high it can negate the caching benefit of using Fastly.

The sweet spot is data that is accessed frequently (many new visitors per second), changes infrequently (minutes), and you want changes to be delivered instantly (sub-second). An example could be a live blog. In that case, most requests can be served/handled from cache.

If your data changes multiple times per second (or has the potential to change that fast during peak moments), and you expect frequent access, you really don’t want to be purging your cache multiple times per second. The workaround is to rate-limit your purges. For example, during periods of high throughput, you might purge and publish at a maximum rate of once per second or so. This way the majority of new visitors can be served from cache, and the data will be updated shortly after.

An example

We created a Live Counter Demo to show off this combined Fanout + Fastly architecture. Requests first go to Fanout Cloud, then to Fastly, then to a Django backend server which manages the counter API logic. Whenever a counter is incremented, the Fastly cache is purged and the data is published through Fanout Cloud. The purge and publish process is also rate-limited to maximize caching benefit.

The code for the demo is on GitHub.

Examining Mature APIs (Slack, Stripe, Box)

In our previous blog post, we discussed the disconnect between API pricing plans where you pay monthly for a set number of calls and regular developer use cases. We think competition will drive new pricing models that are more developer friendly – and a potential approach could be charging for calls based on their business value. Examining webhook events available via API from Stripe, Slack, and Box gives us a forward look into how this could work.

What’s a mature API?

Forbes nicely summarizes where they see API development going in this graphic (ignore the “customer-driven platform revolution”) portion:

forbes

They make a valid point that APIs become more valuable as the data that flows from them becomes bi-directional – APIs are not only returning data based on calls, but actively pushing out data based on API activity.

This data push generally starts around activity with high business value – so we’re going to examine APIs from Stripe, Slack, and Box to get an idea of what events they make available.

Slack has a separate “Events API”

Slack has chosen to implement a separate Events API for developers who want to build apps that respond to events within Slack. Here’s the full list of event types that they can push in realtime as they happen.

Looking at this list in more detail, it’s focused around key messaging and collaboration activities:

  • Creating and updating channels
  • Uploading, sharing, and commenting on files
  • Messages being posted to various channels

Box uses event triggers

Box uses webhooks with event triggers attached to Box files and folders to monitor events attached to files and folders and notify you when they occur. Here’s their full list of events for files and folders.

As expected for Box, events are focused around file management and collaboration:

  • Uploading, previewing, and downloading files
  • Comment and task assignment creation and updating

Stripe sends a variety of events

Stripe sends a wide variety of events around payments, both keyed to internal and external usage:

  • Account creation and updating
  • Product or plan creation
  • Card charges and updates

What does it mean?

The events that these mature APIs have chosen to make available for realtime push have substantial business value for developers building apps using their functionality. As more APIs begin to offer push of data, they may move to a blended pricing model that charges more for these high-value events. We’re interested to see what happens!

Spotlight Article: 2017 Is Quickly Becoming The Year Of The API Economy by Louis Columbus

In his article, Louis Columbus discusses how the urgency to create new business models has catalyzed the proliferation of public facing / monetizable APIs.

This year more CIOs will have their bonuses tied to how many new business models they help create with existing and planned IT platforms than ever before. This trend will accelerate over the next three years. CIOs and IT staffs need to start thinking about how they can become business strategists first, technicians and enablers of IT second. CIOs must create and launch new business models faster to keep their companies competitive. APIs are the fuel helping to make this happen.

Full Article

Spotlight Article: Bringing The API Deployment Landscape Into Focus by Kin Lane

In this article, Kin Lane (API Evangelist) dives into the current landscape of APIs and the host of definitions that drive the industry.

I am finally getting the time to invest more into the rest of my API industry guides, which involves deep dives into core areas of my research like API definitionsdesign, and now deployment. The outline for my API deployment research has begun to come into focus and looks like it will rival my API management research in size.

With this release, I am looking to help onboard some of my less technical readers with API deployment. Not the technical details, but the big picture, so I wanted to start with some simple questions, to help prime the discussion around API development.

Where? – Where are APIs being deployed. On-premise, and in the clouds. Traditional website hosting, and even containerized and serverless API deployment

How? – What technologies are being used to deploy APIs? From using spreadsheets, document and file stores, or the central database. Also thinking smaller with microservices, containes, and serverless

Who? – Who will be doing the deployment? Of course, IT and developers groups will be leading the charge, but increasingly business users are leveraging new solutions to play a significant role in how APIs are deployed.

Full Source