Yeah, ok, it's a touch bold to talk about something being
the thing of the year as early as January, but the
potential of the web streams API has gotten me all excited.
Promises are a great way to represent async delivery of a single
value, but what about representing multiple values? Or multiple
parts of larger value that arrives gradually?
Say we wanted to fetch and display an image. That involves:
Fetching some data from the network
Processing it, turning it from compressed data into raw
pixel data
Rendering it
We could do this one step at a time, or we could stream it:
If we handle & process the response bit by bit, we get to
render some of the image way sooner. We even get to render
the whole image sooner, because the processing can happen in
parallel with the fetching. This is streaming! We're
reading a stream from the network, transforming
it from compressed data to pixel data, then writing it to
the screen.
You could achieve something similar with events, but streams
come with benefits:
Start/end aware - although streams can be
infinite
Buffering of values that haven't been read
- whereas events that happen before listeners are attached are
lost
Chaining via piping - you can pipe streams
together to form an async sequence
Built-in error handling - errors will be
propagated down the pipe
Cancellation support - and that
cancellation message is passed back up the pipe
Flow control - you can react to the speed
of the reader
That last one is really important. Imagine we were using streams
to download and display a video. If we can download and decode 200
frames of video per second, but only want to display 24 frames a
second, we could end up with a huge backlog of decoded frames and
run out of memory.
This is where flow control comes in. The stream that's handling
the rendering is pulling frames from the decoder stream 24 times a
second. The decoder notices that it's producing frames faster than
they're being read, and slows down. The network stream notices that
it's fetching data faster than it's being read by the decoder, and
slows the download.
Because of the tight relationship between stream & reader, a
stream can only have one reader. However, an unread stream can be
"teed", meaning it's split into two streams that receive the same
data. In this case, the tee manages the buffer across both
readers.
Ok, that's the theory, and I can see you're not ready to hand
over that 2016 trophy just yet, but stay with me.
The browser streams loads of things by default. Whenever you see
the browser displaying parts of a page/image/video as it's
downloading, that's thanks to streaming. However, it's only
recently, thanks to a standardisation effort, that
streams are becoming exposed to script.
Streams + the fetch API
Response
objects, as defined by the fetch spec, let
you read the response as a variety of formats, but
response.body gives you access to the underlying
stream. response.body is supported in the current
stable version of Chrome.
Say I wanted to get the content-length of a response, without
relying on headers, and without keeping the whole response in
memory. I could do it with streams:
// fetch() resolves once headers have been received
fetch(url).then(response => {
// response.body is a readable stream.
// Calling getReader() gives us exclusive access to
// the stream's content
var reader = response.body.getReader();
var bytesReceived = 0;
// read() resolves when content has been received
reader.read().then(function processResult(result) {
// Result objects contain two properties:
// done - true if the stream has already given
// you all its data.
// value - some data. Always undefined when
// done is true.
if (result.done) {
console.log("Fetch complete");
return;
}
// result.value for fetch streams is a Uint8Array
bytesReceived += result.value.length;
console.log('Received', bytesReceived, 'bytes of data so far');
// Read some more, and call this function again
return reader.read().then(processResult);
});
});
The demo fetches 1.3mb of gzipped HTML from the server, which
decompresses to 7.7mb. However, the result isn't held in memory.
Each chunk's size is recorded, but the chunks themselves are
garbage collected.
result.value is whatever the creator of the stream
provides, which can be anything: a string, number, date, ImageData,
DOM element… but in the case of a fetch stream it's always a
Uint8Array of binary data. The whole response is
each Uint8Array joined together. If you want the
response as text, you can use TextDecoder:
var decoder = new TextDecoder();
var reader = response.body.getReader();
// read() resolves when content has been received
reader.read().then(function processResult(result) {
if (result.done) return;
console.log(
decoder.decode(result.value, {stream: true})
);
// Read some more, and recall this function
return reader.read().then(processResult);
});
{stream: true} means the decoder will keep a buffer
if result.value ends mid-way through a UTF-8 code
point, since a character like ♥ is represented as 3 bytes:
[0xE2, 0x99, 0xA5].
TextDecoder is currently a little clumsy, but it's
likely to become a transform stream in the future (once transform
streams are defined). A transform stream is an object with a
writable stream on .writable and a readable stream on
.readable. It takes chunks into the writable,
processes them, and passes something out through the readable.
Using transform streams will look like this:
var reader = response.body
.pipeThrough(new TextDecoder()).getReader();
reader.read().then(result => {
// result.value will be a string
});
The browser should be able to optimise the above, since both the
response stream and TextDecoder transform stream are
owned by the browser.
Cancelling a fetch
A stream can be cancelled using stream.cancel() (so
response.body.cancel() in the case of fetch) or
reader.cancel(). Fetch reacts to this by stopping the
download.
View
demo (also, note the amazing random URL JSBin gave
me).
This demo searches a large document for a term, only keeps a
small portion in memory, and stops fetching once a match is
found.
Anyway, this is all so 2015. Here's the fun new stuff…
Creating your own
readable stream
In Chrome Canary with the "Experimental web platform features"
flag enabled, you can now create your own streams.
var stream = new ReadableStream({
start(controller) {},
pull(controller) {},
cancel(reason) {}
}, queuingStrategy);
start is called straight away. Use this to set
up any underlying data sources (meaning, wherever you get your
data from, which could be events, another stream, or just a
variable like a string). If you return a promise from this and
it rejects, it will signal an error through the stream.
pull is called when your stream's buffer isn't
full, and is called repeatedly until it's full. Again, If you
return a promise from this and it rejects, it will signal an
error through the stream. Also, pull will not be
called again until the returned promise fulfills.
cancel is called if the stream is cancelled.
Use this to cancel any underlying data sources.
queuingStrategy defines how much this stream
should ideally buffer, defaulting to one item - I'm not
going to go into depth on this here, the spec has
more details.
As for controller:
controller.enqueue(whatever) - queue data in
the stream's buffer.
controller.close() - signal the end of the
stream.
controller.error(e) - signal a terminal
error.
controller.desiredSize - the amount of buffer
remaining, which may be negative if the buffer is over-full.
This number is calculated using the
queuingStrategy.
So if I wanted to create a stream that produced a random number
every second, until it produced a number > 0.9, I'd
do it like this:
var interval;
var stream = new ReadableStream({
start(controller) {
interval = setInterval(() => {
var num = Math.random();
// Add the number to the stream
controller.enqueue(num);
if (num > 0.9) {
// Signal the end of the stream
controller.close();
clearInterval(interval);
}
}, 1000);
},
cancel() {
// This is called if the reader cancels,
//so we should stop generating numbers
clearInterval(interval);
}
});
See
it running. Note: You'll need Chrome
Canary with
chrome://flags/#enable-experimental-web-platform-features
enabled.
It's up to you when to pass data to
controller.enqueue. You could just call it whenever
you have data to send, making your stream a "push source", as
above. Alternatively you could wait until pull is
called, then use that as a signal to collect data from the
underlying source and then enqueue it, making your
stream a "pull source". Or you could do some combination of the
two, whatever you want.
Obeying controller.desiredSize means the stream is
passing data along at the most efficient rate. This is known has
having "backpressure support", meaning your stream reacts to the
read-rate of the reader (like the video decoding example earlier).
However, ignoring desiredSize won't break anything
unless you run out of device memory. The spec has a good example of
creating
a stream with backpressure support.
Creating a stream on its own isn't particularly fun, and since
they're new, there aren't a lot of APIs that support them, but
there is one:
new Response(readableStream);
You can create an HTTP response object where the body is a
stream, and you can use these as responses from a service
worker!
Serving a string, slowly
View demo. Note: You'll need Chrome
Canary with
chrome://flags/#enable-experimental-web-platform-features
enabled.
You'll see a page of HTML rendering (deliberately) slowly. This
response is entirely generated within a service worker. Here's the
code:
// In the service worker:
self.addEventListener('fetch', event => {
var html = '…html to serve…';
var stream = new ReadableStream({
start(controller) {
var encoder = new TextEncoder();
// Our current position in `html`
var pos = 0;
// How much to serve on each push
var chunkSize = 1;
function push() {
// Are we done?
if (pos >= html.length) {
controller.close();
return;
}
// Push some of the html,
// converting it into an Uint8Array of utf-8 data
controller.enqueue(
encoder.encode(html.slice(pos, pos + chunkSize))
);
// Advance the position
pos += chunkSize;
// push again in ~5ms
setTimeout(push, 5);
}
// Let's go!
push();
}
});
return new Response(stream, {
headers: {'Content-Type': 'text/html'}
});
});
When the browser reads a response body it expects to get chunks
of Uint8Array, it fails if passed something else like
a plain string. Thankfully TextEncoder can take a
string and returns a Uint8Array of bytes representing
that string.
Like TextDecoder, TextEncoder should
become a transform stream in future.
Serving a transformed
stream
Like I said, transform streams haven't been defined yet, but you
can achieve the same result by creating a readable stream that
produces data sourced from another stream.
"Cloud" to "butt"
View demo. Note: You'll need Chrome
Canary with
chrome://flags/#enable-experimental-web-platform-features
enabled.
What you'll see is
this page (taken from the cloud computing article on Wikipedia)
but with every instance of "cloud" replaced with "butt". The
benefit of doing this as a stream is you can get transformed
content on the screen while you're still downloading the
original.
Video codecs are really efficient, but videos don't autoplay on
mobile. GIFs autoplay, but they're huge. Well, here's a really
stupid solution:
View demo. Note: You'll need Chrome
Canary with
chrome://flags/#enable-experimental-web-platform-features
enabled.
Streaming is useful here as the first frame of the GIF can be
displayed while we're still decoding MPEG frames.
So there you go! A 26mb GIF delivered using only 0.9mb of MPEG!
Perfect! Except it isn't real-time, and uses a lot of CPU. Browsers
should really allow autoplaying of videos on mobile, especially if
muted, and it's something Chrome is working towards right now.
Full disclosure: I cheated somewhat in the demo. It downloads
the whole MPEG before it begins. I wanted to get it streaming from
the network, but I ran into an OutOfSkillError. Also,
the GIF really shouldn't loop while it's downloading, that's a bug
we're looking into.
Creating one stream from multiple sources to supercharge page
render times
This is probably the most practical application of service
worker + streams. The benefit is huge in terms of
performance.
A few months ago I built a demo of an offline-first
wikipedia. I wanted to create a truly progressive web-app that
worked fast, and added modern features as enhancements.
In terms of performance, the numbers I'm going to talk about are
based on a lossy 3g connection simulated using OSX's Network Link
Conditioner.
Without the service worker it displays content sent to it by the
server. I put a lot of effort into performance here, and it paid
off:
So um, first render is faster, but there's a massive regression
when it comes to rendering content.
The fastest way would be to serve the entire page from
the cache, but that involves caching all of Wikipedia. Instead, I
served a page that contained the CSS, JavaScript and header,
getting a fast initial render, then let the page's JavaScript set
about fetching the article content. And that's where I lost all the
performance - client-side rendering.
HTML renders as it downloads, whether it's served straight from
a server or via a service worker. But I'm fetching the content from
the page using JavaScript, then writing it to
innerHTML, bypassing the streaming parser. Because of
this, the content has to be fully downloaded before it can be
displayed, and that's where the two second regression comes from.
The more content you're downloading, the more the lack of streaming
hurts performance, and unfortunately for me, Wikipedia articles are
pretty big (the Google article is 100k).
This is why you'll see me whining about JavaScript-driven
web-apps and frameworks - they tend to throw away streaming as step
zero, and performance suffers as a result.
I tried to claw some performance back using prefetching and
pseudo-streaming. The pseudo-streaming is particularly hacky. The
page fetches the article content and reads it as a stream. Once it
receives 9k of content, it's written to innerHTML,
then it's written to innerHTML again once the rest of
the content arrives. This is horrible as it creates some elements
twice, but hey, it's worth it:
0s
1s
2s
3s
4s
Server render: 0.73
seconds until initial render, 1.8 seconds until content
render
Service worker client render: 0.1 seconds until initial render, 3.8
seconds until content render
…with hacks: 0.1
seconds until initial render, 2.5 seconds until content
render
So the hacks improve things but it still lags behind server
render, which isn't really acceptable. Furthermore, content that's
added to the page using innerHTML doesn't quite behave
the same as regular parsed content. Notably, inline
View
demo. Note: You'll need Chrome Canary
with
chrome://flags/#enable-experimental-web-platform-features
enabled.
Using service worker + streams means you can get an
almost-instant first render, then beat a regular server render by
piping a smaller amount of content from the network. Content goes
through the regular HTML parser, so you get streaming, and none of
the behavioural differences you get with adding content to the DOM
manually.
Render time comparison
Crossing the streams
Because piping isn't supported yet, combining the streams has to
be done manually, making things a little messy:
var stream = new ReadableStream({
start(controller) {
// Get promises for response objects for each page part
// The start and end come from a cache
var startFetch = caches.match('/page-start.inc');
var endFetch = caches.match('/page-end.inc');
// The middle comes from the network, with a fallback
var middleFetch = fetch('/page-middle.inc')
.catch(() => caches.match('/page-offline-middle.inc'));
function pushStream(stream) {
// Get a lock on the stream
var reader = stream.getReader();
return reader.read().then(function process(result) {
if (result.done) return;
// Push the value to the combined stream
controller.enqueue(result.value);
// Read more & process
return read().then(process);
});
}
// Get the start response
startFetch
// Push its contents to the combined stream
.then(response => pushStream(response.body))
// Get the middle response
.then(() => middleFetch)
// Push its contents to the combined stream
.then(response => pushStream(response.body))
// Get the end response
.then(() => endFetch)
// Push its contents to the combined stream
.then(response => pushStream(response.body))
// Close our stream, we're done!
.then(() => controller.close());
}
});
There are some templating languages such as Dust.js which stream their output, and
also handle streams as values within the template, piping the
content too and even HTML-escaping it on the fly. All that's
missing is support for web streams.
The future for streams
Aside from readable streams, the streams spec is still being
developed, but what you can already do is pretty incredible. If
you're wanting to improve the performance of a content-heavy site
and provide an offline-first experience without rearchitecting,
constructing streams within a service worker will become the
easiest way to do it. It's how I intend to make this blog work
offline-first anyway!
Having a stream primitive on the web means we'll start to get
script access to all the streaming capabilities the browser already
has. Things like:
Gzip/deflate
Audio/video codecs
Image codecs
The streaming HTML/XML parser
It's still early days, but if you want to start preparing your
own APIs for streams, there's a
reference implementation that can be used as a polyfill in some
cases.
Streaming is one of the browser's biggest assets, and 2016 is
the year it's unlocked to JavaScript.