PHOBOSLAB

Decode It Like It's 1999

A few years ago I started to work on an MPEG1 Video decoder, completely written in JavaScript. Now, I finally found the time to clean up the library, improve its performance, make it more error resilient and modular and add an MP2 Audio decoder and MPEG-TS demuxer. This makes this library not just an MPEG decoder, but a full video player.

In this blog post I want to talk a bit about the challenges and various interesting bits I discovered during the development of this library. You'll find a demo, the source and documentation and reasons why to use JSMpeg over on the official website:

jsmpeg.com - Decode it like it's 1999

Refactoring

Recently, I needed to implement audio streaming into JSMpeg for a client and only then realized in what a pitty state the library is. It has grown quite a bit since its first release. A WebGL renderer, WebSocket client, progressive loading, benchmarking facilities and much more have been tacked on in the last few years. All kept in a single, monolithic class with conditionals bursting at the seams.

I decided to clean up this mess first by separating its logical components. I also sketched out what would be needed for the sound implementation: a Demuxer, MP2 decoder and Audio Output:

Plus some auxiliary classes:

Each of the components (apart from the Sources) has a .write(buffer) method to feed it with data. These components can then "connect" to a destination that receives the processed result. The complete flow through the library looks like this:

                 / -> MPEG1 Video Decoder -> Renderer
Source -> Demuxer  
                 \ -> MP2 Audio Decoder -> Audio Output

JSMpeg currently has 3 different implementations for the Source (AJAX, AJAX progressive and WebSocket) and there's 2 different Renderers (Canvas2D and WebGL). The rest of the library is agnostic to these – i.e. the Video Decoder doesn't care about the Renderers internals. With this approach it's easy to add new components: further Sources, Demuxers, Decoders or Outputs.

I'm not completely happy with how these connections work in the library. Each component can only have one destination (apart from the Demuxer, that has one destination per stream). It's a tradeoff. In the end, I felt that anything else would be over engineering and complicating the library for no good reason.

WebGL Rendering

One of the most computationally intensive tasks for an MPEG1 decoder is the color conversion from MPEG's internal YUV format (Y'Cr'Cb to be precise) into RGBA so that the browser can display it. Somewhat simplified, the conversion looks like this:

for (var i = 0; i < pixels.length; i+=4 ) {
    var y, cb, cr = /* fetch this from the YUV buffers */;

    pixels[i + 0 /* R */] = y + (cb + ((cb * 103) >> 8)) - 179;
    pixels[i + 1 /* G */] = y - ((cr * 88) >> 8) - 44 + ((cb * 183) >> 8) - 91;
    pixels[i + 2 /* B */] = y + (cr + ((cr * 198) >> 8)) - 227;
    pixels[i + 4 /* A */] = 255;
}

For a single 1280x720 video frame that loop has to run 921600 times to convert all pixels from YUV to RGBA. Each pixel needs 3 writes to the destination RGB array (we can pre-populate the alpha component since it's always 255). That's 2.7 million writes per frame, each needing 5-8 adds, subtracts, multiplies and bit shifts. For a 60fps video, we end up with more than 1 billion operations per second. Plus the overhead for JavaScript. The fact that JavaScript can do this, that a computer can do this, still boggles my mind.

With WebGL, this color conversion (and subsequent displaying on the screen) can be sped up tremendously. A few operations for each pixel is the bread and butter of GPUs. GPUs can process many pixels in parallel, because they're independent of any other pixel. The WebGL shader that's run on the GPU doesn't even need these pesky bit shifts – GPUs likes floating point numbers:

void main() {
    float y = texture2D(textureY, texCoord).r;
    float cb = texture2D(textureCb, texCoord).r - 0.5;
    float cr = texture2D(textureCr, texCoord).r - 0.5;

    gl_FragColor = vec4(
        y + 1.4 * cb,
        y + -0.343 * cr - 0.711 * cb,
        y + 1.765 * cr,
        1.0
    );
}

With WebGL, the time needed for the color conversion dropped from 50% of the total JS time to just about 1% for the YUV texture upload.

There was one minor issue I stumbled over with the WebGL renderer. JSMpeg's video decoder does not produce three Uint8Arrays for each color plane, but Uint8ClampedArrays. It's doing this, because the MPEG1 standard mandates that decoded color values must be clamped, not wrap around. Letting the browser do the clamping through the ClampedArray works out faster than doing it in JavaScript.

A bug that still stands in some Browsers (Chrome and Safari) prevents WebGL from using the Uint8ClampedArray directly. Instead, for these browsers we have to create a Uint8Array view for each array for each frame. This operation is pretty fast since nothing needs to be copied, but I'd still like to do without it.

JSMpeg detects this bug and only uses the workaround if needed. We simply try to upload a clamped array and catch the error. This detection sadly triggers an un-silencable warning in the console, but it's better than nothing.

WebGLRenderer.prototype.allowsClampedTextureData = function() {
    var gl = this.gl;
    var texture = gl.createTexture();

    gl.bindTexture(gl.TEXTURE_2D, texture);
    gl.texImage2D(
        gl.TEXTURE_2D, 0, gl.LUMINANCE, 1, 1, 0,
        gl.LUMINANCE, gl.UNSIGNED_BYTE, new Uint8ClampedArray([0])
    );
    return (gl.getError() === 0);
};

WebAudio for Live Streaming

For the longest time I assumed that in order to feed WebAudio with raw PCM sample data without much latency or pops and cracks, you'd have to use a ScriptProcessorNode. You'd copy your decoded sample data just in time whenever you get the callback from the script processor. It works. I tried it. It needs quite a bit of code to function properly and of course it's computationally intensive and inelegant.

Luckily, my initial assumption was wrong.

The WebAudio Context maintains its own timer that's separate from JavaScript's Date.now() or performance.now(). Further, you can instruct your WebAudio sources to start() at a precise time in the future based on the context's time. With this, you can string very short PCM buffers together without any artefacts.

You only have to calculate the start time for the next buffer by continuously adding the duration of all previous ones. It's important to always use the WebAudio Context's own time for this.

var currentStartTime = 0;

function playBuffer(buffer) {
    var source = context.createBufferSource();
    /* load buffer, set destination etc. */

    var now = context.currentTime;
    if (currentStartTime < now) {
        currentStartTime = now;
    }

    source.start(currentStartTime);
    currentStartTime += buffer.duration;
}

There's a caveat though: I needed to get the precise remaining duration of the enqueued audio. I implemented it simply as the difference between the current time and the next start time:

// Don't do that!
var enqueuedTime = (currentStartTime - context.currentTime);

It took me a while to figure it out, but this doesn't work. You see, the context's currentTime is only updated every so often. It's not a precise real time value.

var t1 = context.currentTime;
doSomethingForAWhile();
var t2 = context.currentTime;

t1 === t2; // true

So, if you need the precise audio play position (or anything based on it), you have to revert to JavaScript's performance.now().

Audio Unlocking on iOS

You gotta love the shit that Apple throws into Web devs faces from time to time. One of those things is the need to unlock audio on a page before you can play anything. Basically, audio playback can only be started as a response to a user action. You click on a button, audio plays.

This makes sense. I won't argue against it. You don't want to have audio blaring at you unannounced when you visit a page.

What makes it shitty, is that Apple neither provided a way to cleanly unlock Audio nor a way to ask the WebAudio Context if it's unlocked already. What you do instead, is to play an Audio source and continually check if it's progressing. You can't chek immediately after playing, though. No, no. You have to wait a bit!

WebAudioOut.prototype.unlock = function(callback) {
    // This needs to be called in an onclick or ontouchstart handler!
    this.unlockCallback = callback;
    
    // Create empty buffer and play it
    var buffer = this.context.createBuffer(1, 1, 22050);
    var source = this.context.createBufferSource();
    source.buffer = buffer;
    source.connect(this.destination);
    source.start(0);

    setTimeout(this.checkIfUnlocked.bind(this, source, 0), 0);
};

WebAudioOut.prototype.checkIfUnlocked = function(source, attempt) {
    if (
        source.playbackState === source.PLAYING_STATE || 
        source.playbackState === source.FINISHED_STATE
    ) {
        this.unlocked = true;
        this.unlockCallback();
    }
    else if (attempt < 10) {
        // Jeez, what a shit show. Thanks iOS!
        setTimeout(this.checkIfUnlocked.bind(this, source, attempt+1), 100);
    }
};

Progressive Loading via AJAX

Say you have a 50mb video file that you load via AJAX. The video starts loading no problem. You can even check the current process (downloaded vs. total bytes) and display a nice loading animation. What you can not do, is to access the already downloaded data while the rest of the file is still loading.

There have been some proposals for adding chunked ArrayBuffers into XMLHttpRequest, but nothing has been implemented across browsers. The newer fetch API (that I still don't understand the purpose of) proposed some similar features, but again: no cross browser support. However, we can still do the chunked downloading in JavaScript using Range-Requests.

The HTTP standard implements a Range header that allows you to only grab part of a resource. If you just need the first 1024 bytes of a big file, you set the header Range: bytes=0-1024 in your request. Before we can start though, we have to figure out how large the file. We can do this with a HEAD request, instead of a GET. This returns only the HTTP headers for the resource, but none of the body bytes. Range-Requests are supported by almost all HTTP servers. The one exception I know of, is PHP's built-in development server.

JSMpeg's default chunk size for downloading via AJAX is 1mb. JSMpeg also appends a custom GET parameter to the URL (e.g. video.ts?0-1024) for each request, so that each chunk essentially gets its own URL and plays nice with bad caching proxies.

With this in place, you can start playing the file as soon as the first chunk has arrived. Also, further chunks will only be downloaded when they're needed. If someone only watches the first few seconds of a video, only those first few seconds will get downloaded. JSMpeg does this by measuring the time it took to load a chunk, adding a lot of safety margin and comparing this to the remaining duration of the already loaded chunks.

In JSMpeg, the Demuxer splits streams as fast as it can. It also decodes the presentation time stamp (PTS) for each packet. The video and audio decoders however only advance their play position in real-time increments. The difference between the last demuxed PTS and the decoder's current PTS is the remaining play time for the downloaded chunks. The Player periodically call's the Source's resume() method with this headroom time:

// It's a silly estimate, but it works
var worstCaseLoadingTime = lastChunkLoadingTime * 8 + 2;
if (worstCaseLoadingTime > secondsHeadroom) {
    loadNextChunk();
}

Audio & Video Sync

JSMpeg tries to play audio as smoothly as possible. It doesn't introduce any gaps or compressions when queuing up samples. Video playback orients itself on the audio playback position. It's done this way, because even the tiniest gaps or discontinuities are far more perceptible in audio than in video. It's far less jarring if a video frame is a few milliseconds late or dropped.

For the most part, JSMpeg relies on the presentation time stamp (PTS) of the MPEG-TS container for playback, instead of calculating the playback time itself. This means, the PTS in the MPEG-TS file have to be consistent and accurate. From what I gathered from the internet, this is not always the case. But modern encoders seemed to have figured this out.

One complication was that the PTS doesn't always start at 0. For instance, if you have a WebCam connected and running for a while, the PTS may be the start time when the WebCam was turned on, not when recording started. Therefore, JSMPeg searches for the first PTS it can find and uses that as the global start time for all streams.

The MPEG1 and MP2 decoders also keep track of all PTS they received alongside with the buffer position of each PTS. With this, we can seek through the audio and video streams to a specific time.

Currently, JSMpeg will happily seek to an inter-frame and decode it on top of the previously decoded frame. The correct way to handle this, would be to rewind to the last intra-frame before the one we seek to and decode all frames in between. This is something I still need to fix.

Build Tools & The JavaScript Ecosystem

I avoid build tools wherever I can. Chances are, your shiny toolset that automates everything for you, will stop working in a year or two. Setting up your build environment is never as easy as "just call webpack", or use grunt or whatever task runner is the hot shit today. It always ends up like

(...) Where do I get webpack from? Oh, I need npm.
Where do I get npm from? Oh, I need nodejs.
Where do I get nodejs from? Oh, I need, homebrew.
What's that? gyp build error? Oh, sure, I need to install XCode.
Oh, webpack needs the babel plugin?
What? The left-pad dependency could not be resolved?
...

And suddenly you spent two hours of your life and downloaded several GB of tools. All to build a 20kb library, for a language that doesn't even need compiling. How do I build this library 2 years from now? 5 years?

I had a thorough look at webpack and hated it. It's way too complex for my taste. I like to understand what's going on. That's part of the reason I wrote this library instead of diving into WebRTC.

So, the build step for JSMpeg is a shell script with a single call to uglifyjs that can be altered to use cat (or copy on Windows) in 2 seconds. Or you simply load the source files separately in your HTML while you're working on it. Done.

Quality, Bitrates And The Future

The quality of MPEG1 at reasonable bitrates is, much to my surprise, not bad at all. Have a look at the demo video on jsmpeg.com - granted, it's a favorable case for compression. Slow movement and not too many cuts. Still, this video weighs in at 50mb for it's 4 minutes, and provides a quality comparable to most Youtube videos that are "only" 30% smaller.

In my tests, I could always get video that I'd consider "high quality" at max 2Mbit/s. Depending on your use-case (want a coffe cam?), you can go to 100Kbit/s or even lower. There's no bottom limit for the bitrate/framerate.

You could get a cheap cell phone contract with a 1GB/month data limit, put a 3G dongle and a webcam on a Raspberry Pi, attach it to a 12 V automotive battery, throw it on your crops field and get a live weather cam that doesn't need any infrastructure or maintenance for a few years and is viewable in your smartphone's browser without installing anything.

The simplicity of MPEG1, compared to modern codecs, makes it very attractive in my opinion. It's well understood and there's a ton of tools that can work with it. All patents relating to MPEG1/MP2 have expired now. It's a free format.

Do you remember the GIF revival after its patents expired?

Thursday, February 2nd 2017 / Comments (6)

Quake for Oculus Rift

My Oculus Rift CV1 finally arrived last week, so of course I had to update my fork of the excellent Quakespasm engine for the new Oculus 1.4 API.

Quake VR

While I was at it, I also fixed a number of bugs, included support for the Rift's headphones and the XBox One controller – I'd still recommend you play with mouse & keyboard, though. Controls are set up with reasonable defaults, but you can of course change everything in the options menu. There's also a new VR/HMD options menu where you can configure your preferred aim mode (e.g. decouple it from the view) and crosshair style.

The game starts directly in VR Mode, disables view bobbing and sets the texture mode to NEAREST for the proper, pixelated oldschool vibe.

Once you're in the game, press END or the start button on your gamepad to recenter the view.

Short playthrough of the first few levels of Quake in VR

Download and Source

Friday, May 27th 2016 / Comments (21)

The Absolute Worst Way To Read Typed Array Data with JavaScriptCore

Edit: as of September 13, 2016 Apple introduced a native API to manipulate Typed Arrays as part of iOS 10. This makes this whole article obsolete, if still technically interesting.

Oh man, where do I even begin.

Back when the HTML5 Canvas element was introduced by Apple to power some Dashboard widgets, the 2D Drawing Context supported a method to read the raw pixel data. This method was called getImageData() and it returned an ImageData object with a data array. This array however was not a plain JavaScript Array, but a new type called CanvasPixelArray.

This CanvasPixelArray behaved like a normal array for the most part, with some important exceptions: each element was clamped to values between 0-255 and the array was not resizable. This was done for performance reasons. With this restriction, the array could be allocated with a single memory region in the JavaScript engine.

I.e. the JavaScript engine's C++ source for getImageData() looked somewhat like this:

JSObjectRef GetImageData(int width, int height) {
    // Allocate memory for the ImageData, 4 bytes for each pixel (RGBA)
    uint8_t *backingStore = malloc(width * height * 4);

    // Read the pixel data from the canvas and store in the backingStore
    // ...
}

In JavaScript you could then read, manipulate and write image data like this:

// Read a portion of the canvas
var imageData = ctx.getImageData(0, 0, 320, 240);
var pixels = imageData.data;

// Set the Red channel for each pixel to a random value
for (var i = 0; i < pixels.length; i += 4) {
    pixels[i] = Math.random() * 255;
}

// Write it back to the canvas
ctx.putImageData(imageData, 0, 0);

Manipulating pixels on the CPU is generally a bad idea, but using a plain old JavaScript Array would have made it prohibitively slow and memory inefficient. With the CanvasPixelArray it was at least feasible – this new array type deeply made sense.

Later, with the arrival of WebGL, the W3C realized that we need better facilities to deal with low-level binary data. CanvasPixelArray was a good start at making raw byte buffers faster, but other data types would be needed. So the WebGL draft proposed some new types in the same vain. Collectively, these were called Typed Arrays, because, other than plain JavaScript arrays, these had a fixed value type.

Modern JavaScript engines now support Typed Arrays in signed and unsigned flavors, with 8, 16 or 32 bit Integer values as well as 32 bit float and 64 bit float versions. E.g. Int32Array, Uint16Array and Float32Array.

At the same time, the good old CanvasPixelArray was renamed to Uint8ClampedArray. It now supports all the same methods as all the other Typed Arrays.

A very important addition introduced with these Typed Arrays is the fact that the actual data is not stored in the array but in a separate ArrayBuffer. The Typed Array itself is then only a View of this buffer. Furthermore, this buffer can be shared between different Views.

// Create a Uint32Array with 64 values. This allocates an ArrayBuffer with
// 256 bytes (64 values × 4 bytes) behind the scenes.
var u32 = new Uint32Array(64);
u32.length; // 64
u32.byteLength; // 256

// Create a Uint8Array, sharing the same buffer
var u8 = new Uint8Array(u32.buffer);
u8.length; // 256
u8.byteLength; // 256

// Modifying data on either View will modify it in all other views sharing
// this buffer
u8[1] = 1;
u32[0]; // 256

Every browser supporting WebGL or the Canvas2D context needs to be able to read and write the data of these Typed Arrays in native (C++) code as fast as possible. Say for instance you create a Float32Array with the vertex data of a 3D model in JavaScript. The WebGL implementation of your Browser needs to hand this data over to the GPU (via OpenGL or Direct3D) and - if possible - do it without copying.

Fortunately, as we established earlier, JavaScript engines only need to allocate a single lump of data for each array. A WebGL implementation can simply hand over the address of this data to the GPU and be done with it. E.g. an implementation of WebGL's gl.bufferData() might look like this:

void WebGLBufferData(JSObjecRef array) {
    void *ptr= JSTypedArrayGetDataPtr(array);
    size_t length = JSTypedArrayGetLength(array);

    // Call the OpenGL API with the data pointer we got
    glBufferData(GL_ARRAY_BUFFER, length, ptr, GL_STATIC_DRAW);    
}

There. Problem solved. Fast access to binary data without copying or conversion of some sort.

So, if all is good and well, what's with the title of this post then? Well, this is where the fun part starts. Buckle up.

Read complete post »

Tuesday, November 24th 2015 / Comments (13)

Quake for Oculus DK2, Direct Mode

Almost exactly two years ago, I published my first release of Quake with support for the Oculus Rift. Since then, an amazing community has fixed a number of bugs and added some nice features. In particular Luke Groeninger did a lot of work to get the kinks out.

However, Quake was still stuck in the Oculus DK1 world, with no positional tracking and no support for the DK2's Direct Mode. Today, I finally fixed that!

The game looks absolutely gorgeous in the DK2, even after all those years. The run speed is still ridiculous, but somehow I don't get sick playing it in the Rift. Your mileage may vary though, so take it slow.

The game now starts in VR Mode by default, disables view bobbing and sets the texture mode to NEAREST for the proper, pixelated oldschool vibe.

Additonal vars you can use in the console in the game (bring it up using the ~ key):

Download and Source

Wednesday, July 29th 2015 / Comments (6)

Play GTA V In Your Browser – Sort Of

Inspired by a blog post to run your own cloud gaming service which uses a VPN and Steam's In-Home Streaming, I thought I could do this, too, but in the browser.

With the tech I developed for Instant Webcam I had all the major building blocks ready and just needed to glue them together in nice, useable Windows App.

Demo Video for jsmpeg-vnc

The result is jsmpeg-vnc – a tiny Windows application written in C. It captures the screen at 60 frames per second, encodes it into MPEG1 using FFmpeg, sends it over WebSockets to the browser where it is finally decoded again in JavaScript using jsmpeg. All the mouse and keyboard input in the browser is captured as well and sent back to the server again.

This works way better than it should. Latency is minimal and only affected by the network conditions. It's certainly good enough to play most types of games.

Update: I did some latency measurement now. For my system it hovers around 50-70ms (video, photo). Tested with an 800x600 window at ~7% CPU utilization on a Core i5. My desktop monitor is a Dell U2711, which seems to add about 15ms latency itself.

Full source code and binary releases are on github: jsmpeg-vnc

Monday, July 27th 2015 / Comments (61)

What makes an On-Screen Keyboard Fun?

We recently released our typing game ZType for the iPhone and learned a whole lot in the process.

ZType

The original Web Version of ZType was hastily created for Mozilla's Game On competition a few years ago and won the Community Choice Award back then. For this version, it was a conscious decission to not care for mobile device. After all, what fun would a typing game be if you don't have a real keyboard?

The Web Version of ZType still remains to be quite popular and we got and more requests for a mobile version of the game. So a few months back we started investigating the possibilities. There are already a few typing games for iOS out there. Many of them can hardly be called a "game" (to be fair, some don't aim to be one), most use the default iOS keyboard and gameplay is cumbersome – sort of what we expected from typing games with an on-screen keyboard.

The default iOS on-screen keyboard needs a lot of screen real estate for things we don't care about: language settings, microphone input, a space bar and word suggestions. Most modern on-screen keyboards are predictive in that they try to figure out what it is you want to write. This works okay-ish for writing a quick SMS, but quickly becomes obstructive for a typing game. It was clear that if we wanted to bring ZType to mobile devices and make it fun to play, we'd have to come up with a custom keyboard solution.

In ZType we carefully control which words appear on the screen. We prevent having two words with the same first letter in the game at the same time, so that targeting words remains unambiguous. With a layout map of a QWERTY keyboard, we can further ensure that these letters spread out over the keyboard, instead of clustering to nearby keys.

Now, here's the crux: if we control which words are on the screen and we ensure keys are spread out over the keyboard, we can enlarge the touch area for each valid key, making it easier to hit the right key. After having targeted a word, there's only one valid key to hit and we can similarly enlarge the touch area for it as well.

We carefully designed the keyboard so that even with those enlarged touch areas, it's still possible to hit all keys. For example, if S is a valid key and thus has a larger touch area, the touch areas for all surrounding keys (W, A, D, X and Y) simply gets a bit smaller. This took some fiddling around and hours of play testing to get right, but in the end was worth it.

Another thing we noticed is that sometimes you hit a key while performing a downward motion with your finger. Typically an on-screen keyboard only emits a letter when you lift your finger from a key, not when you touch it. This allows you to swipe over the keyboard until you see the right key highlighted, but it also creates another problem: Say you want to type the letter T – there's one smooth motion by your thumb, starting from somewhat above the keyboard, then hitting the surface of the touch screen at that T key and continuing to move your thumb further down on the glass before lifting your thumb again. That moment, when you lift your thumb, it may well be over the G key.

The iOS default keyboard seems to recognize this motion and correctly determines that the letter you meant to type was the T, even though you ended up lifting your thumb over the G. In our tests that happened often enough that we had to take care of it too. We ended up detecting if the key where you lifted your finger was held for a period shorter than 30msec and the key where you started the touch was one row above. If both of the conditions are met, we determine that the key you actually wanted to hit was the key were you started the touch at.

The end result is a game where you can almost blindly type on an on-screen keyboard, without being patronised. I'm proud to say that it's actually a fun typing game.

Try it yourself: Download ZType from the AppStore

Thursday, July 9th 2015 / Comments (6)

Reverse Engineering WipEout (PSX)

In 1995 one of my all time favorite video games was released: the original WipEout for PlayStation. The brand new PlayStation produced 3D graphics previously unseen on living room TVs and WipEout exploited its capabilities like no other game at the time. It was one of the pioneering titles of the fifth generation console era.

WipEout's art style was distinctively different from other games too. With the help of the UK based design studio The Designers Republic the game achieved a mature look that was in stark contrast to the comic style found in most other games.

I remember poking around on the CD of the PC Version of WipEout back in the day, looking for ways to modify the game. I was thrilled to find .pcx images of all textures and tried to change one of the in-game billboard graphics to show my name. I wasn't able to get it working.

Now, almost 20 years later, I thought I'd give it another shot.

WipEout Model Viewer – A WebGL Experiment

Extracting 3D Models

I didn't have much experience with reverse engineering any data formats, but given the game's age, I expected the 3D format to be quite simple and straight forward. What surprised me was the many different nuances in which scene models were stored. The track itself was stored separately with a different format, spanning multiple files, yet again.

Knowing the PlayStation didn't have a Floating Point Unit (FPU) I assumed vertex data had to be stored as Integers. Using JavaScript and the three.js 3D Library I quickly whipped up a page that would load one the games's .PRM files that I thought would contain the 3D objects. I loaded all data in the file as coordinates of 3 Integers (x, y, z) and rendered them as a point cloud. This allowed me to quickly shift the stride (the spacing between distinct coordinates in the file) and look for patterns.

What I found was indeed chunks of raw vertex data, along with an object header specifying the object's internal name and number of vertices and polygons in that object. Each polygon is preceded by a polygon header and a chunk of index pointers into the object's vertices. These polygon headers is where I probably spent most of my time. I identified 11 different polygon types used by the game – triangles, quads, sprites, textured or flat, having vertex colors or face colors etc. Two types are still ignored because I couldn't figure out what they were.

The track itself is stored in several distinct files. The most interesting of which are TRACK.TRV, containing raw vertices, and TRACK.TRF containing the track faces as quads of 4 index pointers into the vertices. Pretty straight forward.

Reading textures

Most textures in early PSX games were stored in the TIM image format, containing data straight in the PSX' native frame buffer format. A TIM file stores pixels either directly as 16 BPP colors, or as 8 BPP or 4 BPP indices into a color palette. And indeed, you can find a number of TIM files on the original WipEout CD: the start splash screen, background images, loading images etc. However, the images I was most interested in – the textures used for 3D models - were missing.

Some googling revealed that textures are stored in the compressed SCENE.CMP files. These files contain a very simple header that specifies the number of TIM images in the file, along with the uncompressed sizes of each image. The data itself is compressed using LZ77. I even found some C code in a reverse engineering wiki, that would uncompress these files. With this code quickly ported to JavaScript and poking around in the polygon data to find the texture index and coordinates, I was finally be able to draw textured models.

The textures for the track however were an entirely different beast. Along with the vertex and face data (TRACK.TRV and TRACK.TRF) each track comes with an additional TRACK.TRS, containing Track Sections of some sort, and a TRACK.VEW, containing visibility lists (I think). I spent a lot of time trying to find the texture index for each track face in any of these files.

In fact, the unhelpfully named TRACK.CMP did not contain the track textures. Instead, track textures were stored in a file called LIBRARY.CMP, which seemed to be compiled from a set of source images.

As I painstakingly found out, the tracks in WipEout had a Level of Detail (LOD) system, subdividing each track face in up to 4x4 quads when they're near the camera. This was probably done to lessen the impact of the PSX' missing perspective correction when drawing textures. These subdivided faces also used higher resolution textures.

Now, here's the kicker the LIBRARY.CMP file contains about 300 images, but only 19 distinct textures - each in 3 different LOD levels: 4x4, 2x2 and 1x1 tiles, each tile being 32x32 pixels in size. And, as if that wasn't complicated enough, these 4x4 and 2x2 tiles are not stored in the right order. Instead, yet another file, LIBRARY.TTF, stores indices into the LIBRARY.CMP to compose these 4x4 and 2x2 versions.

Knowing that the texture index has to be between 0 and 18, I found it stored as a single byte along with each track face. I also found that another single byte for each track face specifies a few flags for that face. The most important one for drawing: whether to draw with flipped X texture coordinates.

Wrapping up

Finally, I had a complete, textured, vertex-lit scene and track. Using three.js drawing everything was a no-brainer, as it happily lets you create models of all kinds of different polygon data. I also wanted to have a simple fly-through animation for each track. three.js was tremendously helpful to create a smooth camera spline along the track. The optional orbit controls for three.js are top notch, too. They just work without any modifications on mobile and touch devices.

With all the work that went into this project, handling the WebGL drawing with three.js turned out to be one of the simplest parts.

The finished WipEout Model Viewer loads all the original data files and does all the binary file reading, unpacking and scene creation in JavaScript.

All in all, I wrote about 800 lines of JavaScript to load and draw 3D scenes for a 20 year old game. I wonder how big the original WipEout sorurce is. It's quite sad that probably nobody will ever see it again, considering that it now belongs to Sony.

Full source on github: github.com/phoboslab/wipeout

Tuesday, April 14th 2015 / Comments (83)

Xibalba & WebVR

WebVR is an effort of Mozilla and Google to enable Virtual Reality content on the web. Experimental builds of Chromium and Firefox that provide a JavaScript API for WebVR have been available for a while now.

My game Xibalba already provided a stereo rendering mode and supported an external tracking server to get orientation updates from the VR headset, but it was clumsy and only worked for the old Oculus Rift DK1. So I decided to ditch the old stereo rendering mode and implement WebVR proper.

Xibalba VR

WebVR handles the distortion for the Headset by itself. All you have to do is render your game into a side-by-side stereo format and request fullscreen mode on your Headset. Orientation updates are provided by a nice, clean API that's easily implemented in the game.

I also added support for the JavaScript Gamepad API with a Plugin for Impact – the game plays great with a XBox360 Gamepad, even if you don't have an Oculus Rift.

Play the game here: phoboslab.org/xibalba/

If you have a DK2 but no WebVR enabled build of Chrome or Firefox, you can also try a standalone Windows version of the game. It's essentially a Chromium build bundled with the game and batch file to start it. Makes it a bit easier to get going.

Download: xibalba-vr.zip ~80mb

Monday, February 2nd 2015 / Comments (8)