OIC
See it and say it!


Current status:
(21 Jan 2003) I am still busy doing other work, but it looks like I may have found a way to be paid for work that is related to OIC. So now I can work on my code and pretend it's work! I don't have much to report, but things are happening behind the scenes. Rewriting everything from scratch was worth it. It is so much easier to add stuff now.


O.I.C. - A new video codec

OIC is a video compressor and decompressor that I have been working on for some months now. I aim to release it as an open source, patent-free codec, and hope that it will be good enough that everyone chooses to use it. I have been working on it long enough that I think a web page is in order for it. I will probably be dual licensing it in a way that lets me sell it to people who want to use it as part of a proprietary, closed, product.

Why O.I.C?

The name or the codec? :-) I decided that I wanted a format that allows me to archive video on my computer, or on CD. MPEG 1 and 2 aren't good enough, and all of the MPEGs look to be a mess of patents and licences. DivX is not free and open either, and could be liable to be sued out of existence one day as it is sort of an MPEG-4 implementation. I don't really know, but I decided to take no chances.

I was looking at a bunch of low bitrate video I downloaded and got very annoyed with the blocky artifacts and things. Yech. I decreed that There Had To Be A Better Way, and in an entirely in-character display of hubris, declared That I Had Found It. The idea is to use much higher resolution motion vectors than normal, and literally warp one frame into the next. Coupled with wavelet compression, which is a lot better than MDCT methods when it comes to avoiding the blockies, I figured I had a decent chance of making it work. It was my duty to humanity and the world to set things right and do it The Right Way.

The fact that I got all excited about the idea of using much more detailed motion fields as a form of image representation and needed an excuse to try it has nothing to do with anything. :-)

It was pretty easy coming up with code that could warp one frame into another. Of course, in the real world, it becomes obvious that you need to encode your vectors pretty cleanly to keep the current frame looking like the previous frame after it has been eaten and vomited by your pet mutt. I have not yet found The Perfect Way, but I have something good enough  at the moment. I have high hopes that I will significantly improve it over time.

Why the name? Well, I needed one and this one appeals to me. It really really beats the other ones I came up with with friends on IRC. It works as a distinctive three letter extension too. You can pronounce it "Oh I see", or "oyk!" if you are in a hurry. Either way I like it. Hopefully it's what you'll say either way when you see results of the final product. OIC - See it and say it!

Technical stuff

OIC is a wavelet based codec. I use Scalar Quantisation at the moment, cos it's easy and was a good stepping stone. I'll use Vector Quantisation eventually, when I figure out which way is up on this bloody great big textbook I just bought. I have my own little algorithm that takes the quadtree of wavelet co-efficients and spews out bits to define it. Each frame consists of several sections of data, each encoded in an embedded manner. I dunno why I bothered to make it embedded, but I figured I would if I could, and maybe I'll be able to make use of it for bitrate peeling one day.

The quadtree encoder is designed so I can literally drop in a VQ when I figure out exactly which type I'll be using. VQ should improve its performance a lot, as I waste bits at the moment that I havent bothered cleaning up cos the encoder isnt final anyway.

OIC is a VBR codec. That means "Variable Bit Rate". It is primarily designed so that you the user specify a quality for the video, and it makes the smallest stream it can at that quality. When you want to stream video over the net or something, you generally want a Constant Bit Rate encoder, to make sure you dont overflow buffers. OIC can be made CBR to a useful degree with a little massaging, but I'm not going to bother now.

Oh yes. This is all written in Pascal, by the way. Hah! Bet that scares you. Seriously, I needed a language the compiles all my code in milliseconds so I can experiment a lot. Of course, things like range checking help a lot as well. I don't have the time to waste looking for unforced programmer errors. I've been using Kylix, and haven't done anything particularly esoteric, so it should compile in the free version of Kylix, and, I hope, freepascal as well. If you are wondering, yes, I am a Bondage And Discipline programmer. I know I make mistakes, and I choose my preferred language accordingly.

I may produce a C version if there is any demand for it.

Screenshot

Well, we all know you have to have a screenshot now don't we? :-) The different buttons execute different functions within the code, allowing me to play with different ideas just by clicking around. This also allows me to examine the state of just about everything at any point in the compression. This is important so I can tweak my algorithms.

Current Status

Well, I have code that will take streams of JPEGs, GIFs, or PNGs and output a stream of PPMs after compressing and decompressing it. It works at 320x256, black and white only. This is because I want my code to be readable and easy to modify. It really needs to be cleaned up, and bolting on resolution independence and colour will just confuse the issue at the moment. I know how to do it, and roughly how many extra bits colour will eat per frame, and will leave it at that til later. Colour will be a relatively low bitrate addition.

If I fix the quality at around 40dB PSNR in each frame (which is pretty good quality for my input sequences) a talking head sequence (at 30fps) is around 200kbits/sec. Talking heads with a busy background (like people talking at a party), is around 500kbps. Two people beating the crap out of each other with the camera doing its nut panning around them is around 700kbps. This is using SQ, and I think I can do much better with VQ. How much better is going to be the killer question. I have my own theories, and they make me warm and fuzzy, but I'll not say anything till I have it working.

Known Bugs/Issues

A problem is I haven't figured out a nice way to get real streams into the encoder yet, without going via a bunch of still frames. I'm looking at the ffmpeg stuff for this. It looks really simple to use, and well done too.

I send the LL band of my wavelet transform as raw data at the moment. Absolutely zero compression. As I improve other things, this becomes more and more important. I'll do something after the VQ stuff is done. Dunno what yet.

Error is strictly held below a certain threshold at the moment. This is bad, as I can probably throw all sorts of stuff away if I have a better idea of what won't be noticed. I have a cunning plan on how to do this, but have not implemented it yet. I could probably squeeze another 20-40% out of the current code (SQ based) if I added this.

Current Plans

My current plan is to add a Vector Quantiser to my wavelet codec. This should dramatically improve the visual quality, and give me a lot more leeway for picking lower bitrates. I intend to use a hybrid classification/finite state/gain shape Pruned-TSVQ eventually.

I need to get a good supply of varied video streams to train my VQ codebook. High quality interlace-artifact free stuff. At the moment I am using some Buffy episodes encoded with MPEG-1 and a few video clips I extracted from a DVD before Broadcast 2000 weirded out on me. I may add an ffmpeg section to my code just to get training sequences.

At some point, I will get around to adding the error handling I mentioned in the issues section. It's not a priority because it's relatively simple and I have more difficult things to do.

Finally, I need to spend some time cleaning things up before I plug in the new quantiser. The code is monolithic and ugly at the moment, as if someone just hacked it together as a prototype. Gee. I wonder why? :-)

Anyone who would like to help with donations of short high-quality video clips, please feel free to mail me and we'll make a plan. I need lots of different samples, of maybe 30 frames apiece.

Conclusion

I am writing my code in between other things at the moment, and will start releasing source when I have stabilised the format. I'll be using Ogg as the stream encapsulation format, and probably thus for audio too.