machine vision – Alex Sleat

Head over to this page if you want some tutorials on getting started with the Kinect and libfreenect, I’ll update more as time goes on and I have free time.

I’ve been meaning to grab myself an Xbox 360 Kinect for a while, not because I’m a big motion controlled game fan but for machine vision development. Within the first month of the Kinect open source drivers being released the coolest things were seen, from motion controlled media centres to 3D modelling. I’ll admit, I’m a little late to the game, mostly due to the amount of work in my final year at university and other general business. Over the summer I’ll have plenty of time to do a couple of projects and hopefully come up with something cool and contribute to the scene.

Anyway enough of the small talk, I’ve decided to blog in as much detail the journey through the development, from the installation of the libraries to writing the first and last bit of code as a sort of a set of tutorials for anyone else who wants to get into it.

There are currently two main sets of drivers/libraries out there libfreenect and OpenNI both sporting hip, cool, open source names. So which one do you choose? Well, here’s a brief description of both.

Let’s start with OpenNI, these are the Official PrimeSense (the people that Microsoft paid to actually create the Kinect) these allow access to audio, video and depth with the addition of PrimeSense’s NITE Middleware. NITE is the software library used for skeletal tracking, hand gesture recognition and scene analyzer (used to separate figures in the foreground from the background) .

Alternatively there is the libfreenect libraries, from the community over at openkinect.org. While these are admittedly lacking slightly in features such as skeletal tracking and hand gesture recognition they much make up for it in the dedication to open source and the creation of the best suite available. These have access to video, microphones, motors and LED with speakers currently being worked on. They work under a variety of language wrappers for most OS’s and will of course by my personal library of choice.

Fortunately, you won’t have to decide which one you’d prefer ’cause you can run them both on the same machine but you’ll have to look into licencing information for releasing projects with OpenNI so it’s unlikely you’ll want to combine them (or even allowed?).

libfreenect Installation:

OpenKinect’s getting started page provides a well enough documented installation guide that anyone should be able to get them up and running under Windows/Linux or OSX. With Ubuntu being the distro of choice for installation guide. – http://openkinect.org/wiki/Getting_Started

If you’re running Arch, there are a few AUR packages available however they all seem to have lacked updates for a few months but the manual build is pretty simple on the getting started page, I’ve also added a quick list of commands to get you there:

Grab the git copy of the libraries:

git clone https://github.com/OpenKinect/libfreenect.git
cd libfreenect/

Make, install:

mkdir build
cd build/
cmake ..
make
sudo make install
sudo ldconfig /usr/local/lib64/

Add your user to allow access to the connect by creating a group called video and adding your user to it:
note: this can be skipped if you don’t mind running as root/sudo

sudo nano /etc/udev/rules.d/66-kinect.rules
sudo usermod -G video username

Test the kinect with the example program:

bin/glview

If all went well you should have seen a sight similar to the screenshot above, if not check out the OpenKinect page for more information and see if the problems you’re having haven’t already been resolved.

After spending some time to write mbed drivers for the C328 camera for mbed and then taking a break, it looks like someone swooped in and did a cracking job – http://mbed.org/users/shintamainjp/notebook/CameraC328/. The test program takes uncompressed snapshot (80x60px), uncompressed preview (80x60px), JPEG snapshot and JPEG preview images and stores them on the mbed filesystem which allows you to grab them via USB.

A huge issue it looks like I’m going to face is that it takes on average about 6.9 seconds (or a incredible 0.145fps) to take a 80x60px uncompressed image and around 11.6 seconds (0.08621fps) for a 640x480px JPEG image, which I think may be more down to the camera than the mbed or software. If that is the case, there may be a pretty high chance I’m going to have to switch my camera.

I’ll have to explore the code a little more tomorrow to see how this work (in the hours when it’s not so late/early) but this should be perfect since it should give me more time for vision processing..

mbed + C328 - JPEG Preview (640x480px)