A Flash Developer Resource Site

Results 1 to 18 of 18

Thread: VR Headtracking

Hybrid View

  1. #1
    Senior Member
    Join Date
    Nov 2003
    Location
    Las Vegas
    Posts
    770

    VR Headtracking

    I've been trying a few methods to do headtracking with a webcam, and I'm more and more convinced it's possible in flash with items you have at home.

    Attached is the begining of my experiments, using the same on-screen demo environment type as shown on the youtube videos for WiiFlash and PS3Eye. It doesn't have the same amount of freedom yet, no rotation around the y-axis and rotation around the x-axis is auto-focused on the horizon. A single light source develops z-depth issues when the detection area fluctuates, but that can be smoothed out with multi-sampling. Once I mount a couple of lights on a hat, I'm sure the increased width will be easier to work with.

    I think for full freedom (or close to it), I'll need a third light source either at the brim or top of the hat. This will allow for z-depth dertimination by height and y-axis rotation by width and the distance between the points. I don't know how to do rotation with two points without constraining right->left and left->right (ie you couldn't look left if you are standing on the left), which is fine for windowing but isn't full freedom.

    Any ideas on better methods?

    Here's some ugly code I'm working on:
    PHP Code:
    // import flash.display.BitmapData, flash.geom.Rectangle, flash.geom.Point, and flash.geom.Matrix

    // set up rendering variables
    var Ox Stage.width/2;
    var 
    Oy Stage.height/2;
    var 
    focalLength 100;
    var 
    cam = new Object();
    cam.0;
    cam.0;
    cam.100;

    // light source width multiplier
    // for a single point of light try values between 5 and 20
    // for two points of light, or a light bar, try values between 0.1 and 5
    var scaler 10;

    // attach the webcam video to a video object
    my_cam Camera.get();
    webcam_video.attachVideo(my_cam);

    // create a mirror image of the webcam video, scaled 2* for help with precision later
    createEmptyMovieClip('holder'getNextHighestDepth());
    now = new flashBitmapData(webcam_video._width 2webcam_video._height 2);
    holder.attachBitmap(nowholder.getNextHighestDepth());
    with(holder){
        
    _x webcam_video._width;
        
    _y webcam_video._height 10;
        
    _xscale = -50;
        
    _yscale 50;
    }

    // create a box to place around the isolated light source for debugging
    createEmptyMovieClip('box',getNextHighestDepth());
    with(box){
        
    lineStyle(10xFFFFFF); lineTo(1000); lineTo(100100); lineTo(0100); lineTo(00);
    }

    // create some target objects
    createEmptyMovieClip('targets'getNextHighestDepth());

    MovieClip.prototype.makeTargets=function(targetObj){
        for(
    06n++){
            
    this.createEmptyMovieClip(targetObj.names[n], 100+n);
            
    with(this[targetObj.names[n]]){
                
    lineStyle(100targetObj.colors[n]); lineTo(1,0);
                
    lineStyle(500xffffff); moveTo(00); lineTo(10);
                
    lineStyle(25targetObj.colors[n]); moveTo(00); lineTo(10);
                
    lineStyle(120xffffff); moveTo(00); lineTo(10);
            }
            
    this[targetObj.names[n]].targetObj.x[n];
            
    this[targetObj.names[n]].targetObj.y[n];
            
    this[targetObj.names[n]].targetObj.z[n];
        }
    };

    targetsObject = new Object();
    targetsObject.names = new Array('target1','target2','target3','target4','target5','target6');
    targetsObject.= new Array(0,200,-300,-100,-300,300);
    targetsObject.= new Array(0,200,-100,-100,-300,-300);
    targetsObject.= new Array(100,200,300,500,700,700);
    targetsObject.colors = new Array('0xff0000','0x0000ff','0x00b000','0xff00ff','0xffff00','0xff6666');

    targets.makeTargets(targetsObject);

    // work the magic
    this.onEnterFrame=function(){

        
    // draw the current webcam image to a bitmapdata object
        // scaling up will help with accuracy a bit
        
    matrix = new flashGeomMatrix();
        
    matrix.scale(2,2);
        
    now.draw(webcam_videomatrix);

        
    // eleminate all but the brightest colors
        
    now.threshold(nownow.rectangle, new flashGeomPoint(00), '<='0xFF6666660xFF0000000xFF0000FFfalse);

        
    // find the bounding box of the brightest color
        
    redBox=now.getColorBoundsRect(0x00FF0000,0x00FFFFFF,false);

        
    // TO-DO -- use multiple lights, subdivide the bounding box and repeat to isolate each light

        // if a light source is detected, track it
        
    if(redBox.width>0){

            
    // align and resize box indicator for debugging
            
    box._x redBox.x/2-redBox.width/4+webcam_video._x;
            
    box._y redBox.y/2-redBox.height/4+webcam_video._y;
            
    box._width redBox.width/2;
            
    box._height redBox.height/2;

            
    // set z based on bounding box width and scaler value
            
    cam.redBox.width*scaler;

            
    // determine the head position in 3d space (remember that holder has been scaled to 2*)
            
    var ratio focalLength / (focalLength cam.z);
            
    cam.= (holder._width redBox.x) * ratio 4;
            
    cam.= (redBox.holder._height 2) * ratio 4;
            
    cam.rotY 1-(redBox.y/holder._height); // keeps the camera pointed at horizon

            // TO-DO -- decide which freedoms are important, rotation vs position in x/y planes
            // perhaps 3 points of light can be used to triangulate all freedoms using x,y,width,height ratios

            // render targets and perspective lines        
            
    adjust3d(targets.target1cam);
            
    adjust3d(targets.target2cam);
            
    adjust3d(targets.target3cam);
            
    adjust3d(targets.target4cam);
            
    adjust3d(targets.target5cam);
            
    adjust3d(targets.target6cam);
            
    renderLines(cam);
        }
    };

    // 3d perspective rendering
    function adjust3d(objcam){
        var 
    TminusC = (obj.z-cam.== 0) ? obj.cam.z;
        if(
    focalLength TminusC == 0){
            var 
    ratio 0.00000001;
        }else{
            var 
    ratio focalLength / (focalLength TminusC);
        }
        
    obj._x Ox + (obj.cam.x) * ratio;
        
    obj._y Oy + (obj.cam.y) * ratio;
        
    obj._xscale obj._yscale ratio 100;
        if(
    obj.cam.focalLength){ obj._visible false; }else{ obj._visible true; }
        
    obj.swapDepths(Math.round(10000-obj.z-cam.z));
    }

    // render prespective lines
    function renderLines(cam){
        
    with(_root){
            
    clear();
            for(var 
    = -Stage.width<= Stage.widthx+=200){
                var 
    = -Stage.height;
                var 
    0;
                var 
    ratio focalLength / (focalLength 200-cam.z);
                var 
    sx Ox + (cam.x) * ratio;
                var 
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength + -cam.z+100000);
                var 
    ex Ox + (cam.x) * ratio;
                var 
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);

                
    Stage.height;
                
    0;
                
    ratio focalLength / (focalLength 200-cam.z);
                
    sx Ox + (cam.x) * ratio;
                
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength + -cam.z+100000);
                
    ex Ox + (cam.x) * ratio;
                
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);
            }
            for(var 
    = -Stage.height<= Stage.heightv+=200){
                var 
    = -Stage.width;
                var 
    v;
                var 
    0;
                var 
    ratio focalLength / (focalLength 200-cam.z);
                var 
    sx Ox + (cam.x) * ratio;
                var 
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength + -cam.z+100000);
                var 
    ex Ox + (cam.x) * ratio;
                var 
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);

                
    Stage.width;
                
    v;
                
    0;
                
    ratio focalLength / (focalLength 200-cam.z);
                
    sx Ox + (cam.x) * ratio;
                
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength + -cam.z+100000);
                
    ex Ox + (cam.x) * ratio;
                
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);
            }

            for(var 
    200<= 100000z*=1.5){
                var 
    Stage.width;
                var 
    Stage.height;
                var 
    ratio focalLength / (focalLength z-cam.z);
                var 
    sx Ox + (cam.x) * ratio;
                var 
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength z-cam.z);
                
    = -Stage.height;
                var 
    ex Ox + (cam.x) * ratio;
                var 
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);

                
    = -Stage.width;
                
    Stage.height;
                
    ratio focalLength / (focalLength z-cam.z);
                
    sx Ox + (cam.x) * ratio;
                
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength z-cam.z);
                
    = -Stage.height;
                
    ex Ox + (cam.x) * ratio;
                
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);

                
    Stage.width;
                
    Stage.height;
                
    ratio focalLength / (focalLength z-cam.z);
                
    sx Ox + (cam.x) * ratio;
                
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength z-cam.z);
                
    = -Stage.width;
                
    ex Ox + (cam.x) * ratio;
                
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);

                
    Stage.width;
                
    = -Stage.height;
                
    ratio focalLength / (focalLength z-cam.z);
                
    sx Ox + (cam.x) * ratio;
                
    sy Oy + (cam.y) * ratio;
                
    ratio focalLength / (focalLength z-cam.z);
                
    = -Stage.width;
                
    ex Ox + (cam.x) * ratio;
                
    ey Oy + (cam.y) * ratio;
                
    lineStyle(1,0xffffff);
                
    moveTo(sx,sy);
                
    lineTo(ex,ey);
            }
        }
    }

    stop(); 
    Note- For this to work, you need either a dark room, or a piece of film over the webcam lens to filter out all light other than your lightsource
    Attached Files Attached Files
    Last edited by JerryScript; 02-20-2008 at 07:19 PM. Reason: forgot to add attachment

  2. #2
    do your smiles love u? slicer4ever's Avatar
    Join Date
    Dec 2005
    Location
    in a random occurance
    Posts
    475
    nice to see your asking for help rather than just linking to a vid=-)

    unfortuantly i've never worked with a webcam so i don't think i'll be of much help

  3. #3
    M.D. mr_malee's Avatar
    Join Date
    Dec 2002
    Location
    Shelter
    Posts
    4,139
    Note- For this to work, you need either a dark room, or a piece of film over the webcam lens to filter out all light other than your lightsource

    thats why I'm hesitant about this webcam stuff, it seems that to get something working you need to be in really specific conditions. Would it work better if you chose a color that is not seen much in the real world or in an office (magenta?) and use that as points?

    You would have to go to a arts and craft store and buy those little sticky dots and slap them on your head. Then isolate that color in the bitmap. Might work better, and in any light.
    lather yourself up with soap - soap arcade

  4. #4
    Senior Member
    Join Date
    Nov 2003
    Location
    Las Vegas
    Posts
    770
    A piece of film from some bad negatives taped to your webcam lens is all you need for daytime, that's not too difficult(ok, if you're of the digital age, ask your mom/grandma). Turning of the lights after dark isn't too hard either, and I've used everything from a lighter, to my mouse, to my cellphone as the targeting light source.

    I've done some testing using specific color values, but the issue is with reflected light. Even without magenta existing in the real world, it does exist in reflected light in a webcam since the true colors are resolved to a limited subset of the spectrum (256,256,256). A single point of light the color of magenta away from your targeting dot will make the colorBoundsRect paramaters worthless.

  5. #5
    When you know are. Son of Bryce's Avatar
    Join Date
    Aug 2002
    Location
    Los Angeles
    Posts
    838
    I was really interested in this post and did a quick search to see if there was any resources to see how tracking was accomplished with Sony's Eyetoy for PlayStation and I stumbled upon this interesting video.

    http://www.gametrailers.com/player/u...es/169289.html

    In the video the guy recreates that Wii VR tracking demo on PlayStation3 using the PS3. He wears a pair of glasses with infrared lights emitting from it and this is picked up from the camera. The camera uses film to filter out only the infrared light, similar to how Jerry described.

    I was wondering if you tried this, Jerry. I'm guessing you've already seen it since you mentioned the film and PS3Eye. The conditions may be a bit specific but if this is something you can setup for less than $50 it's definitely worth investigating, if only for an artistic experiment.

    I'm starting to think I should invest in a webcam just for new interactive development.

  6. #6
    Senior Member
    Join Date
    Nov 2003
    Location
    Las Vegas
    Posts
    770
    Yes, I'm inspired by both the WiiFlash and PS3Eye demonstrations. I checked out the source, but it's dependant upon libraries I don't have. One of the source files for the PS3Eye demo uses lookups for alot of values, which helps with speed I'm sure.

    You can buy a cheap webcam for $20, or a used one for $10. I have a cheap one that is over 6 years old, and it works fine for this. A piece of scrap negative can't even be figured into the cost, you could get it for free at any photo shop if you don't have some old photos envelopes lying around. So for $10 bucks, you could be set up.

    Being able to move around objects is so immersive it shouldn't be overlooked as a fun (dare I say it after a recent thread here, even addicting) factor for games!

  7. #7
    ....he's amazing!!! lesli_felix's Avatar
    Join Date
    Nov 2000
    Location
    London UK
    Posts
    1,506
    You could do some clever OCR to get the colourbounds thing working.

    One or two odd pixels here and there is bound to be discernable from a small area of more solid colour.

    you could also do some pre-calibration to get it working, so if you're sitting in front of a magenta wall, if tells you to move your ass somewhere else, or use a different colour.

  8. #8
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    I've been experimenting with face-recognition based tracking lately. Although currently too slow to be useable (1-3 fps), I do get x/y tracking with NO special equipment. Z shouldn't be too hard to get via scale comparison. Rotation is far more problematic for this method.

    You can see a demo of the tracking here:
    http://suckatmath.com/personal/faced...acedetect.html

    I threw this together with papervision yesterday (my first pv3d experiment), and got a very natural interface. You move your head, you move the camera.

    If you'd like to see the code, I just open-sourced it. http://code.google.com/p/deface

  9. #9
    Elvis...who tha f**k is Elvis? phreax's Avatar
    Join Date
    Feb 2001
    Posts
    1,836
    Nice example both Jerry and 5 tons
    5tons your example is pretty slow and can take several seconds to update if I move my head. Nevertheless it traces my head as it should! Good job

    I would love to try some of this stuff some day, so thanks for sharing your progress!
    Streets Of Poker - Heads-Up Texas Hold'em Poker Game

  10. #10
    Senior Member
    Join Date
    Nov 2003
    Location
    Las Vegas
    Posts
    770
    Nice work 5tons! Thanks for posting your source as well!

    As you said, it's too slow currently to be used, but there is potential there.

  11. #11
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    The tracking stuff was actually just a side goal of the main face-detection project, but now I'm finding it quite interesting. How's this for an idea:
    Use face-detection as a calibration step, to determine which colors are "face", use those colors to filter the pixels acquired through the standard difference between frames motion detection, or just in a threshold operation. A colorbounds should then get you a fairly good idea of where in the frame the users face is, and from there you can use a little bit of edge detection for orientation calculation and location refinement.

    It's a rough idea, but I think that it might be made to work.

  12. #12
    ....he's amazing!!! lesli_felix's Avatar
    Join Date
    Nov 2000
    Location
    London UK
    Posts
    1,506
    Without looking at the source 5tons... And going on what you've just said, i'm curious...

    does the whole face tracking code re-execute from fresh every frame, or does it use the last position, and previous frame information to get a better grip on where the face is?

  13. #13
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    It does use previous information to narrow the search window for the face (basically it scales the previous found rectangle up and down by .2 as the new limits), but each frame DOES require a fairly expensive preprocessing step to calculate what is called an Integral Image on which the face classifiers work.

  14. #14
    Senior Member AzraelKans's Avatar
    Join Date
    May 2002
    Location
    Hell... with frequent access to heaven ;)
    Posts
    409
    mr_malee:
    Actually you could use IR Leds instead of regular lights so you could use it on regular lighting conditions.

    Great Work jerryscript!

    Btw, I remember you mentioned this method to me when I was working with my OSR engine but I found is faster to

    -Store the image in a byte array (12 ms for a 320*240 image aprox.)
    -Analyze the bytearray for a lighting threeshold and store the result in a binary array.
    -Analyze the binary array instead.

    Is a LOT faster to analize an array in memory than using getPixel, getThreshold about 10 times faster actually. (a lot more coding is required though)
    Last edited by AzraelKans; 02-21-2008 at 02:13 PM.

  15. #15
    Senior Member
    Join Date
    Nov 2003
    Location
    Las Vegas
    Posts
    770
    Hmmm, 5tons' demo, and Mr_Malee and lesli_felix's post made me think...

    1- click on several points on the face to determine a color average

    2- click on the eyes and determine their color average

    3 - use threshold and getColorBoundsRect to determine the face and eye screen positions

    4 - use the spatial relation between the eyes and the face to determine orientation and positioning including depth

    With this method, you increase your points of reference from 1 to 6 making positioning and orientation much easier to calculate, and more accurate. The farther the eyes are from the top of the face, the greater the cam.rotX. The farther from the bottom of the face, the lesser the cam.rotX. The farther from the left side, the greater the cam.rotY (converse for right side). The greater the size of the face and/or the distance between the eyes, the lesser the cam.z.

    hmmm, just thought of something else... If I end up posting an example using an image of my face, you are all going to regret it!

  16. #16
    M.D. mr_malee's Avatar
    Join Date
    Dec 2002
    Location
    Shelter
    Posts
    4,139
    that sounds just crazy enough to work. Get a demo up so I can rotate stuff with me noggin
    lather yourself up with soap - soap arcade

  17. #17
    Senior Member
    Join Date
    Nov 2003
    Location
    Las Vegas
    Posts
    770
    Just an update here. I've only been able to work on this in my spare time, but at least there is good progress, and no need for special lights or other equipment. Current status:

    1- head recognition : buggy, first attempted using colorBounds reduction via motion detection, but this doesn't work well for those with long full voluptuous hair, may have to resort to an initialization matrix via a historgram of a scaled down bitmapData (down to 10px), but I really want to avoid histograms

    2- facial orientation : buggy, but improving fast. using a modified version of GSkinner's ColorMatrix class in combination with paletteMap, I can easily find the eye height, and with a bit of twiddling of contrast/brightnes I can connect the dots between the eyes

    3- noise reduction : not-implemented yet, I'm experimenting with a couple of algorithms to try to make different test sets cancel each other's errors, and to make each test set ignore out-of-bounds data. For those interested in the algorithm I'm trying to adapt, you can read about it here : http://www.mii.lt/informatica/pdf/INFO537.pdf

    My current process is as follows:

    1- grab webcam image and store in two bitmapData's via draw

    2- use difference blend mode and threshold to determine motion area

    3- copyPixels of motion area to new bitmapData (this cuts out as much of the image as possible resulting in a smaller image area for further processing)

    4- adjust the brightness and contrast for initial color reduction (GSkinner's)

    5- paletteMap the results to as few colors as possible (down to 9 or even 6)

    6- copyPixels from a narrow band in the center of the motion detection bitmapData at what is assumed to be the forehead height, then use threshold to determine the eye height

    6- floodFill the palleteMap bitmapData in the assumed forehead region, then use threshold to determine the facial width

    7- copyPixels using the facial width and eye height as a guide to grab the eye region, then use threshold to determine the eye positions

    8- draw an ellipse using bitmapFill in a seperate sprite with 2:1 (h:w) proportions based on the floodFill/threshold results to show only the face, then draw a box from eye to eye.

    The result is that you can now track the pitch and yaw of the head. I doubt I'll include roll unless absolutely necessary, unless I find a cheap way. Once I work out a good head detection method (regardless of hair), it will be possible to make both fulcrum adjustments, and pitch and yaw, which should be enough for VR with nothing more than a webcam! Mr Malee's Amazing Noggin Turning Control System is not far away!

    Attached is an example swf. The text boxes in the lower left are for brightness (left text box) and contrast (right text box). For my system and lighting conditions, I get the best results with brightness adjusted based on conditions (0 to -50), and contrast set as high as possible (up to 100).

    Note- this only works with head motion, not full body motion yet! If anyone has some good ideas for how to find the head regardless of how much of the body is showing (and regardless of hair style), I would appreciate your suggestions!
    Attached Files Attached Files

  18. #18
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    Well, since posting that demo back in the thread, I've got some optimization help from FlashGuru on the face detection code. It's still not quite fast enough to use alone in a realtime situation, but it's somewhere between 10 and 20 fps. We used it as an intermittent position corrector for a mean-shift head tracking application, and that worked very well.
    I haven't updated the code in the svn repository yet, but will probably do that in the coming days or weeks.

    You could take a similar approach, using face detection for initialization and correction, but letting faster/looser algorithms do the actual tracking.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  




Click Here to Expand Forum to Full Width

HTML5 Development Center