Real-Time Stroke-based Stylization of Image, Video, 3D Models

Jingwan Lu (jingwanl@princeton.edu), Pedro Sander, Adam Finkelstein

Introduction

This program takes image, video or 3D model (or any combination of them) as input and produces stylized rendering in real-time on high-end commercial graphics card. The program allows user to control the rendering style by adjusting a set of parameters.

In order to run the executable, you need to have the following hardware, operating system and software available.

  • NVIDIA Graphics Card (for example GeForce 8800)
  • Windows Vista (The earlier version of window does not support DirectX10)
  • Microsoft DirectX SDK( November 2007 or 2008 are guaranteed to work. The others are not tested )
    Make sure the environment variable DXSDK_DIR is set correctly to something like C:\Program Files\Microsoft DirectX SDK (November 2008)

The program is developed using C++ and DirectX10. The source codes are provided with Visual Studio 2008 project files.

File Structures

The input of the program can contain three different formats.

  • Image : The input images should be put into the folder Image\
  • Video : The input videos should come as successive frames in jpg format (000000.jpg) and be put into the folder Video\[VideoName]\
    You can decompose a video into images using MPlayer to execute a simple avs script.

    For example, the avs script may look like this :
    DirectShowSource("c:\Butterfly.wmv", fps=15).ConvertToRGB24
    ImageWriter("C:\ButterFly\",0,0,"jpg")

  • Model : The input model meshes (sdkmesh, obj) together with textures (dds, jpg) and animation (sdkmesh_anim) should be put into the folder Model\[ModelName]\

    The program currently supports the sdkmesh models (windows specific). You can convert other types of model to .x format using softwares like Maya. Then you can use windows tool to convert .x to .sdkmesh. This type of model may also come with an animation sequence defined in file .sdkmesh_anim. For sdkmesh, the program supports rendering of both static and animated model. The rendering of static scene expects several .sdkmesh models, sky and objects. The rendering of animated model expects a .sdkmesh model and a .sdkmesh_anim (specify animation sequence)

    The program also supports textured obj models. The program expects a .obj file, a .mtl file (specify material) and some textures if any. The obj loader is not robust. It only supports diffuse, specular and ambient lighting. The use of bump mapping, reflection mapping and texture atlas (specified in .mtl) is not supported.

The particular look of individual brush strokes is defined by two types of textures.

  • Surface Texture : The textures that define the stroke surface details should be put into the folder StrokeTex\
  • Shape Texture : The textures that define the 2D shape of the stroke should be put into the folder AlpMask\

The screenshots taken when running the program are saved to the folder ScreenShot/Original/ and ScreenShot/Rendering/.

The rendering style files (txt format) should be put into the folder Style\. The Style Format.txt describe the contents of the style files (save and load from the program). The other style files are associated with different input and will be preloaded into the program when rendering.

UI and Features

The program starts with no UI showing in the window.

Pressing space key gives the main menu and the help text. The UI are separated into two menus : Stroke Properties and Shared Selection. Stroke properties controls the stroke textures, stroke size, probability, alpha etc. The rest of the controls shared by different rendering session (image, video, model etc) are put into Shared Selection

Help Text

  • Frame Per Second
  • Rendering Session
    Image
    Video
    Static Model (Optical Flow)
    : render the scene using 'perfect' optical flow
    Static Model : render the static scene frame by frame, construct new strokes for every frame
    Animated Model : models(sdkmesh) with animation sequence(sdkmesh_anim)
    Model On Image : animated model rendered on top of still image
    Model On Video : animated model rendered on top of video frames
    Obj Model : render the static obj model using 'perfect' optical flow
  • Total Brush Stroke Count : the total number of strokes in all layers
  • Stroke count break down for three layers : S (small strokes), M (median strokes), L (large strokes)
    Pre : the number of strokes in the layer in the previous frame
    Cur : the number of strokes in the layer in the current frame
    Append : the number of strokes get appended in append step
    Delete : the number of strokes (negative) get deleted in update and delete step
    Cur = Pre + Append + Delete
  • Animation Paused : when the animation for the animated model is paused by user
  • Video is loaded : when the video frames are selected and loaded onto the graphics card
  • Camera position : where the camera is in 3D
  • lookat position : where the camera is pointed to in 3D

Mouse Interaction

The change of camera and lookat position is only enabled for model related rendering.
  • Right Mouse Button : rotate the camera around the model , change the camera position
  • Middle Mouse Button : pan the camera
  • Middle Mouse Wheel : zoom in/out the camera
  • Left Mouse Button :
    • check Change LookAt Pos : change look at position
    • check Enable Focus : change focus point position
    • default : do nothing

In Stroke Properties

  • Use Red for Orientation : use red color chanel or hue as stroke orientation
  • Clip Strokes at edges : apply stroke clipping or not
  • Stroke Texture : give different surface detail
  • Alpha Texture : crop the stroke into different shape
  • Gradient Threshold : The strokes with gradient magnitude larger than this threshold will be drawn
  • Gradient Threshold 1 : separate the coarse and median level
  • Gradient Threshold 2 : separate the median and fine level, should be larger than Gradient Threshold 1
  • Stroke Length, Stroke Width : specify the size for coarse level stroke
  • Relative Size : specify the ratios with the size of coarse level strokes The rest explain themselves

In Shared Selection

  • Style
    • Load Style : open a file dialog to load a specific style file (.txt) which includes most of the parameter values
    • Save Style : save all the parameter values to a specific .txt file
  • Screen Capture

    There are three types of screen capture. The rendering results are saved into \ScreenShot\Rendering as bmp file (for example 0.bmp). The rendering of original input are saved into \ScreenShot\Original. The name of the output file increments from 0. Once the rogram starts running, it will keep incrementing.

    • Image Capture : does single frame capture.
    • Clip Capture : capture the result of continuous camera movement or mesh animation
    • Shot : used to capture rendered video frames, check it to start and uncheck it to stop
    • Reset Index : reset the output file name to start from 0 again

  • Input Selection

    There are five types of input for stylization which are image, video, static model, animated model and Obj model. All the inputs are loaded in run time when you select a specific one from the dropdown list, except video. For video, you need to check "Load Video" to load the frames, which will take several seconds. Obj model may also take several seconds to load depending on the complexity of the geometry. Different types of input are loaded together with different default styles (imageStyle.txt, videoStyle.txt, sodierStyle.txt etc). You can load other style file for rendering or interactively change the parameters. Camera position and lookat position are also adjusted for different input.

    • Input Image
    • Input Video
    • Input Animated Model
    • Input Static Model
    • Input Obj Model
    • Load Video : Load the selected video frames

  • Rendering

    You need to select different rendering session from the dropdown list in order to switch from rendering one type of input to another type. You can also select different rendering stage to check out the intermediate results.

    • session : After changing the rendering session, the runtime change of parameters for the previous rendering ession will be lost. The rendering of new type of input comes with predefined parameters.
      Image
      Video
      Static Model (optical flow)
      Static Model
      Animated Model
      Model on Image
      Model on Video
      Obj Model
    • Rendering Stage : render intermediate results such as
      Original : original input image
      Blurred : blured input image
      Gradient : gradient image
      Stroke Position : stocastically determined stroke position
      Delete Position : the program probablistically delete strokes from the red area
      Append Position : the program probablistically append strokes into the red area
      Final Rendering : render result Inter Rendering Result : not used

  • Focus

    In order to attract attention to some specific area, we allow user to specify a focus point. The gradients are attentuated gradually around the point. The r1 specifies the radius of a smaller circle and r2 specifies the radius of a larger circle. Within the smaller circle, the gradients are not attentuated at all. Between the smaller circle and the larger circle, the degree of attenuation changes smoothly. Outside the larger circle, the gradients are attentuated by a constant scale f. The image is in nowhere attenuated when f is 1. When r1 is larger than r2, the program produce opposite effect.

    • Enable Focus : Enable the left mouse button to specify the focus point position in the image space
    • Focus (f) : f from 0 to 1
    • Focus (r1) : r1 from 0 to 1
    • Focus (r2) : r2 from 0 to 1
  • Others
    • Optical Flow : Render the optical Flow orientations (triangle with the tip pointing to the positive direction) for some sample points. Probability (Medium) change the number of sample points
    • Num Frame : The number of video frames to load onto the graphics card. For example, 50 means the program will load 50 frames to GPU as textures.
    • Fade In Speed : The speed the new strokes get faded in.
    • Fade Out Speed : The speed the old strokes get faded out.
    • Start : Start playing the video
    • Stop : Stop playing the video

Known Bugs

  • The window is not scalable. Drag to change the window size may crash the program
  • The input image always gets washed out. The rendering of the original input image is different from how it actually looks.
  • When you close the program, a dialog will show up saying "The Direct3D device has a non-zero reference count, meaning some objects were not released". This doesn't really hurt the program. Press "Ok" and ignore it.
  • The precision of video optical flow is very low.

Limitations

  • High frequency strokes introduce more temporal noises. Moving images work better with small number of high frequency strokes
  • The stroke clipping tends to clip more than it should, which leaves some blank areas near the edges
  • The program does not allow user to select an arbitrary input. The simplest way to load a new image without changing the code is to change the image name to be an existing image's name
  • The program works the best with high-end NVIDIA graphics card. Real-time performance is not guaranteed on ATI card or low-end NVIDIA card
  • The rendering algorithm is not very scalable with the resolution. The performance drops down fast as the resolution increases.
  • The program currently doesn't allow the composition of image (video frame) and static models