Sunday, March 11, 2012

Figures and FG/BG-selective filtering

Some figures for the write-up:


The next two figures show the results of flash/no-flash integration (still very hacked, since hardware support for timing flash isn't available in the current FCam release). The alternating flash/no-flash shots are used to determine foreground (FG) from background (BG); filter is then applied only on FG or BG.



Some more figures:

The remaining figure will be the perf measurements.

Monday, March 5, 2012

Live flash/no-flash integration

We implemented a very hacked version of flash/no-flash integration. The difficulty is in maintaining proper synchronization between successive flash and no-flash shots.

Nevertheless, with the hacked flash toggle implementation, a joint bilateral filter was set up. It would also be interesting to do foreground-background detection by flash/no-flash difference images. There can be a lot of neat effects enabled by this technique, e.g. cartoonization applied only to foreground (or background), or synthetic blurring of background, etc.

[March 6th]: Alas, Nvidia just confirmed that the two methods for Flash control are lacking: (1) FireAction does not support timing in the current FCam implementation; (2) TorchAction is completely asynchronous and hence this is not really the means to get an alternating stream of flash/no-flash.

If/when Nvidia releases stereo camera, we might try shuffling the two stereo cams to the two SharedBuffers.

Friday, March 2, 2012

More pictures (3)

My walk to Stanford campus this morning. Will take more shots later when the sun is in a more favorable spot relative to the Memorial Church:








From the Hoover tower:





Thursday, March 1, 2012

More pictures (2)

Was inspired to take a short walk outside today:












Random stuff inside:







Wednesday, February 29, 2012

More pictures

Pictures of just bilateral filter (two passes) -- no edges:




The edge detection algorithm really needs to be improved. The bilateral filter result is beautiful.

Made a slight tweak to the edge-detection so that we don't use hard thresholding:


Comparison of stylized and non-stylized images:





Continued agenda

From my email:

0) Obviously we can tweak the current shader algorithm. Before we do this, I would like for us to do item #1 below.

1) I would like to run a detailed benchmark on all shader operations. I run a recursive approximation to the bilateral filter, for instance, and do a basic Laplacian edge detection. There are many hard-coded parameters for the filters (such as spatial extent), and I would like to know the dependence of the shader runtime on these parameters. I think that this information will be generally useful to the class and Nvidia, for instance.

2) We can further work on flash integration. I have all of the necessary framework to copy flash and non-flash images into toggled destinations as we discussed on Sunday. However, we need to work out the details of flash timing. Basically, I find that the non-flash image has lingering flash from the flash-shot.

3) We can also think about stereo integration, even independent of flash integration. Here, we need some idea of what it is that we can achieve with stereo shots. Do you have any estimated results on depth calculations from stereo images? (Say, just by taking the difference of two images?)

4) I have been interested in the NPR application to augmented reality. I may code up a virtual object that you can put on top of the viewfinder stream.

Basic perf measurement

Copying CPU-->GPU:

Destination 640x480: 18+/-7 ms

Copying GPU-->CPU: 

Destination 640x480: 31+/11 ms

Bilateral filter perf:

Tuesday, February 28, 2012

Shader workflow -- The "underlying reasons"

Recap of last week's work.

After few days of serious thinking, I decided to implement the GPU flow in the underlying C loop of the FCam application, rather than the OnDrawFrame rendering thread on Java. Here was my pro/con list:

Java:
  • The rendering thread already makes use of the GPU. Having multiple GPU access points, i.e. in the Java thread and the C loop, seemed redundant. I was annoyed, for instance, at the prospect of having to do multiple RGB<-->YUV conversions.
  • We had already worked on a shader in Java-side.
  • On the Java thread after "TripleBuffer", I didn't know how to implement multiple frames for flash/no-flash fusion.
  • I also did not figure out how to do multiple shader passes on the same image.
C:
  • Full control. I knew how to do multiple SharedBuffers, as well as how to do multiple shader passes.
  • I realized that I did not have to do the RGB conversion if I didn't want to, even if the buffer itself was set up as 4-byte RGBA (8888). I would just use the R byte for Y, G for U, etc.
  • I personally prefer C to Java.
  • Had to do an extra copy of the image from 'frame.image' to the SharedBuffers.
  • Multiple places in code access GPU (C and Java). Maybe a bit less elegant.

Ultimately, I had to choose the C implementation because I had spent at least one full day and could not figure out how to do multi-pass shading in Java. I had not even attempted to figure out how to do multiple frame-buffers (for flash/no-flash fusion).

The shader pipeline in the C thread looks like the following:


NPR update

For the last week or so, I have fallen into the terrible habit of not documenting my progress. Details to come.

Results first! The first demonstration of NPR, at "pretty okay" frame rates (~10 FPS).


Learned that you can just take screenshots straight off the device, using Eclipse:




Tuesday, February 21, 2012

Multiple passes with GLSL

This is what I want to do!

How do I do it? Perhaps helpful. And perhaps this guy does exactly what I want to do.

Wednesday, February 15, 2012

Parametrized shader and gestures

I implemented some basic swipe actions on the viewfinder, and hooked it up with the blurring shader. Left-right swipes control the size of the Gaussian filter. Vertical swipe basically resets it to the non-blurred state.

Seems like getting multi-touch gestures might be trickier: "Making sense of multitouch"

At this point, it might be also worth it to clean up and optimize the shader, and to implement a kind of a cooler filter.

The fact that we're in RGB in shader is a bit inconvenient. Where does FCam::Frame get converted into RGB? Can I delay it?

Some info on bilateral filtering implementations:

Tuesday, February 14, 2012

Gaussian blur shader

Yes, it turned out to be so much simpler to programmatically manipulate the viewfinder shader from the Java-side, specifically from CameraView::onDrawFrame.

Let's implement a Gaussian shader and programmatically program the filter dimensions.

Ok, I now have a programmatically parameterized Gaussian-like filter. Remarks:

  • Some thought should go into the logical organization of code,
  • Beginning to see some preceivable frame-rate drop from the simple viewfinder. 

Sunday, February 12, 2012

Shader workflow

Observations on the apparent shader workflow, in Timo's example:
  1. Push the fragment shader to a known location in the tablet. 
  2. Declare shader via "setup_shader" on the push path; nv_set_attrib_by_name (?)
  3. Declare SharedBuffers sbuf_in, sbuf_2
  4. Initialize sbuf_in by sbuf_in->map
  5. glUseProgram(shader)
  6. sbuf_in->bindAsTexture2D()
  7. sbuf_out->makeCurrentSurface()
  8. glDrawArrays
  9. glFinish
I've now integrated the above framework into FCam main loop. However, I haven't yet worked out the conversion of the FCam frame data into the SharedBuffer memory. Nevertheless, I've learned a few things:

  • Running the "manual" shader will compete with the shader that is already in place for the Viewfinder. Thankfully, there's nothing really jarring, because the viewfinder simply doesn't update for the period that you've hijacked the GPU, which is a fraction of a second in the test shader that I'm using ("feature.frag" from Nvidia).
  • At the same time, I found that CameraView::onDrawFrame is doing basically all the steps in Java that I've now integrated into the C++ loop. Given that it is feasible to parametrize the shader at runtime, doing the image processing in this "Viewfinder shader" directly seems like the way to go.
  • Importantly, Timo's example shows that it is possible to hook up two texture buffers to a single shader, to do some fusion-based image processing. We should aim to implement this in the Viewfinder shader.
  • Yes, that is the most natural way to do things. Let's set up a cascade (if needed) of programmable shaders in the CameraView::onDrawFrame routine.


What do the GLSL keywords mean?
  • Uniform: read only. Set from the CPU source as follows:
    • int location = glGetUniformLocation(shaderIdx,"attributeName");
    • glUniform4fv(location,1,value)
  • Attribute: read-only in vertex shader
  • Varying: data is transferred from vertex to fragment shader. Read-only in fragment.
Maybe useful links:

Saturday, February 11, 2012

SharedBuffer

First things first. I want to find out how to share buffer between CPU and GPU, so that I can control the GPU filter parameters "live". Nvidia guys supposedly provided an example that demonstrates this. I was able to run the program. Let's see if I can dissect it.

NB: The second shader "ridge.frag" uses two source textures. Great for image fusion!

Here are the input and output images (feature.frag and ridge.frag) provided:

Timo's fixed vertex shader ("FULLSCREEN_QUAD_VERTEX_SHADER"), in readable form:

attribute mediump vec2 pos_attr
attribute mediump vec2 uv_attr
varying mediump vec2 outUV0

void main()
{
gl_Position = vec4(pos_attr,0.0,1.0);
outUV0 = uv_attr;
}

* * *

What is EGL? "EGL is an API for giving direct control over creation of OpenGL contexts that render to on-screen windows, offscreen pixmaps, or additional graphics-card memory."

To get the example (written by Timo Stich at Nvidia) ported to FCam environment:
  1. Copy over the libraries from the tablet to NDK as per Timo's instructions. (Slight typo in his filenames.)
  2. Merge Timo's makefile configuration 'Android.mk' to the FCam project you are working on. In particular, link 'cutils.so'.
With these steps, it should now be possible to declare SharedBuffer and the shader-related functions from the FCam source.

Also might be an interesting read: