July 2020 – OU-Tulsa Lab of Image and Information Processing

3D inpainting

July 23, 2020 by phsamuel - No comments

Latest 3D inpainting result is amazing. See paper and video.

Underactuated robotics

July 15, 2020 by samuel.cheng@ou.edu - No comments

Watched the first lecture of underactuated robotics by Prof Tedrake. It was great. His lecture note/book is available online. And the example code is directly available at colab.

So what is underactuated robotics? Consider a standard manipulator equation with state $latex q$

$latex M(q) \dot{q}+C(q,\dot{q}) \dot{q} = \tau_g(q) + B(q) u,$

where L.H.S. are the force terms, R.H.S. are the “Ma” terms, $M(q)$ is mass/inertia matrix and positive definite, $latex u$ is the control input, and $latex B(q)$ maps the control input to $latex q$.

We can rearrange the above to

$latex \ddot{q}= M(q)^{-1} [ \tau_g(q) + B(q) u – C(q,\dot{q} )\dot{q}] =\underset{f_1(q,\dot{q})}{\underbrace{M(q)^{-1}[ \tau_g(q) – C(q,\dot{q} )\dot{q}]}} +\underset{f_2(q,\dot{q})}{\underbrace{M(q)^{-1} B(q) }}u .$

Note that if $latex f_2(q,\dot{q})$ has full row rank (or simply $latex B(q)$ has full row rank since $latex M(q)$ is positive definite and hence full-rank), then for any desired $latex \ddot{q}^d$, we can achieve that by picking $latex u$ as

$latex u = f_2^{\dagger} (q,\dot{q}) (\ddot{q}^{d} – f_1(q,\dot{q})),$ where $latex f_2^{\dagger}$ is the pseudo-inverse of $latex f_2$. We say such robotic system is fully actuated.

On the other hand, if $latex f_2(q,\dot{q})$ does not have full row rank, the above trivial controller will not work. We then have a much more challenging and interesting scenario. And we say the robotic system is underactuated.

Uncategorized

NVAE

July 12, 2020 by samuel.cheng@ou.edu - No comments

Paper, video

This one generates high resolution images with hierachical variational autoencoder

Uncategorized

NLP scholar

July 9, 2020 by samuel.cheng@ou.edu - No comments

A very nice visualization to explore NLP papers

Deep Learning

BYOL

July 8, 2020 by phsamuel - No comments

It is kind of mysterious that this works without using negative samples for self learning. See video and paper

The main idea is to train a representation network and a classifier so that the latter will predict the representation of an augmented data input.
The representation network for the augmented data has moving average parameter of the current representation. Similar tricks have been used in deep reinforcement learning
It is indeed quite surprising that this works without negative samples. Because there is nothing in the above model that avoids converging to trivial solution (everything maps to a constant)
Experimental results look good. But also may not be accounted for too much. Their implementation for some older approaches have way higher prediction performance. And they pulled numbers from papers (reasonable tho) for comparison. Approach is probably on par and without negative samples, they can train with a smaller batch size
They are using 512 TPUs for training for 7 hours…

Deep Learning

Linformer

July 6, 2020 by samuel.cheng@ou.edu - No comments

video and paper .

Remarks:

Project embedding to lower dimension to save computational complexity and space
Some gain in speed but doesn’t look too significant. Tradeoff in performance seems larger than claimed
Theorem 1 based on JL-lemma did not used properties of attention itself. It seems that the same argument can be used to anywhere (besides attention). The theorem itself seems to be a bit a stretch
With the same goal of speeding up transformer, the “kernelized transformer” appears to be a better work

Deep Learning

Kernelizing transformer

July 4, 2020 by samuel.cheng@ou.edu - No comments

Transformers are RNNs (paper ) and (video):

Remarks:

Kernalized transformer is not on par with the original transformer but can be much faster (x1000 times for some applications)
Kernalized transformer can be modelled as RNN and can help further speed up inference time.

Deep Learning

Object-Centric Learning with Slot Attention

July 2, 2020 by samuel.cheng@ou.edu - No comments

Video and paper.

tldr. but the routing mechanism here seems to be quite similar with the capsule one. The authors emphasize that a slot should learn not just one type of object. It seems that the main trick is to first route image feature to slots. Slots will then be train to fit not just one type of objects like capsule network.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Month: July 2020