Kernelizing transformer

Transformers are RNNs (paper) and (video):

Remarks:

  • Kernalized transformer is not on par with the original transformer but can be much faster (x1000 times for some applications)
  • Kernalized transformer can be modelled as RNN and can help further speed up inference time.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright OU-Tulsa Lab of Image and Information Processing 2020
Tech Nerd theme designed by Siteturner