Kernelizing transformer

Transformers are RNNs (paper) and (video):


  • Kernalized transformer is not on par with the original transformer but can be much faster (x1000 times for some applications)
  • Kernalized transformer can be modelled as RNN and can help further speed up inference time.


Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright OU-Tulsa Lab of Image and Information Processing 2021
Tech Nerd theme designed by Siteturner