Transformer architectures have shown recent success in many Computer Vision and Natural Language benchmarks. This is largely due to a lower inductive bias which can be understood as the assumptions that a model makes about the solution to a task. A large inductive bias means a model may be able to quickly converge to a solution but that the quality of the solution may be limited by the inductive bias of the model. A lower inductive bias allows Transformers to achieve better performance if provided much more data and compute. We discuss and explore the balance between this architecture and inductive biases, evaluating potential approaches to further decrease inductive bias without increasing inference costs as well as approaches to increase inductive bias without decreasing model capacity.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.