Data reductions for graph attention variants

Kaustubh Dholé



There are a lot of data reduction techniques that have been applied over general transformers like Linformer (JL-Lemma), Reformer (using LSH), Nymstromformer (using Nymstrom method), etc. However, many of these approaches which have sped up attention computation have not been explored for speeding up graph attention variants. I am vouching for performing a baseline set of experiments to test some of these data reduction approaches to approximate GAT/GATv2's attention and measure how much drop on some downstream task is seen.