Download PDFOpen PDF in browser

Efficient Sequence Pooling Model for Sparse Attention with Mixture of Experts

EasyChair Preprint 15262

2 pagesDate: October 18, 2024

Abstract

This paper introduces the Efficient Attention-based Model (EAM), a novel architecture designed to reduce the computational and memory overhead of Transformer-based models in sequence modeling tasks. By incorporating sparse attention, token pooling, and a mixture of experts (MoE), the EAM model reduces memory usage and training time without sacrificing accuracy. We evaluate the EAM architecture on sequence classification tasks, comparing it to the Transformer model. The results show that the EAM model achieves competitive accuracy while using significantly less memory and computational power, making it a suitable alternative for resource-constrained environments.

Keyphrases: Efficient Attention-based Model, Mixture of Experts, memory efficiency, sequence modeling, sparse attention, transformer

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:15262,
  author    = {Sk Sayril Amed},
  title     = {Efficient Sequence Pooling Model for Sparse Attention with Mixture of Experts},
  howpublished = {EasyChair Preprint 15262},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser