Improving Weight Excitation for ConvNets and MLP Mixer

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh

Abstract

To improve the representational power of convolutional neural networks, several attention mechanisms have been introduced in recent years. These attention mecha- nisms are calculated on input feature maps by enhancing some parts of the input data and diminishing other parts of the input data as all parts of the input do not contain important features for training. One exception can be seen where weights are used in place of input feature maps and this approach is known as weight excitation. Since the weights of a CNN get fine-tuned based on the input data, calculating attention on weights can be an alternative to calculating attention on input feature maps. One advantage of this method is that this doesn’t introduce any additional computational cost at inference time. In this work, we explore different mechanism of weight ex- citation on different types of architectures. We have conducted several experiments to conclude whether weights can be used as an alternative to input feature maps for computing attention and if this applies to all existing attention mechanisms for Con- volutional Neural Networks. We also test other properties of weight excitation, like the regularizing effect of weight excitation.

Description

Supervised by Professor Dr. Md. Hasanul Kabir, Co-Supervisors: Mr. Shahriar Ivan, Lecturer, Mr. Md. Zahidul Islam, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh

Keywords

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By