Improving Weight Excitation for ConvNets and MLP Mixer

dc.contributor.authorMahbub, Ridwan
dc.contributor.authorAnuva, Samiha Shafiq
dc.contributor.authorKhan, Ifrad Towhid
dc.date.accessioned2024-01-18T06:36:55Z
dc.date.available2024-01-18T06:36:55Z
dc.date.issued2023-05-30
dc.descriptionSupervised by Professor Dr. Md. Hasanul Kabir, Co-Supervisors: Mr. Shahriar Ivan, Lecturer, Mr. Md. Zahidul Islam, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladeshen_US
dc.description.abstractTo improve the representational power of convolutional neural networks, several attention mechanisms have been introduced in recent years. These attention mecha- nisms are calculated on input feature maps by enhancing some parts of the input data and diminishing other parts of the input data as all parts of the input do not contain important features for training. One exception can be seen where weights are used in place of input feature maps and this approach is known as weight excitation. Since the weights of a CNN get fine-tuned based on the input data, calculating attention on weights can be an alternative to calculating attention on input feature maps. One advantage of this method is that this doesn’t introduce any additional computational cost at inference time. In this work, we explore different mechanism of weight ex- citation on different types of architectures. We have conducted several experiments to conclude whether weights can be used as an alternative to input feature maps for computing attention and if this applies to all existing attention mechanisms for Con- volutional Neural Networks. We also test other properties of weight excitation, like the regularizing effect of weight excitation.en_US
dc.identifier.urihttp://hdl.handle.net/123456789/2059
dc.language.isoenen_US
dc.publisherDepartment of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladeshen_US
dc.titleImproving Weight Excitation for ConvNets and MLP Mixeren_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Thesis2023_CSE_180041230_180041137_180041225_Book - RIDWAN MAHBUB, 180041230.pdf
Size:
11.33 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections