giMLPs: Gate with Inhibition Mechanism in MLPs

Kang, Cheng; Prokop, Jindich; Tong, Lei; Zhou, Huiyu; Hu, Yong; Novak, Daneil

Computer Science > Computation and Language

arXiv:2208.00929 (cs)

This paper has been withdrawn by Cheng Kang

[Submitted on 1 Aug 2022 (v1), last revised 2 Aug 2022 (this version, v2)]

Title:giMLPs: Gate with Inhibition Mechanism in MLPs

Authors:Cheng Kang, Jindich Prokop, Lei Tong, Huiyu Zhou, Yong Hu, Daneil Novak

No PDF available, click to view other formats

Abstract:This paper presents a new model architecture, gate with inhibition MLP (giMLP).The gate with inhibition on CycleMLP (gi-CycleMLP) can produce equal performance on the ImageNet classification task, and it also improves the BERT, Roberta, and DeBERTaV3 models depending on two novel techniques. The first is the gating MLP, where matrix multiplications between the MLP and the trunk Attention input in further adjust models' adaptation. The second is inhibition which inhibits or enhances the branch adjustment, and with the inhibition levels increasing, it offers models more muscular features restriction. We show that the giCycleMLP with a lower inhibition level can be competitive with the original CycleMLP in terms of ImageNet classification accuracy. In addition, we also show through a comprehensive empirical study that these techniques significantly improve the performance of fine-tuning NLU downstream tasks. As for the gate with inhibition MLPs on DeBERTa (giDeBERTa) fine-tuning, we find it can achieve appealing results on most parts of NLU tasks without any extra pretraining again. We also find that with the use of Gate With Inhibition, the activation function should have a short and smooth negative tail, with which the unimportant features or the features that hurt models can be moderately inhibited. The experiments on ImageNet and twelve language downstream tasks demonstrate the effectiveness of Gate With Inhibition, both for image classification and for enhancing the capacity of nature language fine-tuning without any extra pretraining.

Comments:	It needs to be replaced in the future, because there are some extra experiments should be added
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2208.00929 [cs.CL]
	(or arXiv:2208.00929v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2208.00929

Submission history

From: Cheng Kang [view email]
[v1] Mon, 1 Aug 2022 15:23:51 UTC (452 KB)
[v2] Tue, 2 Aug 2022 09:51:47 UTC (1 KB) (withdrawn)

Computer Science > Computation and Language

Title:giMLPs: Gate with Inhibition Mechanism in MLPs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:giMLPs: Gate with Inhibition Mechanism in MLPs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators