Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?

Xu, Da; Yang, Bo

doi:10.1145/3543873.3587669

Abstract:The use of pretrained embeddings has become widespread in modern e-commerce machine learning (ML) systems. In practice, however, we have encountered several key issues when using pretrained embedding in a real-world production system, many of which cannot be fully explained by current knowledge. Unfortunately, we find that there is a lack of a thorough understanding of how pre-trained embeddings work, especially their intrinsic properties and interactions with downstream tasks. Consequently, it becomes challenging to make interactive and scalable decisions regarding the use of pre-trained embeddings in practice.
Our investigation leads to two significant discoveries about using pretrained embeddings in e-commerce applications. Firstly, we find that the design of the pretraining and downstream models, particularly how they encode and decode information via embedding vectors, can have a profound impact. Secondly, we establish a principled perspective of pre-trained embeddings via the lens of kernel analysis, which can be used to evaluate their predictability, interactively and scalably. These findings help to address the practical challenges we faced and offer valuable guidance for successful adoption of pretrained embeddings in real-world production. Our conclusions are backed by solid theoretical reasoning, benchmark experiments, as well as online testings.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2304.04330 [cs.LG]
	(or arXiv:2304.04330v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2304.04330
Related DOI:	https://doi.org/10.1145/3543873.3587669

Computer Science > Machine Learning

Title:Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators