Selective Learning-to-Rank for Product Analogs

Fedor Krasnov

Abstract


Product analog discovery is a critical component of modern e-commerce systems, enabling recommendations, catalog deduplication, and search diversification. Unlike classical similarity search, many products in real-world catalogs do not admit valid substitutes, making forced ranking prone to false positives. This work extends selective prediction to learning-to-rank for analog discovery under partial coverage, introducing a simple yet effective confidence-aware reject mechanism based on score gap and absolute score. Experiments on a large proprietary catalog comprising $10^5$ products across 50 categories and $10^6$ labeled pairs show that the proposed method reduces false positives by 25% compared to a forced-ranking baseline while maintaining high coverage and product-level recall. Empirical evaluation across diverse product categories demonstrates a systematic recall-coverage trade-off induced by selective rejection. Price-aware features emerge as the most influential determinants of analog validity, often outweighing fine-grained specification similarity. Overall, selective ranking with abstention is an effective and practically implementable strategy for robust analog discovery at scale.


Full Text:

PDF

References


Chen F. et al. Studying product competition using representation learning // Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020. С. 1261–1268.

Fletcher A., Ormosi P. L., Savani R. Recommender systems and supplier competition on platforms // Journal of Competition Law & Economics. 2023. Т. 19, № 3. С. 397–426.

Hu S., Wei M. M., Cui S. The role of product and market information in an online marketplace // Production and Operations Management. 2023. Т. 32, № 10. С. 3100–3118.

Li M., Yang C.-M. E-Commerce User Shopping Preference Ranking Toward Million-Scale Products: A Hierarchical Feature Learning and Huge Purchase Graph Clustering Framework // IEEE Access. IEEE, 2025.

Cherednichenko O. et al. Information technology for intellectual analysis of item descriptions in e-commerce // Entrepreneurship and Sustainability Issues. 2023. Т. 11, № 1. С. 178.

Wang J. et al. Entity matching: How similar is similar // Proceedings of the VLDB Endowment. 2011. Т. 4. С. 622–633.

Köpcke H., Thor A., Rahm E. Evaluation of entity resolution approaches on real-world match problems // Proceedings of the VLDB Endowment. 2010. Т. 3. С. 484–493.

Singh R. et al. Synthesizing entity matching rules by examples // Proceedings of the VLDB Endowment. 2017. Т. 11. С. 189–202.

Ristoski P. et al. A machine learning approach for product matching and categorization // Semantic Web. 2018. Т. 9, № 5. С. 707–728.

Shah K., Kopru S., Ruvini J. D. Neural network based extreme classification and similarity models for product matching // NAACL-HLT. 2018. С. 8–15.

Choi J. I. et al. Semantic product search for matching structured product catalogs in e-commerce // arXiv preprint arXiv:2008.08180. 2020.

Burges C. et al. Learning to rank using gradient descent // Proceedings of the 22nd International Conference on Machine Learning (ICML). 2006. С. 89–96.

Lee H.-C., Rim H.-C., Lee D.-G. Learning to rank products based on online product reviews using a hierarchical deep neural network // Electronic Commerce Research and Applications. Elsevier, 2019. Т. 36. С. 100874.

El-Yaniv R. On the foundations of noise-free selective classification // Journal of Machine Learning Research. 2010. Т. 11. С. 1605–1641.

Geifman Y., El-Yaniv R. Selective classification for deep neural networks // Advances in Neural Information Processing Systems. 2017.

Arabzadeh N. et al. Query performance prediction: techniques and applications in modern information retrieval // Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 2024. С. 291–294.

Mikolov T. et al. Efficient estimation of word representations in vector space // arXiv preprint arXiv:1301.3781. 2013.

Devlin J. et al. BERT: Pre-training of deep bidirectional transformers for language understanding // NAACL-HLT. 2019. С. 4171–4186.

Reimers N., Gurevych I. Sentence-BERT: Sentence embeddings using Siamese BERT-networks // EMNLP-IJCNLP. 2019. С. 3982–3992.

Jiang Y. et al. Self-supervised Multimodal Representation Learning for Product Identification and Retrieval // International Conference on Neural Information Processing. Springer, 2023. С. 579–594.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность Monetec 2026 СНЭ

ISSN: 2307-8162