eBay Announces 2023 University Machine Learning Competition

Liz Morton
Liz Morton


Comments

The 5th annual eBay University Machine Learning Competition is underway, with winners offered the chance of a summer internship with eBay in 2024.

Last year's competition challenged students to use machine learning to build models to extract and label "named entities" that referred to people, brands, organizations, locations, styles materials, patterns, product names, units of measure, sizes etc. in listing titles.

eBay 2022 University Machine Learning Competition
eBay announces 4th Annual Machine Learning University Challenge - extracting product data from listing titles.

The problem we invite you to consider for this year is to build a model that can accurately extract and label the named entities in the dataset of item titles on eBay. Named Entities are the semantic strings/words/phrases that refer to people, brands, organizations, locations, styles, materials, patterns, product names, units of measure, clothing sizes, etc.

This year's competition looks very similar, still focusing on extracting named entities from listing titles, but eBay has added an international component by making the data set specific to listings taken from the athletic shoes category on the eBay Germany site.

Interested students can sign up through EvalAI through October 10, 2023.

The problem we invite you to consider for this year is to build a model that can accurately extract and label the named entities in the dataset of item titles on eBay. Named Entities are the semantic strings/words/phrases that refer to people, brands, organizations, locations, styles, materials, patterns, product names, units of measure, clothing sizes, etc.

Named Entity Recognition (NER) is the machine learning process of automatic labeling and extracting important named entities in a text that carry a particular meaning. In e-commerce, NER is used to process listing or product titles and descriptions, queries, and reviews, or wherever extraction of important data from raw text is desired.

At eBay, we apply NER in a variety of applications, in particular for extracting aspects from listings (seller-facing context), and from search queries (buyer-facing context). In both of these contexts NER plays a crucial role to bridge unstructured text data to structured data. This challenge focuses on extraction from listings.

Reflecting eBay’s international character we chose this year’s challenge data to be from a non-English site, the data is from listings on eBay’s German site...

The Challenge
Named entity recognition (NER) is a fundamental task in Natural Language Processing (NLP) and one of the first stages in many language understanding tasks. It has drawn research attention for a few decades, and its importance has been well recognized in both academia and industry.

While NER is applied in many different settings, for this challenge, we will only be using eBay listing titles for NER. A few examples of NER labeling of listing titles are shown below (these examples are in English to illustrate the concept, the challenge data will have German language listing titles)...

Data
The data set consists of 10 million randomly selected unlabeled item titles from eBay Germany, all of which are from “Athletic Shoes” categories. Among these item titles there will be 10,000 labeled item titles (“labeled” means the aspects have been extracted). There will also be an annexure document provided that describes the dataset. Finally, we will provide the set of aspect names that should be extracted from each item title (as stated before, not all titles have all aspects). Each item title will have a unique identifier (a record number).

The 10,000 labeled item titles will be split into three groups:

  • Training set (5,000 records)
  • Quiz set (2,500 records)
  • Test set (2,500 records)

The 10 million unlabeled title set and the training set is intended for participants to build their models/prediction system. The actual aspects will be provided for each item title in the training set, along with the item title record number to link the aspects to the title.


View the full rules, FAQ, and timeline for the eBay 2023 University Machine Learning Competition at EvalAI:

EvalAI: Evaluating state of the art in AI
EvalAI is an open-source web platform for organizing and participating in challenges to push the state of the art on AI tasks.
eBayNews

Liz Morton Twitter Facebook LinkedIn

Liz Morton is a seasoned ecommerce pro with 17 years of online marketplace sales experience, providing commentary, analysis & news about eBay, Etsy, Amazon, Shopify & more at Value Added Resource!


Recent Comments