RecTour 2024 Challenge

Description:

As a part of the RecTour workshop, the RecTour 2024 Challenge organized by Booking.com will take place. 

It focuses on ranking reviews, which is an important aspect that influences users’ decision-making. The most trivial way to rank reviews would be according to review scores or time-based.

An alternative approach would be to rank the reviews with the most “helpfulness” votes. However, the main problem with this approach is that most of the reviews do not get this helpfulness votes thus suffering from presentation bias.

In this challenge, the task is to match given accommodations and users to their respective review IDs. The concept is that when a new user interacts with the booking system, we can analyze the accommodation they are viewing along with available user features (e.g., couple, country, etc.). This enables us to display reviews in an order that considers the review content with respect to the user and accommodation characteristics.

To do so, Booking.com provides a unique training dataset containing 1.6 million reviews based on real anonymized bookings. 

Registration and participation:

In order to participate in Booking.com RecTour 2024 challenge each team need to fill this form – Link

Upon completion of your registration you will receive an email with the link to the dataset and further instructions how to extract it.

Data:

There are 3 sets of data for this challenge. Currently only train data is available. Later on as a challenge will progress we will release a validation set and a test set as well.

Each set is separated into three files:

  1. Users – hold information regarding users and accommodation features.
  2. Review – hold information regarding reviews.
  3. Matches – a true label between given user_id accomodaiton_id and review_id (only positive examples).

Matches file for the test set (the true label) –  won’t be accessible during the competition, and will be used in order to assess submitted predictions.

Participants are encouraged to create their own negative labels by levering the information from the Matches file.

Information regarding relevant fields can be found in the list below. Here is a description of the fields available within this dataset:

Metric for evaluation:

We will assess performance using MRR@10. Participants must submit a prediction file containing accommodation_id, user_id, and 10 review_ids sorted by their algorithm.

The submission file should include 12 columns: accommodation_id, user_id, and the top 10 review_ids ranked according to your model’s predictions.

Here is an example of a ranked review for accommodation_id 1 and user_id 1. The algorithm predicted the following ranking of reviews (152, 178, 689, … 42). Consequently, the submission file will display it as follows:

Accommodation idUser idReview 1Review 2Review 3…..Review 10
1115217868942

Important additional notes:

  • Each accommodation id contains at least 10 unique reviews.
  • Each review has at least 3 different topics utilizing text2topic paper [1]. Therefore, reviews like: “awesome” are filtered out from the dataset because it is not informative enough.
  • To extract the data participants should use github lfs commands. 

[1] Wang, Fengjun, et al. “Text2Topic: Multi-Label Text Classification System for Efficient Topic Detection in User Generated Content with Zero-Shot Capabilities.” 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023. 2023.

Top performing teams will be invited to submit short papers to the workshop about their solution approach. 

Important dates (tentative):

  • June 20, 2024: Registration and challenge start
  • August 15, 2024: Release of validation dataset
  • August 27, 2024: Release of test dataset
  • September 1, 2024: Final leaderboard – submission of predictions on test set
  • September 13, 2024: Challenge paper submission deadline
  • September 20, 2024: Notification of acceptance 
  • September 27, 2024: Camera-ready submissions due
  • October 19, 2024: RecTour 2024 takes place

For other questions regarding the challenge contact: rectour2024challenge@booking.com.