This page provides the link to the datasets. Our benchmark has two settings: 1) user-based separation and 2) time-based separation setting. The details of each set are reported in the paper. It should be noted that the Avocado (Personalized Email Subject Generation) dataset is not publicly available; however, we provided the code here and sample ids we used to generate the dataset. Follow the instructions to generate the dataset easily when you got acess to Avocado dataset.
Notice: We deprecated the LaMP 2: Personalized News Categorization task and replaced it with a new task, LaMP 2: Personalized Movie Tagging.