Data
You can download the data set here: final_project.zip
There are several files which contain all the information about the task:
- users.csv: The user ids and corresponding demographic datas.
- User-ID: user IDs which have be anonymized
- Location: demographic data (may contain NULL-value)
- Age: demographic data (may contain NULL-value)
- book_ratings_train/test.csv: Contain the book rating information.
- User-ID: user IDs which have be anonymized
- ISBN: International Standard Book Number which you can find some description through this (just like the Book ID)
- Book-Rating: the book ratings range from 1 to 10 (Note that the test data would not have this value)
- implicit_ratings.csv: Contain the book rating information.
- User-ID: user IDs which have be anonymized
- ISBN: International Standard Book Number which you can find some description through this (just like the Book ID)
- Book-Rating: all the book ratings are implicit, that is 0
- books.csv: Contain content information about books.
- ISBN: International Standard Book Number
- Book-Title: content-based information
- Book-Author: content-based information
- Year-Of-Publication: content-based information
- Publisher: content-based information
- Image-URL-S: URLs linking to cover images (small size)
- Image-URL-M: URLs linking to cover images (medium size)
- Image-URL-L: URLs linking to cover images (large size)
- Book-Description: TA-crawled descriptions of books
- submission.csv: Contain content information about testing samples.
- Book-Rating: the predicted book ratings (Please note that you don't need to provide the header in the first row)