Data

Phase 0

In this phase, we release 2 data sets.
Training data set: ml14fall_train.7z
Testing data set 1: ml14fall_test1_no_answer.7z

The meaning of features:
All our data are in LIBSVM sparse format with feature index starting from 1. There are 14744 instances in the training data set, and 7372 in the testing data set 1. Each instance has 12810 features, it's a flatten array of a 105 width X 122 height image. The indexing rule is as follow:

The meaning of classes:
0 = 鼠, 1 = 牛, 2 = 虎, 3 = 兔, 4 = 龍, 5 = 蛇, 6 = 馬, 7 = 羊, 8 = 猴, 9 = 雞, 10 = 狗, 11 = 豬, 12 = 一, 13 = 二, 14 = 三, 15 = 四, 16 = 五, 17 = 六, 18 = 七, 19 = 八, 20 = 九, 21 = 十, 22 = 壹, 23 = 貳, 24 = 參, 25 = 肆, 26 = 伍, 27 = 陸, 28 = 柒, 29 = 捌, 30 = 玖, 31 = 拾

Submission format:
Each line in the submission file is an integer in [0,31], and the line number is corresponding to the test data, i.e., the line 10 in the submission file is the answer of the 10-th instance in ml14fall_test1_no_answer.dat. You can refer to the sample submission.

Phase 1

In this phase, we will release the answers of testing data set 1, and release testing data set 2 without answers.
It will be released at 01/05/2015 00:00 in the "Final Project Track 0 Phase 1".
On the phase 1 scoreboard, we will only use part of the data as "Public Score", and after the competition ends, the "Private Score" will also be shown on the scoreboard.