Evaluation metrics.
The Models participating in all subtasks will be ranked using the macro-F1 score. For the multilabel task, additional evaluation metrics include micro-F1, example-based F1, and Hamming loss. Macro-averaged metrics are computed by evaluating each label independently and averaging across labels, while micro-averaged metrics aggregate contributions across all labels. Hamming loss is reported to quantify the proportion of incorrectly predicted labels.
Final submission format
All the submissions must be in a .zip format. Inside the ZIP, the predictions must be in a CSV format and the order of the IDs must be in the same order as you downloaded.
For Task 1, participants must submit the predicted label, formatted as follows:
0
1
2
3
0
The name of the CSV for task 1 must be task1_predictions.csv.
For Task 2, since this is a multi-label classification task represented as a vector, participants must submit the instance identifier along with a 7-dimensional binary vector indicating the predicted labels. For example:
0, 0, 0, 0, 0, 0, 1
1, 0, 1, 0, 1, 1, 1
The name of the CSV for task 2 must be task2_predictions.csv.
Submit a single ZIP file with a generic name, for example, "predictions.zip" . If you participate in only one task, the ZIP file will contain a single CSV file. If you participate in two tasks, you should send both prediction files (CSV) in the same ZIP file.
