Evaluation Server

We setup an evaluation server for the IWSLT 2011 development and evaluation data sets of each track:

If you do not have a username for the evaluation server, you have to register online first. The login information can be used for both evaluation servers, so you have to register only once.

After login, select 'Make a new Submission' to submit your run. Select the respective track and upload your hypothesis file. The format of the hypothesis files to be uploaded to the evaluation server depends on the translation task. Currently, the following file formats are supported:

Track CTM Format Plain Text XML Format
ASR × ×
MT ×

The order of the segments is the same as the corresponding .xml input file; concerning the ASR task, the order is the same of the English to French translation task, i.e. for dev2010 first the sentences of talk 69 are expected, then those of talks 129, 227, 535, 93, 457, 453, 531; for test2010, the order of talks is: 779, 769, 792, 799, 767, 790, 785, 783, 824, 805, 837. You can also input additional information ('SystemID' and 'Description' fields) to keep track on your run submission parameters or alike.

After pressing 'Calculate Scores', the evaluation scripts are applied and the automatic evaluation scores are send to the email that you specified at registration time.The evaluation metrics used for the scoring are as follows:

Last modified: 2014/03/27 16:56