Published in Proc. of ACL/EACL Joint Conference. Madrid. 1997. 112-119.
In the past the evaluation of machine translation systems has focused on single system evaluations because there were only few systems available. But now there are several commercial systems for the same language pair. This requires new methods of comparative evaluation. In the paper we propose a black-box method for comparing the lexical coverage of MT systems. The method is based on lists of words from different frequency classes. It is shown how these word lists can be compiled and used for testing. We also present the results of using our method on 6 MT systems that translate between English and German.
Full version (compressed postscript)