Welcome to FEMTI !

The Framework for Machine Translation Evaluation in ISLE is a resource that helps MT evaluators define contextual evaluation plans. FEMTI consists of two interrelated classifications or taxonomies: the first one lists possible characteristics of the contexts of use that are applicable to MT systems. The second one lists the possible characteristics of an MT system, along with the metrics that were proposed to measure them.

Evaluators using FEMTI specify the intended context of use for an MT system using the first classification, and submit it to FEMTI. In return, FEMTI proposes a set of quality characteristics that are relevant to that context, using its embedded knowledge base. Evaluators can modify this set of quality characteristics and select evaluation metrics for each of them, by browsing the second classification. Evaluators can then print the evaluation plan and execute the evaluation. To start using FEMTI, click on RUN FEMTI.

FEMTI is an evolving resource that is improved by its users: please send us your feedback using the comment links, either above or in any pop-up window that you might open. FEMTI helps people who want to compare several MT systems intended for a specific use, or who want to evaluate the suitability of a given system with respect to a given task, or who develop an MT system, or who want to learn about the needs of users to find niche applications for new MT systems.


Instructions: RUN FEMTI

  1. Define the intended context of use for the MT system(s) to be evaluated, by selecting relevant characteristics of the context in the left-hand frame, using the checkboxes. Clicking on a characteristic will open a pop-up window with explanatory details.
  2. When all the relevant evaluation requirements (intended context of use) have been specified, click on the submit button at the bottom of the left-hand frame.
  3. After submission, FEMTI will display the quality characteristics relevant to the selected context of use, highlighted in yellow in the right-hand frame, based on internal computation. Clicking on a quality will open a pop-up window with explanatory details.
  4. For each proposed quality characteristic, check the adjacent checkbox to view existing metrics, then select one or more metrics that you want to use for your evaluation. If you do not want to include a proposed quality characteristic in your final evaluation plan, just leave it unchecked.
  5. You can select additional quality characteristics and metrics by checking the corresponding checkboxes: the additional qualities will be highlighted in orange.
  6. Save your quality model (list of quality characteristics and metrics) by clicking on one of the buttons at the bottom of the right-hand frame, depending on the desired output format: PDF, HTML or RTF. This will generate a draft evaluation plan including your intended context of use, the quality characteristics and the selected metrics, with definitions and other related information. The document will be displayed in a new window, from which you can save or print it using your browser.

Alternatively, you can simply browse the list of quality characteristics and metrics in the left-hand frame and choose what you wish to include in your evaluation plan.


What FEMTI offers

Header menu
  • Introduction: this page.
  • RUN FEMTI: displays both classifications (context of use, or evaluation requirements, and quality characteristics plus metrics) and allows the user to start building an evaluation plan.
  • Printable version: opens a document containing the full contents of both classifications, in a new window; this can be used to browse the two classification offline, but does not contain the links between them.
  • References: a detailed list of publications about MT evaluation; references from pop-up windows point to this list as well.
  • Comments: opens a form to submit feedback about FEMTI. Thanks for using it!
Classifications (visible when clicking on RUN FEMTI)
  • Left-hand frame: a classification of the main features defining the context of use (also called "part 1"), that is: the type of user of the MT system, the type of task the system is used for, and the nature of the input to the system.
  • Right-hand frame: a classification of the MT software quality characteristics (also called "part 2"), into hierarchies of sub-characteristics, with internal and/or external metrics at the lowest levels, while the upper levels are based on the ISO/IEC 9126 characteristics.
  • In both frames, hierarchies can be expanded/collapsed at will; clicking on an item (feature or quality characteristic) will pop-up a window displaying its definition and notes about it; in each of these windows, a comment link at the bottom allows users to send feedback about that particular item.
  • The correspondence between context features and quality characteristics is stored internally (as a 'GCQM'). Although it is used implicitely to suggest a quality model after submitting an intended context of use, you cannot view it directly in non-expert mode.

Valid HTML 4.01!