XTAG
An user interface for the ISSCO tagger tool
TATOO version 3.00
 
 
 

Xtag is an user interface based on Tcl/Tk/expectk languages. This interface integrates all the features of the tagger modules:

Xtag can read a configuration file, corresponding to the defaults of your experimentation.  Thus, all the pre-defined options, file name etc... will be automatically transfer to the appropriate location.
It can be useful to save in a file, the options depending of the text, and avoid to retype them all the time. You can do that with the button "save Preferences" and type the name of the file and a label. The reverse operation can also be envisage with the button "Read References"  (you can prepare a references file for each training or tagging phase).  See bellow the definition of the possible variables.

The Help button allows to read the man pages with your favourite browser.
 

COMMAND:

 
xtag main window
Main window of Xtag
 

In the main window you can select with the appropriate button the part of the tagger you want to execute.
You can follow the operations and have a look at the output.

CONFIG

But above all, you have to configurate the options of the text to process  in the "config" window with the usual  arguments of the tagger commands :

 
 

 configuration
 
 
 

PREPARATION
 

In the "Preparation" window, the text is prepare for the tagger or trainer programs but you can selecte the "Simple conversion" option that apply  the conversions without change the format (-H option).  See the mpreptxt man page.
 


 
 
 

TRAINING

In the "Training" window, you can make a training (mtrain program) but also operate  on the matrices (mcreate, mprint and edit). The matrices file can be created by mtrain which initializes the matrix with equi-proabable values based on the tags found in the corpus.The values can be readjusted to reflect user-defined preferences as stated in the biases_file. This training phase can be repeated for any number of iterations where each iteration may assign different probababilities. The matrices are used by the tagging program mtag to calculate the most probable tag for each word in a text.
mtrain readjusts the parameters and returns the new values in a compiled matrices file.
 

 
 
 
 

TAGGING
 

In the "Tagging" window you can make a tagging (mtag program) .The option "Re-estimate" allow to re-estimate the probalities to improve the accuracy. If the correct tag list and the matrices output file are indicated, the tagger will automatically readjust the values in the matrices according to the correct solutions and retag the text. This improve obviously the result for the given text, but can also improve the result for the text with the similar structures.
 

 
 

 
 

RESULTS
 

In the "Results" window, you can operate in the matrices with the biases rules (mbiases program), and print the results with the mdiff , mdiffb or  mcontext commands.
 


 

 
CONFIGURATION FILE
 

General Configuration for this test 
TAGCNV_F "states.cnv" states conversion file
WRDCNV_F "words.cnv" Word conversion file 
BIASLST_F "biases.lst" Biases file
NBRFIELD 3 Number of fields preceding the [BOS|EOS] field
LEXCOLUMN  3  Specifies the column where to find the word
PREMSEP "\\\\" Specifies the separator within [LEM,ANNOT] pair
SECSEP "\\|" Specifies the separator between the sets of ambiguous [LEM,ANNOT] pairs assigned to a given word
# Defaults for the preparing session
PR_INPUT_F  "text" Input text to be prepare
PR_OUTPUT_F  "text.tr" Output text
PR_MATRICES_F  "MMinit" Matrices file
# Defaults for the training session
TR_INPUT_F  "text.tr"  Input text
TR_M_INPUT_F  "MMinit"  Input Matrices file
TR_M_OUTPUT_F  "MM_01"  Output matrices file
TR_M_PRINT_F  "MM_01.clr"  Output file of the print command
TR_LOOP 1  Loop number
# Defaults for the tagging session 
TA_INPUT_F  "text.tr"  Input text
TA_OUTPUT_F  "text.tg"  Output text
TA_M_INPUT_F  "MM_01"  Matrices file
TA_PRECISION  0  Precision
TA_LOOP 1  Loop number
TA_M_OUTPUT_F  ""  Output Matrices  file
TA_TAG_OUTPUT_F  "/tmp/Taglst"  Tag list file
# Defaults for the biasing session 
B_M_INPUT_F  "MM_01"  Input matrices file
B_M_OUTPUT_F  "MM_01b"  output matrices file
B_BIASES_F  "biases.lst"  biases file
# Defaults for the results session 
D_INPUT1_F  "/tmp/Taglst"  Tag list file
D_INPUT2_F  "TAG1"  Correct tag list (mhandtag)
D_TAG1  ""  Tag 1 present in the tag list
D_TAG2  ""  Tag 2 present in the correct tag list 

SEE ALSO

mtrain(1)
mtag(1)
mcreate(1)
mtagfreq(1)
mprint(1)
mdiff(1)
mdiffb(1)
mcontext(1)
mbiases(1)
mhandtag(1)

AUTHOR

Gilbert ROBERT
(Gilbert.Robert@issco.unige.ch)
Copyright (c) 1998 Issco, Geneva, Switzerland
ISSCO, 54 route des Acacias
1227 Geneva, Switzerland

Comments, suggestions, and bug reports are always welcome.