This ia a library for estimating inter-annotator agreements by weighted Fleiss' kappa coefficient. See the reference for the defnition of weighted Fleiss' kappa coefficient.
For evaluating inter-annotator agreements, a following command line tool is provided:
java -Dfile.encoding=UTF-8 -cp [CLASSPATH] jp.co.d_itlab.dbdc.tool.IAATool -s maa -dic [DIC_DIR] -c INT -a [ANNOTATORS] -i [INPUT_DIR] (-l [LOCALE])
Option Description:
- cp - paths the iaa.jar (placed in "jar" directory) and other jar files of dependent libraries specified in the Requirements.
- s - command ("maa" for estimating inter-annotator agreements by weighted Fleiss' kappa coefficient.)
- dic - path to the directory where the text files which define error categories are defined. (see "res/dic" directory.)
- a - IDs of annotators to be evaluated. (concatenated by commas, e.g., W1,W2,W3 for annotators W1, W2, W3.
- i - path to the annotated data
- l - specifies the language of data. (ja: Japanese(default), en: English)
Launch setting samples are shown in the batch files under the "sample" directory. For executing those batch files, the dependent libraries should be placed in "picocli", "poi", "log4j" directories under "jar" directory, respectively.
The annotation of error category is assumed to be worked using excel files. (Like a template in "res/excel" directory.) The annotated files should be distributed in the following directory layout:
[data] │ └─ [annotator-id] │ │ │ └─ [trial-d] │ │ │ │ │ │─ [trial-id_dialogue-system-id].xlsm │ │ └─ ... │ └─ ... └─ ...
See examples in the "sample/data" directory.
The complied JAR file is provided: iaa-1.0.0.jar
- JRE v1.8 or above version
- Dependent Libraries (for executing the commandline tool)
-
PicoCli - https://github.com/remkop/picocli
Download the latest(4.6.2 or above) JAR file and place it under jar/poicocli:
- picocli-4.6.2.jar
-
Apache POI - https://poi.apache.org/
Download the latest(5.0.1 or above) binary distribution, expand it and place following JAR files under jar/poi:
- poi-5.1.0.jar
- poi-ooxml-5.1.0.jar
- poi-ooxml-lite-5.1.0.jar
- lib/commons-collections4-4.4.jar
- lib/commons-io-2.11.0.jar
- ooxml-lib/commons-compress-1.21.jar
- ooxml-lib/commons-logging-1.2.jar
- ooxml-lib/curvesapi-1.06.jar
- ooxml-lib/xmlbeans-5.0.2.jar
-
Log4j 2 - https://logging.apache.org/log4j/2.x/index.html
Downloiad the latest(2.17.0 or above) binary distribution, expand it and place following JAR files under jar/log4j:
- log4j-api-2.17.0.jar
-
- Ryuichiro Higashinaka, Masahiro Araki, Hiroshi Tsukahara and Masahiro Mizukami (2021), Integrated taxonomy of errors in chat-oriented dialogue systems, in Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, p.89-98.
- The annotation manuals (in Japanese and English) - Integrated taxonomy of errors in chat-oriented dialogue systems
©2021 DENSO IT Laboratory, Inc., All rights reserved. Redistribution or public display not permitted without written permission from DENSO IT Laboratory, Inc.