Annotation guidelines
PARSEME shared task on automatic identification of verbal MWEs - edition 1.0 (2017)


Annotation process and decision tree

We propose the following methodology for VMWE annotation:

  • Step 1 - identify a candidate, that is, a combination of a verb with at least one other word which could form a VMWE. If the candidate is a meaning-preserving variant of a prototypical verbal phrase, the following steps apply to this prototypical phrase, called the canonical form. This step is largely based on the annotators' linguistic knowledge and intuition after reading this guide.
  • Step 2 - determine which components of the candidate (or of its canonical form) are lexicalized, that is, if they are omitted, the VMWE does not occur any more. Corpus and web searches may be required to confirm intuitions about acceptable variants.
  • Step 3 - formally check if the candidate (or its canonical form) forms a VMWE and categorize it into one of the available categories, using the decision trees and detailed tests in the following sections.

We provide two decision trees that indicate the order in which tests should be applied in step 3. They determine the priority of different categories when several tests match. The decision trees are a useful summary to consult during annotation, but contain very short descriptions of the tests. Each test is detailed and explained with examples in the following sections.

Decision tree 1: Identification

In this tree, one YES to one of the tests is sufficient to identify a VMWE
  • Apply test 1 - [CRAN: Candidate contains cranberry word?]
    • Annotate as a VMWE and go to test 6 - [HEAD]
    • Apply test 2 - [LEX: Regular replacement of a component ⇒ unexpected meaning shift?]
      • Annotate as a VMWE and go to test 6 - [HEAD]
      • Apply test 3 - [MORPH: Regular morphological change ⇒ unexpected meaning shift?]
        • Annotate as a VMWE and go to test 6 - [HEAD]
        • Apply test 4 - [MORPHSYNT: Regular morphosyntactic change ⇒ unexpected meaning shift?]
          • Annotate as a VMWE and go to test 6 - [HEAD]
          • Apply test 5 - [SYNT: Regular syntactic change ⇒ unexpected meaning shift?]
            • Annotate as a VMWE and go to test 6 - [HEAD]
            • Apply the LVC hypothesis - [Candidate has operator verb + activity or state noun?]
              • Assume a VMWE and go to test 6 - [HEAD]
              • It is not a VMWE, exit

Decision tree 2: Categorization

  • Apply test 6 - [HEAD: Unique verb as syntactic head of the whole?]
    • Annotate as a VMWE of category OTH
    • Apply test 7 - [1DEP: Verb v has exactly one dependent d?]
      • Annotate as a VMWE of category ID
      • Apply test 8 - [CATEG: What is the morphosyntactic category of d?]
        • Reflexive clitic ⇒ Apply IReflV-specific testsIReflV tests positive?
          • Annotate as a VMWE of category IReflV
          • It is not a VMWE, exit
        • Particle ⇒ Apply VPC-specific testsVPC tests positive?
          • Annotate as a VMWE of category VPC
          • It is not a VMWE, exit
        • NP or PP ⇒ Apply LVC-specific decision treeAnswer positive?
          • Annotate as a VMWE of category LVC
          • Annotate as a VMWE of category ID
        • Other category ⇒ Annotate as a VMWE of category ID