Annotation guidelines
corpora annotated for multiword expressions
Language-specific inherently clitic verbs (LS.ICV)
Inherently Clitic Verbs (LS.ICV) together with the Inherently Reflexive Verbs (IRV) are pronominal verbs. LS.ICV are formed by a full verb combined with one or more non-reflexive clitic that represents the pronominalization of one or more complement (CLI). LS.ICV is annotated when (a) the verb never occurs without one non-reflexive clitic, e.g. entrarci to be relevant to something colloquial form, or (b) when the LS.ICV and the non-clitic versions have clearly different senses or subcategorization frames.
LS.ICVs represent a specific category for some Romance languages, and they are particularly frequent in the Italian language. It is often challenging to distinguish LS.ICV from IRV, particularly because some clitics may be ambiguous, like se/si which is a polyfunctional clitic pronoun and grammatical marker (and has many functions such as reflexive, reciprocal, impersonal, passivizing, aspectual, middle).
If the CLI has a clear reflexive meaning the VMWE might be an IRV.
We start by listing the various categories of LS.ICVs before providing tests to decide whether to annotate a given occurrence as an LS.ICV.
- Inherently clitic verbs ⇒ ANNOTATE as LS.ICV
- The verb without the CLI does not exist
infischiarsene (not worry about) vs *infischiare
- The verb without the CLI does exist, but has a very different meaning
darla (gl.: give it) (transl. fuck around) ≠ dare (give)
prenderle (gl.: take them) (transl. be beaten) ≠ prendere (take)
prenderci (gl.: take it) (transl. grasp the truth) ≠ prendere (take)
starci (gl.: stay there) (transl. agree) ≠ stare (stay) - The verb has more than one CLI of which the second one is an invariable object complement.
fregarsene (gl.: matter self of-it) (transl.don’t care about)
infischiarsene (transl. not worry about)
curarsene (gl.: take care self of-it) (transl. care about)
prendersela (gl.: take self it.FEM)(transl. be angry/upset)
sentirsela (gl.: feel self it.FEM) (transl. be in the mood of)
sentirselo (gl.: feel self it.MASC) (transl. feel)
vedersela (gl.: see self it.FEM)(transl. to manage something) - The verb has two non-reflexive invariable CLIs:
farcela (gl.: make there it.FEM) (transl. succeed)
- The verb has a different meaning with respect to an intensive use of the same two non-reflexive invariable CLIs:
andarsene (gl.: go away self from-there) (transl. die) ≠ andarsene (go away)
bersela (gl.: drink slef it.FEM) (transl. believe) ≠ bersela (drink)
- The verb without the CLI does not exist
LS.ICV-specific decision tree
- Apply test LS.ICV.1 - [CL-INHERENT]
- Annotate as LS.ICV
- Apply test LS.ICV.2 - [CL-DIFF-SENSE]
- Annotate as LS.ICV
- Apply test LS.ICV.3 - [CL-DIFF-SUBCAT]
- Annotate as LS.ICV
- Exit
Test LS.ICV.1 - [CL-INHERENT] Inherent clitic
Does the verb only exist with the CLI and never occurs without it?
- annotate as LS.ICV
infischiarsi ⇒ *infischiare
infischiarsene ⇒ *infischiare - next test
Test LS.ICV.2 - [CL-DIFF-SENSE] - Different sense
Given the same verb without the CLI/CLIs, are all of its meanings clearly different from the inherently clitic form?
- annotate as LS.ICV
smetterla (gl.: quit it) (transl. knock it off) ≠ smettere (quit)
prenderle (gl.: take them) (transl. get beaten up) ≠ prendere (take)
prenderci (gl.: take it)(transl. grasp the truth) ≠ prendere (take)
starci (gl.: stay there)(transl. up for it) ≠ stare (stay)
curarsene (gl.: take care self of-it) (transl. care about) ≠ curare (take care)
prendersela (gl.: take self it.FEM)(transl. be angry/upset)≠ prendere (take)
sentirsela (gl.: feel slef it.FEM) (transl. be in the mood of) ≠ sentire (feel)
darla (gl.: give it.FEM) (transl. fuck around) ≠ dare (give) - next test
Test ICV.3 - [CL-DIFF-SUBCAT] - Different subcategorization frame
Is the subcategorization frame of the simple verb without the CLI different from the subcategorization frame of the LS.ICV?
- annotate as LS.ICV
X se la prende con Y ⇔ X prende Y
- Exit