Annotation guidelines
corpora annotated for multiword expressions
Inherently reflexive verbs (IRV)
Reflexive clitics (RCLI) are clitic pronouns that refer to the subject of the verb, like oneself in English. They are very common in many languages and play several semantic roles depending on the context, as detailed below.
Reflexive verbs (REFLV), sometimes also called pronominal verbs, are formed by a full verb combined with a RCLI, although the clitic does not always have a reflexive meaning. REFLV can be categorized into different classes, some of which should be annotated as verbal MWEs.
Namely, we will only annotate a REFLV as an inherently reflexive verb (IRV) when (a) it never occurs without the clitic, or (b) the REFLV and non-reflexive versions have clearly different senses or subcategorization frames. Inherently reflexive verbs constitute a quasi-universal category.
IReflVs are a difficult category to annotate due to various problematic cases. Note in particular that in some languages, e.g. Slavic, the reflexive clitics inflect and should be considered not only in their most frequent case, i.e. accusative.
We start by listing the various categories of REFLV before providing tests to decide whether to annotate a given occurrence as IRV.
- Inherently reflexive ⇒ ANNOTATE as IRV
- The verb without the RCLI does not exist
усмихвам се to smile, страхувам се to be afraidstydět se to be ashamed, divit se to wondersich schämen to be ashamed, sich wundern to wondersuicidarse to suicide, abstenerse to abstainn.a.s'évanouir to faint, se suicider to suicidesuicidarsi to suicide, arrabbiarsi to get angryzich schamen to be ashamed, zich vergissen to be mistakendowiedzieć się to find out, bać się to be afraidqueixar-se to complain, abster-se to abstaina se teme to be afraid with obligatory ACC reflexive clitic
a își însuși to appropriate with obligatory DAT reflexive cliticsramovati se to be ashamed, bati se to be afraidстидети се stideti se to be ashamed,
бојати се bojati se to be afraidatt försova sig to sleep in
att gifta sig to get married - The verb without the RCLI does exist, but has a very different meaning
смея ≠ смея се to dare ≠ to smile, намирам ≠ намирам се to find ≠ to be situatedsich enthalten ≠ enthalten to abstain ≠ to contain, sich (um etw.) handeln ≠ handeln to be ≠ to handleto find oneself in a difficult situation
to to help oneself to the cookiesrecoger ≠ recogerse to gather ≠ to go home, empeñar ≠ empeñarse to pawn ≠ to insistn.a.s'apercevoir ≠ apercevoir to realize ≠ to see, s'agir ≠ agir to be ≠ to actriferire ≠ riferirsi to report, tell ≠ to referzich aanstellen ≠ aanstellen to put on airs, to act ≠ to appoint, zich begeven ≠ begeven to proceed ≠ to break down, zich realiseren ≠ realiseren to realise (achieve) ≠ to realise (be aware)znajdować ≠ znajdować się to find ≠ to be, radzić ≠ radzić sobie to advise ≠ to manageencontrar-se ≠ encontrar to be ≠ to meet, referir-se ≠ referir to concern ≠ to refera se îndura ≠ a îndura to have the heart ≠ to suffer
a se face≠ a face to become ≠ to make even if it is inchoative (Dindelegan 2013: 79) a se face (=to become) is IRV (it passes Test15)dati se it is possible (to do something) ≠ dati to give, dobiti se to meet ≠ dobiti to getгубити ≠ губити се gubiti ≠ gubiti se to lose ≠ to pass outatt känna sig ledsen/arg to feel sad/angry ≠ to touch
- The verb without the RCLI does not exist
- Reciprocal ⇒ NOT ANNOTATED
- The RCLI has a sense of mutually:
целувам се to kiss each other, срещам се to meet each otherlíbat se to kiss each other, potkávat se to meet each othersich küssen to kiss each other, sich treffen to meet each otherbesarse to kiss each other, verse to see each othern.a.s'embrasser to kiss each other, se rencontrer to meet each otherbaciarsi to kiss each othercałować się to kiss each other, spotykać się to meet each othercumprimentar-se to greet each other, ver-se to see each othera se saluta to greet each otherpoljubljati se to kiss each other, srečati se to meet each otherпољубити се poljubiti se to kiss,
срести се sresti se to meet
- The RCLI has a sense of mutually:
- Reflexive ⇒ NOT ANNOTATED
- The RCLI marks the reflexive or reciprocal construction, that is, the clitic plays the role of self in English
мия се to wash oneself, реша се to combe oneselfmýt se to wash oneself, drbat se to scratch oneselfsich waschen to wash oneself, sich kratzen to scratch oneselfmirarse to look at oneself, vestirse to dress oneselfn.a.se laver to wash oneself, se parler to talk to oneselflavarsi to wash oneself, vestirsi to dress oneselfzich wassen to wash oneself, zich scheren to shave oneselfmyć się to wash oneself, drapać się po głowie to scratch oneself on the headapressar-se to hurry oneself, vestir-se to dress oneselfa se spăla to wash oneselfumivati se to wash oneself, praskati se to scratch oneselfумивати се umivati se to wash one's face,
чешати се češati se to scratch oneselfatt tvätta sig to wash oneself
- The RCLI marks the reflexive or reciprocal construction, that is, the clitic plays the role of self in English
- Body part, also called possessive reflexive ⇒ NOT ANNOTATED
- Specific type of reflexive use in which the direct object is a body part or, more generally, an inalienable part of the subject
мия си ръцете wash REFL.POSSESSIVE hands wash one's handsmýt si nohy wash RCLI.DAT the feet wash one's feetsich das Bein brechen RCLI the leg break break one's legrascarse el brazo scratch.RCLI the arm scratch one's armn.a.se gratter la tête RCLI scratch the head scratch one's headgrattarsi la testa RCLI scratch the head scratch one's headmyć sobie nogi wash RCLI.DAT the feet wash one's feetimpossible, uses possessive insteada-şi rupe mâna RCLI.DAT break arm break one's armumivati noge wash RCLI.DAT the feet wash one's feet, zlomiti roko RCLI.DAT break arm break one's armсломити си ногу to break RCLI the foot slomiti si nogu to break one own's leg,
умити си лице umiti si lice to wash RCLI the face to was one own's face
- Specific type of reflexive use in which the direct object is a body part or, more generally, an inalienable part of the subject
- Middle with preverbal subject, also called synthetic passive ⇒ NOT ANNOTATED
- The clitic marks a regular syntactic alternation for transitive verbs. Just like in regular passive alternation, the direct object of the transitive version appears as the subject of the REFLV version, and thus the verb agrees with the subject.
- Differently from inchoative (see below), the subject of the transitive version is absent in the REFLV version but it exists necessarily, though it is underspecified
книги се пишат трудно books write.PL RCLI difficult it is difficult to write booksdie Häuser verkaufen sich gut the houses sell RCLI well the houses sell welllas casas se venden bien the houses RCLI sell well the houses sell welln.a.les pots se vendent bien the pots RCLI sell well the pots sell wellle case si affittano the houses RCLI rent the houses are renteddomy dobrze się sprzedają houses sell.PL RCLI well houses sell wellas casas se vendem bem the houses RCLI sell well the houses sell wellcasele se vând bine houses-the RCLI sell well houses sell wellhiše se dobro prodajajo the houses sell RCLI well the houses sell wellземља се добро продаје zemlja se dobro prodaje the land RCLI well sell the land's selling well
- Middle with postverbal subject, also called synthetic passive ⇒ NOT ANNOTATED
- In some languages, middle alternation with preverbal subject sounds unnatural and middle alternation with postverbal subject is preferred. Depending on the languages, it is viewed as a postverbal subject (ES, PL, PT, RO) or as an object which agrees with the unaccusative verb form (IT). Middle alternation with postverbal subject is impossible in FR and DE.
трудно се пишат книги difficult RCLI write.PL books it is difficult to write booksse alquilan casas RCLI rent houses people rent housesn.a.si affittano case RCLI rent houses people rent housesdobrze sprzedają się te domy well sell RCLI these houses these houses sell well Polish is a relatively free word-order language and a postverbal subject is a regular (even if stylistically marked) alternation.alugam-se casas rent-RCLI houses people rent housesse vând bine apartamentele din blocurile noi RCLI sell well apartments-the from blocks-the new Apartments from new blocks sell well
se construiesc locuințe noi RCLI built houses new new houses are builtnove hiše se gradijo new houses RCLI built new houses are builtдобро се продаје ова роба dobro se prodaje ova roba well RCLI sell these goods these goods are selling well
- In some languages, middle alternation with preverbal subject sounds unnatural and middle alternation with postverbal subject is preferred. Depending on the languages, it is viewed as a postverbal subject (ES, PL, PT, RO) or as an object which agrees with the unaccusative verb form (IT). Middle alternation with postverbal subject is impossible in FR and DE.
- Impersonal ⇒ NOT ANNOTATED
- The RCLI marks an impersonal verb alternation possible for various transitivity classes, depending on the language: only transitive verbs (FR), only intransitive verbs with manner adjuncts (DE), preferably intransitive but tolerated for transitive verbs (PT), either transitive or intransitive verbs (IT, ES, RO, PL)
- There is no noun phrase before the verb (empty subject slot), the presence of the RCLI indicates a verb interpreted with a generic and underspecified subject
- The verb is in third person singular, even when the object is plural
не се вечеря късно not RCLI have dinner late it is not good to have dinner latehier tanzt es sich gut here dances it RCLI well people dance well herese busca a actores RCLI searches to actors people look for actors
se trabaja mejor aquí RCLI works better here people work better heren.a.il se dit des bêtises it RCLI says silly things people say silly thingssi lavora troppo RCLI works too much people work too much
si affitta molte case RCLI rents many houses people rent many housesza dużo się pracuje too much RCLI works people work too much
bzdury się opowiada nonsense RCLI tells people tell nonsensedorme-se muito sleeps-RCLI much people sleep a lot
conta-se histórias tells-RCLI stories people tell stories Transitive impersonal is considered wrong by traditional grammar but it is found in corpora.se lucrează până târziu RCLI works until late people work until late transitive verbs can be impersonal in RO only when they are null-object verbs (se lucrează până târziu - *este lucrat până târziu) or when their subject is realized by a clause headed by a complementizer Dindelegan 2013: 174
se suferă din cauza sărăciei RCLI suffer because of poverty one suffers because of poverty RO impersonal reflexive verbs are mostly intransitive Dindelegan 2013: 173
se aleargă dimineața RCLI run in the morning people run in the morninggovori se/govorijo se neumnosti it says/they say RCLI silly things people say silly thingsради се превише radi se previše it works RCLI too much there's too much work being done,
говоре се глупости govore se gluposti they say RCLI nonsense nonsense is being said
- Inchoative ⇒ NOT ANNOTATED
- Similar to middle, but the RCLI marks a less productive syntactic alternation:
- the direct object of the transitive version appears as subject of the REFLV
- the subject of the transitive version is not only absent, it is also semantically unclear or nonexistent
вратата се отваря the door opensdveře se otvírají the door opensdie Tür öffnet sich the door opensla puerta se abrió the door openedn.a.la porte s'est subitement ouverte the door suddenly openedla porta si apre the door opensdrzwi się otwierają the door openso vaso se quebrou the vase brokemașina s-a stricat the car broke down
ușa s-a deschis the door openedvrata se odpirajo the door opensврата се отварају vrata se otvaraju the doors are openingdörren öppnar sig the door opens
- Similar to middle, but the RCLI marks a less productive syntactic alternation:
IRV-specific decision tree
- Apply test IRV.1 - [INHERENT]
- Annotate as IRV
- Apply test IRV.2 - [DIFF-SENSE]
- Annotate as IRV
- Apply test IRV.3 - [DIFF-SUBCAT]
- Annotate as IRV
-
- verb has no subject ⇒ Apply test IRV.4 - [IMPERS]
- It is not a VMWE, exit
- Annotate as IRV
- verb has a subject ⇒ Apply test IRV.5 - [MIDDLE-INCHO]
- It is not a VMWE, exit
- Apply test IRV.6 - [REFL]
- It is not a VMWE, exit
-
- subject is SINGULAR ⇒ Apply test IRV.7 - [REFL-MUTUAL]
- It is not a VMWE, exit
- Annotate as IRV
- subject is PLURAL ⇒ Apply test IRV.8 - [RECIPRO]
- It is not a VMWE, exit
- Annotate as IRV
- subject is SINGULAR ⇒ Apply test IRV.7 - [REFL-MUTUAL]
- verb has no subject ⇒ Apply test IRV.4 - [IMPERS]
Test IRV.1 - [INHERENT] Inherent clitic
Does the verb only exist with the RCLI and never occurs without it?
- annotate as IRV
страхувам се ⇒ *страхувам to be afraid
усмихвам се ⇒ *усмихвам to smilesich schämen ⇒ *schämen to be ashamed
sich wundern ⇒ *wundern to wondersuicidarse ⇒ *suicidar to suicide
abstenerse ⇒ *abstener to abstainn.a.s'évanouir ⇒ *évanouir to faint
se suicider ⇒ *suicider to suicidesuicidarsi ⇒ *suicidare to suicidezich schamen ⇒ *schamen to be ashamed
zich vergissen ⇒ *vergissen to be mistakendowiedzieć się ⇒ *dowiedzieć to find out
bać się ⇒ *bać to be afraid
wydarzyć się ⇒ *wydarzyć to happenqueixar-se ⇒ *queixar to complain
abster-se ⇒ *abster to abstaina se teme ⇒ *a teme to be afraid
a își însuși ⇒ *a însuși to appropriatesramovati se ⇒ *sramovati to be ashamed
čuditi se ⇒ *čuditi to wonderбавити се ⇒ *бавити baviti se ⇒ *baviti to deal with,
дивити се ⇒ *дивити diviti se ⇒ *diviti to admire - next test
Test IRV.2 - [DIFF-SENSE] - Different sense
Given the same verb without the RCLI, are all of its meanings clearly different from the REFLV form?
- annotate as IRV
намирам се ≠ намирам to be situated ≠ to find
радвам се≠ радвам to feel happy ≠ to make happysich verstehen ≠ verstehen to get along well ≠ to understandto find oneself in a difficult situation
to to help oneself to the cookiesrecogerse ≠ recoger to go home ≠ to pick up, to gathern.a.s'apercevoir ≠ apercevoir to realize ≠ to see
s'agir ≠ agir to be ≠ to actriferirsi ≠ riferire to refer ≠ to report, to tellzich voordoen ≠ voordoen to arise ≠ to showznajdować się ≠ znajdować to find oneself ≠ to be
sprawdzić się≠ sprawdzić to prove appropriate ≠ to check
wybrać się≠ wybrać to go ≠ to chooseencontrar-se ≠ encontrar to be ≠ to meet
referir-se ≠ referir to concern ≠ to refera se îndura ≠ a îndura to have the heart to ≠ to sufferrazumeti se ≠ razumeti to get along well ≠ to understandзнати ≠ знати се znati ≠ znati se to know ≠ to know someone,
забављати ≠ забабљати се zabavljati ≠ zabavljati se to amuse someone else ≠ to amuse oneself to amuse someone ≠ to date someone - next test
Test IRV.3 - [DIFF-SUBCAT] - Different subcategorization frame
Is the subcategorization frame of the simple verb without the RCLI different from the subcategorization frame of the REFLV, except for the addition of a direct or indirect object corresponding to the same syntactic argument as the RCLI in the REFLV version?
- annotate as IRV
X verliert sich in Y ⇔ X verliert Y X looses RCLI in Y ⇔ X looses YX se olvidó de Y ⇔ X olvidó Y X RCLI forgot of Y ⇔ X forgot Yn.a.X se confesse de Y ⇔ X confesse Y (but *X confesse de Y) X RCLI confesses of Y ⇔ X confesses Y (but not *X confesses of Y)
X se plaint de Z ⇒ *Y plaint (à) X de Z X RCLI complains of Z ⇒ *Y complains (to) X of Z → the verb without RCLI, plus direct or indirect object. does not subcategorize for the PP with preposition de
X se refuse à Vinf ⇒ *Y refuse (à) X à Vinf X RCLI refuses to Vinf ⇒ *Y refuses (to) X to VinfX si è dimenticato di Y ⇔ X ha dimenticato Y X RCLI forgot of Y ⇔ X forgot YX verwondde zich aan Y ⇔ X verwondde Y X wounded/injured RCLI to Y ⇔ X wounded/injured Y
X toonde zich ADJ ⇔ X toonde NOUN X showed RCLI ADJ ⇔ X showed NOUN ?? elle se trouve grosse want se trouver hier zelfde betekenis als trouverX tłumaczy się z Y ⇔ X tłumaczy Y X explains SELF of Y ⇔ X explains Y
X dziwi się Y.dat ⇔ Y dziwi X ⇔ Z dziwi X Y.inst X surprises SELF Y.dat ⇔ Y surprises X ⇔ Z surprises X Z.instX se esqueceu de Y ⇔ X esqueceu Y X RCLI forgot of Y ⇔ X forgot YX se gândeşte la Y ⇔ X gândeşte că Y X RCLI thinks of Y ⇔ X thinks that YА се објаснио с Б ⇔ А је објаснио Б A se objasnio s B A resolved the issues with B ⇔ A explained something to B - next test
Test IRV.4 - [IMPERS] - Impersonal
When you replace the RCLI by an underspecified subject such as one or people, does the sentence keep its meaning?
- do NOT annotate as verbal MWE
не се вечеря късно ⇔ хората не вечерят късно not RCLI have dinner late it is not good to have dinner latehier tanzt es sich gut ⇔ hier tanzen die Leute gut people dance well herese duerme mucho ⇔ las personas duermen mucho people sleep a lot
se busca a actores ⇔ la gente busca a actores people look for actorsn.a.il se dit des bêtises ⇔ les personnes disent des bêtises people say silly thingssi dorme molto ⇔ le persone dormono molto people sleep a lot
si affitta molte case ⇔ le persone affittano molte case people rent many housespracuje się za dużo ⇔ ludzie pracują za dużo people work too much
opowiada się bzdury ⇔ ludzie opowiadają bzdury people tell nonsensedorme-se muito ⇔ as pessoas dormem muito people sleep a lot
conta-se histórias ⇔ as pessoas contam histórias people tell storiesse lucrează până târziu ⇔ lumea lucrează până târziu people work until late
se aleargă dimineața ⇔ lumea aleargă dimineața people run in the morninggovorijo se neumnosti ⇔ ljudje govorijo neumnosti people tell nonsenseради се превише. ⇔ људи раде превише. radi se previše. ⇔ ljudi rade previše. there's too much work being done ⇔ people are working too much. - annotate as IRV
Test IRV.5 - [MIDDLE-INCHO] - Middle or Inchoative
When you move the subject to the object position, remove the RCLI and add a generic subject (people, somebody), thus building a transitive version, does it imply the REFLV version? In other words, people/somebody V [to] X ⇒ X REFLV?
- do NOT annotate as verbal MWE
някой отваря вратата ⇒ вратата се отваря somebody opens the door ⇒ the door opensman kann die Häuser gut verkaufen ⇒ die Häuser verkaufen sich gut people can sell the houses well ⇒ the houses sell well
jemand öffnet die Tür ⇒ die Tür öffnet sich somebody opens the door ⇒ the door opensla gente cuenta historias ⇒ se cuentan historias people tell stories ⇒ stories are told
alguien abrió la puerta ⇒ la puerta se abrió somebody opened the door ⇒ the door openedn.a.on vend bien ce produit ⇒ ce produit se vend bien people sell this product well ⇒ this product sells well
quelqu'un ouvre la porte ⇒ la porte s'ouvre, somebody opens the door ⇒ the door opensqualcuno vende bene questo prodotto ⇒ questo prodotto si vende bene someone people sells this product well ⇒ this product sells well
qualcuno apre la porta ⇒ la porta si apre somebody opens the door ⇒ the door opensktoś sprzedaje te domy ⇒ te domy się sprzedają somebody sells these houses ⇒ these houses sell well
ktoś otwiera drzwi ⇒ drzwi się otwierają somebody opens the door ⇒ the door opens
ktoś nasila skargi ⇒ skargi nasilają się somebody increases complaints ⇒ complaints increase
ktoś rozgrywa mecz ⇒ mecz rozgrywa się somebody plays a game ⇒ the game playsalguém conta histórias ⇒ contam-se histórias somebody tells stories ⇒ tell.PL-RCLI stories somebody tells stories ⇒ stories are told
alguém acalmou o menino ⇒ o menino se acalmou somebody calmed the boy ⇒ the boy RCLI calmedsomebody calmed the boy down ⇒ the boy calmed down
o juiz casou João com Maria ⇒ João se casou com Maria the judge married João with Maria ⇒ João RCLI married with Maria the judge married João with Maria ⇒ João got married to Maria
o juiz casou Maria e João ⇒ Maria e João se casaram the judge married Maria and João ⇒ Maria and João RCLI married the judge married Maria and João ⇒ Maria and João got married
alguém lembrou João do meu aniversário ⇒ João se lembrou do meu aniversário somebody reminded João of my birthday ⇒ João RCLI reminded of my birthday somebody reminded João of my birthday ⇒ João remembered my birthdaycineva spune glume ⇒ se spun glume somebody tells jokes ⇒ jokes are told
cineva a deschis ușa ⇒ ușa s-a deschis somebody opened the door ⇒ the door openednekdo pripoveduje šale ⇒ šale se pripovedujejo somebody tells jokes ⇒ jokes are told
nekdo je odprl vrata ⇒ vrata so se odprla somebody opened the door ⇒ the door openedнеко је отварао врата ⇒ врата се отварају neko je otvarao vrata ⇒ vrata se otvaraju someone was opening the doors ⇒ the doors were being opened,
неко шири гласине ⇒ галасине се шире neko širi glasine ⇒ glasine se šire someone's spreading the rumors ⇒ the rumors are being spread - next test
Test IRV.6 - [REFL] - Reflexive
When you replace the RCLI by oneself only or to oneself only, does it imply the REFLV version? In other words, X V [to] himself only ⇒ X REFLV?
- do NOT annotate as verbal MWE
Павел лекува себе си ⇒ Павел се лекува Pavel heals himselfPaul kratzt nur sich selbst ⇒ Paul kratzt sich Paul scratches himselfPaul washes only himself ⇒ Paul washes himselfPablo se lava a sí mismo ⇒ Pablo se lava Paul washes himselfn.a.Paul ne soigne que lui-même ⇒ Paul se soigne Paul heals himself
Paul ne parle qu'à lui-même ⇒ Paul se parle Paul talks to himselfPaolo cura solo se stesso ⇒ Paolo si cura Paul heals himself
Paolo parla solo a se stesso ⇒ Paolo si parla Paul talks to himselfPaul wast alleen zichzelf ⇒ Paul wast zich(zelf) Paul washes himselfPaweł leczy tylko siebie ⇒ Paweł leczy się Paul heals himself
Paweł bogaci tylko siebie ⇒ Paweł bogaci się Paul enriches himself Paul gets rich
Paweł myje tylko siebie ⇒ Paweł myje się Paul washes himselfPaulo só lava a si mesmo ⇒ Paulo se lava Paul washes himselfPaul se spală doar pe sine ⇒ Paul se spală. Paul washes himselfPavel praska sam sebe ⇒ Pavel se praska Paul scratches himselfМарко лечи сам себе ==> Марко се лечи Marko leči sam sebe ==> Marko se leči Marko is treating himself ==> Marko is getting treated - next test
- The subject is singular: test REFL-MUTUAL
- The subject is plural or coordinated (Bob and Alice): test RECIPRO
Test IRV.7 - [REFL-MUTUAL] - Reflexive-mutual
Is a reciprocal version possible? Namely: Is it acceptable to replace the singular subject by a plural and add each other to the REFLV form without changing the REFLV's meaning?
- do NOT annotate as verbal MWE The test applies only if test 15 has failed. For example, for "X se marie" 'X gets married' in French, it is odd though possible to say 'X and Y marry each other', but this does not mean 'X gets married', because it is only possible if X and Y are marriage officiants
Павел се мие ⇔ те се мият един друг they wash each otherPaul wäscht sich ⇔ Sie waschen sich gegenseitig / einander they wash each otherPablo se lava ⇔ ellos se lavan mutuamente / los unos a los otros they wash each othern.a.Paul se lave ⇔ ils se lavent mutuellement / les uns les autres they wash each otherPaolo si lava ⇔ essi si lavano reciprocamente / l'un l'altro they wash each otherPaul wast zich ⇔ Zij wassen elkaar they wash each otherPaweł się myje ⇔ oni myją się nawzajem they wash each otherPaulo se lava ⇔ eles se lavam mutuamente / uns aos outros they wash each otherel se spală ⇔ ei se spală unul pe altul they wash each otherPavel se umiva ⇔ umivajo drug drugega they wash each otherМарко се забавља ⇔ они један другог забављају Marko se zabavlja ⇔ oni jedan drugog zabavljaju Marko is amusing himself ⇔ they are amusing one another
- annotate as IRV
Test IRV.8 - [RECIPRO] - Reciprocal
Is it possible to remove the RCLI and replace the coordinated subject (A and B) or plural subject (A.PL) by a singular subject (A or A.PL) and a singular object, often introduced by to/with (B or A.PL), without changing the REFLV's meaning? That is:
- Coordinated subject: A and B PronV ⇔ A V [to/with] B and B V [to/with] A?
- Plural subject: A.PL PronV ⇔ A.PL V [to/with] A.PL?
- do NOT annotate as verbal MWE
Павел и Елена се целуват ⇔ Павел целува Елена и Елена целува Павел Pavel and Elena kissPaul und Anna umarmen sich ⇔ Paul umarmt Anna and Anna umarmt Paul Paul and Anna hug each other
die Affen kratzen sich ⇔ die Affen kratzen die Affen the monkeys scratch each otherPablo y Ana se abrazan ⇔ Pablo abraza a Ana and Ana abraza a Pablo Paul and Ann hug each other
los niños se abrazan ⇔ los niños abrazan a los niños the children hug each othern.a.Paul et Anne s'embrassent ⇔ Paul embrasse Anne and Anne embrasse Paul Paul and Ann kiss
les jours se suivent ⇔ les jours suivent les jours the days follow each otherGiovanni e Anna si baciano ⇔ Giovanni bacia Anna and Anna bacia Giovanni John and Ann kiss
i giorni si seguono ⇔ i giorni seguono i giorni i giorni seguono l'un l'altroPaweł i Elena całują się ⇔ Paweł całuje Elenę i Elena całuje Pawła, Paweł i Elena całują się nawzajem Paweł kisses Elena and Elena kisses Paweł, Paweł and Elena kissJoão e Ana se beijam ⇔ João beija Ana and Ana beija João John and Ann kiss
os presos se agridem ⇔ os presos agridem os presos the prisoners aggress each otherIon şi George se salută ⇔ Ion îl salută pe George and George îl salută pe Ion Ion and George greet each other
participanții se salută ⇔ participanții îi salută pe participanți the participants greet each otherPavel in Ana se objemata ⇔ Pavel objema Ano in Ana objema Pavla Paul and Anna hug each otherМ и Н су се пољубили ⇔ М је пољубио Н и Н је пољубила М M i N su se poljubili ⇔ M je poljubio N i N je poljubila M M and N kissed ⇔ M kissed N and N kissed M - annotate as IRV
Problematic cases and remarks
Keep in mind that both simple and reflexive verbs can have several senses. In test 15, we ask that ALL senses you can think of are different from the REFLV form in the given context. For example, French verb trouver can mean to find something, to have an opinion about something, discover something, etc. But it has a totally different and unrelated meaning of to be (located at) in the sentence L'église se trouve à Paris the church is located in Paris . It should thus be annotated as a MWE. As the REFLV is polysemous itself, it should NOT be annotated as IRV in sentences like Elle se trouve grosse she finds herself fat where it means have an opinion about (herself), equivalent to the non-reflexive version.
In some languages the clitics are joint with the verb, sometimes using a hyphen but not always. When there is no hyphen, the REFLV will probably be tokenized as a single token in the corpus.
- In French, orthography and pronunciation rules require the clitic to be concatenated with the verb and its last vowel to be replaced by an apostrophe (liaison):
- s'abstenir to abstain
- In Spanish and Italian, the clitic can appear concatenated after the verb in some verbal forms (e.g. infinitives, gerunds):
- enamorarse to fall in love
- alzarsi to get up
- In Portuguese, there are always hyphens for postponed clitics (enclisis), but in conditional tense the clitic is in the middle of the verb (mesoclisis), separating the root from the suffix:
- queixar-se-ia would complain
- In Romanian the clitic and the verb are either separate or have a hyphen between them:
-
se aude un clopot RCLI hears a bell a bell is heard
s-aude un clopot RCLI-hears a bell a bell is heard
-
se aude un clopot RCLI hears a bell a bell is heard
The current annotation format allows annotating a single token as a MWE if it is a multiword token. Therefore, it should be annotated as an MWE.
Some idiomatic constructions include reflexive clitics. Two cases are possible:
- If a syntactically comparable literal construction is impossible or the REFLV would not be annotated in syntactically comparable literal constructions, annotate only the VID:
пилците се броят наесен chicken REFL are counted in the autumn the true results can be seen only at the end ⇒ кокошките се броят the hens REFL countedsich über etwas im Klaren sein dass S RCLI about s.th. in.the clear be to be aware of s.th./that S ⇒ *sich in N sein, dass for any noun Ndarse cuenta de to realize ⇒ *darse N de for any noun N
meterse en líos to get in trouble ⇒ REFLV not annotated in literal equivalents like meterse en una tienda to get in a storen.a.se rendre compte de to realize ⇒ *se rendre N de for any noun N
s'arracher les cheveux RCLI tear the hair worry ⇒ REFLV not annotated in literal equivalents like s'arracher un ongle to tear oneself's nailrendersi conto di to realize ⇒ *si rende N di for any noun N
si strappa i capelli RCLI tear the hair to worry ⇒ REFLV not annotated in literal equivalents like strapparsi un unghia to tear oneself's nailzich uit de voeten maken RCLI out of the feet make to get out of the way ⇒ *zich uit de N maken for any noun N
zich in de kijker spelen RCLI in the field-glass play to attract attention with one's skills ⇒ *zich in de N spelen for any noun Nzdawać sobie sprawę z to realize ⇒ *zdawać sobie N z for any noun Ndar-se mal to fail ⇒ dar-se ADV intransitive is acceptable only for antonym bem well
meter-se numa fria to get-RCLI in a cold to get in trouble ⇒ REFLV not annotated in literal equivalent like meter-se numa cabine to get into a cabina-și smulge părul din cappuliti si lase tear RCLI the hair to worry ⇒ REFLV not annotated in literal equivalents like puliti si obrvi to pluck one's eyebrowsкитити се туђим перјем kititi se tuđim perjem decorate RCLI someone else's feathers steal someone's thunder; take credit for someone else's accomplishments - If the REFLV would be annotated as IRV in syntactically comparable literal constructions, annotate both the IRV and the VID as embedded MWEs (rare):
смея се през сълзи laugh REFL through tears to laugh bitterlyn.a.rozlatywać się w proch scatter itself into dust disappearvirar-se nos trinta turn-RCLI in-the thirty contains virar-se to get by ≠ virar to turn/becomea i se face rău to CL.DAT RCLI.ACC make ill to feel sick this is a case when both a non-reflexive, dative clitic and a RCLI.ACC appear in the structure; the REFLV is annotated as IRV; both the IRV and the ID are annotated as embedded MWEs; note that the non-reflexive clitic is also considered as part of a VID (6.4_R)
a se duce pe apa sâmbetei RCLI go on water-the Saturday-of to get lost the REFLV is annotated in literal equivalent a se duce pe apa Bistriței he goes on the river Bistriţathere is a notable difference in meaning betwee the non-REFLV a duceto take and the REFLV a se duce to gorežati se kot pečen maček to laugh RCLI like a baked tomcat to laugh loudly režati se is IRVсмејати се као луд smejati se kao lud to laugh like crazy
It is rare, although possible, to find light verb constructions in which a reflexive clitic changes the original meaning significantly, thus characterizing an IRV:
In this case, the whole construction, including the verb, the noun and the reflexive clitic, must be annotated as VID, since there are two syntactic arguments:
Notice that annotating only the verb and the RCLI as IRV would be wrong, since it will have a completely different meaning without the noun, sometimes even coinciding with another IRV:
In some languages, e.g. Polish, clitics inflect for case. Most cases of IRV seem to be restricted to the accusative case:
a se sfiito RCLI.ACC be.shy to be shy
a se căito RCLI.ACC repent to repent
However, other cases can appear in IRV:
a-și apropriato-RCLI.DAT appropriateto appropriate - with a Dative clitic
Some expressions can have double clitics. Only the first two words belong to the IRV:
radzić sobie z sobą to advise RCLI.DAT with RCLI.INST to manage with oneself
This category does not cover other types of pronouns and clitics. They are covered by regular VID tests and should be annotated as such. Examples of constructions that should be annotated as VID rather than IRV include:
s'en aller to self from-it go to leave
en avoir marre to have from-it enough to be fed up
il y avoir it at-it haveto exist
prender-le to take it to be beaten
a o lua pe jos to take CL.ACC on footto walkaccording to the current guidelines, such examples pass the ID tests (see also 6.3_B5); both have literal correspondents that are not characterized by an obligatory non-reflexive clitic: a arde to burn and a lua to take
a-i repugnato CL.DAT loathe to loathe
a-i priito CL.DATto be favourable to sb.