Annotation guidelines
PARSEME shared task on automatic identification of verbal MWEs - edition 1.1 (2018)

Categories of verbal MWEs

In edition 1.1 of this task we distinguish the following categories of verbal MWEs:

  • Two universal categories, i. e. valid for all languages participating in the task:
    • Light verb constructions (LVCs) with two subcategories:
      • LVCs in which the verb is semantically totally bleached (LVC.full)
        • държа под контрол to keep under control
        • eine Rede halten a speech holdto give a speech
        • κάνω μία βόλτα make-1SG a walk to walk
          δίνω μια εξήγηση
        • to give a lecture
        • hacer una promesa to_make a promise to make a promise
        • min hartu pain take to hurt oneself
          lo egin sleep do to sleep
        • avoir du courage to have courage
        • bain triail as extract trial from try
        • držati govor hold a speech to give a speech
        • fare un discorsoto_make a speechto give a speech
          fare una promessa to_make a promise to make a promise
        • ħa deċizjoni took a decision
        • podjąć decyzję to take a decision
        • fazer uma promessa to make a promise
        • a lua o decizie to take a decisionto make a decision
        • imeti predavanje to have a lecture to give a lecture, biti mnenja to be of opinion to have an opinion
        • hålla ett tal hold a speechto give a speech
      • LVCs in which the verb adds a causative meaning to the noun (LVC.cause)
        • давам възможност give an opportunity
        • δίνω προτεραιότητα
        • to grant rights
          to give a headache
          to provoke the destruction of the building
        • dar dolor de cabeza to_give pain of head to give a headache
          hacer ilusión to_make excitement to make excited/to look forward to
        • cuir lúcháir ar put joy on give delight to
        • zadati glavobolju komu to give a headache to someone, izazvati nezadovoljstvo to cause dissatisfaction
        • dare il mal di testa to_give pain of head to give a headache
          dare noia to_give trouble to annoy
        • nakłada obowiązek na użytkowników put a duty on the users
          dać prawo to give the rightto grant the right
          narazić na straty expose to losses
          stawiać komuś celto put an aim to someone to set a goal to someone
        • da cuiva bătăi de cap give sb. a hard time
        • dati ime nekomu to give (somebody) a name to name (somebody), narediti konec nečemu to make an end (to something) to end (something)
    • verbal idioms (VIDs):
      • правя се на дръж ми шапката to behave myself as 'hold my hat' pretend to be naive and innocent
        цъфна и вържа to blossom and give fruit (usually sarcastically) to prosper
        река и отсека to say and cut to say firmly, decisively
      • schwarz fahren to drive black take a ride without a ticket, in Kraft treten into force step to come into effect, in die Waagschale werfen in the weighing pan throw to bring to bear
        einen drauf setzen going one better
      • χάνω τα αυγά και τα καλάθια loose-1SG the eggs and the baskets to be at a complete and utter loss
        απορώ και εξίσταμαι wonder1SG.PST and be-amazed1SG.PST to wonder
      • to go bananas
        fortune favors the bold
        to drink and drive
        to voice act
        to pretty-print
        to short-circuit
        to tumble dry
      • hacer de tripas corazón make of intestines heart to pluck up the courage
        dejar con la miel en los labios to_leave with the honey in the lips leave (sb) wanting more
        dar gato por liebre to_give cat for hare to rip off, to take for a ride
      • adarra jo horn play to pull (somebody's) leg, to be kidding
        burua hautsi head break to rack one's brains, to think very hard
        ikusi eta ikasi see and learn
        hortxe dago koska just-there is the-crux that's the crux of the matter
      • défendre son bifteck defend one's beefsteak to defend one's interests
        court-circuiter to short-circuit
      • ag cur is ag cúiteamh arguing and debating arguing back and forth
      • mlatiti praznu slamu to beat empty straw to talk aimlessly, mazati komu oči to blur eyes to someone to cheat someone
      • gettare le perle ai porci to_throw the pearls to the pigs to waste something good on someone who doesn't care about it
        andare e venire to_come and goback and forth
        to short-circuit
      • għasfur żgħir qalli a bird small told me to hear something from the grapevine
        iqum u joqgħod jump and stay to fidget
      • rzucać grochem o ścianę throw peas agains a wall to try to convince somebody in vain
        pluć i łapać to spit and catch to be lazy, to do nothing useful
      • fazer das tripas coração transform the tripes into heart to try everything possible
        pintar e bordar paint and knit to abuse
      • a trage pe sfoară to pull on rope to fool
        a tunat și i-a adunatit.has thundered and CL.ACC-it.has gatheredbirds of a feather flock together
      • ubiti dve muhi na en mah to kill two flies with one strike to achieve two aims at once, spati kot ubit to sleep like dead to sleep soundly
  • Three quasi-universal categories, valid for some language groups or languages but non-existent or very exceptional in others:
    • inherently reflexive verbs (IRV):
      • усмихвам се to smile
      • sich bemühen to endeavour, sich enthalten himself contain to abstain
      • n.a.
      • to find oneself in a difficult situation
        to to help oneself to the cookies
      • suicidarse to suicide
        quejarse to complain
      • n.a.
      • se suicider to suicide
        se soucier to worry
      • n.a.
      • smijati se to laugh
      • suicidarsi to suicide
        lamentarsi to moan
      • bać się to fear SELFto be afraid
      • se queixar to complain
      • a se gândi to think
      • bati se to be afraid, smejati se to laugh, drzniti si to dare to do something
    • verb-particle constructions (VPC) with two subcategories:
      • fully non-compositional VPCs (VPC.full), in which the particle totally changes the meaning of the verb
        • not applicable to Bulgarian
        • er gibt auf he gives up, er wirft ihr das vor he throws her that against he reproches that to her
        • μπαίνω μέσα get in to go bankrupt
        • to do in
        • n.a.
        • n.a.
        • cas chuig turn towards happen to have
        • postaviti za to set for to appoint
        • buttare giù to_throw down to swallow
        • not applicable to Polish
        • jogar fora This seems to be the only VPC in Portuguese. We annotate it as ID and do not use the VPC category.
        • n.a.
        • n.a.
      • semi non-compositional VPCs (VPC.semi), in which the particle adds a partly predictable but non-spatial meaning to the verb
        • not applicable to Bulgarian
        • n.a.
        • to eat up
        • n.a.
        • tabhair suas give up
        • andare avanti to_go forward to move on
        • n.a.
        • n.a.
    • multi-verb constructions (MVC):
      • will sagen want to say that is to say
      • έχω να κάνω με have to do with to cope
      • to let go
        to make do
      • querer decir to_want to_say to mean
      • ?
      • laisser tomber let fall to give up
        vouloir dire want say to mean
      • ?
      • može biti can be it is possible
      • lasciar andare to_let go to unhand
        voler dire to_want say to mean
      • dać komuś żyćto let someone livenot to bother someone
        można wytrzymaćone can standthe situatiion is reasonably good
      • querer dizer want say to mean
        ouvir falar hear speak to know/remember vaguely
      • n.a.
      • n.a.
  • language-specific categories, defined for a particular language in a separate documentation.

We also introduce an optional experimental category which (if admitted by the given language) is to be considered in the post-annotation step:

  • inherently adpositional verbs (IAVs)
    • излизам пред някого/нещо come in front of someone/something to surpass, to outdo
      излизам със становище come out with a statement
    • n.a.
    • to come across
      to rely on
    • confiar en to_trust in to trust in entender de to_understand of to know about
    • n.a.
    • caith anuas ar throw down on belittle
    • suočiti s to face with
    • confidare su to_trust in to trust in intendersi di to_understand of to know about
    • godzić się na każde warunki to agree on any condition
      mieć do czynienia z czymś to have to do with sth
      odwieść kogoś od czegoś to dissuade someone from doing sth
    • conta pe count on
    • dati skozi give through to go through, gre za it goes about it is about

In practice, to identify and categorize verbal MWEs during manual annotation, one must use the rigorous generic decision tree and the structural and category-specific cross-lingual tests provided.

For a summary of changes with respect to edition 1.0 of the guidelines, see the what's new file.