Annotation guidelines
PARSEME shared task on automatic identification of verbal MWEs - edition 1.0 (2017)

Verbal multiword expressions versus collocations

Collocations are not considered VMWEs in this task and should not be annotated. However, the boundary between both categories is not always easy to define and should be handled with care.

We understand collocations as combinations of words whose idiosyncrasy is purely statistical. In other words, words in collocations tend to co-occur with each other more often than expected by chance, but they show no substantial orthographic, morphological, syntactic and (most notably) semantic idiosyncrasy.

Some combinations happen to be very frequent and are perceived as "frozen":

  • качвам цената raise the price
  • eine Frage beantworten to answer a question, die Graphik zeigt the grahpic shows, einen Bus nehmen to take a bus
  • κάνω βόλτα take-1SG a walk
  • drastically drop
    the graphic shows
    to take a bus
  • responder a una pregunta to answer a question
    el gráfico muestra the graphic shows
    coger el autobús to take the bus
  • rispondere auna domanda to answer a question
    il grafico mostra the graphic shows
    prendere un bus to take a bus
  • zalać rynek to flood the market to dominate the market
  • bater um recorde to break a record (bater to beat has a regular sense of to overcome in addition to the litteral sense)
    entrar em cartaz enter into poster arrive in theaters (for a movie) (the MWE is em cartaz in poster in theaters, the verb just usually collocates with this MWE)
  • lua un autobuztake a bus
  • drastičen upad drastically drop, graf prikazuje graphic shows, vzeti taksi to take a taxi

However, applying regular lexical alternations to them does not markedly impact their meaning.

  • вдигам цената raise the price, увеличавам цената raise the price, качвам залога raise the bet, качвам температурата raise the temperature
  • eine Anfrage beantworten to answer a request, das Diagramm zeigt the diagram shows, mit einem Bus fahren to take a bus
  • πάω βόλτα go for a walk
  • significantly drop, drastically decrease, the diagram shows, the graphic illustrates, to take a coach
  • responder a una petición to answer a request
    el diagrama muestra the diagram shows
    coger un tren take a train
  • rispondere a una richiesta to answer a request
    il diagramma mostra the diagram shows
  • zdominować/zarzucić/zapełnić/nasycić rynek to dominate/overwhelm/fill/saturate the market
  • quebrar/bater/ultrapassar/estabelecer um recorde to break/beat/overcome/establish a record
    o recorde foi quebrado the record was broken
    entrar/estar/permanecer/ficar/continuar/ter em cartaz enter/be/remain/stay/continue/have in poster
  • lua o mașină
  • občuten upad significantly drop, drastično zmanjšanje drastically decrease, diagram prikazuje diagram shows, slika prikazuje picture shows

The difficulty of distinguishing collocations from VMWEs lies in the fact that lexical variability is relevant to some VMWEs:

  • нямам пукната пара/пукнат грош to not have a single penny, be very poor
    имам твърда/дебела глава to have a thick head, to be stubborn and not listen to advice
  • einen Willen/Menschen brechen to break a will/person
  • to come in handy/useful, to stand firm/fast, to break someone's spirit/will, to take the cake/biscuit
  • dar un paseo/ una vuelta give a walk / a turn to go for a walk
    darse/tomar una ducha give.self/take a shower take a shower
  • cogliere/prendere di sorpresa, dare/fornire un contributo
  • zapisać się złotymi literami/zgłoskami to record iteself with golden letters/syllables to be remebered and commemorated for a merit
    zamarznąć na kość/lód/sopel to freeze to bone/ice/icicle to freeze strongly
  • levar em conta/consideração take into account/consideration
    chutar o balde/pau da barraca to kick the bucket/the tent's stick to act irresponsibly
  • lua o decizie/hotărâremake a decision
  • sprejeti odločitev/sklepmake a decision

However, the extent of the vocabulary concerned by this variability is different for collocations and VMWEs. Namely, a head verb in a collocation usually selects a whole semantic class for each of its required arguments. For instance, the verb to take to use a vehicle to travel selects a whole semantic class of means of transport. Similarly, the verb to drop can select a large set of adverbs describing the degree: drastically/significantly/remarkably/slightly/reasonably drop. Conversely, lexical variability in a VMWE is limited to a closed list of lexemes, sometimes only loosely semantically related. For instance, the VMWEs to take a cake/biscuit and to stand firm/fast do not keep their idiomatic readings with semantically close complements: #to take a cookie/wafer, *to stand hard/rigid/solid etc. See also test 2.