Collocations are not considered VMWEs in this task and should not be annotated. However, the boundary between both categories is not always easy to define and should be handled with care.
We understand collocations as combinations of words whose idiosyncrasy is purely statistical. In other words, tokens in collocations tend to co-occur with each other more often than expected by chance, but they show no substantial orthographic, morphological, syntactic and (most notably) semantic idiosyncrasy. In this way we oppose MWEs to collocations.
Note that other authors understand collocations slightly differently. E.g. for Sag et al. (2002), collocations are any statistically significant cooccurrences, i.e. they include all forms of MWE. For Baldwin and Kim (2010), collocations form a proper subset of MWEs. According to (Melcuk, 2010), collocations are binary sematically compositional combinations of words subject to lexical selection constraints, i.e. they intersect with what is here understood as MWEs.
Some combinations happen to be very frequent and are perceived as "frozen":
سؤالعلىأجابanswer a question كتابإشترى buy a book وجبةقدمserve a meal
качвам ценатаraise the price
eine Frage beantwortento answer a question, die Graphik zeigtthe grahpic shows, einen Bus nehmen to take a bus
παίρνω το λεωφορείοperno to leoforiotake-1SG the bus
drastically drop the graphic shows to take a bus
responder a una preguntato answer a question el gráfico muestrathe graphic shows coger el autobústo take the bus
interesa agertuinterest showto show interest galdera bati erantzunquestion one-to answeranswer a question autobusa hartubus taketo take the bus
riješiti dvojbuto solve a dilemma, pripremati jeloto prepare a meal
rispondere a una domandato answer a question il grafico mostrathe graphic shows prendere un bus to take a bus
de bus nemen to take the bus
zalać rynekto flood the marketto dominate the market
bater um recordeto break a record(baterto beat has a regular sense of to overcome in addition to the litteral sense) entrar em cartazenter into posterarrive in theaters (for a movie)(the MWE is em cartazin posterin theaters, the verb just usually collocates with this MWE)
lua un autobuztake a bus
drastičen upaddrastical drop, graf prikazujegraphic shows, vzetitaksito take a taxi
古人 云 the-ancient say the ancient people said 据 报道 according-to report according to what is reported
However, applying regular lexical alternations to them does not markedly impact their meaning.
الاستبيانعلىأجابanswer a questionnaire فطور القدم serve a breakfast جريدة إشترىbuy a newspaper
вдигам ценатаraise the price, увеличавам ценатаraise the price, качвам залогаraise the bet, качвам температуратаraise the temperature
eine Anfrage beantwortento answer a request, das Diagramm zeigtthe diagram shows, mit einem Bus fahrento go by bus
παίρνω το πλοίοperno to pliotake the ship παίρνω το τραίνοperno to trenotake the train
significantly drop, drastically decrease, the diagram shows, the graphic illustrates, to take a coach
responder a una peticiónto answer a request el diagrama muestrathe diagram shows coger el trento take the train
interesa erakutsiinterest showto show interest→'erakutsi' and 'agertu' are synonyms in this context in Basque zalantza bati erantzundoubt one-to answeranswer a doubt trena hartutrain taketo take the train
riješiti dilemuto solve a dilemma, pripremati obrokto prepare a meal
rispondere a una richiestato answer a request il diagramma mostrathe diagram shows
met de bus gaanto go by bus
zdominować/zarzucić/zapełnić/nasycić rynekto dominate/overwhelm/fill/saturate the market
quebrar/bater/ultrapassar/estabelecer um recordeto break/beat/overcome/establish a record o recorde foi quebradothe record was broken entrar/estar/permanecer/ficar/continuar/ter em cartazenter/be/remain/stay/continue/have in poster
насецкати лук / насецкати першун / исецкати сланинуnaseckati luk / naseckati peršun / iscekati slaninuto chop onions / to chop parsely / to chop bacon
古人 说 the-ancient say the ancient people said 圣者 云 the-saint say the saint said 据 称 according-to report according to what is reported 有 报道 have report there are reports
The difficulty of distinguishing collocations from VMWEs lies in the fact that lexical variability is relevant to some VMWEs:
نصيحةأعطى/أسدى to give / weave an advice , كلمة /خطابألقىthrew a word / speech give a word/speech
нямам пукната пара/пукнат грошto not have a single penny, to be very poor имамтвърда/дебелаглаваto have a thick head, to be stubborn and not listen to advice
einen Willen/Menschenbrechento break a will/person
παίρνω / λαμβάνωαπόφασηperno / lamvano apofasitake / take decisionto decide
to come in handy/useful, to stand firm/fast, to break someone's spirit/will, to take the cake/biscuit
dar un paseo/una vueltagive a walk / a turnto go for a walk darse/tomar una duchagive.self/take a showertake a shower
min eman/eginpain give/doto hurt (somebody) eskola/klasea emanclass giveto give a class→'eskola' and 'klasea' are synonyms in Basque
περὶπολλοῦ / ἐλάττονοςποιέομαιperi pollou / elattonos poieomaiabove much.GEN / more.GEN / little.GEN do.1SGto hold in high / higher / low esteem
cogliere/prenderedi sorpresa, dare/fornire un contributo
zapisać się złotymi literami/zgłoskamito record iteself with golden letters/syllablesto be remebered and commemorated for a merit zamarznąć na kość/lód/sopelto freeze to bone/ice/icicleto freeze strongly
levar em conta/consideraçãotake into account/consideration chutar o balde/pau da barracato kick the bucket/the tent's stickto act irresponsibly
lua o decizie/hotărâremake a decision
imeti nekaj na voljo/razpolago to have something available/at disposal, odpreti nekomu pot/vratato open a way/a door (for someone) to give someone an opportunity to do something
крити нешто као змија ноге/крити нешто као гуја ногеkriti nešto kao zmija noge/kriti nešto kao guja nogeto hide (sth.) like a snake hides its legs/to hide (sth.) like a serpent hides its legsto hide something with extreme caution
However, the extent of the vocabulary concerned by this variability is different for collocations and VMWEs. Namely, a head verb in a collocation usually selects a whole semantic class for each of its required arguments. For instance, the verb to taketo use a vehicle to travel selects a whole semantic class of means of transport. Similarly, the verb to drop can select a large set of adverbs describing the degree: drastically/significantly/remarkably/slightly/reasonably drop. Conversely, lexical variability in a VMWE is limited to a closed list of lexemes, sometimes only loosely semantically related. For instance, the VMWEs to take a cake/biscuit and to stand firm/fast do not keep their idiomatic readings with semantically close complements: #to take a cookie/wafer, *to stand hard/rigid/solid etc.
See also Test VID.2.
Some Light-verb constructions (LVCs) and multiverb constructions (MVCs) belong to the gray zone between MWEs and collocations in the sense that some operator (light) verbs seem to select large classes of nouns, as in to make a speech/declaration/remark/etc. However, some studies (e.g. Bonial 2014) show that there is no such thing as truly productive light verbs (e.g. to give a look vs. to give a stare). Therefore, we do include LVCs and MVCs in our annotation scope.