Verbal multiword expressions versus collocations
Collocations are not considered VMWEs in this task and should not be annotated. However, the boundary between both categories is not always easy to define and should be handled with care.
We understand collocations as combinations of words whose idiosyncrasy is purely statistical. In other words, tokens in collocations tend to co-occur with each other more often than expected by chance, but they show no substantial orthographic, morphological, syntactic and (most notably) semantic idiosyncrasy. In this way we oppose MWEs to collocations.
Note that other authors understand collocations slightly differently. E.g. for Sag et al. (2002), collocations are any statistically significant cooccurrences, i.e. they include all forms of MWE. For Baldwin and Kim (2010), collocations form a proper subset of MWEs. According to (Melcuk, 2010), collocations are binary semantically compositional combinations of words subject to lexical selection constraints, i.e. they intersect with what is here understood as MWEs.
Some combinations happen to be very frequent and are perceived as "frozen":
سؤالعلى أجاب answer a question
كتاب إشترى buy a book
وجبة قدمserve a meal качвам цената raise the price eine Frage beantworten to answer a question, die Graphik zeigt the grahpic shows, einen Bus nehmen to take a bus παίρνω το λεωφορείοperno to leoforio take-1SG the bus drastically drop
the graphic shows
to take a bus responder a una pregunta to answer a question
el gráfico muestra the graphic shows
coger el autobús to take the bus interesa agertu interest show to show interest
galdera bati erantzun question one-to answer answer a question
autobusa hartu bus take to take the bus riješiti dvojbu to solve a dilemma, pripremati jelo to prepare a meal rispondere a una domanda to answer a question
il grafico mostra the graphic shows
prendere un bus to take a bus de bus nemen to take the bus zalać rynek to flood the market to dominate the market bater um recorde to break a record (bater to beat has a regular sense of to overcome in addition to the litteral sense)
entrar em cartaz enter into poster arrive in theaters (for a movie) (the MWE is em cartaz in poster in theaters, the verb just usually collocates with this MWE) lua un autobuztake a bus drastičen upad drastical drop, graf prikazuje graphic shows, vzeti taksi to take a taxi графикон приказује grafikon prikazuje the graph displays, дијаграм илуструје dijagram ilustruje the diagram illustrates 古人 云 the-ancient say the ancient people said
据 报道 according-to report according to what is reported
However, applying regular lexical alternations to them does not markedly impact their meaning.
الاستبيانعلىأجابanswer a questionnaire
فطور القدم serve a breakfast
جريدة إشترى buy a newspaper вдигам цената raise the price, увеличавам цената raise the price, качвам залога raise the bet, качвам температурата raise the temperature eine Anfrage beantworten to answer a request, das Diagramm zeigt the diagram shows, mit einem Bus fahren to go by bus παίρνω το πλοίοperno to plio take the ship
παίρνω το τραίνοperno to treno take the train significantly drop, drastically decrease, the diagram shows, the graphic illustrates, to take a coach responder a una petición to answer a request
el diagrama muestra the diagram shows
coger el tren to take the train interesa erakutsi interest show to show interest →'erakutsi' and 'agertu' are synonyms in this context in Basque
zalantza bati erantzun doubt one-to answer answer a doubt
trena hartu train take to take the train riješiti dilemu to solve a dilemma, pripremati obrok to prepare a meal rispondere a una richiesta to answer a request
il diagramma mostra the diagram shows met de bus gaan to go by bus zdominować/zarzucić/zapełnić/nasycić rynek to dominate/overwhelm/fill/saturate the market quebrar/bater/ultrapassar/estabelecer um recorde to break/beat/overcome/establish a record
o recorde foi quebrado the record was broken
entrar/estar/permanecer/ficar/continuar/ter em cartaz enter/be/remain/stay/continue/have in poster lua o mașină občuten upad significant drop, drastično zmanjšanje drastical decrease, diagram prikazuje diagram shows, slika prikazuje picture shows насецкати лук / насецкати першун / исецкати сланину naseckati luk / naseckati peršun / iscekati slaninu to chop onions / to chop parsely / to chop bacon 古人 说 the-ancient say the ancient people said 圣者 云 the-saint say the saint said 据 称 according-to report according to what is reported 有 报道 have report there are reports
The difficulty of distinguishing collocations from VMWEs lies in the fact that lexical variability is relevant to some VMWEs:
نصيحةأعطى/أسدى to give / weave an advice , كلمة /خطابألقىthrew a word / speech give a word/speech нямам пукната пара/пукнат грош to not have a single penny, to be very poor
имам твърда/дебела глава to have a thick head, to be stubborn and not listen to advice einen Willen/Menschen brechen to break a will/person παίρνω / λαμβάνω απόφασηperno / lamvano apofasi take / take decision to decide to come in handy/useful, to stand firm/fast, to break someone's spirit/will, to take the cake/biscuit dar un paseo/una vuelta give a walk / a turn to go for a walk
darse/tomar una ducha give.self/take a shower take a shower min eman/egin pain give/do to hurt (somebody)
eskola/klasea eman class give to give a class →'eskola' and 'klasea' are synonyms in Basque περὶ πολλοῦ / ἐλάττονος ποιέομαιperi pollou / elattonos poieomai above much.GEN / more.GEN / little.GEN do.1SG to hold in high / higher / low esteem slomiti čiju/čiji volju/duh to break someone's will/spirit cogliere/prendere di sorpresa, dare/fornire un contributo zapisać się złotymi literami/zgłoskami to record iteself with golden letters/syllables to be remebered and commemorated for a merit
zamarznąć na kość/lód/sopel to freeze to bone/ice/icicle to freeze strongly levar em conta/consideração take into account/consideration
chutar o balde/pau da barraca to kick the bucket/the tent's stick to act irresponsibly lua o decizie/hotărâremake a decision imeti nekaj na voljo/razpolago to have something available/at disposal, odpreti nekomu pot/vrata to open a way/a door (for someone) to give someone an opportunity to do something крити нешто као змија ноге/крити нешто као гуја ноге kriti nešto kao zmija noge/kriti nešto kao guja noge to hide (sth.) like a snake hides its legs/to hide (sth.) like a serpent hides its legs to hide something with extreme caution
However, the extent of the vocabulary concerned by this variability is different for collocations and VMWEs. Namely, a head verb in a collocation usually selects a whole semantic class for each of its required arguments. For instance, the verb to take to use a vehicle to travel selects a whole semantic class of means of transport. Similarly, the verb to drop can select a large set of adverbs describing the degree: drastically/significantly/remarkably/slightly/reasonably drop. Conversely, lexical variability in a VMWE is limited to a closed list of lexemes, sometimes only loosely semantically related. For instance, the VMWEs to take a cake/biscuit and to stand firm/fast do not keep their idiomatic readings with semantically close complements: #to take a cookie/wafer, *to stand hard/rigid/solid etc.
See also Test VID.2.
Some Light-verb constructions (LVCs) and multiverb constructions (MVCs) belong to the gray zone between MWEs and collocations in the sense that some operator (light) verbs seem to select large classes of nouns, as in to make a speech/declaration/remark/etc. However, some studies (e.g. Bonial 2014) show that there is no such thing as truly productive light verbs (e.g. to give a look vs. to give a stare). Therefore, we do include LVCs and MVCs in our annotation scope.