Annotation guidelines (version 2.0; UNDER CONSTRUCTION)
Used by the
corpora annotated for multiword expressions
Neutral forms of MWEs
MWEs occurring in a corpus can have various syntactic structures. For instance, to take someone by surprise can be inflected (they took me by surprise), negated (they did not take me by surprise), passivised (I was taken by surprise), subject to extraction (the surprise by which I was taken). Similarly, a brain washing, can be transformed into a structure with a nominal-adpositional modifier (washing of a brain), an extraction (brain whose washing [did not succeed]), etc..
Since the linguistic tests are structure-driven (cf. e.g. structural tests), there is a necessity to neutralize variation before the tests are applied. In this section we introduce definitions answering these needs.
Neutral form
A neutral form (previously called canonical form) of a MWE or a MWE candidate is its least syntactically marked form which preserves its meaning. We consider that:
- a form with a finite verb is less marked than with an infinitive, a participle, an analytical tense or a modal
- active voice is less marked than passive and other diathesis alternations,
- a non-negated form is less marked than a negated one,
- a form with an extraction is more marked than one without it,
- a form with an adpositional modifier is more marked than one without it,
- a form with interposed complex determiners and quantifiers is more marked than one without it,
- a form with coordination is more marked than one without it,
she has taken him by surprise - the neutral form is she took him by surprise [just now]
she was taking him by surprise - the neutral form is she took him by surprise [and this happened at the same time as ...]
she wants to take him by surprise - the neutral form is she takes him by surprise [, that's her plan]
będą ją pociągać do odpowiedzialności they will pull her to responsibility they will accuse her - the neutral form is pociągną ją do odpowiedzialności they will pull her to responsibility they will accuse her
będą ją pociągali do odpowiedzialności they will pull her to responsibility they will accuse her is a neutral form
chcą ją pociągnąć do odpowiedzialności they want to pull her to responsibility they want to accuse her - the neutral form is pociągną ją do odpowiedzialności they will pull her to responsibility they will accuse her
pociągnęli ją do odpowiedzialności they pulls her to responsibility they accused her is a neutral form
bo by pociągnęli ją do odpowiedzialności they would pull her to responsibility they would accuse her is a neutral form
pociągnęliby ją do odpowiedzialności they would pull her to responsibility they would accuse her is a neutral form; only the finite verb in the conditional form is annotated
pociągając ją do odpowiedzialności pulling her to responsibility accusing her - the neutral form is pociągną ją do odpowiedzialności they will pull her to responsibility they will accuse her
w takich warunkach decyzje podejmują się same under such circumstances decisions take themselves on their own under such circumstances no effort is needed to take decisions - the neutral form is w takich warunkach ludzie podejmują decyzje [bez wysiłku]
the brain whose washing did not succeed - the neutral form is [there was] a brain washing[, it did not succeed for a brain]
nie mieli cienia wątpliwości they didn't have a shade of a doubt - the neutral form is [nie jest prawdą, że] mieli jakąkolwiek wątpliwość it is not true that they had any doubt
nie od razu Kraków zbudowanoCracow was not built at once Rome was not built in a day - this is the neutral form on its own rather than Zbudowali Kraków od razu they built Cracow at once
metoda kija i marchewki the method of a stick and a carrot offer people things in order to persuade them to do something and punish them if they refuse to do it - this is the neutral form on its own rather than metoda kija i metoda marchewki the method of a stick and the method of a carrot
Neutral form in MWEs containing deverbal forms
We consider that the existence of deverbal nouns, masdars, adjectives and adverbs in MWEs does not imply syntactic marking. For instance, a wild goose chase, a decision maker and a heartbreaking story are neutral forms on their own. Consequently, they are considered nominal and adjectival MWEs, rather than verbal MWEs. Their connection to the corresponding VMWEs, if any (make a decision and break hearts, in the last 2 cases) is made explicit though their subcategories (NV.VID, NV.IVPC.full and AV.VID, respectively).Other examples of such cases include:
Wortbruch word-break a promise which has not been hold - this is a neutral form on its own (a deverbal nominal MWE deriving from ein Wort brenchen to break a word to fail holding a promise)
a wild goose chase - this is a neutral form on its own (nominal MWE); it is not deverbal since chase a wild goose is not a VMWE
during take-off and landing - this is a neutral form on its own (a deverbal nominal MWE, here NV.IVPC.full, deriving from took off)
a run-down apartment - this is a neutral form on its own (a deverbal adjectival MWE, here AV.IVPC.full, deriving from run down)
porte-feuille carry-sheets wallet - this is a neutral form on its own (nominal MWE); it is not deverbal since porter des feuilles not is not a VMWE
couru d'avance run in advance forgone conclusion - this is a neutral form on its own (adjectival MWE); it is not deverbal since courir d'avance not is not a VMWE
la prise en compte the fact of taking into account- this is a neutral form on its own (a deverbal nominal MWE, here NV.VID, deriving from prendre en compte to take into account)
une mise à disposition putting into disposal the fact of making available - this is a neutral form on its own (a deverbal nominal MWE, here NV.LVC.cause, deriving from mettre à disposition to put into disposal to make available)
zabawa czyimś kosztem a play at someone else's expenses - this is a neutral form on its own (a deverbal nominal MWE, here NV.VID, deriving from bawić się czyimś kosztem to enjoy oneself at someone else's expenses)
Note that it is notoriously hard to distinguish deverbal nouns, adjectives and adverbs from verbal inflected forms like gerunds, participles, etc.
she was breaking his heart - verbal MWE (VMWE) vs. heart-breaking story - deverbal adjectival MWE (AV.VID)
łamać serca to break hearts - verbal MWE (VMWE) vs. łamanie serc breaking of hearts - deverbal nominal MWE (NV.VID)
The underlying morpho-syntactic annotation might help in decision making.
Non-unicity of a neutral form
Note that a given MWE type often has more than one neutral form:In previous versions of these guidelines, a neutral form was called canonical form.