Annotation guidelines (version 2.0; UNDER CONSTRUCTION)
Used by the PARSEME corpora annotated for multiword expressions


Neutral forms of MWEs

MWEs occurring in a corpus can have various syntactic structures. For instance, to take someone by surprise can be inflected (they took me by surprise), negated (they did not take me by surprise), passivised (I was taken by surprise), subject to extraction (the surprise by which I was taken). Similarly, a brain washing, can be transformed into a structure with a nominal-adpositional modifier (washing of a brain), an extraction (brain whose washing [did not succeed]), etc..

Since the linguistic tests are structure-driven (cf. e.g. structural tests), there is a necessity to neutralize variation before the tests are applied. In this section we introduce definitions answering these needs.

Neutral form

A neutral form (previously called canonical form) of a MWE or a MWE candidate is its least syntactically marked form which preserves its meaning. We consider that:

  • a form with a finite verb is less marked than with an infinitive, a participle, an analytical tense or a modal
    • she will take him by surprise - the neutral form is she takes him by surprise [- this is the plan for the future]
      she has taken him by surprise - the neutral form is she took him by surprise [just now]
      she was taking him by surprise - the neutral form is she took him by surprise [and this happened at the same time as ...]
      she wants to take him by surprise - the neutral form is she takes him by surprise [, that's her plan]
    • pociągnądo odpowiedzialności they will pull her to responsibility they will accuse her is a neutral form
      będą ją pociągać do odpowiedzialności they will pull her to responsibility they will accuse her - the neutral form is pociągnądo odpowiedzialności they will pull her to responsibility they will accuse her
      będą ją pociągali do odpowiedzialności they will pull her to responsibility they will accuse her is a neutral form
      chcą ją pociągnąć do odpowiedzialności they want to pull her to responsibility they want to accuse her - the neutral form is pociągnądo odpowiedzialności they will pull her to responsibility they will accuse her
      pociągnęlido odpowiedzialności they pulls her to responsibility they accused her is a neutral form
      bo by pociągnęlido odpowiedzialności they would pull her to responsibility they would accuse her is a neutral form
      pociągnęliby ją do odpowiedzialności they would pull her to responsibility they would accuse her is a neutral form; only the finite verb in the conditional form is annotated
      pociągającdo odpowiedzialności pulling her to responsibility accusing her - the neutral form is pociągnądo odpowiedzialności they will pull her to responsibility they will accuse her
  • active voice is less marked than passive and other diathesis alternations,
    • he was taken by surprise - the neutral form is someone took him by surprise
    • décisions importantes se prennent en groupes important decisions take themselves in groups important decisions are taken in groups - the neutral form is on prend des decisions importantes en groupes one takes important decisions in groups
    • została pociągnięta do odpowiedzialności she was pulled to reponsibility she was accused - the neutral form is pociągnęlido odpowiedzialności they pulled her to reponsibility they accused her
      w takich warunkach decyzje podejmują się same under such circumstances decisions take themselves on their own under such circumstances no effort is needed to take decisions - the neutral form is w takich warunkach ludzie podejmują decyzje [bez wysiłku]
  • a non-negated form is less marked than a negated one,
    • they did not take him by surprise - the neutral form is it is not true that they took him by surprise
    • nie pociągną jej do odpowiedzialności they will not pull her to responsibility they will not accuse her - the neutral form is pociągnądo odpowiedzialności they will pull her to responsibility they will accuse her
  • a form with an extraction is more marked than one without it,
    • the surprise by which they took him - the neutral form is they took him by surprise[, this surprise ...]
      the brain whose washing did not succeed - the neutral form is [there was] a brain washing[, it did not succeed for a brain]
    • decyzja, którą podjęłam the decision which I made - the neutral form is podjęłam decyzję I made a decision
  • a form with an adpositional modifier is more marked than one without it,
    • the washing of a brain - the neutral form is brain washing
    • pranie przeznaczone dla mojego mózgu the washing dedicated to my brain - the neutral form is pranie mózgu [przeznaczone dla mnie] brain washing [dedicated to me]
  • a form with interposed complex determiners and quantifiers is more marked than one without it,
    • they took a significant number of steps - the neutral form is they took steps whose number was significant
    • dostali połowę spadku they received a half of the heritage - the neutral form is dostali spadek[, ale nie cały tylko połowę] they received a heritage [but half of it rather than the whole]
      nie mieli cienia wątpliwości they didn't have a shade of a doubt - the neutral form is [nie jest prawdą, że] mieli jakąkolwiek wątpliwość it is not true that they had any doubt
  • a form with coordination is more marked than one without it,
    • a guide to red and yellow cards in soccer - the two neutral forms are a guide to red cards and to yellow cards
    • dwie czerwone i cztery żółte kartki two red and four yellow cards - the two neutral forms are dwie czerwone kartki i cztery źółte kartki
This reasoning may be applied several times until the least syntactically marked form preserving the meaning is found:
  • a bunch of decisions which were made by her - the form contains passivization, extraction and a complex determiner; the neutral form is she made decisions [which were quite numerous]
  • wiele decyzji i działań, które zostały podjęte many decisions and actions which were taken - the form contains passivization, extraction and coordination; the neutral form is podjęli decyzje i podjęli działania [których było wiele] thay took decisions and they took actions [which were many]
In some cases, transforming a MWE or a MWE candidate to a less syntactically marked form does not preserve its meaning. In this case, a more syntactically marked form is considered neutral.
  • the die is cast the point of no retreat has been reached - this is a neutral form on its own rather than they cast the die
  • les carrotes sont cuites the carrots are cooked it's too late - this is a neutral form on its own rather than j'ai cuit les carrotes I have cooked the carrots
  • kości zostały rzuconethe dice have been thrownalea iacta est - this is the neutral form on its own rather than ktoś rzucił kości someone threw the dice
    nie od razu Kraków zbudowanoCracow was not built at once Rome was not built in a day - this is the neutral form on its own rather than Zbudowali Kraków od razu they built Cracow at once
    metoda kija i marchewki the method of a stick and a carrot offer people things in order to persuade them to do something and punish them if they refuse to do it - this is the neutral form on its own rather than metoda kija i metoda marchewki the method of a stick and the method of a carrot

Neutral form in MWEs containing deverbal forms

We consider that the existence of deverbal nouns, masdars, adjectives and adverbs in MWEs does not imply syntactic marking. For instance, a wild goose chase, a decision maker and a heartbreaking story are neutral forms on their own. Consequently, they are considered nominal and adjectival MWEs, rather than verbal MWEs. Their connection to the corresponding VMWEs, if any (make a decision and break hearts, in the last 2 cases) is made explicit though their subcategories (NV.VID, NV.IVPC.full and AV.VID, respectively).

Other examples of such cases include:

  • Vergiss-mein-nicht forget-me-not - this is a neutral form on its own (nominal MWE); it is not deverbal since vergiss mein nicht is not a VMWE
    Wortbruch word-break a promise which has not been hold - this is a neutral form on its own (a deverbal nominal MWE deriving from ein Wort brenchen to break a word to fail holding a promise)
  • forget-me-not - this is a neutral form on its own (nominal MWE); it is not deverbal since forget me not is not a VMWE
    a wild goose chase - this is a neutral form on its own (nominal MWE); it is not deverbal since chase a wild goose is not a VMWE
    during take-off and landing - this is a neutral form on its own (a deverbal nominal MWE, here NV.IVPC.full, deriving from took off)
    a run-down apartment - this is a neutral form on its own (a deverbal adjectival MWE, here AV.IVPC.full, deriving from run down)
  • peut-être may-be maybe - this is a neutral form on its own (adverbial MWE); it is not deverbal since peut être not is not a VMWE
    porte-feuille carry-sheets wallet - this is a neutral form on its own (nominal MWE); it is not deverbal since porter des feuilles not is not a VMWE
    couru d'avance run in advance forgone conclusion - this is a neutral form on its own (adjectival MWE); it is not deverbal since courir d'avance not is not a VMWE
    la prise en compte the fact of taking into account- this is a neutral form on its own (a deverbal nominal MWE, here NV.VID, deriving from prendre en compte to take into account)
    une mise à disposition putting into disposal the fact of making available - this is a neutral form on its own (a deverbal nominal MWE, here NV.LVC.cause, deriving from mettre à disposition to put into disposal to make available)
  • zrobić coś za Bóg-zapłać do something for a God-pay to do something for free - this is a neutral form on its own (nominal MWE); it is not deverbal since Bóg zapłaci God will pay is not a VMWE
    zabawa czyimś kosztem a play at someone else's expenses - this is a neutral form on its own (a deverbal nominal MWE, here NV.VID, deriving from bawić się czyimś kosztem to enjoy oneself at someone else's expenses)

Note that it is notoriously hard to distinguish deverbal nouns, adjectives and adverbs from verbal inflected forms like gerunds, participles, etc.

  • all hearts broken by her verbal MWE (VMWE) vs. broken hearts - deverbal nominal MWE (NV.VID)
    she was breaking his heart - verbal MWE (VMWE) vs. heart-breaking story - deverbal adjectival MWE (AV.VID)
  • wszystkie serca, które zostały przez nią złamane all the hearts which were broken by her verbal MWE (VMWE) vs. wszystkie złamane przez nią serca - deverbal nominal MWE (NV.VID)
    łamać serca to break hearts - verbal MWE (VMWE) vs. łamanie serc breaking of hearts - deverbal nominal MWE (NV.VID)

The underlying morpho-syntactic annotation might help in decision making.

Non-unicity of a neutral form

Note that a given MWE type often has more than one neutral form:
  • decisions made - the neutral form can be she makes decisions, I make decision, she/I/we made decisions, etc.
  • serca, które zostały złamane - the neutral form can be złamała serca, złamał serca, złamali serca, etc.
Thus, a neutral form is not the same thing as a lemma, i.e. a unique form representative of a MWE.
In previous versions of these guidelines, a neutral form was called canonical form.

An error has occured !