Table of Contents

Project outcomes

The goal of our project is to develop linguistic resources (lexicons, corpora, annotation guidelines) and software (parsers, MWE identifiers and linkers). Some of them are currently under development and will be published here when they are ready.

Software

MWE identification software

Tools which annotate multiword expressions automatically in running text, developed within the project or in close collaboration with PARSEME-FR project members.

Some of these tools can be tested online on the PARSEME-FR demonstrator

Other software

Language resources and datasets

Verbal MWE-annotated corpora of the PARSEME shared tasks

The datasets of the PARSEME shared task contain 18-20 languages, including French, and can be downloaded from:

Full-MWE annotated Sequoia treebank

MWE and coreference corpus

Manually annotated web sample

Multilingual corpus of literal occurrences of multiword expressions

French metagrammar with verbal MWEs

Project-internal resources