User Tools

Site Tools

Agence Nationale de la Recherche



This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
wp1 [2016/02/17 17:52]
wp1 [2017/09/18 15:46]
matthieu.constant some words on WP1
Line 1: Line 1:
-__Work Package 2__: **Multiword Expression Lexicon** +__Work Package 1__: **MWE representation and annotation** 
- +  * **Partners in charge**: LLF (Marie Candito) and ATILF (Mathieu Constant) 
-  * **Partners in charge**: LI (Agata Savary) and LIGM (Mathieu Constant) +  * **Partners involved**: LLF, LI, LIF, LIFO, ATILF 
-  * **Partners involved**: LI, LIF, LIGM +  * **Objectives**: Select the set of criteria to be used in the project for MWE identification classificationpropertiesProduce a gold standard corpus
-  * **Objectives**: build a unified and enriched MWE lexiconsincluding morphologicaldistributional, syntactic and semantic information; multiword NEs will get special treatment as they will be associated with pragmatic information (i.e. linking with the LOD). The encoded features will be of varying nature - either symbolic or numeric+  * **Final products**:  
-  * **Final product**:  +    * **FP.1.1**: A state-of-the-art report on MWE representation
-    * (FP.2.1) a new lexical resource, distributed under an open license, in a standard format+    * **FP.1.2**: Guidelines indicating the criteria to identify and classify MWEs, as well as the list of properties to be encoded in the lexicon and an annotation scheme 
-    * (FP.2.2) a tool to project an MWE lexicon on treebanks+    * **FP.1.3**: A gold standard corpus manually annotated by experts, including deep MWE annotation, together with the annotation guidelines
   * **Subtasks**:    * **Subtasks**: 
-    * WP 2.[[WP2.1|Compilation and analysis of existing lexicons]] +    * **WP 1.1**: State-of-the art on MWE in language resources 
-    * WP 2.2 Construction of a unified framework;  +    * **WP 1.2**: Setup of formal criteria for MWE identification and classification  
-    * WP 2.3 Enrichment of the lexicon +    * **WP 1.3**: A gold standard 
-    WP 2.4Interlinking of MWEs with the Linked Open Data + 
-    * WP 2.5: Converting the lexicon to a standard export format +---- 
-    WP 2.6: Projection on treebanks+**Results** 
 +In the framework of the [[|PARSEME Shared Task on identification of verbal MWEs]], Agata Savary, Carlos Ramisch and Marie Candito participated in the writing of the annotation guidelines (Savary et alMWE 2017). Marie Candito, Mathieu Constant, Carlos Ramisch, Agata Savary, Yannick Parmentier, Caroline Pasquer and Jean-Yves Antoine produced the French dataset (Candito et al. TALN 2017). This dataset, composed of the Sequoia corpus and the French UD treebank (about 19,000 sentences), includes 5,000 annotated verbal MWEs,  
 +**Work in progress** 
 +The annotation of the Sequoia corpus is now being extended to all MWEs, using annotation guidelines under construction. The release of the data is planned for the end of 2017.
wp1.txt · Last modified: 2017/09/18 15:46 by matthieu.constant