Differences

This shows you the differences between two versions of the page.

--- 2018-lifat-m2-1 [2018/09/21 14:11]
agata.savary
+++ 2018-lifat-m2-1 [2018/09/21 14:42]
agata.savary
@@ Line 1: / Line 1: @@
-====== Lexicon-to-corpus multiword expression browser  ======
+====== Verbal Multiword Expression Discovery in French Based on Seen Data and Distributional Semantics  ======
   * **Domain:** Natural Language Processing
@@ Line 27: / Line 27: @@
 The objectives of this internship are to exploit word embeddings for discovery of new MWEs based on their semantic proximity to the previously seen MWEs, contained in a lexicon or in an annotated corpus (resources of both types belong to the outcomes of the PARSEME-FR project). The discovery should lead to (semi-)automatic enrichment of these initial resources. Two stages are to be considered:
-  * (i) candidates for new MWEs are generated by replacing individual components of known MWEs by their semantically close words, established notably via word embeddings;
+  * candidates for new MWEs are generated by replacing individual components of known MWEs by their semantically close words, established notably via word embeddings;
-  * (ii) the candidates generated in this way are filtered based on their corpus frequency or contexts of occurrence; for instance, adjectives //chaud/froid// ‘hot/cold’ tend to co-occur more frequently with //*prendre* un **bain**/une **douche**// ‘to take a bath/shower’ than with //**prendre** une **baignoire**// (spacieuse/solide...) ‘take a (huge/solid) bathtub’.
+  * the candidates generated in this way are filtered based on their corpus frequency or contexts of occurrence; for instance, adjectives //chaud/froid// ‘hot/cold’ tend to co-occur more frequently with //*prendre* un **bain**/une **douche**// ‘to take a bath/shower’ than with //**prendre** une **baignoire**// (spacieuse/solide...) ‘take a (huge/solid) bathtub’.
 Possible extensions of the objectives:
-  * (iii) integrating MWE discovery with MWE identification in //varIDE//
+  * integrating MWE discovery with MWE identification in //varIDE//
-  * (iv)  coupling word embedding-based lexical replacement with semantic resources such as WordNet.
+  * coupling word embedding-based lexical replacement with semantic resources such as WordNet.
@@ Line 56: / Line 56: @@
 ===== References ====
+  * Baldwin, T. and Kim, S. N. (2010) [[https://people.eng.unimelb.edu.au/tbaldwin/pubs/handbook2009.pdf|Multiword Expressions]], in Nitin Indurkhya and Fred J. Damerau (eds.)  Handbook of Natural Language Processing, Second Edition, CRC Press, Boca Raton, USA, pp. 267-292.
-Marie Candito, Mathieu Constant, Carlos Ramisch, Agata Savary, Yannick Parmentier, Caroline Pasquer, and Jean-Yves Antoine. Annotation d’expressions polylexicales verbales en français. In Jean-Yves Antoine Iris Eshkol, editor, 24e conférence sur le Traitement Automatique des Langues Naturelles (TALN), Actes de TALN, volume 2 : articles courts, pages 1–9, Orléans, France, 06 2017.
+  * Farahmand, M. Henderson, J., [[http://www.aclweb.org/anthology/W16-1809||Modeling the non-substitutability of multiword expressions with distributional semantics and a loglinear model]], Proceedings of the ACL 2016 Workshop on MWEs. Berlin, pp.61-66, 2016.
+  * Afsaneh Fazly, Paul Cook and Suzanne Stevenson. 2009. [[http://www.aclweb.org/anthology/J09-1005|Unsupervised type and token identification of idiomatic expressions]]. Computational Linguistics 35(1):61–103
-Maurice Gross. Lexicon-grammar and the syntactic analysis of French. In Proc. of COLING-ACL 1964, pages 275–282, Stanford, CA, 1984. Association for Computational Linguistics.
+  * Peng, J., Aharodnik, K., Feldman, A.. (2018). A Distributional Semantics Model for Idiom Detection - The Case of English and Russian. Special Session on Natural Language Processing in Artificial Intelligence, 675-682
+  * Pasquer, C., Savary, A., Antoine, J.-Y., Ramisch, C. (2018b) [[http://aclweb.org/anthology/C18-1219|If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions]], in the Proceedings of the 27th International Conference on Computational Linguistics (COLING-18), Santa Fe, USA.
-Agata Savary, Carlos Ramisch, Silvio Cordeiro, Federico Sangati, Veronika Vincze, Behrang QasemiZadeh, Marie Candito, Fabienne Cap, Voula Giouli, Ivelina Stoyanova, and Antoine Doucet. The PARSEME shared task on automatic identification of verbal multiword expressions. In Proc. of EACL 2017 Workshop on MWEs, pages 31–47, Valencia, April 2017.
+  * Ramisch C., Cordeiro, S., Savary, A., Vincze, V. et al. (2018) [[http://aclweb.org/anthology/W18-4925|Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions]]. the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), Aug 2018, Santa Fe, United States. Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pp.222 - 240.
+  * Savary, A., Jacquemin, Ch. (2003): [[https://link.springer.com/content/pdf/10.1007%2F978-3-540-45115-0_6.pdf|Reducing Information Variation in Text]], in Renals, S., Grefenstette, G. (eds.) Text- and Speech-Triggered Information Access, Proceedings of TESTIA 2000, 8th ELSNET European Summer School on Language and Speech Communication, Lecture Notes in Artificial Intelligence 2705, Springer Verlag, pp. 145-181.
 ------------------------------

Syntactic Parsing and Multiword Expressions in French

User Tools

Site Tools

Differences

Page Tools