This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
2018-lifat-m2-1 [2018/09/21 14:00] agata.savary |
2018-lifat-m2-1 [2018/09/21 14:42] (current) agata.savary |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
* **Domain:** Natural Language Processing | * **Domain:** Natural Language Processing | ||
Line 19: | Line 19: | ||
Automatic identification of MWEs in 19 languages was addressed by the PARSEME shared task1 (Ramisch et al., 20182018), in which the BdTln team participated with the VarIDE system (Pasquer et al., 2018a). The results of the shared task show that identifying **unseen MWEs** (i.e. those MWEs which do not occur in the training data) is particularly challenging. Thus, identification should, ideally, exploit not only annotated corpora but also MWE lexicons and MWE discovery methods. | Automatic identification of MWEs in 19 languages was addressed by the PARSEME shared task1 (Ramisch et al., 20182018), in which the BdTln team participated with the VarIDE system (Pasquer et al., 2018a). The results of the shared task show that identifying **unseen MWEs** (i.e. those MWEs which do not occur in the training data) is particularly challenging. Thus, identification should, ideally, exploit not only annotated corpora but also MWE lexicons and MWE discovery methods. | ||
+ | ===== Topics ===== | ||
+ | This internship is dedicated to discovering how MWE discovery could benefit from the previously seen data, rather than be performed from scratch. The hypothesis to be tested is that new (unseen) MWEs of certain types can be discovered due to their semantic similarity with known (previously seen) MWEs. For instance, knowing that //**haute température**// | ||
+ | |||
+ | To perform lexical substitution, | ||
===== Objectives ===== | ===== Objectives ===== | ||
- | The main objective | + | The objectives |
- | The internship can be divided in the following tasks: | + | * candidates for new MWEs are generated by replacing individual components |
- | * Study the linguistic properties | + | * the candidates generated |
- | * Develop a tool to automatically link annotated verbal MWEs and their corresponding entries | + | |
- | * Develop | + | |
- | * Optionally, extend the work to all multiword expressions, | + | |
- | ===== Profile ==== | + | Possible extensions |
- | * Master 1 or Master 2 in computational linguistics or computer science, | + | |
- | * Good knowledge | + | |
- | * Interests in linguistics and familiarity with language technology, | + | |
- | * Programming skills (python, web programming). | + | |
- | ===== Important dates ==== | + | |
- | * Application deadline: 15 January 2018 (or until filled) | + | * coupling word embedding-based lexical replacement with semantic resources such as WordNet. |
- | * Notification: | + | |
- | | + | |
- | * Position ends: July-August 2018 | + | |
- | ===== References | + | ===== Candidate' |
+ | * 2nd-year master student in computational linguistics, | ||
+ | * Interests in linguistics and familiarity with language technology | ||
+ | * Good knowledge of French | ||
+ | * Good programming skills, preferably in Python | ||
- | Marie Candito, Mathieu Constant, Carlos Ramisch, Agata Savary, Yannick Parmentier, Caroline Pasquer, and Jean-Yves Antoine. Annotation d’expressions polylexicales verbales en français. In Jean-Yves Antoine Iris Eshkol, editor, 24e conférence sur le Traitement Automatique des Langues Naturelles | + | ===== Important dates ==== |
+ | * Application deadline: 15 December 2018 (or until filled) | ||
+ | * Notification: 15 January 2018 | ||
+ | * Position starts: around February-March 2018 | ||
+ | * Position ends: around July-August 2018 | ||
- | Maurice Gross. Lexicon-grammar | + | ===== How to apply ===== |
- | + | Send your CV and a cover letter to: | |
- | Agata Savary, | + | * Caroline Pasquer: first.last@etu.univ-tours.fr |
+ | | ||
+ | * Carlos Ramisch: first.last@lis-lab.fr | ||
+ | ===== References ==== | ||
+ | * Baldwin, T. and Kim, S. N. (2010) [[https:// | ||
+ | * Farahmand, M. Henderson, J., [[http:// | ||
+ | * Afsaneh Fazly, Paul Cook and Suzanne Stevenson. 2009. [[http:// | ||
+ | * Peng, J., Aharodnik, K., Feldman, A.. (2018). A Distributional Semantics Model for Idiom Detection - The Case of English and Russian. Special Session on Natural Language Processing in Artificial Intelligence, | ||
+ | * Pasquer, C., Savary, A., Antoine, J.-Y., Ramisch, C. (2018b) [[http:// | ||
+ | * Ramisch C., Cordeiro, S., Savary, A., Vincze, V. et al. (2018) [[http:// | ||
+ | * Savary, A., Jacquemin, Ch. (2003): [[https:// | ||
------------------------------ | ------------------------------ | ||
- | ===== How to apply ===== | ||
- | Applications should be sent to Mathieu.Constant@univ-lorraine.fr. They should include a CV, a cover letter, and possibly support letters by teacher. |