This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
2018-lifat-m2-1 [2018/09/21 14:40] agata.savary |
2018-lifat-m2-1 [2018/09/21 14:42] (current) agata.savary |
||
---|---|---|---|
Line 27: | Line 27: | ||
The objectives of this internship are to exploit word embeddings for discovery of new MWEs based on their semantic proximity to the previously seen MWEs, contained in a lexicon or in an annotated corpus (resources of both types belong to the outcomes of the PARSEME-FR project). The discovery should lead to (semi-)automatic enrichment of these initial resources. Two stages are to be considered: | The objectives of this internship are to exploit word embeddings for discovery of new MWEs based on their semantic proximity to the previously seen MWEs, contained in a lexicon or in an annotated corpus (resources of both types belong to the outcomes of the PARSEME-FR project). The discovery should lead to (semi-)automatic enrichment of these initial resources. Two stages are to be considered: | ||
- | * (i) candidates for new MWEs are generated by replacing individual components of known MWEs by their semantically close words, established notably via word embeddings; | + | * candidates for new MWEs are generated by replacing individual components of known MWEs by their semantically close words, established notably via word embeddings; |
- | * (ii) the candidates generated in this way are filtered based on their corpus frequency or contexts of occurrence; for instance, adjectives // | + | * the candidates generated in this way are filtered based on their corpus frequency or contexts of occurrence; for instance, adjectives // |
Possible extensions of the objectives: | Possible extensions of the objectives: | ||
- | * (iii) integrating MWE discovery with MWE identification in // | + | * integrating MWE discovery with MWE identification in // |
- | * (iv) | + | * coupling word embedding-based lexical replacement with semantic resources such as WordNet. |