× Description Download Publication(s) Contact
 Back to Software and Resources

Alexina

Morphological (and sometimes syntactic) lexicons (including the Lefff)

Description

Alexina is ALMAnaCH's framework for the acquisition and modeling of morphological and syntactic lexical information. The first and most advanced lexical resource developed in this framework is the Lefff, a morphological and syntactic lexicon for French.

Alexina lexicons rely on a two-level architecture:

  • The intensional lexicon, which describes for each entry its lemma (canonical form + inflection table) as well as deep syntax information (deep sub-categorisation frame + possible realisations + possible restructurations); it is associated with a morphological grammar and a restructuration grammar.
  • The extensional lexicon, automatically built by compilation of the intensional lexicon; this generation process includes an inflection step, depending on the inflection class associated with the intensional entry, then step for constructing the different syntactic structures (one for each relevant restructuration) associated with each inflected form (syntactic information may vary from one form to another, in particular for infinitive and participe forms, and from one restructuration to another).

Lexical information in Alexina lexicons come both from manual work and from a variety of pre-existing, freely available sources. The detail differs from one lexicon to the other and is described in the corresponding papers.

Regarding the Lefff, the main sources of lexical information include:

  • automatic acquisition (with manual validation) thanks to statistical techniques applied on raw corpora (Clément, Sagot and Lang 2004; Sagot 2005),
  • automatic acquisition (with manual validation) of specific syntactic information (Sagot 2006 (PhD dissertation), ch. 7),
  • manual correction and extension guided by automatic techniques, such as error mining in parsing results (Sagot and de La Clergerie, 2006),
  • comparaison with other resources:
    • integration of information extracted from Lexicon-Grammar Tables: impersonal constructions, pronominal constructions, adverbs in -ment, several classes of frozen verbal expressions (Sagot et Danlos 2006; Danlos et Sagot 2007; Sagot et Danlos 2007; Sagot et Fort 2007);
    • merging with the whole Dicovalence lexicon, with manual validation of all entries for the 100 most frequent lemmas and of all entries for which the merging output contained more entries than both the Lefff and Dicovalence;
    • several nouns and adjectives have their origin in the French Multext lexicon (Véronis 1998).

Download

For technical reasons (migration from a GForge to a GitLab), direct download links to lexicons other than Lefff and to older versions of the Lefff are temporarily not available.

Lefff (Cecill-L licence)

Latest release (3.4)

  • Intentional Lefff — its compilation into an extensional lexicon requires the preliminary installation of the alexina-tools
  • If you do not want to recompile the Lefff:

  • Citation and publication(s)

    If you use this work, please cite the following:

    The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French

    Benoît Sagot. 2010. 7th international conference on Language Resources and Evaluation (LREC 2010). Valletta, Malta.
    HAL PDF
    @inproceedings{sagot_The-Lefff,-a-freely_2010,
     address = {Valletta, Malta},
     author = {Sagot, Beno{\^i}t},
     title = {{The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French}},
     year = {2010}
     booktitle = {{7th international conference on Language Resources and Evaluation (LREC 2010)}},
     url = {https://hal.inria.fr/inria-00521242},
     pdf = {https://hal.inria.fr/inria-00521242/file/lrec10lefff.pdf},
    }

    Contact

    For more information or if you have any questions, please contact Benoît Sagot

    Benoit.Sagot[at]inria.fr