2 open positions!

ALMAnaCH project-team

Inria Paris' NLP research team

Research in Artificial Intelligence bringing together Natural Language Processing and Computational Humanities.
Research in Artificial Intelligence bringing together Natural Language Processing and Computational Humanities.
News
6 Nov 2024
🏆 Thibault Clérice has won the 2024 Open Science PhD Award
Thibault Clérice has won the 2024 Open Science PhD Award for his PhD thesis on the detection of isotopies in Latin texts and the production of open corpora and open tools, which he carried out at Université Jean Moulin before joining ALMAnaCH.
1 Nov 2024
🤝 Gabrielle Le Bellier just joined ALMAnaCH as a PhD student
Gabrielle will be working on controlled generation for bias mitigation and cultural awareness in conversational language models.
1 Nov 2024
🤝 Célia Nouri just joined ALMAnaCH as a PhD student
Célia will be working on the automatic analysis of social media conversations.
1 Oct 2024
🤝 Panagiotis Tsolakis just joined ALMAnaCH as a research engineer
Panagiotis will be working on the the extraction of resources and the development of translation infrastructure in the context of the MaTOS project.
1 Oct 2024
🤝 Oriane Nédey just started a PhD at ALMAnaCH
Oriane will be working on low-resource machine translation for dialect continuums, with a focus on languages of France (e.g. Occitan, Picard, Alsacian), within the COLaF project.
1 Oct 2024
🤝 Aina Gari Soler just joined ALMAnaCH as a post-doctoral researcher
Aina will be working on the dynamics of word meaning in dialog: how speakers understand, negotiate and adapt their word usage in interaction.
1 Oct 2024
🤝 Yanzhu Guo just joined ALMAnaCH for the last semester of her thesis
Yanzhu will be working on the automatic evaluation of generated utterances in conversations.
18 Sept 2024
🎓 Djamé Seddah's HDR defence
Djamé defended his HDR (Habilitation to Supervise Research) entitled "From French Statistical Parsing to Low-Resource Language Modeling: a Research Journey".
2 Sept 2024
🤝 Marine Carpuat is joining us at ALMAnaCH this year.
Marine is on sabbatical from the University of Maryland and will be staying with us until the end of the academic year.
1 Sept 2024
🤝 Thibault Clérice is now a permanent member of the team
Thibault's research areas include digital humanities, computational humanities and, natural language processing for ancient languages.
1 Aug 2024
🤝 Djamé Seddah is now an Inria researcher
Djamé, who used to be a Maître de conférences at Sorbonne Université on secondment at Inria within ALMAnaCH, has become an Inria "chargé de recherche" (research scientist) within the team.
5 Jul 2024
🎓 Chadi Helwe's PhD defence
Chadi defended his PhD thesis, supervised by Fabian Suchanek and Chloé Clavel on evaluating and improving the reasoning abilities of language models.
1 Jul 2024
Inria Paris has moved!
The team can now be found at the new Inria Paris offices at 48 rue Barrault in the 13th arrondissement.
24 Jun 2024
🏆 ILLC best student paper award at LREC-COLING 2024.
Niyati Bafna, Cristina España-Bonet, Josef van Genabith, Benoît Sagot and Rachel Bawden were awarded the prize for their paper entitled “When Your Cousin Has the Right Connections: Unsupervised Bilingual Lexicon Induction for Related Data-Imbalanced Languages”.
1 May 2024
🤝 Malik Marmonier just joined ALMAnaCH as a research engineer
Malik will be working on translating with large language models in low-resource scenarios and for unseen languages in the context of the TraLaLaM ANR project.
9 Apr 2024
🎓 Tú Anh Nguyễn's PhD defence
Tú Anh defended his PhD thesis, supervised by Benoît Sagot and Emmanuel Dupoux (META) on Spoken Language Modeling from Raw Audio.
14 Mar 2024
🎓 Paul-Ambroise Duquenne's PhD defence
Paul-Ambroise defended his PhD thesis, supervised by Benoît Sagot and Holger Schwenk (META) on Sentence Embeddings for Massively Multilingual Speech and Text Processing.
26 Feb 2024
🤝 Pierre Chambon just joined ALMAnaCH as a PhD student
Pierre's CIFRE PhD, jointly supervised by Benoît Sagot with META (Gabriel Synnaeve and Baptiste Rozière) will be dedicated to code generation with large language models.
1 Jan 2024
🤝 Sinem Demirkan just joined ALMAnaCH as a research engineer
Sinem will be working under the supervision of Justine Cassell on hyperscanning pairs of children in order to better understand the neural correlates of rapport.
1 Jan 2024
🤝 Hao Wang just joined ALMAnaCH as a research engineer
Hao will be working under the supervision of Biswesh Mohapatra on benchmarking conversational grounding in LLMs.
1 Jan 2024
🤝 Anh Ha Ngo just joined ALMAnaCH as a PhD student
Anh will be working under the supervision of Chloé Clavel and Catherine Pelachaud (ISIR) on multimodal models, conversation repair and human-agent interaction.
15 Dec 2023
🎓 Alafate Abulimiti's PhD defence
Alafate defended his PhD thesis, supervised by Justine Cassell and Chloé Clavel on the role of socio-conversational strategies in task-oriented dialogues in the case of peer-tutoring interactions: a focus on off-task talks and hedges.
1 Dec 2023
🤝 Oriane Nédey just joined ALMAnaCH as a research engineer
She will be working under the supervision of Thibault Clérice, Rachel Bawden and Benoît Sagot on data collection and translation models for a regional language of France.
1 Dec 2023
🤝 Cecilia Graiff just joined ALMAnaCH as a research engineer
She wil be working on named entity disambiguation for the National Archives of France in the context of the NER4Archives project, supervised by Laurent Romary.
30 Nov 2023
🏆 Benoît Sagot is this year's holder of the annual chair position in informatics and digital sciences at the Collège de France
His inaugural lecture will take place on the 30th November at 6pm, followed in the following weeks by 8 classes open to the public, each one accompanied by a seminar from a specialist of the field.
29 Nov 2023
🏆 Thibault Clérice and Alix Chagué have won the Open Science Young Researchers prize for HTR-United
HTR-United is a catalogue of metadata for transcription and segmentation datasets.
1 Nov 2023
🤝 Justine Cassell and her research team, Articulabo, just joined ALMAnaCH
Her research focuses on Human-Computer interaction, dialogue systems, and embodied conversational agents.
1 Nov 2023
🤝 Rasul Dent just joined ALMAnaCH as a PhD student
His research will focus on fine-grained linguistic classification in the context of both inference efficiency and accuracy, with a focus on languages of France and French-based Creoles.
1 Nov 2023
🤝 Armel Zebaze just joined ALMAnaCH as a PhD student
His research will focus on the use of analogy for multilingual natural language processng.
1 Nov 2023
🤝 Christelle Rosello just joined ALMAnaCH as administrative assistant, taking over from Meriem Guemair
We look forward to working with Christelle and thank Meriem for the 5 great years we spent together.
1 Nov 2023
🤝 Marius Le Chapelier just joined ALMAnaCH as a research engineer
He will be working on further developing SARA (the Socially Aware Robot Assistant), an embodied dialogue system that will be able to build social bonds with its users in such a way as to improve its performance.
19 Oct 2023
🎓 Lionel Tadonfouet Tadjou's PhD defence
Lionel defended his PhD thesis, supervised by Laurent Romary and Éric de la Clergerie on the construction of coherent discussion threads based on conversations from professional communication and collaboration tools.
3 Oct 2023
🎓 José Carlos Rosales Núñez's PhD defence
José defended his PhD thesis, supervised by Guillaume Wisniewski and Djamé Seddah on the machine translation of user-generated content: an evaluation of neural translation systems in zero-shot settings.
1 Oct 2023
🤝 Nicolas Dahan just joined ALMAnaCH as a PhD student
He will be working under the supervision of Rachel Bawden and François Yvon (CNRS) on the machine translation evaluation for academic documents in the context of the ANR project MATOS.
1 Oct 2023
🤝 Ziqian Peng just joined ALMAnaCH as a PhD student (recruited at the CNRS)
She will be working under the supervision of François Yvon (CNRS) and Rachel Bawden on the document-level machine translation for academic texts in the context of the ANR project MATOS.
1 Oct 2023
🤝 Juliette Janès just joined ALMAnaCH as a research engineer
She will be working under the supervision of Thibault Clérice and Benoît Sagot on the recovery, encoding, maintenance, and publication of textual data on French and other languages of France produced within the framework of the DEFI COLaF.
1 Oct 2023
🤝 Sarah Bénière just joined ALMAnaCH as a research engineer
She will be working on automatic analysis of digitized sales catalogs as part of the DataCatalogue project with Hugo Scheithauer and under the supervision of Laurent Romary.
1 Oct 2023
🤝 Samuel Scalbert just joined ALMAnaCH as a research engineer
He will be working on the detection of software in HAL articles using GROBID and Softcite in the context of the GrapOS project.
1 Oct 2023
🤝 Chloé Clavel just joined ALMAnaCH as Senior Researcher
Her research will focus on neural models for analysing and generating socio-emotional behaviour in interactions, with the aim of making these models more transparent and controllable.
26 Sept 2023
🎓 Robin Algayres's PhD defence
Robin Algayres defended his PhD thesis, supervised by Emmanuel Dupoux and Benoît Sagot on unsupervised word discovery on speech data.
19 Sept 2023
🏆 Lydia Nishimwe has won the best paper award at RECITAL 2023
Her article is on lexical normalisation of user-generated content on social media (in French).
1 Aug 2023
🤝 Seth Aycock just joined ALMAnaCH as an engineer
He will be working under the supervision of Rachel Bawden on domain adaptation for neural machine translation in the context of the DadaNMT project.
1 May 2023
🤝 Thibault Clérice just joined ALMAnaCH as an SRP (starting research position)
Thibault Clérice will work on the collection and the development of textual resources for for all varieties of French around the world and for languages spoken in France in the context of the COLaF Défi (Corpus et Outils pour les Langues de France).
17 Nov 2022
🎓 Benjamin Muller's PhD defence
Benjamin Muller defended his PhD thesis, supervised by Djamé Seddah and Benoît Sagot, on handling language variation and language diversity in neural language models, with a focus on low-resource languages.
1 Nov 2022
🤝 Francis Kulumba has joined ALMAnaCH as a PhD student
Francis Kulumba will work on the disambiguisation of entities in scholarly publications, in particular authors, affiliations etc.
3 Oct 2022
🤝 Niyati Bafna just joined ALMAnaCH as a research engineer
She will be working on linguistically inspired language models for closely related languages, co-supervised by Benoît Sagot, Rachel Bawden and (from the DFKI) Cristina España-Bonet and Josef van Genabith.
26 Sept 2022
🎓 Clémentine Fourrier's PhD defence
Clémentine Fourrier defended her PhD thesis, supervised by Benoît Sagot, Rachel Bawden and Laurent Romary on neural approaches to historical word reconstruction.
27 Jul 2022
🎓 Pedro Ortiz Suarez's PhD defence
Pedro Ortiz Suarez defended his PhD thesis, supervised by Laurent Romary and Benoît Sagot on a data-driven approach to natural language processing for contemporary and historical French.
31 Mar 2022
🤝 Anna Chepaikina just joined ALMAnaCH as a research engineer
She will be working on oenological comment generation, supervised by Benoît Sagot in collaboration with the Winespace startup company.
9 Mar 2022
We are excited to announce the publication of a new version of our massive multilingual corpus OSCAR, namely version 22.02
Main changes: document-oriented corpus with annotations you can filter on, document-level language identification, a new multilingual subcorpus for multilingual documents, and more!
1 Mar 2022
🤝 Wissam Antoun just joined ALMAnaCH as a research engineer
He will work with Djamé Seddah and Benoît Sagot on language models for languages displaying high variabilty, in particular on Arabic dialects as found in user-generated content on social media.
1 Feb 2022
🤝 Jesujoba Alabi just joined ALMAnaCH as a research engineer
He will be working under the supervision of Rachel Bawden on domain adaptation for neural machine translation in the context of the DadaNMT project.
17 Jan 2022
🤝 Rua Ismail just joined ALMAnaCH as a research engineer
She will be working under the supervision of Benoît Sagot on the OSCAR corpus, in particular on language identification, and on the description of two Nubian languages.
1 Dec 2021
🤝 Nathan Godey just joined ALMAnaCH as a PhD student
He will be working under the supervision of Benoît Sagot and Éric de la Clergerie on improving language models, in particular by using approches derived from optimal transport.
27 Oct 2021
🎓 Louis Martin's PhD defence
Louis Martin defended his PhD thesis, supervised by Benoît Sagot, Éric de La Clergerie, Antoine Bordes (FAIR Paris), on sentence simplification using controllable and unsupervised methods.
1 Oct 2021
🤝 Lydia Nishimwe just joined ALMAnaCH as a PhD student
Within the framework of Rachel Bawden's PRAIRIE chair, she will be working on robust neural machine translation for user-generated content.
1 Oct 2021
🤝 You Zuo just joined ALMAnaCH as a research engineer
She will be working on fine-grained patent classification in collaboration with INPI, the French intellectual property office.
1 Oct 2021
🤝 Roman Castagné is now an ALMAnaCH PhD student
He will be working under the supervision of Benoît Sagot and Éric de la Clergerie on improving language models by better understanding what they learn and how they learn it.
20 Sept 2021
🤝 Camille Rey just joined ALMAnaCH as a Master 2 intern
She will be studing the errors produced by neural machine translation systems.
15 May 2021
🤝 Paul-Ambroise Duquenne just joined ALMAnaCH as a PhD student
He will carry out his research on LASER-like sentence representation spaces under the joint supervision of Benoît Sagot, for ALMAnaCH, and Holger Schwenk, for FAIR (Facebook's AI research lab in Paris) in the context of an industrial (“CIFRE”) PhD.
4 May 2021
We are thrilled to announce PAGnol, a new addition to our language model family.
PAGnol is a free, GPT-3-like generative LM for French, developed in collaboration with LightOn.
3 May 2021
🤝 Matthieu Futeral-Peter just joined ALMAnaCH as a Master 2 intern
His work is in collaboration with the Willow project-team at Inria, with the aim of constructing better multilingual and multimodal word embeddings.
19 Apr 2021
🤝 Tú Anh Nguyễn just joined ALMAnacH as a PhD student
He will carry out his research on the unsupervised learning of linguistic representations from speech (audio) data under the joint supervision of Benoît Sagot, for ALMAnaCH, and Emmanuel Dupoux, for FAIR (Facebook's AI research lab in Paris) in the context of an industrial (“CIFRE”) PhD.
5 Apr 2021
🤝 Hugo Scheithauer just joined ALMAnaCH as a Master 2 intern
He will work on the addition of NER technologies into the open-source eScriptorium environment for automatic transcription using the use case provided by the LECTAUREP project.
1 Apr 2021
🤝 Syrielle Montariol just joined ALMAnaCH as a post-doc
She will work within the H2020 CounteR project under the main supervision of Djamé Seddah on the detection of semantic changes in social media posts at an individual level, in order to contribute to detecting and analysing multiple types of radicalisation processes.
1 Apr 2021
🤝 Thomas Wang just joined ALMAnaCH as a research engineer
Within the framework of Benoît Sagot's PRAIRIE chair, he will work on novel neural language modelling architectures that require less computing power, less memory and/or less data for training. He will notably work on reducing the computational and memory impact of attention mechanisms, especially when long inputs must be processed at once.
1 Apr 2021
🤝 Roman Castagné just joined ALMAnaCH as a Master 2 intern
Within the framework of Benoît Sagot's PRAIRIE chair, he will work on multi-level neural language modelling architectures in order to lower the impact of input noise in the performance of such models.
1 Apr 2021
🤝 Julien Abadji just joined ALMAnaCH as a research engineer
Within the framework of Benoît Sagot's PRAIRIE chair, he will work on the quantitative (volume, number of languages) and qualitative (language classification accuracy, offensive content filtering) improvement of our Common-Crawl-based large multilingual corpus OSCAR. He will also work on the production of new versions of OSCAR on a regular basis.
8 Mar 2021
🤝 Manon Ovide just joined ALMAnaCH as a Master 2 intern
She will work on the digital scientific publishing pipeline set up for the DAHN project, and in particular on the publication step, in compliance with TEI guidelines.
11 Feb 2021
Cap'FALC conference
Inria Paris, Unapei and Facebook Artificial Intelligence Research present Cap'FALC, a project aiming to improve information accessibility for people with intellectual disabilities by developing a new digital tool to help produce more content in FALC (“Facile à Lire et à Comprendre”, i.e. Easy to Read and Understand). [The event will be held in French]
1 Feb 2021
🤝 Thibault Charmet just joined ALMAnaCH as a research engineer
He will work in collaboration with the Cour de Cassation on tools for improving jurisprudence consistency, as part of the IA Lab, an initiative within DINUM (the Direction of Digital Affairs attached to the French Prime Minister) whose goal is to help the State's public administrations to benefit from the recent advances in AI.
13 Jan 2021
New ALMAnaCH website launched!
19 Nov 2020
Article on the collaboration between ALMAnaCH and the Winespace start-up on Inria's website
16 Nov 2020
📣 Benoît Sagot at "France Is AI"
Benoît Sagot was invited as a panellist with François Yvon at "France Is AI", France's biggest event in Artificial Intelligence.
1 Nov 2020
🥂 Benoît Sagot promoted to "Directeur de Recherches"
1 Nov 2020
🤝 Rachel Bawden just joined ALMAnaCH as an Inria “Chargée de Recherches”
She will be working on machine translation and multilingual NLP.
1 Nov 2020
🤝 Arij Riabi just joined ALMAnaCH as a research engineer
Within the framework of Benoît Sagot's PRAIRIE chair, she will be working on NLP for low-resource, non-standardised language varieties, especially North-African dialectal Arabic written using the Latin script (Arabizi)
1 Nov 2020
🤝 Lucas Terriel just joined ALMAnaCH as a research engineer
Witin the EHRI, DAHN and NER4archives projects, he will work at the interface between NLP and Digital Humanities for archival documents, with a focus on named entity recognition in finding aids.
8 Oct 2020
🎓 Jack Bowers's PhD defence
Jack Bowers defended his PhD thesis, supervised by Laurent Romary, on language documentation and standards in digital humanities, and more precisely on the use of the TEI to document Mixtepec-Mixtec.
1 Oct 2020
🎓 Mohamed Khemakhem's PhD defence
Mohamed Khemakhem defended his PhD thesis, supervised by Laurent Romary, on standard-based lexical models for the automatic structuration of electronic dictionnaries
1 Sept 2020
🤝 Yves Tadjo just joined ALMAnaCH as a research engineer
Within the DAHN project, he will develop tools for digital humanities for archival documents.
15 Jul 2020
🎓 Loïc Grobol's PhD defence
Loïc Grobol defended his PhD thesis, supervised by Isabelle Tellier†, Frédéric Landragin, Marco Dinarelli and Éric de la Clergerie, on coreference resolution for French.
25 May 2020
Article on the Cap'FALC initiative on Inria's website
Cap'FALC is an initiative involving FAIR (Facebook) and UNAPEI. Its goal is to develop a text simplification algorithm and an accessible tool to aid the production of FALC (the French equivalent of “Easy read”) for people with mental disabilities
4 May 2020
Laurent Romary interviewed on Inria's website
19 Nov 2019
📰 French national radio station France Culture speaks about CamemBERT
18 Nov 2019
📰 The French newspaper Le Monde publishes a paper on CamemBERT
1 Jul 2019
ALMAnaCH is now an Inria project-team
1 Jan 2017
Creation of ALMAnaCH as an Inria team