Access to large pre-trained models of varied architectures in many different languages is central to the democratisation of NLP. Built with LightON we introduce PAGnol, a collection of French GPT models. PAGnol is based on the GPT-3 architecture with some GPT-2 specific components, and we use scaling laws to efficiently train PAGnol-XL (1.5B parameters) with the same computational budget as much smaller BERT-based models for French. PAGnol-XL is the largest model trained to date for the French language.
PAGnol is distributed under the following licence: MIT