× Description Téléchargement Publication(s) Contact
 Retourner à Logiciels et Ressources

goclassy

Chaîne de traitement asynchrone et parallèle pour la classification de Common Crawl

Site web principal

Description

Chaîne de traitement parallèle asynchrone pour la classification de données Common Crawl à l'aide de l'architecture fastText.

Citation et publication(s)

Si vous utilisez ce travail, merci de citer :

Pedro Javier Ortiz Suárez, Benoît Sagot and Laurent Romary. 2019. Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures.
In 7th Workshop on the Challenges in the Management of Large Corpora (CMLC-7). Leibniz-Institut für Deutsche Sprache. Cardiff, United Kingdom.
HAL PDF
@inproceedings{ortizsuarez:hal-02148693,
 address = {Cardiff, United Kingdom},
 author = {Ortiz Su{\'a}rez, Pedro Javier and Sagot, Beno{\^i}t and Romary, Laurent},
 title = {{Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures}},
year = {2019},
 booktitle = {{7th Workshop on the Challenges in the Management of Large Corpora (CMLC-7)}},
 publisher = {{Leibniz-Institut f{\"u}r Deutsche Sprache}},
 editor = {Piotr Ba{\'n}ski and Adrien Barbaresi and Hanno Biber and Evelyn Breiteneder and Simon Clematide and Marc Kupietz and Harald L{\"u}ngen and Caroline Iliadi},
 doi = {10.14618/IDS-PUB-9021},
 url = {https://inria.hal.science/hal-02148693},
 hal_pdf = {https://inria.hal.science/hal-02148693v1/file/Asynchronous_Pipeline_for_Processing_Huge_Corpora_on_Medium_to_Low_Resource_Infrastructures.pdf},
}

Contact

Pour plus d'informations ou pour poser une question, merci de contacter Pedro Ortiz Suarez

pedro.ortiz-suarez[at]inria.fr