Source: wikibot/hamshahri-corpus

= Hamshahri Corpus
{wiki=Hamshahri_Corpus}

The Hamshahri Corpus is a large-scale Persian text dataset that was created to support natural language processing (NLP) research and applications, particularly for the Persian language. It consists of a collection of newspaper articles that were published by the Hamshahri newspaper in Iran.