The Bijankhan Corpus is a large annotated corpus of the Persian language, created to support research in natural language processing (NLP) and computational linguistics. It consists of a diverse set of texts, including literary, scientific, and journalistic texts, and provides annotations for various linguistic features, such as part-of-speech tagging, dependency parsing, and named entity recognition.
New to topics? Read the docs here!