NANDA is a 13 billion-parametre model, trained on approximately 2.13 trillion tokens of language data, with a strong focus on Hindi