Documentation
The Dzongkha Next Word Prediction System is a machine-learning-based research initiative aimed at enhancing digital productivity and preserving the national language of Bhutan, through cutting-edge NLP and AI technologies.
System Guide Video
Detailed walkthrough of the predictive interface and syllable-based input methods.
Technical Methodology
1. Data Curation
Web scraping and compiling a diverse Dzongkha corpus from news, legal, and literary genres.
2. Syllable Segmentation
Utilizing rule-based segmentation verified by linguists for the Dzongkha script.
3. Model Training
Fine-tuning deep learning models using PyTorch and TensorFlow.
Development Stack
- Languages: Python, JavaScript
- AI Frameworks: PyTorch, TensorFlow
- Web Architecture: Django Framework
- Hardware: NVIDIA GPU for model training
Social Impact
National Identity
Strengthening the presence of Dzongkha in the digital age.
Efficiency
Reducing typing effort by approximately 80%, boosting productivity.
Digital Literacy
Empowering students and educators with modern AI tools.