Tech

Sri Lanka launches largest Sinhala LLM with 10 million sentences (SinLlama)

Published

on

Research students at the Department of Computer Science and Engineering, University of Moratuwa have developed the country’s first large-scale large language model (LLM) that exclusively include Sinhala, a breakthrough in advancing local language computing.

This project was jointly supervised by Dr Surangika Ranathunga (Massey University, New Zealand), Dr Nisansa de Silva (University of Moratuwa) and Dr Rishemjit Kaur (Central Scientific Instruments Organisation, India).

The model, named “SinLlama,” was built by continually pre-training Llama-3-8B with nearly 10 million Sinhala sentences. According to the research team, SinLlama is the largest Sinhala LLM to date and has already outperformed Llama-3-8B on Sinhala text classification benchmarks.

Leave a Reply

Your email address will not be published. Required fields are marked *

Trending

Exit mobile version