I’m training up some machine learning models to 'disambiguate homographs' (aka heteronyms, or words with identical Latin alphabet spellings but different pronunciations like β€˜bow/bow’, β€˜tear/tear’). This will help solve one of the more annoying aspects of auto transliteration into #Shavian. It is both very exciting and intensely boring, since I’m having to make the data sets. Hard to believe there is almost no publicly available training data for this. #𐑖𐑱𐑝𐑾𐑯
@shavian Sounds like a great project! I have spent many an hour manually resolving homographs :)