Iโ€™m training up some machine learning models to 'disambiguate homographs' (aka heteronyms, or words with identical Latin alphabet spellings but different pronunciations like โ€˜bow/bowโ€™, โ€˜tear/tearโ€™). This will help solve one of the more annoying aspects of auto transliteration into #Shavian. It is both very exciting and intensely boring, since Iโ€™m having to make the data sets. Hard to believe there is almost no publicly available training data for this. #๐‘–๐‘ฑ๐‘๐‘พ๐‘ฏ
@shavian This is a big problem with Persian texts too