Machine-Learning-Based Transformation of Japanese Passive Sentences into Active by Separating Training Data into Each Input Particle
Masaki MURATA, Toshiyuki KANAMARU, Tamotsu SHIRADO, Hitoshi ISAHARA
We developed a new method of transforming Japanese case particles when transforming Japanese passive sentences into active sentences. This method separates training data into each input particle and uses machine learning for each particle. We also used numerous rich features for learning. Murata et al. conducted a previous study on transforming Japanese passive sentences into active sentences . They used machine learning but did not separate training data for any input particles and did not have many rich features for learning. They achieved an accuracy rate of 89.77%. We added many rich features to those used in Murata et al.'s study and obtained an accuracy rate of 92.00%. In addition, we used our method of separating training data into each input particle and using machine learning for each particle, and obtained an accuracy rate of 94.30%. We confirmed the significance of these improvements through a statistical test. We also conducted experiments utilizing traditional methods using verb dictionaries and manually prepared heuristic rules and confirmed that our method achieved much higher accuracy rates than traditional methods.