Learning Topic Representation for SMT with Neural Networks
- Lei Cui ,
- Dongdong Zhang ,
- Shujie Liu ,
- Shujie Liu ,
- Qiming Chen ,
- Mu Li ,
- Ming Zhou ,
- Muyun Yang
ACL 2014 |
Published by ACL - Association for Computational Linguistics
Statistical Machine Translation (SMT)
usually utilizes contextual information
to disambiguate translation candidates.
However, it is often limited to contexts
within sentence boundaries, hence broader
topical information cannot be leveraged.
In this paper, we propose a novel approach
to learning topic representation for parallel
data using a neural network architecture,
where abundant topical contexts are
embedded via topic relevant monolingual
data. By associating each translation rule
with the topic representation, topic relevant
rules are selected according to the distributional
similarity with the source text
during SMT decoding. Experimental results
show that our method significantly
improves translation accuracy in the NIST
Chinese-to-English translation task compared
to a state-of-the-art baseline.