This paper demonstrates the importance of relation equivalence for entity translation pair discovery. Existing approach of understanding relation equivalence has focused on using explicit features of co-occurring entities. In this paper, we explore latent features of temporality for understanding relation equivalence, and empirically show that the explicit and latent features complement each other. Our proposed hybrid approach of using both explicit and latent features improves relation translation by 0.16 F1-score, and in turn improves entity translation by 0.02.
Understanding relations is important in entity tasks. In this paper, we illustrate such importance using named entity (NE) translation mining problem. Early research on NE translation used phonetic similarities, for example, to mine the translation ‘Mandelson’‘{CJK*}UTF8zhfsæ¼å¾·å°æ£®’[ManDeErSen] with similar sounds [9, 15]. However, not all NE translations are based on transliterations, but they might be based on semantics (e.g., ‘WTO’‘{CJK*}UTF8zhfsä¸è´¸ç»ç»’[ShiMaoZuZhi]), or even arbitrary (e.g., ‘Jackie Chan’‘{CJK*}UTF8zhfsæé¾’[ChengLong]).
To address this challenge, current state-of-the-art approaches build an entity graph for each language corpus, and align the two graphs by propagating the seed translation similarities (Figure 1) [7, 17]. For example, arbitrary translation pair such as (Jackie Chan, {CJK*}UTF8zhfsæé¾) can be obtained, if he is connected to his film ‘Drunken Master’ ({CJK*}UTF8zhfséæ³) in both graphs. That is, we can propagate the seed translation similarity of (Drunken Master, {CJK*}UTF8zhfséæ³) to neighbor entities ‘Jackie Chan’ and ‘{CJK*}UTF8zhfsæé¾’ in each graph.
When two graphs are obtained from parallel corpora,
graphs are symmetric and “blind propagation” described above is effective.
In contrast, Lee and Hwang () propose “selective propagation”
for asymmetric graphs, of comparing the semantics of relations.
A key contribution of this paper is using relation temporality for determining relation equivalence.
Existing work [14, 12, 11] uses only co-occurring
entity pairs, or explicit features (EF).
For example, for a relation
pay an official visit to pay an official visit to pay an official visit to
In contrast, we propose to explore corpus latent features (LF), to complement the sparsity problem of EF: Out of 158 randomly chosen correct relation translation pairs we labeled, 64% has only one co-occurring entity pair, which makes EF not very effective to identify these relation translations. Therefore, we leverage relation temporality, which is both orthogonal and complementary to existing efforts leveraging entity temporality [8, 6, 16]. In particular, we discover three new challenges on using temporality for relation understanding in comparable corpora, which we discuss in detail in Section 3.2. Based on these challenges, we identify three new features for LF.
We observe the complementary nature of EF and LF, then propose a hybrid approach combining both features. Our new hybrid approach significantly improves the relation translation (0.16 higher F1-score than EF), and in turn improves the entity translation (0.02 higher F1-score).
Selective propagation, leveraging the statements extracted from bilingual comparable corpora, can be summarized by several steps.
Initialize entity translation function .
Build relation translation function using .
Update entity translation function to acquire using .
Repeat Step 2 and Step 3.
For Step 1, an existing method for entity translation is adopted. In our experiments, we use a non-selective (hence not requiring relation translations) propagation approach [17] with [10] for a base translation matrix. The focus of this paper is Step 2, building the translation score of English relation and Chinese relation : We will discuss the detailed procedure of Step 2 and propose how to improve it in Section 3. Step 3 is the stage that selective propagation takes place.
Step 2 and Step 3 reinforce each other to improve the final entity translation function. While Step 3 is well-defined in [11], to propagate entity translation scores when the relation semantics of the edges are equivalent, Step 2 has been restricted to the explicit feature, i.e., co-occurring entities or shared context. In clear contrast, by discovering novel latent features based on temporal properties, we can increase the accuracy of both entity and relation translations. Note that we omit for readability in the following sections.
In this section, we present our approaches to obtain relations of equivalent semantics across languages (e.g., visit{CJK*}UTF8zhfs访é®). Formally, our goal is to build the relation translation score function for English relation and Chinese relation .
In this section, we briefly illustrate a baseline method EF [11]. As we mentioned in the introduction, traditional approaches leverage common co-occurring entity-pairs. This observation also holds in the bilingual environment by exploiting seed entity translations. For example, let us say that we have two extracted statements: (Bruce Willis, star in, The Sixth Sense) and ({CJK*}UTF8zhfså¸é²æ¯{CJK*}UTF8zhfså¨å©æ¯ (Bruce Willis), {CJK*}UTF8zhfsä¸»æ¼ (star in), {CJK*}UTF8zhfs第å æ (The Sixth Sense)). Knowing a few seed entity translations using , ‘Bruce Willis’‘{CJK*}UTF8zhfså¸é²æ¯{CJK*}UTF8zhfså¨å©æ¯’ and ‘The Sixth Sense’‘{CJK*}UTF8zhfs第å æ’, we can find star in and {CJK*}UTF8zhfsä¸»æ¼ are semantically similar.
Specifically, we quantify this similarity based on the number of such common entity pairs that we denote as for an English relation and a Chinese relation . The existing approaches are variations of using . Our baseline implementation uses the one by [11], and we refer the reader to the paper for formal definitions and processing steps we omitted due to the page limit.
Unfortunately, this approach suffers from sparsity of the common entity pairs due to the incomparability of the corpora and those entities that cannot be translated by . Therefore, we leverage corpus latent features as an additional signal to overcome this problem.
We exploit the temporal distribution of textual element during -th week in statements; we count the occurrences of the element on a weekly basis, and normalize them to observe . For example, Figure 5 shows the relation temporal distribution visit against week . Unlike entities, we can easily observe the dissimilarity of the temporal distributions of semantically equivalent relations. We identify the three big challenges in exploiting the temporality in relation translations.
[C1] Considering temporal distributions of relations alone is not sufficient. For relations, such as visit, that involves diverse entities, the temporal distributions are highly noisy (Figure 5).
To address the first challenge, we use a finer-granularity unit for observing the temporality. More specifically, we exploit a coupling of a relation and an entity: where is an entity, a relation, and * is a placeholder indicating that any noun phrase is accepted for the second argument of a statement.11 We use both and to measure the relation translation scores and leverage the average score. But in this section, we only use for readability. As shown in Figure 5, is more distinctive and hence a key clue to find semantically equivalent relations.
[b]0.9
[b]0.9
{subfigure}[b]0.9
[t]0.47
{subfigure}[t]0.47
[C2] Considering entity-relation coupling distribution alone is not sufficient due to the domination of individual temporality. For example, Figure 8 shows entity-dominating entity-relation temporality. If an entity has a peak at some period (Figure 8), most relations that are coupled with the entity would have a peak at the very same period (Figure 8). This makes all relations that appear with this entity very similar to each other regardlessly of semantics. To address this challenge, we use features to measure whether is too close to either of or .
[C3] Lastly, we have to eliminate false positives in relation temporality. To illustrate, two relations deploy and {CJK*}UTF8zhfs娅é¨ç½² (deploy at) have similar temporal behaviors (Figure 5). However, the first relation takes [person], but the second relation [location] for the second argument.
To address this, we check the common co-occurring entity pair of the relations. For example, we can obtain “Russia deployed an aircraft carrier”, but not “Russia deployed at ({CJK*}UTF8zhfs娅é¨ç½²) an aircraft carrier”. Thus, we cannot acquire any common entity pair like (Russia, aircraft carrier) for deploy and {CJK*}UTF8zhfs娅é¨ç½² (deploy at).
We compute the similarity of two relations in English and in Chinese using the following 2-steps.
Compute the similarity of temporal distributions of entity-relation couplings for each bilingual entity pair .
Compute the translation score by aggregating the coupling similarities.
Considering the three challenges, we produce a list of features to measure the coupling similarity as follows.
[Base feature] : . The entity translation score obtained in the previous iteration or the seed entity translation score.
[C1] : . The temporal similarity of the couplings, where is the Jensen-Shannon divergence of two distributions and , defined as , with and .
[C2] , , , :
Entity to entity-relation distribution difference (D1) and relation to entity-relation distribution difference (D2), for English and Chinese respectively.
[C3] : The existence of a common entity pair using the seed entity translations (boolean). That is, if , and otherwise.
Additionally, we use the following features to consider absolute frequencies of textual elements as well because 1) we are more confident with more evidence and 2) in the comparable corpora, the equivalent elements are likely to show similar frequencies.
, : and . is a normalization function, for which we use a sigmoid function over a linear transformation of .
and :
With these features, we measure the similarity of a pair of couplings as follows.
(1) |
By aggregating coupling similarities, we measure the translation score of two relations:
(2) |
where with , a set of translation pairs obtained in the seeds or previous iteration such as (Bush, {CJK*}UTF8zhfså¸ä»).
We normalize the obtained function values for each English relations using the top- Chinese translations. That is, for , we redefine the score as where is the -th rank Chinese relation for by Equation 2. We empirically set .
We find that LF and EF are complementary. Table 1 shows the examples of relations and their translations. In general, LF can translate more relations (e.g., support and capture). However, in some cases like ratify, highly related relations may induce noise. That is, we always {CJK*}UTF8zhfs讨论 (discuss) before we {CJK*}UTF8zhfsæ¹å (ratify) something and hence the temporal behavior of {CJK*}UTF8zhfs讨论 (discuss) is also very similar to that of ratify. On the other hand, it can be correctly translated using EF.
Thus, we produce the hybrid relation translation, and we empirically set :
(3) |
English | LF | EF |
---|---|---|
visit | {CJK*}UTF8zhfsè®¿é® (visit) | {CJK*}UTF8zhfsè®¿é® (visit) |
support | {CJK*}UTF8zhfså…æä¾ (provide to …) | - |
ratify | {CJK*}UTF8zhfs讨论 (discuss) | {CJK*}UTF8zhfsæ¹å (ratify) |
In this section, we evaluate the proposed approach on the entity translation task and the relation translation task. We extract English and Chinese statements from news articles in 2008 by Xinhua news who publishes news in both English and Chinese, which were also used by Lee and Hwang (). The number of English articles is 100,746, and that of Chinese articles is 88,031. As we can see from the difference in the numbers of the documents, the news corpora are not direct translations, but they have asymmetry of entities and relations.
Person | Organization | |||||
---|---|---|---|---|---|---|
Method | P. | R. | F1 | P. | R. | F1 |
LF+EF | 0.84 | 0.80 | 0.82 | 0.60 | 0.52 | 0.56 |
EF | 0.81 | 0.79 | 0.80 | 0.56 | 0.52 | 0.54 |
Seed | 0.80 | 0.77 | 0.78 | 0.49 | 0.44 | 0.46 |
PH+SM | 0.59 | 0.59 | 0.59 | 0.29 | 0.29 | 0.29 |
In this section, we present experimental settings and results on translating entities using our proposed approaches. To measure the effectiveness, we use a set of gold standard entity translation pairs which consist of 221 person entities and 52 organization entities. We measure precision, recall, and F1-score based on the returned translation pairs for each English entity as it is done in [11].
We compare our hybrid approach, which is denoted by LF+EF with EF [11], a combined approach PH+SM of phonetic similarity and letter-wise semantic translation for better accuracy for organizations [10], and the seed translations Seed that we adopt [17] with PH+SM as a base translation matrix.33Our results leveraging relational temporality outperforms the reported results using entity temporality on the same data set. The two approaches using temporality are orthogonal and can be aggregated, which we leave as our future directions. We process one iteration of the entire framework (Step 1-3) for both LF+EF and EF.
Table 2 shows the comparison of the methods. Our proposed approach LF+EF shows higher F1-score than the baselines. In particular, our approach outperforms EF. For example, ‘Matthew Emmons’ is a lesser known entity, and we have only few statements mentioning him in the corpora. The corpus explicit feature EF alone cannot translate the relation win and, in turn, ‘Matthew Emmons’. However, LF+EF translates him correctly into {CJK*}UTF8zhfs马修·åèæ¯ through the relation win.
This section considers the relation translation task. Each relation translation method translates an English relation into a list of Chinese relations, and we check whether a Chinese relation with the highest translation score is the correct translation. We consider the relation translation is correct when the semantics are equivalent. For example, {CJK*}UTF8zhfså» (leave for/go to) is a correct translation of leave for, but {CJK*}UTF8zhfsç¦»å¼ (leave) is not. Total 3342 English-Chinese relation translation pairs returned by our method and the baselines are randomly shown and labeled. Out of 3342 pairs, 399 are labeled as correct.
Methods | Precision | Recall | F1 |
---|---|---|---|
LF+EF | 0.37 | 0.44 | 0.40 |
LF | 0.26 | 0.25 | 0.26 |
EF | 0.41 | 0.17 | 0.24 |
Table 3 shows the comparisons of LF, EF and their hybrid LF+EF. We can clearly see that LF shows higher recall than EF while EF shows higher precision. As we emphasized in Section 3.3, we can see their complementary property. Their hybrid LF+EF has both high precision and recall, thus has the highest F1-score.
Note that the absolute numbers (due to the harsh evaluation criteria) may look low. But the top translations are still relevant (e.g., fight is translated to {CJK*}UTF8zhfsé©» (deploy troops)). In addition, the lower ranked but correct relation translations also affect entity translation. Therefore, even lower-performing EF boosted the entity translations, and in effect, our approach could achieve higher F1-score in the entity translation task.
Eng. Rel. | C1 | C1+C2 | C1+C2+C3 | EF |
---|---|---|---|---|
visit | 15 | 4 | 1 | 1 |
drop | 21 | 14 | 1 | - |
capture | 6 | 4 | 1 | - |
To illustrate the detailed effects of the corpus latent features, Table 4 shows the ranks of correct Chinese translations for English relations by methods using selected features for the challenges. For comparison, the ranks of the correct translations when using EF are shown. Our approach using the entity-relation coupling similarity feature for [C1] alone often cannot find the correct translations. But using all features removes such noise.
This paper studied temporality features for relation equivalence. With the proposed features, we devised a hybrid approach combining corpus latent and explicit features with complementary strength. We empirically showed the effectiveness of our hybrid approach on relation translation, and it, in turn, improved entity translation.
This research was supported by the MSIP (The Ministry of Science, ICT and Future Planning), Korea and Microsoft Research, under IT/SW Creative research program supervised by the NIPA(National IT Industry Promotion Agency). (NIPA-2013-H0503-13-1009).