|
|
ヤマグチ タカヒラ
Yamaguchi Takahira 山口 高平 所属 神奈川大学 情報学部 システム数理学科 神奈川大学大学院 工学研究科 工学専攻(情報システム創成領域) 職種 教授 |
|
言語種別 | 英語 |
発行・発表の年月 | 2013/11 |
形態種別 | 学術雑誌 |
査読 | 査読あり |
標題 | An Automatic sameAs Link Discovery from Wikipedia |
執筆形態 | 共著 |
掲載誌名 | SEMANTIC TECHNOLOGY |
出版社・発行元 | SPRINGER-VERLAG BERLIN |
巻・号・頁 | 8388,pp.399-413 |
著者・共著者 | Kosuke Kagawa,Susumu Tamagawa,Takahira Yamaguchi |
概要 | Spelling variants of words or word sense ambiguity takes many costs in such processes as Data Integration, Information Searching, data preprocessing for Data Mining, and so on. It is useful to construct relations between a word or phrases and a representative name of the entity to meet these demands. To reduce the costs, this paper discusses how to automatically discover "sameAs'' and "meaningOf'' links from Japanese Wikipedia. In order to do so, we gathered relevant features such as IDF, string similarity, number of hypernym, and so on. We have identified the link-based score on salient features based on SVM results with 960,000 anchor link pairs. These case studies show us that our link discovery method goes well with more than 70 % precision/recall rate. |
DOI | 10.1007/978-3-319-06826-8_29 |
ISSN | 0302-9743 |