I. Ontology-based Information Retrieval(3)

2021-04-06 05:55

Abstract: In the proposed article a new, ontology-based approach to information retrieval (IR) is presented. The system is based on a domain knowledge representation schema in form of ontology. New resources registered within the system are linked to conce

where S is the diagonal matrix of singular values and U,Vare matrices of left and right singular vectors. If the singular values in S are ordered by size, the firstklargest values may be kept and the remaining smaller ones are set to zero. The product of the resulting matrices is a matrix approximately equal to A, and is closest to A in the least squares sense.

TA ASVD where ASVD=UKSKVK

In order to determine similarity between a query and approximate document vector Di,SVD, we need to transform query vector to new feature space. (Original query vector is computed with tf-idf scheme as described above for vector model approach.)

T 1QSVD=QTF IDFUKSK

and then we can compute similarity in the same way as before, i.e.

simSVD(QSVD,Di,SVD)=Di,SVD×QSVD

Di,SVDQSVD.

2.3. ONTOLOGY-BASED APPROACH

This part describes the Webocrat-like approach that uses ontology for document retrieval purposes. For the experiments described below we did not consider type of relation in ontology for calculation of similarity between concepts. Moreover, we assumed that the set of relevant concepts to the query is known. But this condition can be achieved with any technique for assigning concepts from ontology to a query, e.g. based on manual assignment or based on synonyms to query terms, making use of Wordnet or other.

The way in which a query is processed by this approach is shown on the Figure 1. For a given query first appropriated concepts are retrieved - in our case manually from the user. Then the set of concepts associated with each document is retrieved from database. As next, these two sets are compared using simple metric, which expresses the similarity between a document Di and given query Q.

Qcon∪Di,conifQcon∪Di,con≠0

simonto(Q,Di)=

k

where Qcon is a set of concepts assigned to query Q and Dcon is a set of concepts assigned to document Di, and k is small constant, e.g. 0.1. Resulted number represents ontology-based similarity measure. Better results have been achieved when this number have been combined with some of the previous two retrieval approaches described above (i.e. LSI approach or vector model). The final similarity is then computed as multiplication, e.g.

sim(Q,Di)=simonto(Q,Di) simTF IDF(Q,Di)


I. Ontology-based Information Retrieval(3).doc 将本文的Word文档下载到电脑 下载失败或者文档不完整,请联系客服人员解决!

下一篇:接龙小学教师2011年暑期培训学习活动总结

相关阅读
本类排行
× 注册会员免费下载(下载后可以自由复制和排版)

马上注册会员

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: