文档介绍:硕士研究生学位论文
题目:Maze检索系统性能优化和资源评价
姓名:
学号:
院系:信息科学技术学院
专业:计算机系统结构
研究方向:计算机网络与分布式系统
导师:教授,讲师
二〇〇六年五月
版权声明
任何收存和保管本论文各种版本的单位和个人,未经本论文作者同意,不得将本论文转借他人,亦不得随意复制、抄录、拍照或以任何方式传播。否则,引起有碍作者著作权之问题,将可能承担法律责任。
摘要
Maze系统是基于P2P的内容交换系统,采用集中式架构管理用户和资源。本文的研究范围为Maze检索系统以及Maze系统中资源的性质。
第一部分详细介绍了Maze检索系统的设计和实现,并针对Maze检索系统的性能问题进行了研究,讨论检索效率的影响因素,并提出一些改进方法和途径。这些方法包括改进整数压缩编码、基于Peer的多级缓冲技术等方法。最后对Maze系统的检索效率进行综合评测,并提出改进方案。
第二部分研究Maze系统中资源的性质。首先提出了根据文件指纹的搜索方式,利用文件指纹聚合镜像文件,向用户提供所需文件的所有可下载源。然后提出禁用指纹库和禁用词表结合的禁用文件识别方法,控制Maze网络中的禁用文件的传播。最后,本文提出ResourceRank算法,利用下载关系构造一个投票模型,评估资源的价值。ResourceRank算法对资源进行全局评价,有助于选择性索引文件资源和合理排序返回结果。
关键词: P2P,检索系统,倒排文件,缓冲机制,ResourceRank
Retrieval Performance Optimization and Resource Evaluation in Maze System
Abstract
Maze is works file exchange system based on P2P, it manages users and resources through centralized architecture. In this thesis, we study the Maze retrieval system and the attributes of the resources in Maze.
In the first part, we detailed introduce the design and realization of Maze retrieval system. And we do the research on the performance of the Maze retrieval system, discuss the factors of influencing the indexing efficiency. Some methods for improvement are presented, which includes improving the pression code and multistage cache technology based on the Peer. Then we evaluate the efficiency of Maze retrieval system, and propose the scheme to improve.
In the second part, we study the attributes of the resources in Maze. We first propose the search method based on the file figure. The file figure is used to cluster the mirror image file, then the available sources can be provided to the user. What’s more, we present a recognition method which makes user of both an invalid file figure database and invalid words table to control the invalid file transfer in Maze.
At last, an algorithm called ResourceRank is proposed. We use the upload and download relationship to construct a vote model to evaluate the resources. ResourceRank makes the