1 / 4
文档名称:

推荐系统之基于物品的协同过滤算法.pdf

格式:pdf   大小:162KB   页数:4页
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

分享

预览

推荐系统之基于物品的协同过滤算法.pdf

上传人:鼠标 2023/6/8 文件大小:162 KB

下载得到文件列表

推荐系统之基于物品的协同过滤算法.pdf

相关文档

文档介绍

文档介绍:该【推荐系统之基于物品的协同过滤算法】是由【鼠标】上传分享,文档一共【4】页,该文档可以免费在线阅读,需要了解更多关于【推荐系统之基于物品的协同过滤算法】的内容,可以使用淘豆网的站内搜索功能,选择自己适合的文档,以下文字是截取该文章内的部分文字,如需要获得完整电子版,请下载此文档到您的设备,方便您编辑和打印。推荐系统之基于物品的协同过滤算法(ItemCF)推荐系统之基于物品的协同过滤算法(ItemCF)?前端时间已经把基于?户的推荐系统给弄出来了,详情见我的另?篇?章:?,(建议先看懂UserCF后再来看这篇?章,当然?佬可以忽视)其实理解了基于?户的协同过滤算法,再来看基于物品的协同过滤算法,就会感觉没啥太?差异,具体的思路,通俗的讲:?户A?喜欢了?个物品s集合,那么推荐的时候就把与物品s集合?最相似的前N个物品推荐给?户A,结束。是不是?简意赅?哈哈,其实道理都差不多,看懂了UserCF再来看ItemCF,就会感觉基本差不多。具体的步骤呢:?、计算物品之间的相似度。?、根据物品的相似度和?户的历史?为给?户?成推荐列表同样,计算相似度的时候公式?的也是余弦相似度,详情就看我写的UserCF吧:?,因为都差不多就不重复写了,对照着上?篇博客然后在看看书,就知道基本完全?样了这?就贴上书上给的?个案例,最下?就是系统的推荐TOP?N?,很好理解:依旧是??的ItemCF?代码,贴上供?家学****-?coding:?utf-8?-*-'''Created?on?2015-06-******@author:?Lockvictor'''import?sysimport?randomimport?mathimport?osfrom?operator?import?(0)class?ItemBasedCF(object):????'''?TopN?mendation?-?Item?Based?Collaborative?Filtering?'''????def?__init__(self):?????????=?{}?=?{}?????????=?{}?????????=?20?????????=?10?????????=?{}?????????=?{}?????????=?0????????print('Similar?movie?number?=?%d'?%?,?file=)????????print('mended?movie?number?=?%d'?%??????????????,?file=)????***@staticmethod????def?loadfile(filename):????????'''?load?a?file,?return?a?generator.?'''????????fp?=?open(filename,?'r')????????for?i,?line?in?enumerate(fp):????????????yield?('\r\n')????????????if?i?%?100000?==?0:????????????????print?('loading?%s(%s)'?%?(filename,?i),?file=)????????()????????print?('load?%s?'?%?filename,?file=)????def?generate_dataset(self,?filename,?pivot=):????????'''?load?rating?data?and?split?it?to?training?set?and?test?set?'''????????trainset_len?=?0????????testset_len?=?0????????for?line?in?(filename):????????????user,?movie,?rating,?_?=?('::')????????????#?split?the?data?by?pivot????????????if?()?<?pivot:????????????????(user,?{})????????????????[user][movie]?=?int(rating)????????????????trainset_len?+=?1????????????else:????????????????(user,?{})????????????????[user][movie]?=?int(rating)????????????????testset_len?+=?1????????print?('split?training?set?and?test?set?',?file=)????????print?('train?set?=?%s'?%?trainset_len,?file=)????????print?('test?set?=?%s'?%?testset_len,?file=)????def?calc_movie_sim(self):????????'''?calculate?movie?similarity?matrix?'''????????print('counting?movies?number?and?popularity...',?file=)????????for?user,?movies?in?():????????????for?movie?in?movies:????????????????#?count?item?popularity????????????????if?movie?not?in?:????????????????????[movie]?=?0????????????????[movie]?+=?1????????print('count?movies?number?and?popularity?',?file=)????????#?save?the?total?number?of?movies?????????=?len()????????print('total?movie?number?=?%d'?%?,?file=)????????#?count?co-rated?users?between?items????????itemsim_mat?=?????????print('building?co-rated?users?matrix...',?file=)print('building?co-rated?users?matrix...',?file=)????????for?user,?movies?in?():????????????for?m1?in?movies:????????????????for?m2?in?movies:????????????????????if?m1?==?m2:????????????????????????continue????????????????????(m1,?{})????????????????????itemsim_mat[m1].setdefault(m2,?0)????????????????????itemsim_mat[m1][m2]?+=?1????????print('build?co-rated?users?matrix?',?file=)????????#?calculate?similarity?matrix????????print('calculating?movie?similarity?matrix...',?file=)????????simfactor_count?=?0????????PRINT_STEP?=?2000000????????for?m1,?related_movies?in?():????????????for?m2,?count?in?():????????????????itemsim_mat[m1][m2]?=?count?/?(????????????????????[m1]?*?[m2])????????????????simfactor_count?+=?1????????????????if?simfactor_count?%?PRINT_STEP?==?0:????????????????????print('calculating?movie?similarity?factor(%d)'?%??????????????????????????simfactor_count,?file=)????????print('calculate?movie?similarity?matrix(similarity?factor)?',??????????????file=)????????print('Total?similarity?factor?number?=?%d'?%??????????????simfactor_count,?file=)????def?mend(self,?user):????????'''?Find?K?similar?movies?and?mend?N?movies.?'''????????K?=?????????N?=?????????rank?=?{}????????watched_movies?=?[user]????????for?movie,?rating?in?():????????????for?related_movie,?similarity_factor?in?sorted([movie].items(),???????????????????????????????????????????????????????????key=itemgetter(1),?reverse=True)[:K]:????????????????if?related_movie?in?watched_movies:????????????????????continue????????????????(related_movie,?0)????????????????rank[related_movie]?+=?similarity_factor?*?rating????????#?return?the?N?best?movies????????return?sorted((),?key=itemgetter(1),?reverse=True)[:N]????def?evaluate(self):????????'''?print?evaluation?result:?precision,?recall,?coverage?and?popularity?'''????????print('Evaluation?start...',?file=)????????N?=?????????#??varables?for?precision?and?recall????????hit?=?0????????rec_count?=?0????????test_count?=?0????????#?varables?for?coverage????????all_rec_movies?=?set()????????#?varables?for?popularity????????popular_sum?=?0????????for?i,?user?in?enumerate():????????????if?i?%?500?==?0:????????????????print?('mended?for?%d?users'?%?i,?file=)print?('mended?for?%d?users'?%?i,?file=)????????????test_movies?=?(user,?{})????????????rec_movies?=?(user)????????????for?movie,?_?in?rec_movies:????????????????if?movie?in?test_movies:????????????????????hit?+=?1????????????????(movie)????????????????popular_sum?+=?(1?+?[movie])????????????rec_count?+=?N????????????test_count?+=?len(test_movies)????????precision?=?hit?/?(?*?rec_count)????????recall?=?hit?/?(?*?test_count)????????coverage?=?len(all_rec_movies)?/?(?*?)????????popularity?=?popular_sum?/?(?*?rec_count)????????print?('precision=%.4f\trecall=%.4f\tcoverage=%.4f\tpopularity=%.4f'?%???????????????(precision,?recall,?coverage,?popularity),?file=)if?__name__?==?'__main__':????ratingfile?=?('ml-1m',?'')????itemcf?=?ItemBasedCF()????(ratingfile)????()????()代码来源: