文档介绍:Abstract
Microblog is ing a most popular application. According to the statis- tics, more than 100 million tweets publiched in everyday. These tweets not only convey the description of facts, but also contain the emotional states of massive microblog users. And these emotional informations may be help for user to decide whether buy a product, provide very important reference value panies to make market strategy, and even make massive data available for government to monitoring public opinion.
In light of this, we proposed a sentiment analysis method based on bination of syntactic dependencies and text classification techniques for Chinese tweets. The method adopts the syntactic dependencies to perform sentiment analysis, at the same time, com- putes a confidence for every tweet. Choosen tweets which confidence above a certain threshold as training samples, train a two-step sentiment classifier by using the content features and media features of tweets. Finally, classify the sentiment orientation of tweets again. In addiation, we also proposed a method that mon emoticons as the sen- timent class labels of tweets and implements an incremental learning method to tackle the problem of real-time sentiment analysis.
Experimental results show that the proposed method dramatically improves the pre- cision and the recall by 6% and 3% pared to the method that only based on syntactic dependencies. And the performance of our two feature sets are also better than unigram features, the precision and the recall both are 88% in term of subjective classifier, and they are % and % for sentiment classifier. Apart from this, the media features are good for trackling the problem of real-time sentiment analysis.
Keywords: Chinese Microblog, Sentiment Analysis, Syntactic dependencies, Text Classi- fication
目 录
摘 要··················································································· I ABSTRACT ····································································