Project

Text mining on Amazon reviews to extract feature based feedback

Analyzing the feedback on a product or a service helps to improve the quality of the product or service. That way, reviews from online shopping sites (such as Amazon) not only help a consumer to buy a product but also can help a manufacturer or seller to know the pros and cons of their product. Amazon star ratings alone is not enough for this. One should go through the text reviews to know specifically which feature of the product is lacking customer satisfaction. But a product may have thousands of reviews and it’s hard for a person to go through all the reviews. Hence, we need a system which can give a statistical report on the number of reviewers not satisfied with a specific feature of a product. This project enables the user to view feature based review for a selected category of Amazon products. For a particular product the percentage of dissatisfied reviewers for each major feature of the product can be viewed. The dataset which includes product details and customer reviews for each product are collected from Amazon.com. The implementation of this system is achieved by using MongoDB and R. The statistical results that are generated by the system are visualized with the help of Tableau software. The Amazon reviews undergo Natural Language Processing and text mining to identify major features of the product. Then a deep sentiment analysis is made to identify the polarity (positive or negative) of each review. This project can be further developed to a user interactive web or mobile application where user can choose categories and products and have better visualization of results. Implementation required integrating both data mining and artificial intelligence techniques.

Project (M.S., Computer Science)--California State University, Sacramento, 2017.

Analyzing the feedback on a product or a service helps to improve the quality of the product or service. That way, reviews from online shopping sites (such as Amazon) not only help a consumer to buy a product but also can help a manufacturer or seller to know the pros and cons of their product. Amazon star ratings alone is not enough for this. One should go through the text reviews to know specifically which feature of the product is lacking customer satisfaction. But a product may have thousands of reviews and it’s hard for a person to go through all the reviews. Hence, we need a system which can give a statistical report on the number of reviewers not satisfied with a specific feature of a product. This project enables the user to view feature based review for a selected category of Amazon products. For a particular product the percentage of dissatisfied reviewers for each major feature of the product can be viewed. The dataset which includes product details and customer reviews for each product are collected from Amazon.com. The implementation of this system is achieved by using MongoDB and R. The statistical results that are generated by the system are visualized with the help of Tableau software. The Amazon reviews undergo Natural Language Processing and text mining to identify major features of the product. Then a deep sentiment analysis is made to identify the polarity (positive or negative) of each review. This project can be further developed to a user interactive web or mobile application where user can choose categories and products and have better visualization of results. Implementation required integrating both data mining and artificial intelligence techniques.

Relationships

Items