Tweet segmentation based on POS Tagging

T. Mounika, K. Deepika

Abstract


Twitter has become one of the most critical conversation channels with its potential imparting the most up to date and newsworthy information. considering wide use of twitter because the source of statistics, attaining an interesting tweet for user amongst a bunch of tweets is hard. A massive amount of tweets despatched per day by means of hundred tens of millions of users, statistics overload is inevitable. For extracting data in massive volume of tweets, Named Entity popularity (NER), methods on formal texts. however, many applications in records Retrieval (IR) and natural Language Processing (NLP) go through critically from the noisy and short nature of tweets.in this paper, we suggest a novel framework for tweet segmentation in a batch mode, referred to as HybridSeg through splitting tweets into meaningful segments, the semantic or context records is well preserved and without difficulty extracted by means of the downstream programs. HybridSeg unearths the foremost segmentation of a tweet through maximizing the sum of the stickiness ratings of its candidate segments. The stickiness score considers the probability of a phase being aphrase in English (i.e., global context) and the probability of a segment being a word in the batch of tweets (i.e., nearby context). For the latter, we recommend and evaluate  fashions to derive neighborhood contextby considering the linguistic capabilities and term-dependency in a batch of tweets, respectively. HybridSeg is likewise designed to iteratively learn from assured segments as pseudo remarks. As an utility, we display that excessive accuracy is achieved in named entity recognition with the aid of applying segmentbased component-of-speech (POS) tagging.


Full Text:

PDF




Copyright (c) 2017 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Publisher

EduPedia Publications Pvt Ltd, D-351, Prem Nagar-2, Suleman Nagar, Kirari, Nagloi, New Delhi PIN-Code 110086, India Through Phone Call us now: +919958037887 or +919557022047

All published Articles are Open Access at https://edupediapublications.org/journals/


Paper submission: editor@edupediapublications.com or edupediapublications@gmail.com

Editor-in-Chief       editor@edupediapublications.com

Mobile:                  +919557022047 & +919958037887

Websites   https://edupediapublications.org/journals/.

Journals Maintained and Hosted by

EduPedia Publications (P) Ltd in Association with Other Institutional Partners

http://edupediapublications.org/

Pen2Print and IJR are registered trademark of the Edupedia Publications Pvt Ltd.