Word segmentation and POS tagging are the basis of natural language processing. Word segmentation means to combine single-character sequences into word sequences according to the grammar norms. POS tagging refers to identifying the most likely part of speech with the given word sequences.
Named entity recognition is a basic task required by the natural language upper application, which refers to the identification of entities with specific meaning in the text, including person names, place names, organization names and proper nouns, etc.
The text sentiment analysis algorithm can automatically analyze and recognize the opinions or sentimental tendencies expressed in the text, and give the sentimental orientation indicators that can express the polarity and intensity of sentiments.
Keyword extraction algorithm is used to extract the core vocabularies in the text that can best reflect the meaning of the article.
Automatic text summary algorithm refers to the automatic generation of a simple but coherent essay to express the core content of the original literature. It can achieve the efficient compression of the original information and help users to read efficiently.
Language recognition algorithm refers to automatic determination of the language input in the text.
Text classification algorithm refers to the automatic marking of text category in accordance with certain classification system or standard.
Sensitivity analysis refers to the semantic analysis of text content, which can automatically identify whether the text contains any violent, reactionary, pornographic and other sensitive information.
The text quality assessment algorithm can comprehensively determine the quality of the text, effectively identify the noise data including garbled, code, tags, scripts, invalid information and meaningless articles.
Event element extraction algorithm can automatically identify such information as the core time, place, characters and event feature words in an article.
Word vector is a kind of word representation method commonly used in deep learning, which can express both the word itself and its semantic association with other words.
Relation extraction is an important task of information extraction, which realizes the semantic relations identification between entity pairs.
Based on deep learning technology, text similarity algorithm realizes high-accuracy and high-efficiency semantic similarity computing.