cleanNLP - A Tidy Data Model for Natural Language Processing
Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, or a Python back ends with 'spaCy' <https://spacy.io>. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, and dependency parsing.
Last updated 4 months ago
corenlpnatural-language-processingspacy
209 stars 5.52 score 15 dependenciesggimg - Graphics Layers for Plotting Image Data with 'ggplot2'
Provides two new layer types for displaying image data as layers within the Grammar of Graphics framework. Displays images using either a rectangle interface, with a fixed bounding box, or a point interface using a central point and general size parameter. Images can be given as local JPEG or PNG files, external resources, or as a list column containing raster image data.
Last updated 12 months ago
ggplot2-geomimage-analysis
52 stars 3.14 score 31 dependenciesgenlasso - Path Algorithm for Generalized Lasso Problems
Computes the solution path for generalized lasso problems. Important use cases are the fused lasso over an arbitrary graph, and trend fitting of any given polynomial order. Specialized implementations for the latter two subproblems are given to improve stability and speed. See Taylor Arnold and Ryan Tibshirani (2016) <doi:10.1080/10618600.2015.1008638>.
Last updated 2 years ago
32 stars 3.07 score 11 dependencies 6 dependentstif - Text Interchange Format
Provides validation functions for common interchange formats for representing text data in R. Includes formats for corpus objects, document term matrices, and tokens. Other annotations can be stored by overloading the tokens structure.
Last updated 10 months ago
corpusnatural-language-processingterm-frequencytext-processingtokenizer
35 stars 2.57 score 2 dependenciescoreNLP - Wrappers Around Stanford CoreNLP Tools
Provides a minimal interface for applying annotators from the 'Stanford CoreNLP' java library. Methods are provided for tasks such as tokenisation, part of speech tagging, lemmatisation, named entity recognition, coreference detection and sentiment analysis.
Last updated 2 years ago
1 stars 1.31 score 2 dependencies