Package: cleanNLP 3.1.0

cleanNLP: A Tidy Data Model for Natural Language Processing

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, or a Python back ends with 'spaCy' <https://spacy.io>. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, and dependency parsing.

Authors:Taylor B. Arnold [aut, cre]

cleanNLP_3.1.0.tar.gz
cleanNLP_3.1.0.zip(r-4.5)cleanNLP_3.1.0.zip(r-4.4)cleanNLP_3.1.0.zip(r-4.3)
cleanNLP_3.1.0.tgz(r-4.4-any)cleanNLP_3.1.0.tgz(r-4.3-any)
cleanNLP_3.1.0.tar.gz(r-4.5-noble)cleanNLP_3.1.0.tar.gz(r-4.4-noble)
cleanNLP_3.1.0.tgz(r-4.4-emscripten)cleanNLP_3.1.0.tgz(r-4.3-emscripten)
cleanNLP.pdf |cleanNLP.html✨
cleanNLP/json (API)
NEWS

# Install 'cleanNLP' in R:

install.packages('cleanNLP', repos = c('https://statsmaths.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/statsmaths/cleannlp/issues

Datasets:

un - Universal Declaration of Human Rights
word_frequency - Most frequent English words

On CRAN:

corenlp natural-language-processing spacy

8.85 score 212 stars 221 scripts 728 downloads 8 exports 15 dependencies

Last updated 6 months agofrom:0e6bf7d8f6. Checks:OK: 7. Indexed: yes.

Target	Result	Date
Doc / Vignettes	OK	Nov 16 2024
R-4.5-win	OK	Nov 16 2024
R-4.5-linux	OK	Nov 16 2024
R-4.4-win	OK	Nov 16 2024
R-4.4-mac	OK	Nov 16 2024
R-4.3-win	OK	Nov 16 2024
R-4.3-mac	OK	Nov 16 2024

Exports:cnlp_annotate cnlp_download_spacy cnlp_init_spacy cnlp_init_stringi cnlp_init_udpipe cnlp_utils_pca cnlp_utils_tf cnlp_utils_tfidf

Dependencies:data.table here jsonlite lattice Matrix png rappdirs Rcpp RcppTOML reticulate rlang rprojroot stringi udpipe withr

Creating Text Visualizations with Wikipedia Data

Taylor Arnold

Rendered fromwikipedia.Rmdusingknitr::rmarkdownon Nov 16 2024.

Last update: 2020-03-07
Started: 2019-10-22

Exploring the State of the Union Addresses: A Case Study with cleanNLP

Taylor Arnold

Rendered fromstate-of-union.Rmdusingknitr::rmarkdownon Nov 16 2024.

Last update: 2020-03-07
Started: 2019-10-22

Citation

Development and contributors

Readme and manuals

Help Manual

Help page	Topics
cleanNLP: A Tidy Data Model for Natural Language Processing	cleanNLP-package cleanNLP
Run the annotation pipeline on a set of documents	cnlp_annotate
Download model files needed for spacy	cnlp_download_spacy
Interface for initializing the spacy backend	cnlp_init_spacy
Interface for initializing the standard R backend	cnlp_init_stringi
Interface for initializing the udpipe backend	cnlp_init_udpipe
Compute Principal Components and store as a Data Frame	cnlp_utils_pca
Construct the TF-IDF Matrix from Annotation or Data Frame	cnlp_utils_tf cnlp_utils_tfidf
Universal Declaration of Human Rights	un
Most frequent English words	word_frequency