Package: cleanNLP 3.1.0

cleanNLP: A Tidy Data Model for Natural Language Processing

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, or a Python back ends with 'spaCy' <https://spacy.io>. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, and dependency parsing.

Authors:Taylor B. Arnold [aut, cre]

cleanNLP_3.1.0.tar.gz
cleanNLP_3.1.0.zip(r-4.5)cleanNLP_3.1.0.zip(r-4.4)cleanNLP_3.1.0.zip(r-4.3)
cleanNLP_3.1.0.tgz(r-4.4-any)cleanNLP_3.1.0.tgz(r-4.3-any)
cleanNLP_3.1.0.tar.gz(r-4.5-noble)cleanNLP_3.1.0.tar.gz(r-4.4-noble)
cleanNLP_3.1.0.tgz(r-4.4-emscripten)cleanNLP_3.1.0.tgz(r-4.3-emscripten)
cleanNLP.pdf |cleanNLP.html
cleanNLP/json (API)
NEWS

# Install 'cleanNLP' in R:
install.packages('cleanNLP', repos = c('https://statsmaths.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/statsmaths/cleannlp/issues

Datasets:
  • un - Universal Declaration of Human Rights
  • word_frequency - Most frequent English words

On CRAN:

corenlpnatural-language-processingspacy

8 exports 209 stars 5.52 score 15 dependencies 207 scripts 479 downloads

Last updated 4 months agofrom:0e6bf7d8f6. Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 17 2024
R-4.5-winOKSep 17 2024
R-4.5-linuxOKSep 17 2024
R-4.4-winOKSep 17 2024
R-4.4-macOKSep 17 2024
R-4.3-winOKSep 17 2024
R-4.3-macOKSep 17 2024

Exports:cnlp_annotatecnlp_download_spacycnlp_init_spacycnlp_init_stringicnlp_init_udpipecnlp_utils_pcacnlp_utils_tfcnlp_utils_tfidf

Dependencies:data.tableherejsonlitelatticeMatrixpngrappdirsRcppRcppTOMLreticulaterlangrprojrootstringiudpipewithr

Creating Text Visualizations with Wikipedia Data

Rendered fromwikipedia.Rmdusingknitr::rmarkdownon Sep 17 2024.

Last update: 2020-03-07
Started: 2019-10-22

Exploring the State of the Union Addresses: A Case Study with cleanNLP

Rendered fromstate-of-union.Rmdusingknitr::rmarkdownon Sep 17 2024.

Last update: 2020-03-07
Started: 2019-10-22