Tokenizers break text into pieces that are more usable by machine learning models. Many tokenizers share some preparation steps. This package provides those shared steps, along with a simple tokenizer.
Version: | 1.0.2 |
Depends: | R (≥ 2.10) |
Imports: | cli, glue, rlang (≥ 0.4.2), stringi, stringr |
Suggests: | covr, testthat (≥ 3.0.0) |
Published: | 2023-06-02 |
DOI: | 10.32614/CRAN.package.piecemaker |
Author: | Jon Harmon [aut, cre], Jonathan Bratt [aut], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph] |
Maintainer: | Jon Harmon <jonthegeek at gmail.com> |
BugReports: | https://github.com/macmillancontentscience/piecemaker/issues |
License: | Apache License (≥ 2) |
URL: | https://github.com/macmillancontentscience/piecemaker, https://macmillancontentscience.github.io/piecemaker/ |
NeedsCompilation: | no |
Materials: | README NEWS |
CRAN checks: | piecemaker results |
Reference manual: | piecemaker.pdf |
Package source: | piecemaker_1.0.2.tar.gz |
Windows binaries: | r-devel: piecemaker_1.0.2.zip, r-release: piecemaker_1.0.2.zip, r-oldrel: piecemaker_1.0.2.zip |
macOS binaries: | r-release (arm64): piecemaker_1.0.2.tgz, r-oldrel (arm64): piecemaker_1.0.2.tgz, r-release (x86_64): piecemaker_1.0.2.tgz, r-oldrel (x86_64): piecemaker_1.0.2.tgz |
Old sources: | piecemaker archive |
Reverse imports: | morphemepiece, wordpiece |
Please use the canonical form https://CRAN.R-project.org/package=piecemaker to link to this page.