skip to main content
joncoded
run-times + snippets + thoughts
nooks / pre-processing

📜⚙️📃

2025-12-17

code

[snippets]

Pre-processing steps for NLP

splitting text up with tokenization + removing "textual noise" + reducing tokens with stemming and lemmatization

built with sanity + next.js in the early 2020s

jonchius | keywords | loglists | mixmarks | newsnook