skip to main content
joncoded
projects + snippets + thoughts
nooks / pre-processing

📜⚙️📃

2025-12-17

code

[snippets]

Pre-processing steps for NLP

splitting text up with tokenization + removing "textual noise" + reducing tokens with stemming and lemmatization

a jxc project - [ joncoded ] aka jonchius

built with sanity + next.js in the early 2020s

other sites : jonolist + jononews