Vectorization (colab notebook)

notebook about vectorization with Gensim + Word2Vec and sentiment analysis with BERT + BART
2025-12-20 12:40
// updated 2025-12-21 13:19

Proceedings from a (2025-12-20) lecture about vectorization:

(accessible via link since Google Colab does not allow posting notebooks in an iframe)

Topics covered

  • Vectorization
    • Dimensionality
    • Similarity
    • Dissimilarity
    • Most similar
    • (Heat matrix of similarity)
    • Cosine similarity
    • Transformer architecture
      • Architecture sub-types
        • Encoder-decoder architectures (e.g. BART)
        • Encoder-only architectures (e.g. BERT)
        • Decoder-only architectures (e.g. GPT)
      • Sentiment analysis with transformers
⬅️ older (in snippets)
📜 Pre-processing steps for NLP
⬅️ older (posts)
📜 Pre-processing steps for NLP