Number Representations & States

"how numbers are stored and used in computers"

Machine learning

This is a practical resource for experienced software developers who want to understand how machine learning algorithms work internally. Many of the coding examples are JavaScript programs that can be run directly in your web browser, and operate on real-world data.

Mathematical notation

Understanding mathematical notation is essential to understanding machine learning. It provides a precise and compact language for expressing complex concepts, models, and algorithms. Key ideas such as gradient descent, loss functions, matrix operations, and probability distributions are most clearly and rigorously described in mathematical notation. Without understanding this notation, you will be unable to read research papers or effectively reason about machine learning algorithms.

However, I am acutely aware of the widespread allergy to mathematical notation, so I have attempted to augment it with human-readable explanations of each relevant term. Try hovering over the algorithm below to see how to examine each term.

Outline

WORK IN PROGRESS

  • what is machine learning
  • supervised vs unsupervised, reinforcement learning
  • tools and languages (python, jupyter, scikit-learn)

Data preparation and preprocessing

  • data cleaning
  • feature engineering
  • normalization and standardization
  • train, test, validation splits

Supervised learning

  • linear regression
  • cost functions, gradient descent
  • overfitting and underfitting
  • classification algorithms
  • logistic regression
  • k nearest neighbors
  • decision trees and random forests
  • support vector machines (SVMs)

Model evaluation and tuning

  • performance metrics (accuracy, precision, recall, F1, ROC-AUC)
  • cross validation
  • hyperparameter tuning (grid search, random search)
  • bias-variance tradeoff

Unsupervised learning

  • Clustering
    • K-means
    • DBSCAN
    • Hierarchical clustering
  • Dimensionality reduction
    • PCA
    • t-SNE

Neural networks and deep learning

  • basics of neural networks
  • activation functions
  • backpropagation and optimization
  • deep learning frameworks (tensorflow, pytorch)

Specialized models

  • Natural language processing (word2vec, history of LLMs)
  • Time series forecasting
  • Anomaly detection
  • Recommendation systems
  • LLMs

Deployment

  • serialization (pickle, joblib, ONNX)
  • building and deploying ML APIs
  • monitoring and updating models