Less Training Data
2026-02-11
This is part of a series on ML for generalists, you can find the start here.
In practice, we rarely get to generate as much labelled data as we want. Let's look at what happens when we have 200 training samples instead of 800:
python generate.py --count 200 --output data-train
We'll keep our model exactly the same, and run our train.py again. Here's the full output over 20 epochs:
[Epoch 1/20] after 61.3 seconds:
loss: 0.4477
within 5 degrees: 20.50%
within 10 degrees: 34.50%
within 20 degrees: 59.00%
[Epoch 2/20] after 72.5 seconds:
loss: 0.1192
within 5 degrees: 16.00%
within 10 degrees: 33.00%
within 20 degrees: 68.00%
[Epoch 3/20] after 89.1 seconds:
loss: 0.0708
within 5 degrees: 41.00%
within 10 degrees: 77.00%
within 20 degrees: 94.50%
[Epoch 4/20] after 82.2 seconds:
loss: 0.0448
within 5 degrees: 42.50%
within 10 degrees: 76.50%
within 20 degrees: 94.50%
[Epoch 5/20] after 78.9 seconds:
loss: 0.0374
within 5 degrees: 43.50%
within 10 degrees: 73.00%
within 20 degrees: 93.00%
[Epoch 6/20] after 65.9 seconds:
loss: 0.0325
within 5 degrees: 42.00%
within 10 degrees: 75.50%
within 20 degrees: 96.00%
[Epoch 7/20] after 76.0 seconds:
loss: 0.0242
within 5 degrees: 37.50%
within 10 degrees: 67.50%
within 20 degrees: 93.00%
[Epoch 8/20] after 77.5 seconds:
loss: 0.0222
within 5 degrees: 43.50%
within 10 degrees: 67.00%
within 20 degrees: 92.00%
[Epoch 9/20] after 78.7 seconds:
loss: 0.0224
within 5 degrees: 48.00%
within 10 degrees: 74.00%
within 20 degrees: 94.50%
[Epoch 10/20] after 75.5 seconds:
loss: 0.0165
within 5 degrees: 47.50%
within 10 degrees: 80.00%
within 20 degrees: 96.00%
[Epoch 11/20] after 69.8 seconds:
loss: 0.0142
within 5 degrees: 50.50%
within 10 degrees: 78.50%
within 20 degrees: 96.50%
[Epoch 12/20] after 61.2 seconds:
loss: 0.0113
within 5 degrees: 50.00%
within 10 degrees: 82.00%
within 20 degrees: 98.00%
[Epoch 13/20] after 57.6 seconds:
loss: 0.0118
within 5 degrees: 49.50%
within 10 degrees: 83.00%
within 20 degrees: 94.50%
[Epoch 14/20] after 60.8 seconds:
loss: 0.0085
within 5 degrees: 49.50%
within 10 degrees: 83.50%
within 20 degrees: 97.00%
[Epoch 15/20] after 65.5 seconds:
loss: 0.0069
within 5 degrees: 54.00%
within 10 degrees: 82.50%
within 20 degrees: 95.50%
[Epoch 16/20] after 66.8 seconds:
loss: 0.0060
within 5 degrees: 50.00%
within 10 degrees: 82.00%
within 20 degrees: 98.00%
[Epoch 17/20] after 67.9 seconds:
loss: 0.0055
within 5 degrees: 51.00%
within 10 degrees: 83.00%
within 20 degrees: 96.50%
[Epoch 18/20] after 61.5 seconds:
loss: 0.0047
within 5 degrees: 53.50%
within 10 degrees: 83.00%
within 20 degrees: 98.00%
[Epoch 19/20] after 61.8 seconds:
loss: 0.0042
within 5 degrees: 52.00%
within 10 degrees: 81.00%
within 20 degrees: 96.50%
[Epoch 20/20] after 62.9 seconds:
loss: 0.0039
within 5 degrees: 52.00%
within 10 degrees: 82.50%
within 20 degrees: 98.50%
Loss is dropping but our accuracy plateaued around epoch 10! What's happening?
Train vs Test
Our training output shows average loss and accuracy:
[Epoch 8/20] after 77.5 seconds:
loss: 0.0222
within 5 degrees: 43.50%
within 10 degrees: 67.00%
within 20 degrees: 92.00%
Here's how we calculate our loss:
for images, labels in train_loader:
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
avg_loss = running_loss / len(train_loader)
Here's how we calculate accuracy:
for images, labels in test_loader:
outputs = model(images)
predicted_angles = sincos_to_angles(outputs[:, 0], outputs[:, 1])
true_angles = sincos_to_angles(labels[:, 0], labels[:, 1])
angle_diff = torch.abs(predicted_angles - true_angles)
angle_diff = torch.min(angle_diff, 360 - angle_diff)
within_5 += (angle_diff < 5).sum().item()
within_10 += (angle_diff < 10).sum().item()
within_20 += (angle_diff < 20).sum().item()
total += labels.size(0)
Notice we're calculating loss based on training data, because it's something we need for training.
While we calculate accuracy based on test data, because we want to keep it separate to validate our training. If we used our training data to measure accuracy too, we wouldn't know how the model performs on data it hadn't seen before.
This might be the clue. Our model might be memorising our training data and not learning general patterns to solve our problem.
This is called overfitting.