Print stats on the dataset before training #23

Open
opened 2025-10-14 15:51:22 -06:00 by navan · 0 comments
Owner

Originally created by @starride-teklia on 1/4/2023

It could be useful to have some stats on the dataset before training. Something like:

  • number of images (training/validation sets)
  • number of characters/words (training/validation sets)
  • character distribution (training set)
  • unknown characters
*Originally created by @starride-teklia on 1/4/2023* It could be useful to have some stats on the dataset before training. Something like: * number of images (training/validation sets) * number of characters/words (training/validation sets) * character distribution (training set) * unknown characters
Sign in to join this conversation.
No labels
dependencies
stale
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github/PyLaia#23
No description provided.