Fed up with calculating dataset splits (e.g. train, validation, test, dev, silly, etc) for multiple classes to make sure they're balanced? Me too.

I built a tool to help me:

https://sbrl.github.io/research-smflooding/dataset-split-calculator.html

Put 1 integer value per line.

It even spits out shell commands to cut lines-based files (e.g. jsonl, csv, etc) into separate files!

May write a proper blog post soon!

#AI #DataScience #BigData #Automation #JSONL #CSV #Bash / #Shell #Scripts #AreAwesome

Dataset split size calculator

A tool to calculate dataset split sizes based on input dataset sizes and split ratios

βœ¨πŸ“œπŸ–‹οΈβœ¨πŸ‘€πŸ²πŸ―β£οΈ

#PenQuote #April17th #PassionateInterests #AreAwesome #FindYours

TEENtastic Tuesdays: AAPI Heritage Month Kick-off with The Linda Lindas

YouTube