In an ML model for ID'ing bird calls, you _don't_ want to slice the training data into small, eg, 5 sec slices. All bird recordings are polluted w background, non-target birds. You need to multimodel, w 1st model grabbing long, eg, 60 secs saying 'ROI for target species at seconds :31-:36" and then your 2nd model is your fine-tuned discriminator. #ML #AudioClassification