- [Not loaded yet]
- My problem with Butina split for CV is that the dataset is not inherently n distinct clusters but some k clusters and then lots of random xyz molecules. This then makes some of the test folds unique and other folds the same as a random CV. We should always check if the Butina split actually worked.Nov 18, 2024 14:54
- What do you mean by "We should always check if the Butina split actually worked"? Do you have suggestions on better splitting approaches?