- A new release of 650,000 public-domain English books, sourced from OCR'd texts in the Internet Archive and Open Library. 400 GB. Biggest resource of its kind, but metadata may not be as rich as one would like? #CSSky huggingface.co/datasets/sto...
Mar 6, 2024 20:54