When.com Web Search

Search results

  1. Results From The WOW.Com Content Network
  2. List of datasets for machine-learning research - Wikipedia

    en.wikipedia.org/wiki/List_of_datasets_for...

    OpenML: [493] Web platform with Python, R, Java, and other APIs for downloading hundreds of machine learning datasets, evaluating algorithms on datasets, and benchmarking algorithm performance against dozens of other algorithms. PMLB: [494] A large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms ...

  3. Category:Datasets in machine learning - Wikipedia

    en.wikipedia.org/wiki/Category:Datasets_in...

    Download as PDF; Printable version; ... Pages in category "Datasets in machine learning" ... Text is available under the Creative Commons Attribution-ShareAlike 4.0 ...

  4. List of datasets in computer vision and image processing

    en.wikipedia.org/wiki/List_of_datasets_in...

    Wikipedia-based Image Text Dataset 37.5 million image-text examples with 11.5 million unique images across 108 Wikipedia languages. 11,500,000 image, caption Pretraining, image captioning 2021 [7] Srinivasan e al, Google Research Visual Genome Images and their description 108,000 images, text Image captioning 2016 [8] R. Krishna et al.

  5. Template:Machine learning - Wikipedia

    en.wikipedia.org/wiki/Template:Machine_learning

    Download as PDF; Printable version; In other projects ... List of datasets for machine-learning research. List of datasets in computer vision and image processing ...

  6. COCO (dataset) - Wikipedia

    en.wikipedia.org/?title=COCO_(dataset)&redirect=no

    Download as PDF; Printable version; ... Redirect page. Redirect to: List of datasets for machine-learning research#COCO; ... Text is available under the Creative ...

  7. Category:Datasets - Wikipedia

    en.wikipedia.org/wiki/Category:Datasets

    Download as PDF; Printable version; ... Datasets in machine learning (1 C, 12 P) S. Statistical data sets (18 C, 32 P) Pages in category "Datasets"

  8. The Pile (dataset) - Wikipedia

    en.wikipedia.org/wiki/The_Pile_(dataset)

    The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. [1] [2] It is composed of 22 smaller datasets, including 14 new ones. [1]

  9. BookCorpus - Wikipedia

    en.wikipedia.org/wiki/BookCorpus

    The dataset consists of around 985 million words, and the books that comprise it span a range of genres, including romance, science fiction, and fantasy. [ 3 ] The corpus was introduced in a 2015 paper by researchers from the University of Toronto and MIT titled "Aligning Books and Movies: Towards Story-like Visual Explanations by Watching ...