Search results
Results From The WOW.Com Content Network
A dataset for NLP and climate change media researchers The dataset is made up of a number of data artifacts (JSON, JSONL & CSV text files & SQLite database) Climate news DB, Project's GitHub repository [394] ADGEfficiency Climatext Climatext is a dataset for sentence-based climate change topic detection. HF dataset [395] University of Zurich ...
GPT-2 was pre-trained on a dataset of 8 million web pages. [2] It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019. [3] [4] [5] GPT-2 was created as a "direct scale-up" of GPT-1 [6] with a ten-fold increase in both its parameter count and the size of its training dataset. [5]
The company received a $2 billion valuation. In February 2023, the company announced partnership with Amazon Web Services (AWS) which would allow AWS customers access to Hugging Face's products. The company also said the next generation of BLOOM will be run on Trainium, a proprietary machine learning chip created by AWS.
[2] NIST 80 Million Tiny Images: 80 million 32×32 images labelled with 75,062 non-abstract nouns. 80,000,000 image, label 2008 [3] Torralba et al. JFT-300M Dataset internal to Google Research. 300M images with 375M labels in 18291 categories 300,000,000 image, label 2017 [4] Google Research Places
BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) [1] [2] is a 176-billion-parameter transformer-based autoregressive large language model (LLM). The model, as well as the code base and the data used to train it, are distributed under free licences. [3]
The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. [1] [2] It is composed of 22 smaller datasets, including 14 new ones. [1]
[1] [2] The MMLU was released by Dan Hendrycks and a team of researchers in 2020 [ 3 ] and was designed to be more challenging than then-existing benchmarks such as General Language Understanding Evaluation (GLUE) on which new language models were achieving better-than-human accuracy.
Apache 2.0 Outperforms GPT-3.5 and Llama 2 70B on many benchmarks. [82] Mixture of experts model, with 12.9 billion parameters activated per token. [83] Mixtral 8x22B April 2024: Mistral AI: 141 Unknown Unknown: Apache 2.0 [84] DeepSeek LLM November 29, 2023: DeepSeek 67 2T tokens [85]: table 2 12,000}} DeepSeek License