bigscience/bloom · Hugging Face | AI NavHub Tools Directory

Introduction

What is BigScience BLOOM?

Overview of BigScience BLOOM

BigScience BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.

Technical Specifications

Model Architecture and Objective: Decoder-only architecture with layer normalization applied to word embeddings layer, ALiBI positional encodings, and GeLU activation functions.
Compute Infrastructure: Trained on the Jean Zay Public Supercomputer, provided by the French government, with 384 A100 80GB GPUs and additional 32 A100 80GB GPUs in reserve.

Training

Training Data: 1.6TB of pre-processed text, converted into 350B unique tokens, including 46 natural languages and 13 programming languages.
Training Speed and Size: Training throughput of about 150 TFLOP per GPU per second, with a checkpoint size of 329GB for bf16 weights and 2.3TB for full checkpoint with optimizer states.

Environmental Impact

Estimated carbon emissions: Forthcoming.
Estimated electricity usage: Forthcoming.

Uses

Intended Use: Enable public research on large language models (LLMs) for language generation or as a pretrained base model that can be further fine-tuned for specific tasks.
Direct Use: Text generation, exploring characteristics of language generated by a language model.
Downstream Use: Tasks that leverage language models, such as Information Extraction, Question Answering, Summarization.

Risks and Limitations

Model may overrepresent some viewpoints and underrepresent others, contain stereotypes, contain personal information, generate hateful or discriminatory language, and make errors.
Model outputs may not be appropriate for all settings, including sexual content.

Evaluation

Metrics: Perplexity, Cross Entropy Loss, and multiple different metrics for specific tasks.
Factors: Language, domain, demographic characteristics.
Results: Zero-shot evaluations and train-time evaluation results.

Recommendations

Indirect users should be made aware when the content they're working with is created by the LLM.
Users should be aware of Risks and Limitations and include an appropriate age disclaimer or blocking interface as necessary.
Models trained or finetuned downstream of BLOOM LM should include an updated Model Card.

bigscience/bloom · H... Website Traffic Analysis

Latest traffic information

Monthly Visits	Bounce Rate	Pages Per Visit
19.1M	45.07%	5.52
Visit Duration	Global Rank	Country/Region Rank
00:05:32	2,633	3,281 (United States)

Traffic Sources

Source	Percentage
Direct	48.31%
Referrals	12.17%
Organic Search	36.36%
...	...

Top Regions

Region	Percentage
United States	18.18%
India	13.13%
Russia	7.59%
...	...

What is BigScience BLOOM?

Overview of BigScience BLOOM

Technical Specifications

Model Architecture and Objective: Decoder-only architecture with layer normalization applied to word embeddings layer, ALiBI positional encodings, and GeLU activation functions.
Compute Infrastructure: Trained on the Jean Zay Public Supercomputer, provided by the French government, with 384 A100 80GB GPUs and additional 32 A100 80GB GPUs in reserve.

Training

Training Data: 1.6TB of pre-processed text, converted into 350B unique tokens, including 46 natural languages and 13 programming languages.
Training Speed and Size: Training throughput of about 150 TFLOP per GPU per second, with a checkpoint size of 329GB for bf16 weights and 2.3TB for full checkpoint with optimizer states.

Environmental Impact

Estimated carbon emissions: Forthcoming.
Estimated electricity usage: Forthcoming.

Uses

Intended Use: Enable public research on large language models (LLMs) for language generation or as a pretrained base model that can be further fine-tuned for specific tasks.
Direct Use: Text generation, exploring characteristics of language generated by a language model.
Downstream Use: Tasks that leverage language models, such as Information Extraction, Question Answering, Summarization.

Risks and Limitations

Model may overrepresent some viewpoints and underrepresent others, contain stereotypes, contain personal information, generate hateful or discriminatory language, and make errors.
Model outputs may not be appropriate for all settings, including sexual content.

Evaluation

Metrics: Perplexity, Cross Entropy Loss, and multiple different metrics for specific tasks.
Factors: Language, domain, demographic characteristics.
Results: Zero-shot evaluations and train-time evaluation results.

Recommendations

Indirect users should be made aware when the content they're working with is created by the LLM.
Users should be aware of Risks and Limitations and include an appropriate age disclaimer or blocking interface as necessary.
Models trained or finetuned downstream of BLOOM LM should include an updated Model Card.

Added on :	Sept 30, 2024
Monthly Visitors :	19.1M18.18%

Added on :	Sept 30, 2024
Monthly Visitors :	19.1M18.18%

bigscience/bloom · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Introduction

What is BigScience BLOOM?

Overview of BigScience BLOOM

Technical Specifications

Training

Environmental Impact

Uses

Risks and Limitations

Evaluation

Recommendations

bigscience/bloom · H... Website Traffic Analysis

Latest traffic information

Traffic Sources

Top Regions

bigscience/bloom · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Introduction

What is BigScience BLOOM?

Overview of BigScience BLOOM

Technical Specifications

Training

Environmental Impact

Uses

Risks and Limitations

Evaluation

Recommendations

bigscience/bloom · H... Website Traffic Analysis

Latest traffic information

Traffic Sources

Top Regions