MosaicML ने MPT-7B-8K मॉडल जारी किया

xguru · 2023-07-21T10:02:02+09:00

8k context length को सपोर्ट करने वाला 7B parameter open source LLM MPT-7B पर अतिरिक्त 500B token data का उपयोग करके 256 NVidia H100 पर 3 दिनों तक training 3 मॉडल जारी किए गए: MPT-7B-8k, MPT-7B-8k-Instruct, MPT-7B-8k-Chat commercial उपयोग के लिए उपलब्ध ALiBi(Attention with Linear Biases Enables Input Length Extrapolation) के जरिए 8k input सपोर्ट FlashAttention और FasterTrasformer के साथ तेज training और inference

(mosaicml.com)

9 पॉइंट द्वारा xguru 2023-07-21 | अभी कोई टिप्पणी नहीं है. | WhatsApp पर शेयर करें

8k context length को सपोर्ट करने वाला 7B parameter open source LLM
MPT-7B पर अतिरिक्त 500B token data का उपयोग करके 256 NVidia H100 पर 3 दिनों तक training
3 मॉडल जारी किए गए: MPT-7B-8k, MPT-7B-8k-Instruct, MPT-7B-8k-Chat
commercial उपयोग के लिए उपलब्ध
ALiBi(Attention with Linear Biases Enables Input Length Extrapolation) के जरिए 8k input सपोर्ट
FlashAttention और FasterTrasformer के साथ तेज training और inference

MosaicML ने MPT-7B-8K मॉडल जारी किया

संबंधित पढ़ाई

अभी कोई टिप्पणी नहीं है.