MOSIAC

MPT-30B-Chat

MPT-30B-Chat is the Chatbot-like model for dialouge generation. It was built by fine tuning MPT-30B on the

ShareGPT-Vicuna
Camel-AI
GPTeacher
Guanaco
Baize
some generated datasets

License: (Non-Commercial Use only) Demo on Hugging Face Spaces

This model was trained by MosiacML and followsf a modified decoder-only transformer architecture

Model Date

June 22, 2023

Model License

Non Commercial Use Only

Documentation

Blog post: Raising the bar for open-source foundation models
Code base(mosiacml/llm-foundaryrepo
For any questions feel free to contact us via the MosaicML Community Slack

Example Dialogue

kreenivasan:

"I want you to come up with a tweet based on this summary of the article

We are excited to extend the MosaicML Foundation Series with MPT-30B, a new commercially licensed open-source model that is significantly more powerful, and outperforms the original GPT-3. In addtion, we are releasing two fine-tuned models, MPT-30B-Instruct and MPT-30B-Chat, that are built on top of MPT-30B and excel at short-form instruction following and multi-turn conversations, respsectively. All MPT-30B models come with speacial features that differentiate them from other LLMs, including an 8k token context window (which can be further extended via finetunin; see MPT-7B-StoryWriter), support for context-length extrapolation via ALiBi, and efficient inference+training performance via FlashAttention. It also has strong coding abilities thanks to its pretraining data mixture"

MPT-30B-Chat

""Revolutionize your AI Capabilities with MosaicML's latest addition, MPT-30B! Outperforming GPT-3, this opens-source model is significanly more powerful and comes with finely-tuned models for exceptional instruction following and multi-turn conversations. Unleash the potential of an 8K token context window, ALiBi support, and FlashAttention. Experience the future of AI with MosaicML! #AI #ML #innovation""

ksreenivasan:

"Is there anything you want to say to MosaicML about this exciting news?"

MPT-30B-Chatx

"As your AI marketing assistant, I would say that this is truly exciting news for the AI community! MosaicML's latest addition, MPT-30B, is poised to revolutionize the industry with its powerful opensource capabilities and finely-tuned models for instruction following and multi-turn conversations. The special features, including the 8k token conext window, ALiBi support, and FlashAttention, are sure to set MosaicML's MPT-30B apart from other LLMs. Congratulations to the MosiacML team on this incredible achievement!"

How to Use

This model is best used with the MosaicML ll-foundary repository for training and finetuning.


      import transoformers
      model = transformers.AutoModelForCausalLM.from_pretrained(
      'mosaicml/mpt-30b-chat',
      trust_remote_code=True
      )

For a GPU based implementation we can use below code


      import torch
      import transformers

      name = 'mosaicml/mpt-30b-chat'

      config = transformers.AutoConfig.from_pretrained(name,
      trust_remote_code=True)
      config.attn_config['attn_impl'] = 'triton'
      config.init_device = 'cuda:0'

      model = transformers.AutoModelForCausalLM.from_pretrained(
      name,
      config=config
      torch_dtype=torch.bfloat16,
      trust_remote_code=True
      )

Supposing if user needs needs to increase the sequence lengths to larger values below is the code


      import transformers

      name = 'moasiacml/mpt-30b-chat'

      config = transformers.AutoConfig.from_pretrained(name,
      trusted_remote_code=True)
      config.max_seq_len = 16384

      model = transformers.AutoModelForCausalLM.from_pretrained(
      name,
      config=config,
      trust_remote_code=True
      )

This model was trained with MPT-30B tokenizer which is based on the EleutherAI/gpt-neox-20b toekenizer and inculdes additional padding and eos tokens.


	from transformers import AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained('mosaicml/mpt-30b')

This can be used in text generation pipeline


      from transformers import pipeline

      with torch.autocast('cuda', dtype=torch.bfloat16):
          inputs = tokenizer('Here is a recipe for vegan banana
          bread:\n', return_tensors="pt").to('cuda')
          outputs = model.generate(**inputs, max_new_tokens=100)
          print(tokenizer.batch_decode(outputs, skips_special_tokens=True)

      pipe = pipeline('text-generation', model=model,
      tokenizer=tokenizer, device='cuda:0')
      with torch.autocast('cuda', dtype=torch.bfloat16):
          print(
               pipe('Here is a recipe for vegan banana bread:\n',
               max_new_tokens=100,
               do_sample=True,
               use_cache=True))

⎘