반응형
https://generativeai.pub/how-to-run-70b-llms-on-a-single-4gb-gpu-d1c61ed5258c
#https://generativeai.pub/how-to-run-70b-llms-on-a-single-4gb-gpu-d1c61ed5258c
from airllm import AutoModel
MAX_LENGTH = 128
# load the model from the Hugging Face hub
model = AutoModel.from_pretrained("garage-bAInd/Platypus2-70B-instruct")
# or load the model from a local path
# model = AutoModel.from_pretrained("/home/ubuntu/.cache/huggingface/hub/models--garage-bAInd--Platypus2-70B-instruct/snapshots/b585e74bcaae02e52665d9ac6d23f4d0dbc81a0f")
# prepare the input text
input_text = [
'What is the capital of United States?',
]
# tokenize the input text
input_tokens = model.tokenizer(input_text,
return_tensors="pt",
return_attention_mask=False,
truncation=True,
max_length=MAX_LENGTH,
padding=False)
# generate the output text
generation_output = model.generate(
input_tokens['input_ids'].cuda(),
max_new_tokens=20,
use_cache=True,
return_dict_in_generate=True)
# decode the output text
output = model.tokenizer.decode(generation_output.sequences[0])
# print the output text
print(output)
반응형
'Python' 카테고리의 다른 글
Dragon Moving Banner (0) | 2024.05.07 |
---|---|
[Errno 13] Permission denied: '/dev/ttyUSB0' (0) | 2024.05.02 |
[Python]Convolutional Neural Network (CNN) & Computer Vision (0) | 2024.04.25 |
[Python]Python GUI PyQt UI 생성 및 연결(Python GUI) (0) | 2024.04.24 |
ESP32 cam Person Detection (0) | 2024.04.19 |
댓글