본문 바로가기

PY(Python Image Processing)

LLM evaluation test for gpt2

728x90

[download gpt2 to local model path]

openai-community/gpt2 at main (huggingface.co)

 

openai-community/gpt2 at main

Detected Pickle imports (3) "torch.FloatStorage", "collections.OrderedDict", "torch._utils._rebuild_tensor_v2" What is a pickle import?

huggingface.co

 

 

[app.py]

from transformers import GPT2LMHeadModel, GPT2Tokenizer

# 로컬 경로에서 모델과 토크나이저 로드
local_model_path = 'D:/gpt2'
model = GPT2LMHeadModel.from_pretrained(local_model_path)
tokenizer = GPT2Tokenizer.from_pretrained(local_model_path)

# CPU에서 모델을 실행하도록 설정
device = "cpu"
model.to(device)

# 평가를 위한 입력 텍스트
input_texts = [
    "What is jiu-jitsu?",
    "Explain how jiu-jitsu became known in Brazil"
]

# 모델 응답 생성
def generate_responses(input_texts):
    responses = []
    for text in input_texts:
        inputs = tokenizer.encode(text, return_tensors='pt').to(device)
        outputs = model.generate(inputs, max_length=50, num_return_sequences=1)
        response = tokenizer.decode(outputs[0], skip_special_tokens=True)
        responses.append(response)
    return responses

# 응답 생성
responses = generate_responses(input_texts)

# 결과 출력
for i, response in enumerate(responses):
    print(f"Input: {input_texts[i]}")
    print(f"Response: {response}\n")

 

[operation]

pip install transformers

python app.py

 

 

[result]

C:\Python311\Lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable 
results.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Input: What is jiu-jitsu?
Response: What is jiu-jitsu?

Jiu-jitsu is a martial art that is based on the principles of jiu-jitsu. It is a martial art that is based on the principles of jiu-jitsu.

Input: Explain how jiu-jitsu became known in Brazil
Response: Explain how jiu-jitsu became known in Brazil.

The Brazilian Jiu-Jitsu Association (BJJA) is a Brazilian organization that promotes the use of jiu-jitsu in Brazilian Jiu-Jitsu. The B

 

728x90