implementing LoRA (Low-Rank Adaptation) using the popular PEFT library for fine-tuning a Hugging Face model. This approach is efficient and works well with large language models.

LEARNMYCOURSE

2 min readJan 29, 2025

Step-by-Step Implementation of LoRA Using PEFT

Install Required Libraries

Make sure you have the required libraries installed:

pip install transformers peft datasets accelerate

2. Import Necessary Libraries

from transformers import AutoModelForCausalLM, AutoTokenizer

from peft import LoraConfig, get_peft_model, PeftModel

from datasets import load_dataset

3. Load a Pre-trained Model and Tokenizer

For this example, we’ll use a pre-trained GPT-2 model.

# Load GPT-2 model and tokenizer

model_name = “gpt2"

model = AutoModelForCausalLM.from_pretrained(model_name)

tokenizer = AutoTokenizer.from_pretrained(model_name)

# Ensure the tokenizer uses padding

tokenizer.pad_token = tokenizer.eos_token

4. Configure LoRA

Define the LoRA parameters. Here, we configure rank, alpha, and the specific layers to apply LoRA.

# Define LoRA configuration

lora_config = LoraConfig(

. task_type=”CAUSAL_LM”, # Task type (e.g., Causal Language Modeling)

. inference_mode=False, # Set to True for inference-only use

. r=8, # Low-rank dimension

. lora_alpha=32, # Scaling factor

. lora_dropout=0.1, # Dropout for LoRA layers

)

# Apply LoRA to the model

peft_model = get_peft_model(model, lora_config)

5. Prepare Dataset

Load and preprocess a dataset for fine-tuning. Let’s use the Hugging Face “wikitext” dataset.

# Load a sample dataset

dataset = load_dataset(“wikitext”, “wikitext-2-raw-v1", split=”train”)

# Tokenize the dataset

def tokenize_function(examples):

. return tokenizer(examples[“text”], truncation=True, padding=”max_length”, max_length=512)

tokenized_dataset = dataset.map(tokenize_function, batched=True)

6. Fine-Tune the Model

Fine-tune the model using LoRA with fewer trainable parameters.

from transformers import TrainingArguments, Trainer

# Define training arguments

training_args = TrainingArguments(

. output_dir=”./lora-fine-tuned”, # Output directory

. per_device_train_batch_size=8,

. num_train_epochs=3,

. logging_dir=”./logs”,

. logging_steps=10,

. save_steps=500,

. save_total_limit=2,

. evaluation_strategy=”steps”,

. eval_steps=500,

. learning_rate=5e-4, # Smaller learning rate for LoRA

)

# Define a Trainer

trainer = Trainer(

. model=peft_model,

. args=training_args,

. train_dataset=tokenized_dataset,

)

# Fine-tune the model

trainer.train()

7. Save and Use the Fine-Tuned Model

After fine-tuning, save the model for inference.

# Save the model

peft_model.save_pretrained(“./lora-fine-tuned”)

# Load the fine-tuned model for inference

fine_tuned_model = PeftModel.from_pretrained(model, “./lora-fine-tuned”)

# Generate text with the fine-tuned model

input_text = “Once upon a time”

inputs = tokenizer(input_text, return_tensors=”pt”)

outputs = fine_tuned_model.generate(**inputs, max_length=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Key Notes

• Trainable Parameters: LoRA reduces trainable parameters significantly by only modifying the low-rank adapters, leaving most of the model frozen.

• Memory Efficiency: This approach is ideal for fine-tuning large models on smaller hardware.

• Dataset: You can replace the dataset with your own text data by preprocessing it in a similar way.

implementing LoRA (Low-Rank Adaptation) using the popular PEFT library for fine-tuning a Hugging Face model. This approach is efficient and works well with large language models.

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by LEARNMYCOURSE

No responses yet