implementing LoRA (Low-Rank Adaptation) using the popular PEFT library for fine-tuning a Hugging Face model. This approach is efficient and works well with large language models.

LEARNMYCOURSE
2 min readJan 29, 2025

--

Step-by-Step Implementation of LoRA Using PEFT

  1. Install Required Libraries

Make sure you have the required libraries installed:

pip install transformers peft datasets accelerate

2. Import Necessary Libraries

from transformers import AutoModelForCausalLM, AutoTokenizer

from peft import LoraConfig, get_peft_model, PeftModel

from datasets import load_dataset

3. Load a Pre-trained Model and Tokenizer

For this example, we’ll use a pre-trained GPT-2 model.

# Load GPT-2 model and tokenizer

model_name = “gpt2"

model = AutoModelForCausalLM.from_pretrained(model_name)

tokenizer = AutoTokenizer.from_pretrained(model_name)

# Ensure the tokenizer uses padding

tokenizer.pad_token = tokenizer.eos_token

4. Configure LoRA

Define the LoRA parameters. Here, we configure rank, alpha, and the specific layers to apply LoRA.

# Define LoRA configuration

lora_config = LoraConfig(

. task_type=”CAUSAL_LM”, # Task type (e.g., Causal Language Modeling)

. inference_mode=False, # Set to True for inference-only use

. r=8, # Low-rank dimension

. lora_alpha=32, # Scaling factor

. lora_dropout=0.1, # Dropout for LoRA layers

)

# Apply LoRA to the model

peft_model = get_peft_model(model, lora_config)

5. Prepare Dataset

Load and preprocess a dataset for fine-tuning. Let’s use the Hugging Face “wikitext” dataset.

# Load a sample dataset

dataset = load_dataset(“wikitext”, “wikitext-2-raw-v1", split=”train”)

# Tokenize the dataset

def tokenize_function(examples):

. return tokenizer(examples[“text”], truncation=True, padding=”max_length”, max_length=512)

tokenized_dataset = dataset.map(tokenize_function, batched=True)

6. Fine-Tune the Model

Fine-tune the model using LoRA with fewer trainable parameters.

from transformers import TrainingArguments, Trainer

# Define training arguments

training_args = TrainingArguments(

. output_dir=”./lora-fine-tuned”, # Output directory

. per_device_train_batch_size=8,

. num_train_epochs=3,

. logging_dir=”./logs”,

. logging_steps=10,

. save_steps=500,

. save_total_limit=2,

. evaluation_strategy=”steps”,

. eval_steps=500,

. learning_rate=5e-4, # Smaller learning rate for LoRA

)

# Define a Trainer

trainer = Trainer(

. model=peft_model,

. args=training_args,

. train_dataset=tokenized_dataset,

)

# Fine-tune the model

trainer.train()

7. Save and Use the Fine-Tuned Model

After fine-tuning, save the model for inference.

# Save the model

peft_model.save_pretrained(“./lora-fine-tuned”)

# Load the fine-tuned model for inference

fine_tuned_model = PeftModel.from_pretrained(model, “./lora-fine-tuned”)

# Generate text with the fine-tuned model

input_text = “Once upon a time”

inputs = tokenizer(input_text, return_tensors=”pt”)

outputs = fine_tuned_model.generate(**inputs, max_length=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Key Notes

• Trainable Parameters: LoRA reduces trainable parameters significantly by only modifying the low-rank adapters, leaving most of the model frozen.

• Memory Efficiency: This approach is ideal for fine-tuning large models on smaller hardware.

• Dataset: You can replace the dataset with your own text data by preprocessing it in a similar way.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

LEARNMYCOURSE
LEARNMYCOURSE

No responses yet

Write a response