Special Annexure 1: PyTorch Interview Questions and Answers (Basic to Advanced)
Abstract:
Below is Special Annexure 1: PyTorch Interview Questions and Answers (Basic to Advanced) — comprehensive, structured, and industry-ready.
**Special Annexure 1
PyTorch Interview Questions and Answers (Basic to Advanced)**
This annexure compiles curated technical interview questions frequently asked in academic, industrial, and research roles involving PyTorch. The questions span beginner, intermediate, and advanced levels, covering tensors, autograd, neural networks, optimization, GPU acceleration, deployment, and troubleshooting.
Section A: Basic-Level Questions
1. What is PyTorch?
PyTorch is an open-source deep learning framework developed by Facebook AI Research. It provides:
-
Dynamic computational graphs
-
Efficient tensor operations
-
Automatic differentiation
-
High flexibility for research and prototyping
2. What is a tensor in PyTorch?
A tensor is a multidimensional array similar to:
-
NumPy arrays (CPU)
-
GPU arrays (CUDA-supported)
PyTorch tensors support GPU acceleration and autograd.
3. How do you create a tensor in PyTorch?
x = torch.tensor([1, 2, 3])
Other methods:
-
torch.zeros() -
torch.ones() -
torch.randn()
4. What is the difference between NumPy arrays and PyTorch tensors?
| Feature | NumPy | PyTorch |
|---|---|---|
| GPU Support | ❌ No | ✔ Yes (cuda) |
| Autograd | ❌ No | ✔ Yes |
| Deep Learning | Indirect | Native |
5. How do you check if CUDA is available?
torch.cuda.is_available()
6. What is Autograd in PyTorch?
Autograd is PyTorch’s automatic differentiation engine.
It tracks operations and computes gradients for tensors with requires_grad=True.
7. How do you disable gradient calculation?
with torch.no_grad():
output = model(x)
8. Difference between model.train() and model.eval()?
| Mode | Purpose |
|---|---|
model.train() |
Enables dropout, batchnorm updates |
model.eval() |
Turns off dropout, uses running stats |
9. What is a DataLoader?
A DataLoader:
-
Loads data in batches
-
Supports shuffling
-
Uses multiprocessing (
num_workers)
loader = DataLoader(dataset, batch_size=32, shuffle=True)
10. What is the purpose of an optimizer?
Optimizers update model parameters (weights) using gradients during training.
Examples:
-
SGD
-
Adam
-
RMSProp
Section B: Intermediate-Level Questions
11. What does backward() do?
Computes gradients for all tensors in the computation graph:
loss.backward()
12. How do you update parameters in PyTorch?
optimizer.step()
But always after:
optimizer.zero_grad()
loss.backward()
13. What is a custom Dataset class?
A user-defined dataset that inherits from torch.utils.data.Dataset.
class MyDataset(Dataset):
def __getitem__(self, idx):
return data[idx], labels[idx]
14. What is the purpose of collate_fn in DataLoader?
It defines how a batch of samples is combined.
Useful for:
-
Variable-length sequences (text)
-
Audio clips
-
Complex structures
15. Explain dynamic computation graph.
PyTorch builds the graph on-the-fly during execution.
This means:
-
Flexible designs
-
Easy debugging
-
Better suited for NLP/RL tasks
16. How do you save and load models?
Recommended method — save only weights:
torch.save(model.state_dict(), 'model.pth')
model.load_state_dict(torch.load('model.pth'))
17. What are nn.Module and nn.functional?
| Component | Description |
|---|---|
nn.Module |
Layer/object class (state + parameters) |
nn.functional |
Stateless functions (like F.relu) |
Example:
-
nn.ReLU()stores state -
F.relu()does not
18. What are hooks used for?
Hooks allow inspecting:
-
Layer inputs/outputs
-
Gradients
-
Activations
Used for debugging.
19. How does PyTorch handle broadcasting?
Tensors with different shapes can be automatically expanded to match dimensions following NumPy broadcasting rules.
Example:
a + b # if shapes are compatible
20. How to perform gradient clipping?
Used to prevent exploding gradients:
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
Section C: Advanced-Level Questions
21. How does PyTorch's Autograd work internally?
-
Each tensor has a
grad_fnif created by operations -
Backpropagation follows a reverse traversal of the computation graph
-
The graph is freed after backward unless
retain_graph=True
22. What is the difference between TorchScript and eager mode?
| Mode | Description |
|---|---|
| Eager Mode | Pythonic, dynamic, easy to debug |
| TorchScript | Serialized, optimized, deployable (mobile, C++) |
TorchScript = tracing or scripting a model.
23. What is Distributed Data Parallel (DDP)?
DDP allows large-scale training across multiple GPUs or nodes.
Key features:
-
Efficient gradient synchronization
-
Scalable parallelism
-
Better performance than
DataParallel
24. What is mixed-precision training?
Using FP16 + FP32 to:
-
Reduce memory usage
-
Improve speed
-
Maintain stability
With AMP:
from torch.cuda.amp import autocast, GradScaler
25. Explain custom loss function creation.
class MyLoss(nn.Module):
def forward(self, pred, target):
return torch.mean((pred - target)**2)
26. What is gradient accumulation?
Accumulate gradients over multiple batches to simulate larger batches.
loss.backward()
if (i+1) % accumulation_steps == 0:
optimizer.step()
optimizer.zero_grad()
27. How do you detect vanishing or exploding gradients?
Inspect gradient norms:
total_norm = torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=2.0)
print(total_norm)
28. What is a DataParallel? Why is it slower than DDP?
DataParallel:
-
Splits data across GPUs
-
Slow due to CPU overhead & replication
DDP:
-
Processes run in parallel
-
Uses efficient communication backend
29. Explain the difference between tracing and scripting in TorchScript.
| Type | When Used | Limitation |
|---|---|---|
| Tracing | Fixed control flow (CNNs) | Fails with loops, if-statements |
| Scripting | Dynamic models | Slower to compile |
30. How does PyTorch manage memory on GPUs?
Mechanisms:
-
Caching allocator
-
Asynchronous execution
-
Gradient buffer reuse
Common error:
CUDA out of memory
Fix:
-
Reduce batch size
-
Use mixed precision
-
Empty cache:
torch.cuda.empty_cache()
Section D: Real-World Scenario Questions
31. Your loss is not decreasing. How do you debug?
Checklist:
-
Check learning rate
-
Inspect preprocessing
-
Check data-label alignment
-
Visualize gradient norms
-
Overfit on a tiny batch
32. How to freeze layers in transfer learning?
for param in model.features.parameters():
param.requires_grad = False
33. Your model is overfitting. What do you do?
Solutions:
-
Increase dropout
-
Data augmentation
-
Use weight decay
-
Early stopping
34. How do you deploy a PyTorch model?
Options:
-
TorchScript
-
ONNX → TensorRT
-
PyTorch Mobile
-
FastAPI/Flask REST API
35. How do you profile a PyTorch model?
with torch.profiler.profile() as prof:
output = model(x)
print(prof.key_averages().table(sort_by="cpu_time_total"))
Section E: Coding Challenges (With Expected Answers)
Challenge 1: Write a PyTorch code to compute gradients of a simple function.
x = torch.tensor(5.0, requires_grad=True)
y = x**2 + 3*x + 1
y.backward()
print(x.grad) # Expected: 2x + 3 = 13
Challenge 2: Define a simple feed-forward neural network.
class MLP(nn.Module):
def __init__(self):
super().__init__()
self.fc = nn.Sequential(
nn.Linear(784, 256),
nn.ReLU(),
nn.Linear(256, 10)
)
def forward(self, x):
return self.fc(x)
Challenge 3: Build a custom DataLoader for images.
class ImageDataset(Dataset):
def __init__(self, image_paths, transform=None):
self.paths = image_paths
self.transform = transform
def __getitem__(self, idx):
img = Image.open(self.paths[idx]).convert("RGB")
if self.transform:
img = self.transform(img)
return img
def __len__(self):
return len(self.paths)
Conclusion
This Special Annexure 1 provides a complete interview-ready resource covering:
-
Fundamental to advanced PyTorch questions
-
Real-world debugging scenarios
-
Deployment and optimization
-
Coding challenges
This is suitable for:
-
Students
-
Researchers
-
ML engineers
-
Candidates preparing for interviews
-
Trainers and educators
Comments
Post a Comment
"Thank you for seeking advice on your career journey! Our team is dedicated to providing personalized guidance on education and success. Please share your specific questions or concerns, and we'll assist you in navigating the path to a fulfilling and successful career."