Annexure 9: PyTorch Glossary of Key Terms (Beginner to Advanced)
Abstract:
Below is the Annexure 9: PyTorch Glossary of Key Terms (Beginner to Advanced) — concise, complete, and ready to insert into the book.
**Annexure 9
PyTorch Glossary of Key Terms (Beginner to Advanced)**
This annexure compiles the most essential and frequently used PyTorch terms. It covers foundational concepts, intermediate constructs, and advanced components used in deep learning research and deployment.
A. Beginner-Level Terms
1. Tensor
A multi-dimensional array used for all computations in PyTorch. Analogous to NumPy arrays but with GPU support.
2. Tensor Rank
The number of dimensions (e.g., 0D scalar, 1D vector, 2D matrix).
3. Autograd
PyTorch’s automatic differentiation engine that computes gradients for tensors with requires_grad=True.
4. Computational Graph
A directed graph representing operations performed on tensors. PyTorch builds it dynamically.
5. Gradient
The derivative of a function with respect to its variables; essential for optimization.
6. CUDA
NVIDIA’s GPU computing platform; enables tensor operations to run on GPUs via tensor.cuda().
7. CPU vs GPU
CPU: General-purpose processor.
GPU: Optimized for parallel computations (faster for DL).
8. Optimizer
A PyTorch object that updates model parameters based on gradients (e.g., SGD, Adam).
9. Loss Function
A function that measures the error between predictions and targets. (e.g., MSELoss, CrossEntropyLoss).
10. Model / Network
A class derived from torch.nn.Module that defines layers and forward pass.
11. DataLoader
Loads data in batches, shuffles, and handles multiprocessing.
12. Dataset
Represents input data; can be MNIST, CustomDataset, or others.
13. Epoch
One complete pass over the entire training dataset.
14. Batch Size
Number of samples processed before model update.
15. Learning Rate
Controls how big parameter updates are during training.
B. Intermediate-Level Terms
16. Module
The base class for all neural network components (nn.Module).
17. Forward Pass
Executing the model on input data to get outputs.
18. Backward Pass
PyTorch computes gradients using loss.backward().
19. State_dict
A Python dictionary containing model parameters and optimizer states.
20. Checkpoint
Saved model state used for restoring or resuming training.
21. Activation Function
Introduces non-linearity (e.g., ReLU, Tanh, Sigmoid).
22. Dropout
Regularization technique that randomly zeros activations.
23. Batch Normalization
Normalizes layer inputs for stable training.
24. Gradient Clipping
Restricts gradient magnitude to avoid exploding gradients.
25. Weight Decay
L2 regularization used in optimizers like AdamW.
26. Learning Rate Scheduler
Adjusts learning rate dynamically (StepLR, ReduceLROnPlateau).
27. TorchScript
A PyTorch model representation used for deployment and optimization.
28. ONNX
Open Neural Network Exchange format for cross-platform model deployment.
29. Mixed Precision Training
Training with float16 + float32 for speed and lower memory use.
30. AMP (Automatic Mixed Precision)
PyTorch tool for safe mixed precision via torch.cuda.amp.
C. Advanced-Level Terms
31. DDP (Distributed Data Parallel)
Parallel training across multiple GPUs/machines.
32. RPC Framework
Allows remote execution for model parallel distributed training.
33. JIT Compiler
Just-In-Time compiler accelerating PyTorch models.
34. FX Tracer
PyTorch’s intermediate representation tool for graph transformations.
35. Quantization
Reduces model precision (INT8, FP16) for deployment.
36. Pruning
Removing unimportant weights to reduce model size.
37. Triton
Low-level GPU programming language integrated with PyTorch for kernel customization.
38. Memory Pinning
Pinned memory speeds up CPU → GPU transfers in DataLoader.
39. TensorRT
NVIDIA engine for optimizing PyTorch models for inference.
40. Batch Inference
Running prediction on many inputs simultaneously for faster inference.
41. Micro-Batching
Splitting large batches into smaller ones to avoid OOM errors.
42. Gradient Accumulation
Accumulating gradients over batches to simulate large batch size.
43. Graph Mode Execution
Optimized execution where dynamic graphs are converted to static graphs.
44. Autocast
Automatically selects precision for operations during mixed precision training.
45. Profiler
PyTorch tool to measure performance of CPU/GPU execution.
46. Custom Dataset & Collate Function
User-defined logic for loading data and customizing batch assembly.
47. Parameter Server
Architecture for distributed training used in very large models.
48. Checkpoint Sharding
Splitting model checkpoints across multiple files/devices.
49. Zero Redundancy Optimizer (ZeRO)
Memory-efficient optimizer for training extremely large models.
50. TorchDynamo
A graph-capture tool that speeds up PyTorch execution.
51. Inductor
PyTorch’s native deep-learning compiler backend.
52. Functorch / vmap
Tools for vectorizing operations across inputs.
53. Autograd Grad Mode
Controls gradient tracking (no_grad, inference_mode).
54. Lazy Tensors
Defers execution for optimization in large-scale systems.
55. TorchServe
Framework for serving PyTorch models in production.
56. Accelerate (HuggingFace)
Library simplifying multi-GPU/mixed-precision training.
57. Flash Attention
Optimized attention operation for large transformer models.
58. Memory-Efficient Attention
Techniques reducing memory used by transformer layers.
59. Kernel Fusion
Combining multiple GPU operations into one for speed.
60. PyTorch Lightning
High-level framework simplifying training loops while keeping flexibility.
D. Practical Quick Reference Table
| Term | Category | Short Definition |
|---|---|---|
| Tensor | Basic | Main data structure |
| Autograd | Basic | Automatic gradient tracking |
| DataLoader | Basic | Batch data iterator |
| Optimizer | Basic | Updates model parameters |
| Scheduler | Intermediate | Adjusts learning rate |
| Dropout | Intermediate | Prevents overfitting |
| AMP | Intermediate | Mixed precision tool |
| DDP | Advanced | Multi-GPU training |
| TorchScript | Advanced | Deployable model format |
| Quantization | Advanced | Model size reduction |
| Profiler | Advanced | Performance measurement |
E. Conclusion
This glossary equips learners, students, and professionals with the most relevant PyTorch terminology. It acts as a rapid reference to reinforce conceptual clarity, improve coding fluency, and support advanced research and deployment workflows.
Comments
Post a Comment
"Thank you for seeking advice on your career journey! Our team is dedicated to providing personalized guidance on education and success. Please share your specific questions or concerns, and we'll assist you in navigating the path to a fulfilling and successful career."