Publisher's Synopsis
Drastically accelerate the building process of complex models using PyTorch and Horovod to extract the best performance of any computing environment.
Key Features
- Train machine learning models faster by using PyTorch and Horovod
- Reduce the model building time using single or multiple devices on-premises or in the cloud
- Focus on the model quality by rapidly evaluating different model configurations
Book Description
With the advent of complex models, data scientists need to know how to accelerate the model building process, so they can invest more time to improve the model quality by evaluating different model configurations. This book will guide you to improve the performance of the training step by using two powerful Python frameworks: PyTorch and Horovod. In the first section of the book, you'll understand how model complexity directly translates to the time required to train it. Next, you'll learn how to explore distinct levels of performance tuning to accelerate the training process. After this introduction, you'll dive into a set of techniques to accelerate model training. At first, you will learn how to use specialized libraries to optimize math and memory operations. Next, you'll learn how to build an efficient data pipeline to keep accelerators occupied during the entire training execution. After that, you'll learn how to reduce model complexity and adopt mixed precision, hence decreasing the computing time needed to train the model and reduce memory consumption. At the end of this part, you'll learn how to combine simple kernel operations to reduce overhead and improve performance. As you progress, you'll cover the basic concepts of distributed training. You'll then learn how to use PyTorch to harness the computing power of multicore systems. Next, you'll learn how to spread the training step among multiple GPUs in a single machine. At the end of the book, you'll be introduced to Horovod and learn how to easily distribute the training across multiple machines with single or multiple devices.What you will learn
- Use specialized libraries to accelerate math operations
- Build a data pipeline to boost GPU execution
- Reduce model complexity without penalizing accuracy
- Evaluate opportunities to adopt mixed precision
- Employ kernel fusion techniques to optimize GPU usage
- Distribute the training step across multiple machines and devices
Who this book is for
The primary audience of the book is entry-level data scientists and machine learning engineers.
]]>