Video Memory Optimization

This guide is for GPU training.

Video Memory optimization is to reduce the video memory consumption of Program during execution by analyzing and reusing the video memory occupied by Variable in Program. Users can use the memory_optimize interface to perform video memory optimization through Python scripts. The execution strategy of video memory optimization is as follows:

  • Firstly, analyze the remaining lifetime of Variable according to the relationship between Operator in Program to get the remaining lifetime of each Variable;
  • Secondly, according to the remaining lifetime of each Variable, the future Variable will reuse the video memory which is used by the Variable that approches the end of its remaining lifetime or ceases to exist.
z = fluid.layers.sum([x, y])
m = fluid.layers.matmul(y, z)

In this example, the lifetime of x lasts until fluid.layers.sum, so its video memory can be reused by m.

Disable video memory optimization for specific parts

memory_optimize supports disabling video memory optimization for specific sections. You can specify the Variable whose video memory space is not going to be reused by passing in a collection of Variable names; At the same time, memory_optimize disables video memory optimization for the backward part of the network, and the user can enable this function by passing in the skip_grads parameter.

fluid.memory_optimize(fluid.default_main_program(),
        skip_opt_set=("fc"), skip_grads=True)

In this example, the fluid.memory_optimize interface performs analysis of remaining lifetime of Variable for the default Program , and skips the Variable with the name fc and all the Variable in the backward part of the network . This part of the Variable video memory will not be used again by other Variable.

Specify the video memory optimization level

memory_optimize supports printing information for video memory reusing to facilitate debugging. Users can enable debugging video memory reusing by specifying print_log=True;

memory_optimize supports two levels of video memory optimization, namely 0 or 1 :

  • When the optimization level is 0: After memory_optimize analyzes the remaining lifetime of Variable, it will judge the shape of Variable . Memory reusing can only happens to the Variable with the same shape;
  • When the optimization level is 1: the memory_optimize will perform video memory reusing as much as possible. After analyzing the remaining survival time of Variable, even with different shape, the Variable will also perform the maximum amount of video memory reusing.
fluid.memory_optimize(fluid.default_main_program(),
        level=0, print_log=True)

In this example, the fluid.memory_optimize interface performs analysis of remaining lifetime of Variable for the default Program . Only when the shape is exactly the same, will the Variable enjoy video memory reusing. After the analysis is finished, all the debugging information related to video memory reusing will be printed out.