Input Filters

The input filters are separated into two categories - observation filters and reward filters.

Observation Filters

ObservationClippingFilter

class rl_coach.filters.observation.ObservationClippingFilter(clipping_low: float = -inf, clipping_high: float = inf)[source]

Clips the observation values to a given range of values. For example, if the observation consists of measurements in an arbitrary range, and we want to control the minimum and maximum values of these observations, we can define a range and clip the values of the measurements.

Parameters
  • clipping_low – The minimum value to allow after normalizing the observation

  • clipping_high – The maximum value to allow after normalizing the observation

ObservationCropFilter

class rl_coach.filters.observation.ObservationCropFilter(crop_low: numpy.ndarray = None, crop_high: numpy.ndarray = None)[source]

Crops the size of the observation to a given crop window. For example, in Atari, the observations are images with a shape of 210x160. Usually, we will want to crop the size of the observation to a square of 160x160 before rescaling them.

Parameters
  • crop_low – a vector where each dimension describes the start index for cropping the observation in the corresponding dimension. a negative value of -1 will be mapped to the max size

  • crop_high – a vector where each dimension describes the end index for cropping the observation in the corresponding dimension. a negative value of -1 will be mapped to the max size

ObservationMoveAxisFilter

class rl_coach.filters.observation.ObservationMoveAxisFilter(axis_origin: int = None, axis_target: int = None)[source]

Reorders the axes of the observation. This can be useful when the observation is an image, and we want to move the channel axis to be the last axis instead of the first axis.

Parameters
  • axis_origin – The axis to move

  • axis_target – Where to move the selected axis to

ObservationNormalizationFilter

class rl_coach.filters.observation.ObservationNormalizationFilter(clip_min: float = -5.0, clip_max: float = 5.0, name='observation_stats')[source]

Normalizes the observation values with a running mean and standard deviation of all the observations seen so far. The normalization is performed element-wise. Additionally, when working with multiple workers, the statistics used for the normalization operation are accumulated over all the workers.

Parameters
  • clip_min – The minimum value to allow after normalizing the observation

  • clip_max – The maximum value to allow after normalizing the observation

ObservationReductionBySubPartsNameFilter

class rl_coach.filters.observation.ObservationReductionBySubPartsNameFilter(part_names: List[str], reduction_method: rl_coach.filters.observation.observation_reduction_by_sub_parts_name_filter.ObservationReductionBySubPartsNameFilter.ReductionMethod)[source]

Allows keeping only parts of the observation, by specifying their name. This is useful when the environment has a measurements vector as observation which includes several different measurements, but you want the agent to only see some of the measurements and not all. For example, the CARLA environment extracts multiple measurements that can be used by the agent, such as speed and location. If we want to only use the speed, it can be done using this filter. This will currently work only for VectorObservationSpace observations

Parameters
  • part_names – A list of part names to reduce

  • reduction_method – A reduction method to use - keep or discard the given parts

ObservationRescaleSizeByFactorFilter

class rl_coach.filters.observation.ObservationRescaleSizeByFactorFilter(rescale_factor: float)[source]

Rescales an image observation by some factor. For example, the image size can be reduced by a factor of 2.

Parameters

rescale_factor – the factor by which the observation will be rescaled

ObservationRescaleToSizeFilter

class rl_coach.filters.observation.ObservationRescaleToSizeFilter(output_observation_space: rl_coach.spaces.PlanarMapsObservationSpace)[source]

Rescales an image observation to a given size. The target size does not necessarily keep the aspect ratio of the original observation. Warning: this requires the input observation to be of type uint8 due to scipy requirements!

Parameters

output_observation_space – the output observation space

ObservationRGBToYFilter

class rl_coach.filters.observation.ObservationRGBToYFilter[source]

Converts a color image observation specified using the RGB encoding into a grayscale image observation, by keeping only the luminance (Y) channel of the YUV encoding. This can be useful if the colors in the original image are not relevant for solving the task at hand. The channels axis is assumed to be the last axis

ObservationSqueezeFilter

class rl_coach.filters.observation.ObservationSqueezeFilter(axis: int = None)[source]

Removes redundant axes from the observation, which are axes with a dimension of 1.

Parameters

axis – Specifies which axis to remove. If set to None, all the axes of size 1 will be removed.

ObservationStackingFilter

class rl_coach.filters.observation.ObservationStackingFilter(stack_size: int, stacking_axis: int = -1)[source]

Stacks several observations on top of each other. For image observation this will create a 3D blob. The stacking is done in a lazy manner in order to reduce memory consumption. To achieve this, a LazyStack object is used in order to wrap the observations in the stack. For this reason, the ObservationStackingFilter must be the last filter in the inputs filters stack. This filter is stateful since it stores the previous step result and depends on it. The filter adds an additional dimension to the output observation.

Warning!!! The filter replaces the observation with a LazyStack object, so no filters should be applied after this filter. applying more filters will cause the LazyStack object to be converted to a numpy array and increase the memory footprint.

Parameters
  • stack_size – the number of previous observations in the stack

  • stacking_axis – the axis on which to stack the observation on

ObservationToUInt8Filter

class rl_coach.filters.observation.ObservationToUInt8Filter(input_low: float, input_high: float)[source]

Converts a floating point observation into an unsigned int 8 bit observation. This is mostly useful for reducing memory consumption and is usually used for image observations. The filter will first spread the observation values over the range 0-255 and then discretize them into integer values.

Parameters
  • input_low – The lowest value currently present in the observation

  • input_high – The highest value currently present in the observation

Reward Filters

RewardClippingFilter

class rl_coach.filters.reward.RewardClippingFilter(clipping_low: float = -inf, clipping_high: float = inf)[source]

Clips the reward values into a given range. For example, in DQN, the Atari rewards are clipped into the range -1 and 1 in order to control the scale of the returns.

Parameters
  • clipping_low – The low threshold for reward clipping

  • clipping_high – The high threshold for reward clipping

RewardNormalizationFilter

class rl_coach.filters.reward.RewardNormalizationFilter(clip_min: float = -5.0, clip_max: float = 5.0)[source]

Normalizes the reward values with a running mean and standard deviation of all the rewards seen so far. When working with multiple workers, the statistics used for the normalization operation are accumulated over all the workers.

Parameters
  • clip_min – The minimum value to allow after normalizing the reward

  • clip_max – The maximum value to allow after normalizing the reward

RewardRescaleFilter

class rl_coach.filters.reward.RewardRescaleFilter(rescale_factor: float)[source]

Rescales the reward by a given factor. Rescaling the rewards of the environment has been observed to have a large effect (negative or positive) on the behavior of the learning process.

Parameters

rescale_factor – The reward rescaling factor by which the reward will be multiplied