Input Filters¶

The input filters are separated into two categories - observation filters and reward filters.

Observation Filters¶

ObservationClippingFilter¶

class rl_coach.filters.observation.ObservationClippingFilter(clipping_low: float = -inf, clipping_high: float = inf)[source]¶

Clips the observation values to a given range of values. For example, if the observation consists of measurements in an arbitrary range, and we want to control the minimum and maximum values of these observations, we can define a range and clip the values of the measurements.

Parameters

clipping_low – The minimum value to allow after normalizing the observation
clipping_high – The maximum value to allow after normalizing the observation

ObservationCropFilter¶

class rl_coach.filters.observation.ObservationCropFilter(crop_low: numpy.ndarray = None, crop_high: numpy.ndarray = None)[source]¶

Crops the size of the observation to a given crop window. For example, in Atari, the observations are images with a shape of 210x160. Usually, we will want to crop the size of the observation to a square of 160x160 before rescaling them.

Parameters

crop_low – a vector where each dimension describes the start index for cropping the observation in the corresponding dimension. a negative value of -1 will be mapped to the max size
crop_high – a vector where each dimension describes the end index for cropping the observation in the corresponding dimension. a negative value of -1 will be mapped to the max size

ObservationMoveAxisFilter¶

class rl_coach.filters.observation.ObservationMoveAxisFilter(axis_origin: int = None, axis_target: int = None)[source]¶

Reorders the axes of the observation. This can be useful when the observation is an image, and we want to move the channel axis to be the last axis instead of the first axis.

Parameters

axis_origin – The axis to move
axis_target – Where to move the selected axis to

ObservationNormalizationFilter¶

class rl_coach.filters.observation.ObservationNormalizationFilter(clip_min: float = -5.0, clip_max: float = 5.0, name='observation_stats')[source]¶

Normalizes the observation values with a running mean and standard deviation of all the observations seen so far. The normalization is performed element-wise. Additionally, when working with multiple workers, the statistics used for the normalization operation are accumulated over all the workers.

Parameters

clip_min – The minimum value to allow after normalizing the observation
clip_max – The maximum value to allow after normalizing the observation

ObservationReductionBySubPartsNameFilter¶

class rl_coach.filters.observation.ObservationReductionBySubPartsNameFilter(part_names: List[str], reduction_method: rl_coach.filters.observation.observation_reduction_by_sub_parts_name_filter.ObservationReductionBySubPartsNameFilter.ReductionMethod)[source]¶

Allows keeping only parts of the observation, by specifying their name. This is useful when the environment has a measurements vector as observation which includes several different measurements, but you want the agent to only see some of the measurements and not all. For example, the CARLA environment extracts multiple measurements that can be used by the agent, such as speed and location. If we want to only use the speed, it can be done using this filter. This will currently work only for VectorObservationSpace observations

Parameters

part_names – A list of part names to reduce
reduction_method – A reduction method to use - keep or discard the given parts

ObservationRescaleSizeByFactorFilter¶

class rl_coach.filters.observation.ObservationRescaleSizeByFactorFilter(rescale_factor: float)[source]¶

Rescales an image observation by some factor. For example, the image size can be reduced by a factor of 2.

Parameters: rescale_factor – the factor by which the observation will be rescaled

ObservationRescaleToSizeFilter¶

class rl_coach.filters.observation.ObservationRescaleToSizeFilter(output_observation_space: rl_coach.spaces.PlanarMapsObservationSpace)[source]¶

Rescales an image observation to a given size. The target size does not necessarily keep the aspect ratio of the original observation. Warning: this requires the input observation to be of type uint8 due to scipy requirements!

Parameters: output_observation_space – the output observation space

ObservationRGBToYFilter¶

class rl_coach.filters.observation.ObservationRGBToYFilter[source]¶: Converts a color image observation specified using the RGB encoding into a grayscale image observation, by keeping only the luminance (Y) channel of the YUV encoding. This can be useful if the colors in the original image are not relevant for solving the task at hand. The channels axis is assumed to be the last axis

ObservationSqueezeFilter¶

class rl_coach.filters.observation.ObservationSqueezeFilter(axis: int = None)[source]¶

Removes redundant axes from the observation, which are axes with a dimension of 1.

Parameters: axis – Specifies which axis to remove. If set to None, all the axes of size 1 will be removed.

ObservationStackingFilter¶

class rl_coach.filters.observation.ObservationStackingFilter(stack_size: int, stacking_axis: int = -1)[source]¶

Stacks several observations on top of each other. For image observation this will create a 3D blob. The stacking is done in a lazy manner in order to reduce memory consumption. To achieve this, a LazyStack object is used in order to wrap the observations in the stack. For this reason, the ObservationStackingFilter must be the last filter in the inputs filters stack. This filter is stateful since it stores the previous step result and depends on it. The filter adds an additional dimension to the output observation.

Warning!!! The filter replaces the observation with a LazyStack object, so no filters should be applied after this filter. applying more filters will cause the LazyStack object to be converted to a numpy array and increase the memory footprint.

Parameters

stack_size – the number of previous observations in the stack
stacking_axis – the axis on which to stack the observation on

ObservationToUInt8Filter¶

class rl_coach.filters.observation.ObservationToUInt8Filter(input_low: float, input_high: float)[source]¶

Converts a floating point observation into an unsigned int 8 bit observation. This is mostly useful for reducing memory consumption and is usually used for image observations. The filter will first spread the observation values over the range 0-255 and then discretize them into integer values.

Parameters

input_low – The lowest value currently present in the observation
input_high – The highest value currently present in the observation

Reward Filters¶

RewardClippingFilter¶

class rl_coach.filters.reward.RewardClippingFilter(clipping_low: float = -inf, clipping_high: float = inf)[source]¶

Clips the reward values into a given range. For example, in DQN, the Atari rewards are clipped into the range -1 and 1 in order to control the scale of the returns.

Parameters

clipping_low – The low threshold for reward clipping
clipping_high – The high threshold for reward clipping

RewardNormalizationFilter¶

class rl_coach.filters.reward.RewardNormalizationFilter(clip_min: float = -5.0, clip_max: float = 5.0)[source]¶

Normalizes the reward values with a running mean and standard deviation of all the rewards seen so far. When working with multiple workers, the statistics used for the normalization operation are accumulated over all the workers.

Parameters

clip_min – The minimum value to allow after normalizing the reward
clip_max – The maximum value to allow after normalizing the reward

RewardRescaleFilter¶

class rl_coach.filters.reward.RewardRescaleFilter(rescale_factor: float)[source]¶

Rescales the reward by a given factor. Rescaling the rewards of the environment has been observed to have a large effect (negative or positive) on the behavior of the learning process.

Parameters: rescale_factor – The reward rescaling factor by which the reward will be multiplied