Input Filters¶
The input filters are separated into two categories - observation filters and reward filters.
Observation Filters¶
ObservationClippingFilter¶
-
class
rl_coach.filters.observation.
ObservationClippingFilter
(clipping_low: float = -inf, clipping_high: float = inf)[source]¶ Clips the observation values to a given range of values. For example, if the observation consists of measurements in an arbitrary range, and we want to control the minimum and maximum values of these observations, we can define a range and clip the values of the measurements.
- Parameters
clipping_low – The minimum value to allow after normalizing the observation
clipping_high – The maximum value to allow after normalizing the observation
ObservationCropFilter¶
-
class
rl_coach.filters.observation.
ObservationCropFilter
(crop_low: numpy.ndarray = None, crop_high: numpy.ndarray = None)[source]¶ Crops the size of the observation to a given crop window. For example, in Atari, the observations are images with a shape of 210x160. Usually, we will want to crop the size of the observation to a square of 160x160 before rescaling them.
- Parameters
crop_low – a vector where each dimension describes the start index for cropping the observation in the corresponding dimension. a negative value of -1 will be mapped to the max size
crop_high – a vector where each dimension describes the end index for cropping the observation in the corresponding dimension. a negative value of -1 will be mapped to the max size
ObservationMoveAxisFilter¶
-
class
rl_coach.filters.observation.
ObservationMoveAxisFilter
(axis_origin: int = None, axis_target: int = None)[source]¶ Reorders the axes of the observation. This can be useful when the observation is an image, and we want to move the channel axis to be the last axis instead of the first axis.
- Parameters
axis_origin – The axis to move
axis_target – Where to move the selected axis to
ObservationNormalizationFilter¶
-
class
rl_coach.filters.observation.
ObservationNormalizationFilter
(clip_min: float = -5.0, clip_max: float = 5.0, name='observation_stats')[source]¶ Normalizes the observation values with a running mean and standard deviation of all the observations seen so far. The normalization is performed element-wise. Additionally, when working with multiple workers, the statistics used for the normalization operation are accumulated over all the workers.
- Parameters
clip_min – The minimum value to allow after normalizing the observation
clip_max – The maximum value to allow after normalizing the observation
ObservationReductionBySubPartsNameFilter¶
-
class
rl_coach.filters.observation.
ObservationReductionBySubPartsNameFilter
(part_names: List[str], reduction_method: rl_coach.filters.observation.observation_reduction_by_sub_parts_name_filter.ObservationReductionBySubPartsNameFilter.ReductionMethod)[source]¶ Allows keeping only parts of the observation, by specifying their name. This is useful when the environment has a measurements vector as observation which includes several different measurements, but you want the agent to only see some of the measurements and not all. For example, the CARLA environment extracts multiple measurements that can be used by the agent, such as speed and location. If we want to only use the speed, it can be done using this filter. This will currently work only for VectorObservationSpace observations
- Parameters
part_names – A list of part names to reduce
reduction_method – A reduction method to use - keep or discard the given parts
ObservationRescaleSizeByFactorFilter¶
ObservationRescaleToSizeFilter¶
-
class
rl_coach.filters.observation.
ObservationRescaleToSizeFilter
(output_observation_space: rl_coach.spaces.PlanarMapsObservationSpace)[source]¶ Rescales an image observation to a given size. The target size does not necessarily keep the aspect ratio of the original observation. Warning: this requires the input observation to be of type uint8 due to scipy requirements!
- Parameters
output_observation_space – the output observation space
ObservationRGBToYFilter¶
-
class
rl_coach.filters.observation.
ObservationRGBToYFilter
[source]¶ Converts a color image observation specified using the RGB encoding into a grayscale image observation, by keeping only the luminance (Y) channel of the YUV encoding. This can be useful if the colors in the original image are not relevant for solving the task at hand. The channels axis is assumed to be the last axis
ObservationSqueezeFilter¶
ObservationStackingFilter¶
-
class
rl_coach.filters.observation.
ObservationStackingFilter
(stack_size: int, stacking_axis: int = -1)[source]¶ Stacks several observations on top of each other. For image observation this will create a 3D blob. The stacking is done in a lazy manner in order to reduce memory consumption. To achieve this, a LazyStack object is used in order to wrap the observations in the stack. For this reason, the ObservationStackingFilter must be the last filter in the inputs filters stack. This filter is stateful since it stores the previous step result and depends on it. The filter adds an additional dimension to the output observation.
Warning!!! The filter replaces the observation with a LazyStack object, so no filters should be applied after this filter. applying more filters will cause the LazyStack object to be converted to a numpy array and increase the memory footprint.
- Parameters
stack_size – the number of previous observations in the stack
stacking_axis – the axis on which to stack the observation on
ObservationToUInt8Filter¶
-
class
rl_coach.filters.observation.
ObservationToUInt8Filter
(input_low: float, input_high: float)[source]¶ Converts a floating point observation into an unsigned int 8 bit observation. This is mostly useful for reducing memory consumption and is usually used for image observations. The filter will first spread the observation values over the range 0-255 and then discretize them into integer values.
- Parameters
input_low – The lowest value currently present in the observation
input_high – The highest value currently present in the observation
Reward Filters¶
RewardClippingFilter¶
-
class
rl_coach.filters.reward.
RewardClippingFilter
(clipping_low: float = -inf, clipping_high: float = inf)[source]¶ Clips the reward values into a given range. For example, in DQN, the Atari rewards are clipped into the range -1 and 1 in order to control the scale of the returns.
- Parameters
clipping_low – The low threshold for reward clipping
clipping_high – The high threshold for reward clipping
RewardNormalizationFilter¶
-
class
rl_coach.filters.reward.
RewardNormalizationFilter
(clip_min: float = -5.0, clip_max: float = 5.0)[source]¶ Normalizes the reward values with a running mean and standard deviation of all the rewards seen so far. When working with multiple workers, the statistics used for the normalization operation are accumulated over all the workers.
- Parameters
clip_min – The minimum value to allow after normalizing the reward
clip_max – The maximum value to allow after normalizing the reward
RewardRescaleFilter¶
-
class
rl_coach.filters.reward.
RewardRescaleFilter
(rescale_factor: float)[source]¶ Rescales the reward by a given factor. Rescaling the rewards of the environment has been observed to have a large effect (negative or positive) on the behavior of the learning process.
- Parameters
rescale_factor – The reward rescaling factor by which the reward will be multiplied