aprel.utils package¶

aprel.utils.batch_utils module¶

Utility functions for active batch generation.

aprel.utils.batch_utils.default_query_distance(queries: List[aprel.learning.data_types.Query], **kwargs) → numpy.array¶

Given a set of m queries, returns an m-by-m matrix, each entry representing the distance between the corresponding queries.

Parameters

queries (List[Query]) – list of m queries for which the distances will be computed
**kwargs –
The hyperparameters.
- metric (str): The distance metric can be specified with this argument. Defaults to ‘euclidean’. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html for the set of available metrics.

Returns

an m-by-m numpy array that consists of the pairwise distances between the queries.

Return type

numpy.array

Raises

AssertionError – if the query is not a compatible type. Currently, the compatible types are: FullRankingQuery, PreferenceQuery, and WeakComparisonQuery (all for a slate size of 2).

aprel.utils.dpp module¶

This module handles greedy estimation of the mode of a determinantal point process (DPP). The technique is based on Biyik et al. (2019). The code is adopted from https://github.com/Stanford-ILIAD/DPP-Batch-Active-Learning/blob/master/reward_learning/dpp_sampler.py.

class aprel.utils.dpp.Kernel¶

Bases: object

getKernel(ps, qs)¶

class aprel.utils.dpp.Sampler(kernel, distances, k)¶

Bases: object

addGreedy()¶

append(ind)¶

clear()¶

ratios(item_ids=None)¶

sample()¶

warmStart()¶

class aprel.utils.dpp.ScoredKernel(R, distances, scores)¶

Bases: aprel.utils.dpp.Kernel

getKernel(p_ids, q_ids)¶

aprel.utils.dpp.dpp_mode(distances, scores, k)¶

aprel.utils.dpp.sample_ids_mc(distances, scores, k)¶

aprel.utils.dpp.setup_sampler(distances, scores, k)¶

aprel.utils.generate_trajectories module¶

This module stores the functions for trajectory set generation.

aprel.utils.generate_trajectories.generate_trajectories_randomly(env: aprel.basics.environment.Environment, num_trajectories: int, max_episode_length: Optional[int] = None, file_name: Optional[str] = None, restore: bool = False, headless: bool = False, seed: Optional[int] = None) → aprel.basics.trajectory.TrajectorySet ¶

Generates num_trajectories random trajectories, or loads (some of) them from the given file.

Parameters

env (Environment) – An Environment instance containing the OpenAI Gym environment to be simulated.
num_trajectories (int) – the number of trajectories to generate.
max_episode_length (int) – the maximum number of time steps for the new trajectories. No limit is assumed if None (or not given).
file_name (str) – the file name to save the generated trajectory set and/or restore the trajectory set from. :Note: If restore is true and so a set is being restored, then the restored file will be overwritten with the new set.
restore (bool) – If true, it will first try to load the trajectories from file_name. If the file has fewer trajectories than needed, then more trajectories will be generated to compensate the difference.
headless (bool) – If true, the trajectory set will be saved and returned with no visualization. This makes trajectory generation faster, but it might be difficult for real humans to compare trajectories only based on the features without any visualization.
seed (int) – Seed for the randomness of action selection. :Note: Environment should be separately seeded. This seed is only for the action selection.

Returns

A set of num_trajectories randomly generated trajectories.

Return type

TrajectorySet

Raises

AssertionError – if restore is true, but no file_name is given.

aprel.utils.kmedoids module¶

Function for K-Medoids algorithm.

aprel.utils.kmedoids.kMedoids(D: numpy.array, k: int, tmax: int = 100) → numpy.array¶

Runs the K-Medoids algorithm to return the indices of the medoids. This is based on Bauckhage (2015). And the implementation is adopted from https://github.com/letiantian/kmedoids.

Parameters

D (numpy.array) – a distance matrix, where D[a][b] is the distance between points a and b.
k (int) – the number of medoids to return.
tmax (int) – the maximum number of steps to take in forming clusters.

Returns

an array that keeps the indices of the k selected queries.

Return type

numpy.array

aprel.utils.sampling_utils module¶

This module contains functions that are useful for the sampling in SamplingBasedBelief.

aprel.utils.sampling_utils.gaussian_proposal(point: Dict) → Dict¶

For the Metropolis-Hastings sampling algorithm, this function generates the next step in the Markov chain, with a Gaussian distribution of standard deviation 0.05.

Parameters: point (Dict) – the current point in the Markov chain.
Returns: the next point in the Markov chain.
Return type: Dict

aprel.utils.sampling_utils.uniform_logprior(params: Dict) → float¶

This is a log prior belief over the user. Specifically, it is a uniform distribution over ||weights|| <= 1.

Parameters: params (Dict) – parameters of the user for which the log prior is going to be calculated.
Returns: the (unnormalized) log probability of weights, which is 0 (as 0 = log 1) if ||weights|| <= 1, and negative infitiny otherwise.
Return type: float

aprel.utils.util_functions module¶

General utility functions.

aprel.utils.util_functions.get_random_normalized_vector(dim: int) → numpy.array¶

Returns a random normalized vector with the given dimensions.

Parameters: dim (int) – The dimensionality of the output vector.
Returns: A random normalized vector that lies on the surface of the dim-dimensional hypersphere.
Return type: numpy.array