NevarokML: ANevarokMLTrainer API

The ANevarokMLTrainer class represents a trainer actor in NevarokML.

Properties

_state (ENevarokMLState): The state of the trainer.
_tickTimer (float): The timer for trainer ticks.
_trainerTickFunction (FNevarokMLTrainerTickFunction): The tick function for the trainer.
_procHandle (TUniquePtr<FProcHandle>): The process handle for the backend.
_autoKillBackend (bool): Whether to automatically kill the backend process.
_actSpace (UNevarokMLSpace*): The action space for the trainer.
_obsSpace (UNevarokMLSpace*): The observation space for the trainer.
_socketServer (UNevarokMLSocketServer*): The socket server for communication.
_address (FString): The address for the socket server.
_port (int32): The port for the socket server.
_maxEnvsCount (int): The maximum number of environments.
_tickInterval (float): The interval between ticks.
_envUpdatesPerTick (int32): The number of environment updates per tick.
_envs (TArray<ANevarokMLEnv*>): The array of environments.

Methods

ANevarokMLTrainer

ANevarokMLTrainer();

Constructor for the ANevarokMLTrainer class.

GetEnvsCount

UFUNCTION(BlueprintPure, Category="NevarokML|Trainer")
int GetEnvsCount() const;

Returns the number of environments.

RunBackend

bool ANevarokMLTrainer::RunBackend();

Runs the backend process.

KillBackend

void ANevarokMLTrainer::KillBackend();

Kills the backend process.

ValidateSpaces

bool ANevarokMLTrainer::ValidateSpaces() const;

Validates the action and observation spaces.

ValidateEnvs

bool ANevarokMLTrainer::ValidateEnvs() const;

Validates the environments.

ParseData

bool ANevarokMLTrainer::ParseData(const TArray<uint8>& data, ENevarokMLData& dataType);

Parses the received data and determines the data type.

HandleReset

bool ANevarokMLTrainer::HandleReset(const TSharedPtr<FJsonObject>& jsonObject);

Handles the reset action received from the environment.

HandleReady

bool ANevarokMLTrainer::HandleReady(const TSharedPtr<FJsonObject>& jsonObject);

Handles the ready event received from the environment.

HandleComplete

bool ANevarokMLTrainer::HandleComplete(const TSharedPtr<FJsonObject>& jsonObject);

Handles the complete event received from the environment.

HandleError

bool ANevarokMLTrainer::HandleError(const TSharedPtr<FJsonObject>& jsonObject);

Handles the error event received from the environment.

HandleSave

bool ANevarokMLTrainer::HandleSave(const TSharedPtr<FJsonObject>& jsonObject) const;

Handles the save event received from the environment.

HandleAction

bool ANevarokMLTrainer::HandleAction(const TSharedPtr<FJsonObject>& jsonObject);

Handles the action received from the environment.

ExecuteInit

void ANevarokMLTrainer::ExecuteInit();

Executes the initialization process for the trainer.

KillSocket

void ANevarokMLTrainer::KillSocket();

Kills the socket server.

ExecuteInvalid

void ANevarokMLTrainer::ExecuteInvalid();

Executes the invalid state process for the trainer.

ExecuteInitEnv

void ANevarokMLTrainer::ExecuteInitEnv(int index, ANevarokMLEnv* env);

Executes the initialization process for an environment.

ExecuteStepEnv

void ANevarokMLTrainer::ExecuteStepEnv(int index, ANevarokMLEnv* env);

Executes the step process for an environment.

ExecuteResetEnv

void ANevarokMLTrainer::ExecuteResetEnv(int index, ANevarokMLEnv* env);

Executes the reset process for an environment.

TickTrainer

void ANevarokMLTrainer::TickTrainer(float deltaTime);

Called every frame to update the trainer.

Learn

UFUNCTION(BlueprintCallable, Category="NevarokML|Trainer")
bool Learn(const UNevarokMLBaseAlgorithm* algorithm,
           int timesteps = 10000,
           int evalEps = 0,
           int saveFreq = 6000,
           const FFilePath loadModelPath = FFilePath(),
           const FName saveModelName = FName("Model"),
           const bool deterministic = true,
           const bool showTensorboard = false,
           const bool showReward = false,
           const bool showStepDebug = false,
           const bool showResetDebug = true);

Starts the learning process with the specified algorithm and parameters.

Method Parameters

algorithm (UNevarokMLBaseAlgorithm*): A pointer to the algorithm object that defines the learning algorithm to be used.
timesteps (int): The maximum number of timesteps to run the learning process. The default value is 10,000.
evalEps (int): The number of evaluation episodes to run during the learning process. The default value is 0, indicating no evaluation episodes.
saveFreq (int): The frequency, in timesteps, at which the model should be saved during the learning process. The default value is 6,000.
loadModelPath (FFilePath): The file path to a pre-trained model that should be loaded before starting the learning process. The default value is an empty file path, indicating no pre-trained model should be loaded.
saveModelName (FName): The name to use when saving the trained model. The default value is "Model".
deterministic (bool): A flag indicating whether to use deterministic behavior during the learning process. The default value is true.
showTensorboard (bool): A flag indicating whether to show Tensorboard visualizations during the learning process. The default value is false.
showReward (bool): A flag indicating whether to show reward comparison(before and after) information after the learning process is complete. The default value is false.
showStepDebug (bool): A flag indicating whether to show step debugging information during the learning process. The default value is false.
showResetDebug (bool): A flag indicating whether to show reset debugging information during the learning process. The default value is true.

Override Methods

BeginPlay

virtual void ANevarokMLTrainer::BeginPlay() override;

Overrides the BeginPlay function.

BeginDestroy

virtual void ANevarokMLTrainer::BeginDestroy() override;

Overrides the BeginDestroy function.

RegisterActorTickFunctions

void ANevarokMLTrainer::RegisterActorTickFunctions(bool bRegister) override;

Overrides the RegisterActorTickFunctions function.

Event Methods

Event methods for various trainer states and actions. These methods can be overridden in Blueprints for custom implementation.

graph LR
  A[/OnConstruct/] --> OnStart
  A -->|Error| B((OnInvalid))
  OnStart -->|Error| B
  OnStart -->|Learn| C[/OnInit/]
  C --> E{OnReset}
  E --> D{OnStep}
  D -->|Error| B
  D -->|Not Done| D
  D -->|Done| E
  E --> Z((OnComplete))

OnConstruct

UFUNCTION(BlueprintNativeEvent, Category="NevarokML|Trainer")
void OnConstruct(UNevarokMLSpace* actSpace, UNevarokMLSpace* obsSpace);

The OnConstruct event serves as a setup step before the training process begins, allowing to configure the action and observation spaces.

OnStart

UFUNCTION(BlueprintNativeEvent, Category="NevarokML|Trainer")
void OnStart();

The OnStart event is triggered after the construction of the trainer object is finished and the initialization is successful. It provides a suitable location to start the learning process by invoking the Learn function with the desired parameters.

OnInvalid

UFUNCTION(BlueprintNativeEvent, Category="NevarokML|Trainer")
void OnInvalid();

The OnInvalid event is triggered when there are errors or issues encountered during the construction or learning phases of the trainer object. It serves as a notification and allows you to handle and respond to these errors in a customized manner.

OnInit

UFUNCTION(BlueprintNativeEvent, Category="NevarokML|Trainer")
void OnInit(int index, ANevarokMLEnv* env);

The OnInit event is triggered when an environment at a specific index is successfully initialized. It provides an opportunity to perform any necessary setup or customization related to the initialized environment.

OnStep

UFUNCTION(BlueprintNativeEvent, Category="NevarokML|Trainer")
void OnStep(int index, ANevarokMLEnv* env);

The OnStep event is triggered when the trainer performs a step on the environment at a specific index. It provides an opportunity to respond to the environment's state after the step and perform any necessary actions or calculations.

OnReset

UFUNCTION(BlueprintNativeEvent, Category="NevarokML|Trainer")
void OnReset(int index, ANevarokMLEnv* env);

The OnReset event is triggered when the trainer resets the environment at a specific index. It provides an opportunity to handle any necessary actions or logic related to the reset of the environment.

OnComplete

UFUNCTION(BlueprintNativeEvent, Category="NevarokML|Trainer")
void OnComplete();

The OnComplete event is triggered when the training process reaches its completion. It serves as a notification that the training has finished and provides an opportunity to perform any necessary cleanup or additional actions.

OnConstruct_Implementation

virtual void ANevarokMLTrainer::OnConstruct_Implementation(UNevarokMLSpace* actSpace, UNevarokMLSpace* obsSpace);

Implementation of the OnConstruct event method.

OnStart_Implementation

virtual void ANevarokMLTrainer::OnStart_Implementation();

Implementation of the OnStart event method.

OnInvalid_Implementation

virtual void ANevarokMLTrainer::OnInvalid_Implementation();

Implementation of the OnInvalid event method.

OnInit_Implementation

virtual void ANevarokMLTrainer::OnInit_Implementation(int index, ANevarokMLEnv* env);

Implementation of the OnInit event method.

OnStep_Implementation

virtual void ANevarokMLTrainer::OnStep_Implementation(int index, ANevarokMLEnv* env);

Implementation of the OnStep event method.

OnReset_Implementation

virtual void ANevarokMLTrainer::OnReset_Implementation(int index, ANevarokMLEnv* env);

Implementation of the OnReset event method.

OnComplete_Implementation

virtual void ANevarokMLTrainer::OnComplete_Implementation();

Implementation of the OnComplete event method.