File:Example trajectories of agents trained with different thought cloning strategies.png

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Original file(1,514 × 688 pixels, file size: 311 KB, MIME type: image/png)

Captions

Captions

From the study "Thought Cloning: Learning to Think while Acting by Imitating Human Thinking"

Summary

[edit]
Description
English: "Example trajectories of agents trained with different strategies. Constant teacher-forcing training refers to exclusively training with the teacher-forcing strategy. In this scenario, the agent

does not learn to recover from incorrect thoughts. Once it adopts an incorrect thought, it continues to follow this thought for thousands of time-steps until it reaches the maximum step count (top right from t=53 to t=2880). Constant auto-regressive training after teacher-forcing training implies directly transitioning to auto-regressive training following an initial phase of teacher-forcing training.

In this case, agents begin to generate nonsensical thoughts, as shown on the left, such as open blue at t=24 (left) and pickup door door at t=75 (left). Gradual decay of teacher-forcing rate involves gradually reducing the ratio of teacher-forcing during training. This strategy is adopted in the final version of Thought Cloning. In this setting, the agent might generate some incorrect thoughts as shown at t=53 (bottom right), but it can recover from these errors to explore new ideas, as evidenced at t=131 (bottom right)."
Date
Source https://arxiv.org/abs/2306.00323
Author Authors of the study: Shengran Hu, Jeff Clune

Licensing

[edit]
w:en:Creative Commons
attribution
This file is licensed under the Creative Commons Attribution 4.0 International license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current12:01, 9 August 2023Thumbnail for version as of 12:01, 9 August 20231,514 × 688 (311 KB)Prototyperspective (talk | contribs)Uploaded a work by Authors of the study: Shengran Hu, Jeff Clune from https://arxiv.org/abs/2306.00323 with UploadWizard

There are no pages that use this file.