File:Example trajectories of agents trained with different thought cloning strategies.png
From Wikimedia Commons, the free media repository
Jump to navigation
Jump to search
![File:Example trajectories of agents trained with different thought cloning strategies.png](https://upload.wikimedia.org/wikipedia/commons/thumb/8/81/Example_trajectories_of_agents_trained_with_different_thought_cloning_strategies.png/800px-Example_trajectories_of_agents_trained_with_different_thought_cloning_strategies.png?20230809120121)
Size of this preview: 800 × 364 pixels. Other resolutions: 320 × 145 pixels | 640 × 291 pixels | 1,514 × 688 pixels.
Original file (1,514 × 688 pixels, file size: 311 KB, MIME type: image/png)
File information
Structured data
Captions
Captions
From the study "Thought Cloning: Learning to Think while Acting by Imitating Human Thinking"
Summary
[edit]DescriptionExample trajectories of agents trained with different thought cloning strategies.png |
English: "Example trajectories of agents trained with different strategies. Constant teacher-forcing training refers to exclusively training with the teacher-forcing strategy. In this scenario, the agent
does not learn to recover from incorrect thoughts. Once it adopts an incorrect thought, it continues to follow this thought for thousands of time-steps until it reaches the maximum step count (top right from t=53 to t=2880). Constant auto-regressive training after teacher-forcing training implies directly transitioning to auto-regressive training following an initial phase of teacher-forcing training. In this case, agents begin to generate nonsensical thoughts, as shown on the left, such as open blue at t=24 (left) and pickup door door at t=75 (left). Gradual decay of teacher-forcing rate involves gradually reducing the ratio of teacher-forcing during training. This strategy is adopted in the final version of Thought Cloning. In this setting, the agent might generate some incorrect thoughts as shown at t=53 (bottom right), but it can recover from these errors to explore new ideas, as evidenced at t=131 (bottom right)." |
Date | |
Source | https://arxiv.org/abs/2306.00323 |
Author | Authors of the study: Shengran Hu, Jeff Clune |
Licensing
[edit]![w:en:Creative Commons](https://upload.wikimedia.org/wikipedia/commons/thumb/7/79/CC_some_rights_reserved.svg/90px-CC_some_rights_reserved.svg.png)
![attribution](https://upload.wikimedia.org/wikipedia/commons/thumb/1/11/Cc-by_new_white.svg/24px-Cc-by_new_white.svg.png)
This file is licensed under the Creative Commons Attribution 4.0 International license.
- You are free:
- to share – to copy, distribute and transmit the work
- to remix – to adapt the work
- Under the following conditions:
- attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 12:01, 9 August 2023 | ![]() | 1,514 × 688 (311 KB) | Prototyperspective (talk | contribs) | Uploaded a work by Authors of the study: Shengran Hu, Jeff Clune from https://arxiv.org/abs/2306.00323 with UploadWizard |
You cannot overwrite this file.
File usage on Commons
There are no pages that use this file.