A Robust Exploration Strategy in Reinforcement Learning Based on Temporal Difference Error

Muhammad Shadi Hajar*, Harsha Kalutarage, M. Omar Al-Kadri

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Original languageEnglish
    Title of host publicationAI 2022
    Subtitle of host publicationAdvances in Artificial Intelligence - 35th Australasian Joint Conference, AI 2022, Proceedings
    EditorsHaris Aziz, Débora Corrêa, Tim French
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages789-799
    Number of pages11
    ISBN (Print)9783031226946
    DOIs
    Publication statusPublished (VoR) - 2022
    Event35th Australasian Joint Conference on Artificial Intelligence, AI 2022 - Perth, Australia
    Duration: 5 Dec 20229 Dec 2022

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume13728 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference35th Australasian Joint Conference on Artificial Intelligence, AI 2022
    Country/TerritoryAustralia
    CityPerth
    Period5/12/229/12/22

    Keywords

    • Exploitation
    • Exploration
    • greedy
    • k-armed bandit
    • Q-learning
    • Reinforcement learning
    • Softmax

    Fingerprint

    Dive into the research topics of 'A Robust Exploration Strategy in Reinforcement Learning Based on Temporal Difference Error'. Together they form a unique fingerprint.

    Cite this