1 |
A Refutation of Finite-State Language Models through Zipf’s Law for Factual Knowledge
|
|
|
|
In: Entropy ; Volume 23 ; Issue 9 (2021)
|
|
BASE
|
|
Show details
|
|
2 |
Cadenas de Markov : métodos cuantitativos para la toma de decisiones III ; Métodos cuantitativos para la toma de decisiones III
|
|
|
|
BASE
|
|
Show details
|
|
3 |
The Fundamental Limit Theorem of Countable Markov Chains
|
|
|
|
In: Senior Honors Theses (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Robot Motion Planning in an Unknown Environment with Danger Space
|
|
|
|
In: Electronics ; Volume 8 ; Issue 2 (2019)
|
|
BASE
|
|
Show details
|
|
5 |
Artificial Intelligence in the Context of Human Consciousness
|
|
|
|
In: Senior Honors Theses (2019)
|
|
BASE
|
|
Show details
|
|
6 |
ImproteK: introducing scenarios into human-computer music improvisation
|
|
|
|
In: ACM Computers in Entertainment ; https://hal.archives-ouvertes.fr/hal-01380163 ; ACM Computers in Entertainment, 2017, ⟨10.1145/3022635⟩ (2017)
|
|
BASE
|
|
Show details
|
|
7 |
Hierarchical semi-Markov conditional random fields for deep recursive sequential data
|
|
|
|
BASE
|
|
Show details
|
|
8 |
STATISTICAL RELATIONAL LEARNING AND SCRIPT INDUCTION FOR TEXTUAL INFERENCE
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Generic Reinforcement Learning Beyond Small MDPs
|
|
|
|
Abstract:
Feature reinforcement learning (FRL) is a framework within which an agent can automatically reduce a complex environment to a Markov Decision Process (MDP) by finding a map which aggregates similar histories into the states of an MDP. The primary motivation behind this thesis is to build FRL agents that work in practice, both for larger environments and larger classes of environments. We focus on empirical work targeted at practitioners in the field of general reinforcement learning, with theoretical results wherever necessary. The current state-of-the-art in FRL uses suffix trees which have issues with large observation spaces and long-term dependencies. We start by addressing the issue of long-term dependency using a class of maps known as looping suffix trees, which have previously been used to represent deterministic POMDPs. We show the best existing results on the TMaze domain and good results on larger domains that require long-term memory. We introduce a new value-based cost function that can be evaluated model-free. The value- based cost allows for smaller representations, and its model-free nature allows for its extension to the function approximation setting, which has computational and representational advantages for large state spaces. We evaluate the performance of this new cost in both the tabular and function approximation settings on a variety of domains, and show performance better than the state-of-the-art algorithm MC-AIXI-CTW on the domain POCMAN. When the environment is very large, an FRL agent needs to explore systematically in order to find a good representation. However, it needs a good representation in order to perform this systematic exploration. We decouple both by considering a different setting, one where the agent has access to the value of any state-action pair from an oracle in a training phase. The agent must learn an approximate representation of the optimal value function. We formulate a regression-based solution based on online learning methods to build an such an agent. We test this agent on the Arcade Learning Environment using a simple class of linear function approximators. While we made progress on the issue of scalability, two major issues with the FRL framework remain: the need for a stochastic search method to minimise the objective function and the need to store an uncompressed history, both of which can be very computationally demanding.
|
|
Keyword:
AGI; Arcade Learning Environment; artificial intelligence; Atari games; DAgger; function approximation; general learning agents; imitation learning; looping suffix trees; Markov Decision Processes; MDP; partially observable; POMDP; reinforcement learning; suffix trees
|
|
URL: http://hdl.handle.net/1885/110545
|
|
BASE
|
|
Hide details
|
|
10 |
Constructing States for Reinforcement Learning
|
|
|
|
In: Proceedings of International Conference on Machine Learning (ICML 2010) (2015)
|
|
BASE
|
|
Show details
|
|
11 |
Short text authorship attribution via sequence kernels, Markov chains and author unmasking: An investigation
|
|
|
|
In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing ; http://acl.ldc.upenn.edu/W/W06/#W06-1600 (2015)
|
|
BASE
|
|
Show details
|
|
12 |
Semi-Markov models for sequence segmentation
|
|
|
|
In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007) ; http://www.aclweb.org/anthology-new/D/D07/D07-1.pdf (2015)
|
|
BASE
|
|
Show details
|
|
14 |
Structural Complexity in Linguistic Systems Research Topic 3: Mathematical Sciences
|
|
|
|
In: DTIC (2015)
|
|
BASE
|
|
Show details
|
|
15 |
A Fast Variational Approach for Learning Markov Random Field Language Models
|
|
|
|
In: DTIC (2015)
|
|
BASE
|
|
Show details
|
|
16 |
Planning Human-Computer Improvisation
|
|
|
|
In: International Computer Music Conference ; https://hal.archives-ouvertes.fr/hal-01053834 ; International Computer Music Conference, Sep 2014, Athens, Greece ; http://icmc14-smc14.net (2014)
|
|
BASE
|
|
Show details
|
|
17 |
Markov Substitute Processes : a statistical model for linguistics ; Processus de substitution markoviens : un modèle statistique pour la linguistique
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-01127344 ; General Mathematics [math.GM]. Université Pierre et Marie Curie - Paris VI, 2014. English. ⟨NNT : 2014PA066354⟩ (2014)
|
|
BASE
|
|
Show details
|
|
18 |
Matrix analytic methods with Markov decision processes for hydrological applications.
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Probabilistic Sequence Models with Speech and Language Applications
|
|
|
|
BASE
|
|
Show details
|
|
|
|