DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5
Hits 1 – 20 of 100

1
A Refutation of Finite-State Language Models through Zipf’s Law for Factual Knowledge
In: Entropy ; Volume 23 ; Issue 9 (2021)
BASE
Show details
2
Cadenas de Markov : métodos cuantitativos para la toma de decisiones III ; Métodos cuantitativos para la toma de decisiones III
Fonollosa Guardiet, Juan Bautista; Sunyer Torrents, Albert; Sallán Leyes, José María. - : Universitat Politècnica de Catalunya. Iniciativa Digital Politècnica, 2021
BASE
Show details
3
The Fundamental Limit Theorem of Countable Markov Chains
In: Senior Honors Theses (2021)
BASE
Show details
4
Robot Motion Planning in an Unknown Environment with Danger Space
In: Electronics ; Volume 8 ; Issue 2 (2019)
BASE
Show details
5
Artificial Intelligence in the Context of Human Consciousness
In: Senior Honors Theses (2019)
BASE
Show details
6
ImproteK: introducing scenarios into human-computer music improvisation
In: ACM Computers in Entertainment ; https://hal.archives-ouvertes.fr/hal-01380163 ; ACM Computers in Entertainment, 2017, ⟨10.1145/3022635⟩ (2017)
BASE
Show details
7
Hierarchical semi-Markov conditional random fields for deep recursive sequential data
Tran, Truyen; Phung, Dinh; Bui, Hung. - : Elsevier, 2017
BASE
Show details
8
STATISTICAL RELATIONAL LEARNING AND SCRIPT INDUCTION FOR TEXTUAL INFERENCE
Mooney,Raymond. - 2017
BASE
Show details
9
Generic Reinforcement Learning Beyond Small MDPs
Daswani, Mayank. - 2016
Abstract: Feature reinforcement learning (FRL) is a framework within which an agent can automatically reduce a complex environment to a Markov Decision Process (MDP) by finding a map which aggregates similar histories into the states of an MDP. The primary motivation behind this thesis is to build FRL agents that work in practice, both for larger environments and larger classes of environments. We focus on empirical work targeted at practitioners in the field of general reinforcement learning, with theoretical results wherever necessary. The current state-of-the-art in FRL uses suffix trees which have issues with large observation spaces and long-term dependencies. We start by addressing the issue of long-term dependency using a class of maps known as looping suffix trees, which have previously been used to represent deterministic POMDPs. We show the best existing results on the TMaze domain and good results on larger domains that require long-term memory. We introduce a new value-based cost function that can be evaluated model-free. The value- based cost allows for smaller representations, and its model-free nature allows for its extension to the function approximation setting, which has computational and representational advantages for large state spaces. We evaluate the performance of this new cost in both the tabular and function approximation settings on a variety of domains, and show performance better than the state-of-the-art algorithm MC-AIXI-CTW on the domain POCMAN. When the environment is very large, an FRL agent needs to explore systematically in order to find a good representation. However, it needs a good representation in order to perform this systematic exploration. We decouple both by considering a different setting, one where the agent has access to the value of any state-action pair from an oracle in a training phase. The agent must learn an approximate representation of the optimal value function. We formulate a regression-based solution based on online learning methods to build an such an agent. We test this agent on the Arcade Learning Environment using a simple class of linear function approximators. While we made progress on the issue of scalability, two major issues with the FRL framework remain: the need for a stochastic search method to minimise the objective function and the need to store an uncompressed history, both of which can be very computationally demanding.
Keyword: AGI; Arcade Learning Environment; artificial intelligence; Atari games; DAgger; function approximation; general learning agents; imitation learning; looping suffix trees; Markov Decision Processes; MDP; partially observable; POMDP; reinforcement learning; suffix trees
URL: http://hdl.handle.net/1885/110545
BASE
Hide details
10
Constructing States for Reinforcement Learning
In: Proceedings of International Conference on Machine Learning (ICML 2010) (2015)
BASE
Show details
11
Short text authorship attribution via sequence kernels, Markov chains and author unmasking: An investigation
In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing ; http://acl.ldc.upenn.edu/W/W06/#W06-1600 (2015)
BASE
Show details
12
Semi-Markov models for sequence segmentation
In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007) ; http://www.aclweb.org/anthology-new/D/D07/D07-1.pdf (2015)
BASE
Show details
13
Generic Reinforcement Learning Beyond Small MDPs ...
Daswani, Mayank. - : The Australian National University, 2015
BASE
Show details
14
Structural Complexity in Linguistic Systems Research Topic 3: Mathematical Sciences
In: DTIC (2015)
BASE
Show details
15
A Fast Variational Approach for Learning Markov Random Field Language Models
In: DTIC (2015)
BASE
Show details
16
Planning Human-Computer Improvisation
In: International Computer Music Conference ; https://hal.archives-ouvertes.fr/hal-01053834 ; International Computer Music Conference, Sep 2014, Athens, Greece ; http://icmc14-smc14.net (2014)
BASE
Show details
17
Markov Substitute Processes : a statistical model for linguistics ; Processus de substitution markoviens : un modèle statistique pour la linguistique
Mainguy, Thomas. - : HAL CCSD, 2014
In: https://tel.archives-ouvertes.fr/tel-01127344 ; General Mathematics [math.GM]. Université Pierre et Marie Curie - Paris VI, 2014. English. ⟨NNT : 2014PA066354⟩ (2014)
BASE
Show details
18
Matrix analytic methods with Markov decision processes for hydrological applications.
BASE
Show details
19
Understanding Tonal Languages
In: DTIC (2013)
BASE
Show details
20
Probabilistic Sequence Models with Speech and Language Applications
Henter, Gustav Eje. - : KTH, Kommunikationsteori, 2013. : Stockholm, 2013
BASE
Show details

Page: 1 2 3 4 5

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
100
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern