Fall 2020
Computed strategies for Kuhn Poker and Leduc Hold’em. We have shown, it is a hard task to nd global optima for Stackelberg equilibrium, even the three-player Kuhn Poker. We present a way to compute MaxMin strategy with the CFR algorithm. The tournaments suggest the pessimistic MaxMin strategy is the best performing and the most robust strat-egy.
Public Reports
Run examples/leducholdemhuman.py to play with the pre-trained Leduc Hold'em model. Leduc Hold'em is a simplified version of Texas Hold'em. Leduc Hold'em is a simplified version of Texas Hold'em. Rules can be found here. Edmonton Area (Leduc) NL Hold 'Em Tourneys: leblancryan2: Edmonton, Alberta: NL holdem tourneys, low-mid buyins. Badassredneck500: Calgary/Edmonton/Red Deer, Alberta. Leduc hold'em is a simplified version of texas hold'em with fewer rounds and a smaller deck. In this repository we aim tackle this problem using a version of monte carlo tree search called partially observable monte carlo planning, first introduced by Silver and Veness in 2010. Rules of the UH-Leduc-Holdem Poker Game: UHLPO is a two player poker game. The deck used contains multiple copies of eight different cards: aces, king, queens, and jacks in hearts and spades, and is shuffled prior to playing a hand.
Public Video Reports
Other Titles
Leduc Hold'em
- Tic Tac Toe Solver
- Beating the house in Blackjack
- Effect of Noise on Learning a Planar Pushing Task using SAC
- ResQNet: Finding Optimal Fire Rescue Routes
- COVID Chatbot
- Regularized Follow-the-Leader in Online MDP for Efficient Topographical Mapping
- Learning POMDP model parameters from missing observations
- Reinforcement Learning for No-Limit Texas Hold ‘Em with Bomb Pots
- Identifying Optimal Locations for Satellite Image Capture
- Diet Conscious Meal Planner
- Mitigating Risk of Public Transit during COVID-19
- Predicting the Match: Using Bayesian Networks to Predict Professional Tennis Outcomes
- Efficient Single-Agent Capture of a Moving Target
- Q-Learning Applied to the Taxi Problem
- Settlers of CATAN
- Autonomous Snake
- Q-Learning for Pre-Flop Texas Hold ‘Em
- Deep RL for Atari Games
- Simulating a D&D Encounter with Q-Learning
- Deep RL for Automated Stock Trading
- Dating Under Uncertainty
- Retinal Implant Electrical Stimulation via RL
- Batch Offline RL in the Healthcare Setting
- Computer Caddy – Using RL to advice Golfers’ Club Selection
- RL for Fischer Random Chess
- Timely Decision Making with Probability Path Model
- Satellite-Imagery Based Poverty Level Evaluation System in Mexico with Deep RL Approach
- Pokemon Showdown
- Deep RL for Space Invaders
- Learning to Run
- RL-Based Control of Policy Selection in Near-Accident Scenarios
- Model Predictive Control for an Aircraft Autopilot
- Finding Inharmonic Timbres Locally Consonant with Arbitrary Scales
- Escape Roomba
- Driving in Traffic
- Playing Snake
- The 2020 FLatland Challenge
- Elevator Scheduling with Neural Q-Learning
- Optimizing Immunotherapy Treatment using RL
- Modeling Leduc Hold ‘Em Poker
- Auto Trading System Using Q-Learning
- Energy System Modeling
- Optimizing Fox in the Forest through RL
- Learning Gin Rummy
- Car Racing with Deep RL
- Sequential Decision Making for Mineral Exploration
- Advanced Driver Assistance Systems
- Learning to Play Stargunner with Deep Q-Networks
- A Fourth-and-Goal Football Recommender System
- Algorithms for Motion Planning
- Playing Farkle
- Connect 4: A Survey of Different RL Techniques to Destroy Your Pride
- Decision Making in the word game, Codenames
- Reinforcement Learning Approaches for An Adversarial Snake Agent
- An Attention-Based, Reinforcement-Learned Heuristic Solver for the Double Travelling Salesman Problem With Multiple Stacks
- Uncertainty Aware Model-Based Policy Optimization
- Navigating the Four-Way Stop Autonomously
- Ground Water Remediation Using Sequential Decision Making
- Final Project: Satellite Collision Avoidance
- Q Learning for 4th Down Decision Making in the NFL
- Q-Learning for the Game of Nim: Does The Agent Learn a Combinatorially Optimal Strategy On Its Own?
- Contextual Bandit Algorithms in Recommender Systems
- A Comparison of Reinforcement Learning Methods for Autonomous Navigation
- Reinforcement Learning for Behavior Planning in Intersections for Autonomous Vehicles
- Reinforcement Learning for Pacman Capture the Flag
- Comparing Different Optimization Techniques for Learning Continuous Control with Neural Networks
- Autonomous Exploration in Subterranean Environments
- Improving Image Denoising through Decision Making
- Using MDPs to Optimally Allocate Funds
- Explanations Meet Decision Theory
- Learning Policies for Adaptive LiDAR Scanning with POMDPs
- Cautious Markov Games: A New Framework for Human-Robot Interaction
- Selecting a multibasis community structure for the connectome
- Reinforcement Learning Techniques for Long-Term Trading and Portfolio Management
- Optimal Asset Allocation with Markov Decision Processes
- Symbolic Regression with Bayesian Networks
- Scheduling battery charging using deep reinforcement learning
- Online Knapsack Problem Using Reinforcement Learning
- Policy gradient optimization for
- Resource Allocation for Wildfire Prevention
- Using Reinforcement Learning to Play Omaha
- Fraud Detection for Mobile Payments using Bayesian Network and CNN
- Neuro-Adaptive Artificial Neural Networks for Reinforcement Learning
- AI Agent for Qwirkle
- Learning Optimal Wildfire Suppression Policies With Reinforcement Learning
- Bid Smart with Uncertainty: An Autonomous Bidder
- AA228/CS238 Final Report
- Modeling Identification of Approaching Aircraft as a POMDP
- Short-Term Trading Policies for Bitcoin Cryptocurrency Using Q-learning
- Reinforcement Learning of a Battery Power Schedule for a Short-Haul Hybrid-Electric Aircraft Mission
- Autonomous Helicopter Control for Rocket Recovery
- Reinforcement Learning Strategies Solving Game Gomoku
- A Wildfire Evacuation Recommendation System
- Battleship with Alogrithm
- Developing an Optimal Structure for Breast Cancer Single Cell Classification
- Utilizing Deep Q Networks to Optimally Execute Stock Market Entrance and Exit Strategies
- Online Planning for a Grid World POMDP
- Contingency Manager Agent for Safe UAV Autonomous Operations
- Solving Mastermind as a POMDP
- Simulated Drone Flight with Advantaged Actor Critic Reinforced Learning in 2 and 3 Dimensions
- Solving Queueing Problem Using Monte Carlo Tree Search
- Bayesian Structure Learning on NFL play data
- Multi-Agent Rendezvous Using Reinforcement Learning
- Dynamic Portfolio Optimization
- Fairness and Efficiency in Multi-Portfolio Liquidation: An Multiple-Agent Deep Reinforcement Learning Approach
- Evaluating Poker Hands
- Saving Artificial Intelligence Clinician
- Evaluation of online trajectory planning methods for autonomous vehicles
- Solving Leduc Hold’em Counterfactual Regret Minimization
- From aerospace guidance to COVID-19: Tutorial for the application of the Kalman filter to track COVID-19
- A Reinforcement Learning Algorithm for Recycling Plants
- Monte Carlo Tree Search with Repetitive Self-Play for Tic-Tac-Toe
- Developing a Decision Making Agent to Play RISK
Fall 2019
Public Reports
Other Titles
- Linear Array Target Motion Analysis Using POMDPs
- Speed or Safety?: Calculating Urban Walking Routes Based on Probability of Crime and Foot Traffic
- AlphaGomoku
- Modelling Uncertainty in Dynamic Real-time Multimodal Routing Problems
- Reinforcement Learning for Portfolio Allocation
- Preparation of Papers for AIAA Technical Conferences
- Autonomous Racing
- Deep Learning Enabled Uncorrelated Space Observation Association
- Landing a Lunar Spacecraft with Deep Q-Learning
- POkerMDP: Decision Making for Poker
- 1V1 Leduc Hold’em Bot
- Political Influencers: Using Election Finance Data to Analyze Campaign Success via Bayesian Networks
- Developing AI Policies for Street Fighter via Q-learning
- Impact of Market Technical Indicators On Future Stock Prices Using Reinforcement Learning
- Allocation of Hearts for Transplant as an MDP
- Multi-Agent Reinforcement Learning in a 2D Environment for Transportation Optimization
- Planning under Uncertainty for Discrete Robotic Navigation with Partial Observability
- Deep Reinforcement Learning Applied to Mid-Frequency Trading
- Application of Subspace Identification for Classification of Neural-Activity during Decision-Making
- Using Markov-Decision Processes to Design Betting Strategies for the NFL
- Maneuvering Characteristics Control Systems using Discrete-Time MDPs
- MDP Based Motion Planning In Unsignaled Intersections
- Competitive Blackjack Using Reinforcement Learning
- Modelling Pedestrian Vehicle Interaction at Stop Sign using Markov Decision Process
- Jeopardy! Wagering Under Uncertainty
- Love Letters Under Uncertainty
- Playing The Resistance with a POMDP
- Robotic Simultaneous Localization and Mapping with 2D Laser Scan
- Mars Rover: Navigating an Uncertain World
- Modeling Blood Donations Over Time as a POMDP
- Reinforcement Learning for Control on OpenAI Gym Environments
- Playing Connect 4 using Reinforcement Learning
- Evaluation of Reduced Algorithmic Complexity for Grasping Tasks by Using a Novel Underactuated Curling Grasper with Reinforcement Learning
- Optimizing Strategies for Settlers of Catan
- Exploring Search Algorithms for Klondike Solitaire
- A Sparse Sampling Control Strategy for Risk Minimization during Stretchable Sensor Network Deployment
- Computing Strategies for the 7 Wonders Board Game
- POMDP modeling of stochastic Tetris
- Solving a Maze with Doors and Hidden Tigers
- Playing “Dominion” with Deep Reinforcement Learning
- Delivery Vehicle Navigation in Crowd with Reinforcement Learning
- Capturing Uncertainty in a Multi-Modal Setting With JRMOT: A Real-Time 2D-3D Multi-Object Tracker
- Decentralized Satellite Network Communication
- Seismic Network Planning
- Reinforcement Learning for PaoDeKuai, A Card Game
- Training A Bai Fen Agent with Reinforcement Learning
- Decision Making for Launch Cancellation Based Upon Storm Conditions
- Optimizing for the Competitive Edge: Modeling Sequential Binary Decision Making for Two Competing Firms
- Datacenter Equipment Maintenance Optimization
- To Heat Or Not To Heat: Reinforcement Learning for Optimal Residential Water Heater Control
- Learning to Play Snake Game with Deep Reinforcement Learning
- Optimal Traffic Light Control for Efficient City Transportation
- Modeling NBA Point Spread Betting as an MDP
- Solving a car racing game using Reinforcement Learning
- Is Uncertainty Really Harmful: Solving Partially Observable Lunar Lander Problem with Deep Reinforcement Learning
- Autonomous Navigation of an RC Boat Under a POMDP Framework
- Evaluating the Bayes-Adaptive MDP Framework on Stochastic Gridworld Environments
- Value Iteration with Enhanced Action Space for Path Planning
- The Medical Triage Problem: Improving Hospitals’ Admission Decisions
- Optimal Route Selection for Riders in Toronto
- Model Free Learning for Optimal Power Grid Management
- Wasting Less Time on the Road Using MDPs
- Learning User Preferences to Produce Sequential Movie Recommendations
- A Comparison of Learning Based Control Methods for Optimal Trajectory Tracking with a Quadrotor
- Artificial Pancreas: Q-Learning Based Control for Closed-loop Insulin Delivery Systems
- Navigating in an Uncertain World
- Teaching an Autonomous Car to Drive through an Intersection with POMDPs
- Atomic structure minimization using simulated annealing with a MCTS temperature scheme
- AI Game Player for 2048
- Deep Q-Learning with GARCH for Stock Volatility Trading
- Learning to Become President
- Solving GNSS Integrity-Based Path Planning in Urban Environments via a POMDP Framework
- Reservoir operation under climate uncertainty
- Reinforcement Learning for Maze Solving
- Using Reinforcement Learning to Find Basins of Attraction
- Planning for Asteroid Prospecting Missions with POMDPs
- Human-Aware Robot Motion Planner
- Determining Federal Funds Rate Changes – Hike / Cut / Hold – Under Economic Uncertainty
- Simulating Work-Life Balance with POMDP
- Solving 2048 as a Markov Decision Process
- Accounting for Delay in Dynamic Resource Allocation for Wildfire Suppression – a POMDP Approach
- Daily Allocation of Assets with Distinct Risk Profiles using Reinforcement Learning
- LocoNets for Deep Reinforcement Learning
- Exploring a full joint observability game with Markov decision processes
- Deep Bayesian Active Learning for Multiple Correct Outputs
- Convolutionally Reducing Markov Decision Processes
- Robust Decision Making Agent for Frozen Lake
- Tic-Tac-Toe How Many In A Row?
- Turbomachinery Optimization Under Uncertainty
- Devising a Policy for Liar’s Dice Using Model Free Reinforcement Learning
- Political Compromises: an Iterative Game of Prisoner’s Dilemma
- Optimal Home Energy System Management using Reinforcement Learning
- Drone Tracking in a 2-dimentional Grid using Particle Filter Algorithm
- Deep Reinforcement Learning for Traffic Signal Control
- A Deep Reinforcement Learning Approach to Recommender Systems
- FlyCroTugs – Collaborative Object Manipulation Using Flying Tugs
- Local Approximation Q-Learning for a Simplified Satellite Reconnaissance Mission
- Developing Policies for Blackjack Using Reinforcement Learning
- Applying Q-learning to the Homicidal Chauffeur Problem
- Optimal Satellite Detumbling through Reinforcement Learning
- Active Preference-Based Gaussian Process Regression for Reward Learning and Optimization
- A Comparative Study on Heart Disease Prediction
- Robot Navigation with Human Intent Uncertainty
- Conquering the Queen of Spades: A Hearts Agent
- Using Markov Decision Processes to Predict Soccer Player Market Value
- Effectiveness of Recurrent Network for Partially-Observable MDPs
- Capture The Flag
- Predicting uncertainty
- Optimal Asset Allocation with Markov Decision Processes
- Nets on Nets: Using Bayesian Networks to Predict Supplier Links in Economic Networks
- Playing 2048 With Reinforcement Learning
- Trading strategies using deep reinforcement learning with news and time series stock data
- Modeling Contract Bridge as a POMDP
- Solving Rubik’s Cubes Using Milestones
- Playing 2048 with Deep Reinforcement Learning
- An Approximate Dynamic Programming Minimum-Time Guidance Policy for High Altitude Balloons
- Identifying Bots on Twitter
- Approaches to Model-Free Blackjack
- Jumping Robot Simulator: An Exploration of Methods to Teach a Bio-Inspired Frog Robot to Navigate
- Air Traffic Control Tower Policy for Terminal Environment Operations
- Managing a Prediction Market Portfolio
- Applying Partially Observable Markov Decision Making Processes to a Product Recommendation System
- Self-Driving Under Uncertainty
- Reinforcement Learning for QWOP
- Modeling Macroeconomic Phenomena with Multi-Agent Reinforcement Learning
- Optimal Learning Policy via POMDP planning
- AI Guidance for Thermalling in a Glider
- Decision Making For Profit: Portfolio Management using Deep Reinforcement Learning
- Self-play Reinforcement Learning for Open-face Chinese Poker
- Feature Constrained Graph Generation with a Modified Multi-Kernel Kronecker Model
- Sensor Fusion of IMU and LiDAR Data Using a Multirate Extended Kalman Filter
- Optimizing Empiric Antibiotic Delivery in the Emergency Department
- The Task Completion Game
- Optimizing Modified Mini-Metro (M³)
- Improving Pragmatic Inferences with BERT and Rational Speech Act Framework and Data Augmentation
- Deep Q-Learning for Playing Hanabi as a POMDP
- A Comparative Study on Heart Disease Prediction
Fall 2018
Public Reports
Other Titles (excluding optional final projects)
- Occlusion Handling for Local Path Planning with Stereo Vision
- Pre-Flop Betting Policy in Poker
- Optimal Impulsive Maneuver Times for Simultaneous Imaging and Gravity Recovery of an Asteroid
- Monte-Carlo Planning in Subsurface Resource Development
- Learning to Win at Go-Stop
- Police Officer Distribution
- Optimizing Road Construction to Improve Traffic Conditions Using Reinforcement Learning
- Q-Learning for Casino Hold’em
- Modeling a Connected Highway Merge as a POMDP Using Dynamic GPS Error
- Figure 8 Race Track Optimal and Safe Driving
- Predictive Maintenance of Trucks using POMDPs
- Predictive Models for Maximizing Return on Agriculture given Location and Temperature
- A Policy to Deal With Delay Uncertainty
- Reinforcement Learning Methods for Energy Microgrids
- Boom! Tetris for Bot – Designing a Reinforcement Learning Framework for NES Tetris
- Hidden Markov Models for Economic Cycle Regime Estimation
- Push Me: Optimizing Notification Timing to Promote Physical Activity
- Resource Allocation for Floridian Hurricanes
- Motion Planning in Human-Robot Interaction Using Reinforcement Learning
- Automated Neural Network Architecture Tuning with Reinforcement Learning
- Imitation Learning in OpenAI Gym with Reward Shaping
- Collision Avoidance for Unmanned Rockets using Markov Decision Processes
- MDP Solvers for a Successful Sushi Go! Agent
- Uncovering Personalized Mammography Screening Recommendations through the use of POMDP Methods
- Implementing Particle Filters for Human Tracking
- Decision Making in the Stock Market: Can Irrationality be Mathematically Modelled?
- Single and Multi-Agent Autonomous Driving using Value Iteration and Deep Q-Learning
- Buying and Selling Stock with Q-Learning
- Application and Analysis of Online, Offline, and Deep Reinforcement Learning Algorithms on Real-World Partially-Observable Markov Decision Processes
- Reward Augmentation to Model Emergent Properties of Human Driving Behavior Using Imitation Learning
- Classification and Segmentation of Cancer Under Uncertainty
- Comparison of Learning Methods for Price Setting of Airfare
- QMDP Method Comparisons for POMDP Pathfinding
- Global Value Function Approximation using Matrix Completion
- Artificial Intelligence Techniques for a Game of 2048
- Exploring the Boundaries of Art
- An Iterative Linear Algebra Approach to Dynamic Programming
- Solving Open AI Gym’s Lunar Lander with Deep Reinforcement Learning
- Application of Imitation Learning to Modeling Driver Behavior in Generalized Environments
- Craps Shoot: Beating the House…?
- Movie Recommendations with Reinforcement Learning
- Playing Atari 2600 Games Using Deep Learning
- Traverse Synthesis for Planetary Exploration
- Optimal operation of an islanded microgird under a Markov Decision Process framework
- Implementing Deep Q-learning Extensions in Julia with Flux.jl
- Learning How to Buy Food
- Using Dynamic Programming for Optimal Meal Planning
- Modelling Wildfire Evacuation using MDPs
- Comparing Multimodal Representations for Robotic Reinforcement Learning Tasks
- Applying Reinforcement Learning to Packet Routing in Mesh Networks
- Xs & Os: Creating a Tic-Tac-Toe Foe
- Doggo Does a Backflip: Deep Reinforcement Learning on a Quadruped Platform
- GrocerAI: Using Reinforcement Learning to Optimize Supermarket Purchases
- Reinforcement Learning For The Buying and Selling of Financial Assets
- Towards Designing a Policy on Automotive GPS Integrity
- Generalized Kinetic Component Analysis
- Trading Wheat Futures Contracts
- Using PCR, Neural Networks, and Reinforcement Learning
- Reinforcement Learning for Inverted Pendulums
- Electric Vehicle Charging under Uncertainty
- Automatic Accompaniment Generator: An MDP Approach
- Comparison of Methods in Artificial Life
- Modeling a Better Visual Acuity Test
- Online Methods Applied to the Game of Euchre
- Missile Defense Strategy: Towards Optimal Interceptor Allocation
- Smart Charging of Electric Vehicles under State Uncertainty
- Learning to Play Atari Breakout Using Deep Q-Learning and Variants
- Decision making on fault-code
- Learning FlappyBird with Deep Q-Networks
- A Fresh Start: Using Reinforcement Learning to Minimize Food Waste and Stock-Outs in Supermarkets
- Autonomous orbital maneuvering using reinforcement learning
- Autonomous Decision-Making for Space Debris Avoidance
- Maximizing Monthly Expenditures Under Uncertainty
- Modeling Voter Preferences in US General Elections
- Application of Reinforcement Learning to the Path Planning with Dynamic Obstacles
- A Decision Making framework for Medical Diagnostics
- Learning to Walk Using Deep RL
- Q(λ)-Learning with Boltzmann and ε-greedy Exploration Applied to a Race Car Simulation
- Reinforcement learning for Glassy/Phase Transitions
- Proximal Policy Optimization in Julia
- University Technology Patent and License Decisions: Open- versus Closed-Loop Planning in a Markov Decision Process
- A Policy Gradient Approach for Continuous-Time Control of Spacecraft Manipulator Systems
- Applying Techniques in Reinforcement Learning to Motion Planning in Redundant Robotic Manipulators
- Deep Q-Learning for Atari Pong
- Adversarial Curiosity for Model-Based Reinforcement Learning
- A Markov Decision Process Approach to Home Energy Management with Integrated Storage
- Using Maximum Likelihood Model-Based Reinforcement Learning to Play Skull
- Cryptocurrency Trading Strategy with Deep Reinforcement Learning
- Evaluating Multisense Word Embeddings Final Report
- Near-Earth Object (NEO) Deflection via POMDP
- Reinforcement Learning for Car Driving
- Reinforcement Learning for Automatic Wheel Alignment
- Julia Implementation of Trust Region Policy Optimization
- Deep Reinforcement Learning with Target and Prediction Networks
- Playing Tower Defense with Reinforcement Learning
- Q-Learning agent as a portfolio trader
- Multi-Robot Rendezvous from Indoor Acoustics
- Portfolio Asset Allocation using Reinforcement Learning
- Creating a 2048 AI Solver using Expectimax
- Robustness of Reinforcement Learning Based Communication Networks in Multi-Modal Multi-Step Games to Input Based Adversarial Attacks
- Deep Q-Learning with Nearest Neighbors in Sequential Decision-Making for Sepsis Treatment
- Positioning Archival Radar Data with a Particle Filter
- Reinforcement Learning for Atari Skiing
- Understanding Donations with Reinforcement Learning
- Known and Unknown Discrete Space Exploration Using Deep Q-Learning
- Speeding Up Reinforcement Learning with Imitation
- Learning Bandwidth-Limited Communication in Decentralized Multi-agent Reinforcement Learning
Fall 2017
- 2048 as a MDP
- A Computational Approach to Employee Resource Allocation between Multiple Projects
- Accelerated Training of Deep Q Learning Models for Atari Games
- AlphaOthello: Developing an Othello player through Reinforcement Learning on Deep Neural Networks
- An Online Approach to Energy Storage Management Optimization
- An Optimal Basketball Foul Strategy by Value Iteration
- Annealed Reward Functions in Continuous Control Reinforcement Learning
- Applications of Inverse Reinforcement Learning for Multi-Feature Path Planning
- Attributing Authorship in the Case of Anonymous Supreme Court Opinions Utilizing SVMs and Probabilistic Inference on Score Uncertainty
- Balancing Safety and Performance in Imitation Learning
- Baseball Pitch Calling as a Markov Decision Process
- Batch Reinforcement Learning Technological Investment Strategies Utilizing The Contingent Effectiveness Model In A Markov Decision Process
- Bayesian Learning of Image Transformations from User Preferences for Individualized Automatic Filters
- BetaMiniMax: An Agent for Cheat
- Building a Game Agent to Play Resistance
- Building Trust in Autonomy: Sharing Future Behavior in Reinforcement Learning
- Car racing with low dimensional input features
- Comparison of Classical Control Methods and POMDPs for 3D Motion Control
- Control of a Partially-Observable Linear Actuator
- DDQN Learning for 2048
- Deep Q-learning in OpenAI gym
- Deep Q-Learning with Target Networks for Game Playing
- Design of A Planning Machinery for Choosing an NBA Team’s Play Style Strategy
- Detecting Human from Image with Double DQN
- Determining the Optimal Betting Policy: World Series
- Disrupting Distributed Consensus (or Not) Using Reinforcement Learning
- Dominating Dominoes
- Double A3C: Deep Reinforcement Learning on OpenAI Gym Games
- Emergent Language in Multi-Agent Co-operative Reinforcement Learning
- Explore the Frontier of Safe Imitation Learning for Autonomous Driving
- Fast Operation of Coordinated Distributed Energy Resources without Network Models using Deep Deterministic Policy Gradients
- Faster Algorithms for Contextual Combinatorial Cascading Bandits
- Finding a Scent Source with a Soft Growing Robot Using Monte Carlo Tree Search
- Gaming Bitcoin Leveraging Model-Based Reinforcement Learning
- Get Ready for Demand Response
- GlideAI: Optimizing Soaring Strategy with Reinforcement Learning
- Grid Stability Management and Price Arbitrage for Distributed Energy Storage and Generation via Reinforcement Learning
- Guiding the management of sepsis with deep reinforcement learning
- HMMs for Prediction of High-Cost Failures
- Integrating Mini-Model Evidence into Policy Evaluation
- Investigating Parametric Insurance Models As Multi- Variable Decision Networks
- Learning an Optimal Policy for Police Resource Allocation on Freeways
- Learning Terminal Airspace Models from TSAA Data
- Learning the Education System
- Learning the Policy of the Policy: Deep Reinforcement Learning with Model-Based Feedback Controllers
- Learning to Play a Simplified Version of Monopoly Using Multi-Agent SARSA
- Learning to Play Othello Without Human Knowledge
- Limbed Robot Motion Control through Online Reinforcement Learning
- Linear Approximation Q-Learning to Learn Movement in a 2D Space
- Locally Optimal Risk Aware Path Planning
- Massively Parallel Reinforcement Learning Using Trust Region Policy Optimization
- Model-Free Learning of Casino Blackjack
- Model-Free Reinforcement Learning of a Modified Helicopter Game
- Model-Free Reinforcement Learning on Flappy Bird
- Modeling Disaster Evacuation Paths
- Modeling Flight Delay and Cancelation
- Modeling NBA Matchups
- Modeling Optimally Efficient Earth to Earth Flight Trajectories in Kerbal Space Program with Reinforcement Learning
- Modeling Real Estate Investment with Deep Reinforcement Learning
- Multi-Agent Cooperative Language Learning
- Multi-armed Bandits with Unobserved Confounders
- Multidisciplinary Design Optimization for Approximating Unsteady Aerodynamics of Flexible Aircraft Structures
- Navigating Chaos: Autonomous Driving in a Highly Stochastic Environment
- Optimal Flight Itineraries Under Uncertainty Using a Stochastic Markov Decision Process
- Optimal Strategy for Two-Player Incremental Classification Games Under Non-Traditional Reward Mechanisms
- Optimizing sequential time-lapse seismic davcx bta collection using a POMDP
- Personal Portfolio Asset Allocation as An MDP Problem
- Planetary Lander with Limited Sensor Information and Topographical Uncertainty
- Playing Flappy Bird Using Deep Reinforcement Learning
- POMDP and MDP for Underwater Navigation
- POMDP Modeling of a Simulated Automatic Faucet for Cognitive State and Task Recognition
- Portfolio Management
- Power Grid real time optimization using Q-Learning
- Predicting Congressional Voting Behavior and Party Affiliation using Machine Learning
- Predicting Income From OkCupid Profiles
- Predicting NBA Game Outcomes using POMDPs
- Predicting Subjective Sleep Quality
- Preparation of Papers for AIAA Technical Journals
- Pursuit-Evasion Game with an Agent Unaware of its Role
- Rapid Reinforcement Learning by Injecting Stochasticity into Bellman
- Real Time Collision Detection and Identification for Robotic Manipulators
- Reinforcement Learning Applied to Quadcopter Hovering
- Reinforcement Learning Approaches to Pathfinding in Stochastic Environments
- Reinforcement Learning For A Reach-Avoid Game
- Reinforcement Learning for Atari Breakout
- Reinforcement Learning for Crypto-Currency Arbitrage Bot
- Reinforcement Learning for Precision Landing of a Physically Simulated Rocket
- Reinforcement learning in an online multiplayer game
- Reinforcement Learning of Blackjack Variants
- Reinforcement training of nonlinear reduced order models
- Reward Shaping with Dynamic Guidance to Accelerate Learning for Multi-Agent Systems
- Risk – Bayesian World Conquest
- Roboat: Reinforcement of Boat’s Optimal Adaptive Trajectory
- Robotic Arm Motion Planning Based on Reinforcement Learning
- Robotic Decision Making Under Uncertainty
- Sensor Selection for Energy-Efficient Path Planning Under Uncertainty
- ShAIkespeare: Generating Poetry with Reinforcement Learning and Factor Graphs
- Shared Policies in Aircraft Avoidance
- Simulated Autonomous Driving with Deep Reinforcement Learning
- Simulating Coverage Path Planning with Roomba
- SLAMming into Obstacles: Simultaneous Localization and Mapping the Path of a Turtlebot
- Smart Health Coach: Using Markov Decision Processes to Optimize Health Advising Strategies
- Smarter Queues by Reinforcement Learning
- Solving Real-world Oil Drilling Problem with Multi-Armed Bandit and POMDP Models
- Stay in Your Lane: Probabilistic Vehicular Automation for DIY Robocars
- Supervised Learning and Reinforcement Learning for Algorithmic Trading
- Taking Out the GaRLbage
- Terrain Relative Navigation and Path Planning for Planetary Rovers
- Time-Constrained Sample Retrieval in a Martian Gridworld with Unknown Terrain
- Trade-offs in Connect Four Game-Playing Agents
- Training an Intelligent Driver on Highway Using Reinforcement Learning
- UAV Collision Avoidance Using Neural Network-Assisted Q-Learning
- Understanding Limitations of Network Meta-analysis Approaches to Rank Effectiveness of Treatments
- Using Bayesian Networks to Impute Missing Data
- Using Bayesian Networks to Predict Credit Card Default
- Using Bayesian Networks to Understand and Predict Wildfires
- Using Classification Models to Represent and Predict Students’ Restaurant Preferences
- Using Q-Learning to Optimize Lunar Lander Game Play
- Using the QMDP Method to Determine an Open Ocean Fishing Policy
- Utilizing fundamental factors in reinforcement learning for active portfolio management
- Utilizing Fundamental Factors in Reinforcement Learning for Active Portfolio Management
Fall 2016
- Model-Free Reinforcement Learning of Blackjack
- Partially Observable Actions in Solving Markov Decision Processes. The Case for Insulin Dosing Optimization in Diabetic Patients.
- Using Monte-Carlo Tree Search to Solve the Board Game Hive
- Blackjack: How to use MDP’s to (nearly) beat the house
- Cancer Metabolism Mapping: Bayesian Networks and Network Learning Techniques to Understand Cancer Metabolic and Regulatory Pathways
- Gibbs Sampling in BayesNets.jl
- UAV Collision Avoidance Policy Optimization with Deep Reinforcement Learning
- Improving Training Efficiency in Deep Q-Learning for Atari Breakout
- Monitoring Machine Workload Distribution with Kalman Filter
- Approximating Transition Functions to Cart Track MDPs via Sub-State Sampling
- Approaching Quantitative Trading with Machine Learning
- Structure and Parameter Learning in Bayesian Networks with Applications to Predicting Breast Cancer Tumor Malignancy in a Lower Dimension Feature Space
- Autonomous Racing by Learning from Professionals
- Bravo Zulu: Optimizing Teammate Selection for Military and Civilian Applications
- Investigating Transfer Learning in Deep Reinforcement Learning Context
- Simultaneous Estimation and Control with MCTS
- Controlling Soft Robots with POMCP
- Automatic Learning of Computer Users’ Habits
- Learning to Play Soccer in the OpenAI Gym
- Playing Ultimate Tic-Tac-Toe with TD Learning and Monte Carlo Tree Search
- A Bayesian Network Model of Pilot Response to TCAS Resolution Advisories
- Improving Head Impact Kinematics Measurement Accuracy using Sensor Fusion
- Drive Decision Making at Intersections
- Deterministic and Bayesian Techniques for Spaceborne Vision-Based Non-Stellar Object Detection
- A Two-Phased Deep Reinforcement Learning Algorith for High-Frequency Trading
- Implementation and Experimentation of a DQN solver in Julia for POMDPs.jl
- Landing on the Moon
- Deserted Island: Cooperative Behavior in Absence of Explicit Delayed Reward
- DeepGo.py
- Managing Groundwater under Uncertain Seasonal Recharge
- Using Reinforcement Learning to Find Flaws in Collision Avoidance Systems
- Effectiveness of Bayesian Networks in Building a Prediction Model for Movie Success
- Data Driven Agent based on Aircraft Intent
- Deep Q-Learning with Natural Gradients
- A Shot in the Dark: Beating Battleship with POMCP
- Accelerated Asynchronous Deep Reinforcement Learning Variant of Advantage Actor-Critic
- Applying Reinforcement Learning and Online Methods on the Inverted Pendulum Problem
- Predicting Sentiment with Deep Q-Learning
- A Lookahead Strategy for Super-Level Set Estimation using Gaussian Processes
- Modeling Breast Cancer Treatment as a Markov Decision Process
- Learning 31 using Cross-Entropy Methods
- Improving Haptic Guidance using Reinforcement Learning
- NLPLab: Actor-Critic Training in Natural Language Processing
- Deep Reinforcement Learning on Atari Breakout
- Reinforcement Learning for LunarLander
- Reinforcement Learning for AI Machine Playing Hearthstone
- Using Deep Q-Learning to Automate CNN Training
- Automatic Continuous Variable Encoder in Bayesian Network
- Side Channel Analysis using Neural Networks and Random Forests
- A Decision-Making System for Wildfire Management
- Decentralized Game Theoretic Methods for the Distributed Graph Coverage Problem
- Autonomous altitude control for high altitude balloons
- Neural Network Arbitration for Better Time and Accuracy trade-offs
- Deep Deterministic Policy Gradient with Robot Soccer
- Towards a Personal Decision Support System
- Optimal Gerrymandering under Uncertainty
- The Ambulance Dilemma: Crossing an Intersection with Monte Carlo Tree Search
- DeepDominionDevelopmental Policy Design: an MDP approach
- Training of a craps betting strategy with Reinforcement Learning Techniques
- Engineering a Better Monkey
- Decision Making During a Bicycle Race
- Using Discrete Pressure Measurements to Understand Subsonic Bluff-Body Dynamic Damping
- Effective Move Selection in Chess Without Lookahead Search
- Solving Texas Hold’em with Monte-Carlo Planning
- Reinforcement Learning of High-Speed Autonomous Driving through Unknown Map
- Implementation and deployment of particle filter for simulated and real-world localization tasks
- Tree Augmented Naive Bayes and Backward Simulation
- Transfer of Q values across tasks in Reinforcement Learning
- Training Regime Modifications for Deep Q-Network Learning Acceleration
- Reinforce Optimizer
- Approximating Ligand Docking Using a Markov Decision Process
- Breaking Down Social Media Filter Bubbles via Reinforcement Learning
- Performing an N-Sentiment Classification Task on Tinder Profiles Based On Image Feature Extraction
- Play Blackjack With Monte Carlo Simulation And Q-learning with Linear Regression
- Observer-Actor Neural Networks for Self-Play in Imperfect Information Games
- Using Hybrid Bayes Nets to Model Country Prosperity
- Solving a Pandemic! Various Approaches for Tackling the Board Game
- Improved Markov Decision Process Model for Resource Allocation in Disaster Scenarios
- Learning Chess through Reinforcement Learning
- Deep Reinforcement Learning For Continuous Control: An Investigation of Techniques and Tricks
- Computer Vision Through Perception: Semantic Understanding of Novel Scenes through Data Programming
- Path Planning for Insertion of a Bevel-Tip Needle
- Modeling human biases through reinforcement learning
- Bootstrapping Neural Network with Auxiliary Tasks
- Q-Learning Application in Optimizing Pokémon Battle Strategy
- Model-based exploration in natural language generation
- Automated Aircraft Touchdown
- Longitudinal Vehicle Control using a Markov Decision Process and Deep Neural Network
- MOMDP-based Aerial Target Search Optimization
- Greedy Thick-Thinning Structure Learning and Bayesian Network Conditional Independence Implementations in BayesNets.jl
- Multiagent Planning For Aerial Broadband Internet
- Viral Marketing as an MDP
- Neural Soccer – Towards Exploration by the Pursuit of Novelty
- Locally Weighted Value Iteration in Julia
- Fully-Nested Interactive POMDPs for Partially-Observable Turn-Based Games
- Optimal Policy Considerations for Gas Turbine Maintenance
- Learning Optimal Manipulation of Food Webs
- Estimating Resource Prospector’s Probability of Failure Using Importance Sampling and Cross Entropy
- Dynamically Discount Deep Reinforcement Learning
- Deep Reinforcement Learning: Accelerated Learning with Effective Gradient Ascent Optimization Algorithms
- Autonomous Human Tracking in Simulated Environment
- A LQG Library for POMDPs.jl
Leduc Holdem Game
Fall 2015
Leduc Hold'em Poker
- Mars Hab-Bot: Using MDPs to simulate a robot constructing human-livable habitats on Mars
- A Value Iteration Study of BlackJack
- Optimized Store-Stocking via Monte Carlo Tree Search with Stochastic Rewards
- Trajectory Planning for Map Exploration Using Terrain Features
- Instruction Following with Deep Reinforcement Learning
- Using Markov Decision Processes to Minimize Golf Score
- Reinforcement Learning for Scheduling I/O Requests on Multi-Queue Flash Storage
- Finding the Perfect ‘Job’ in resource allocation
- Maximizing Influence in Social Networks
- A Machine Learning Regression Approach to General Game Playing
- Modeling GPS Spoofing as a Partially Observable Markov Decision Process
- Travel Hacking with MDPs
- Optimal Mission Planning for a Satellite-Based Particle Detector via Online Reinforcement Learning
- An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
- Sampling Strategies for Deep Reinforcement Learning
- Descriptive Power of Bayesian Structure Learning in Stock Market
- Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
- Simulated Pedestrian-like Navigation with a 1D Kalman Filter with an Accelerometer and the Global Positioning System
- Search and Track Tradeoff for Multifunction Radars
- Play Calling in American Football Using Value Iteration
- Reinforcement learning for commodity trading
- Learning the Stock Market, a Naive Approach
- A POMDP Framework for Modelling Robotic Guidance During a Tissue Palpation Task
- Reinforcement Learning of an Artificially Intelligent Hearts Player1
- Toy Helicopter Control via Deep Reinforcement Learning
- Gas Refuelling Optimization Modelled as a Markov Decision Process
- Q-Matrix and Policy Compression via Deep Learning
- Augmenting Self-Learning In Chess Through Expert Imitation
- Monte Carlo Tree Search Applied to a Variant of the RockSample Problem
- Supply Chain Management using POMDPs
- Online Markov Decision Process Framework for Modeling Efficient Home Robot Cleaners
- Reinforcement Learning for Path Planning with Soft Robotic Manipulators
- Exploring POMDPS with Recurrent Neural Networks
- Tic-tac-toe with reinforcement learning: best strategies and influence of parameters
- Vehicle Speed Prediction using Long Short-Term Memory Networks
- Explorations on Learning Bayesian Networks
- Playing unknown game on a visual world
- Reinforcement Learning for Atari Games
- Q-learning in the Game of Mastermind
- Modeling of a Baseball Inning as MDP
- Reinforcement Learning for Path Planning with Soft Robotic Manipulators
- Autonomous Driving on a Multi-lane Highway Using POMDPs
- Solving a Maze Without Location Data
- Markov Decision Processes and Optimal Policy Determination for Street Parking
- Solving an opponent-based match-three mobile game
- Life begins as a POMDP: improving decision making in the IVF clinic
- Path Planning for Target-Tracking Unmanned Aerial Vehicle
- Discrete State Filter Implementation for a Battleships Artificial Intelligence
- POMDP for Search and Rescue with Obstacle Avoidance: Incorporation of Human in the Loop
- Application performance over cellular networks
- An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
- Solving Dudo: beating Liar’s Dice with a POMDP
- Reinforcement Learning for Tetris
- Robot Path Planning using Monte Carlo POMDP
- Reinforcement Learning of an Artificially Intelligent Hearts Player
- Enhancing Computational Efficiency of PILCO Model-based Reinforcement Learning Algorithm
- Analysis of UCT Exploration Parameter in Sailing Domain Problems
- Solving a Search and Rescue Planning problem with MOMDPs
- Robot Motion Planning in Unknown Environments using Monte Carlo Tree Search
- Delivery optimization of an on-demand delivery service
- Solving MultiAgent Decision Making using MDPs
- Efficient and Modular Inventory Management Framework for Small Businesseses
- Markov Decision Processes in Board Game Playing
- Automated Model Selection via Gaussian Processes
- Predictive Hybrid Vehicle Control Policy
- Optimal Policies for In-Space Satellite Communications
- Spacecraft Navigation in Cluttered, Dynamic Environments Using 3D Lidar
- Playing Chess Endgames using Reinforcement Learning
- Space Debris Removal
- Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
- Relation Extraction from Scratch
- Lane Merging as a Markov Decision Process
- Using MDP/POMDP to Help in Search of Survivors of a Plane Crash
- Applying POMCP to Controlling Partially Observable Diffusion Processes
- Credit Risk Classification using Bayesian Network
Fall 2014
Leduc Holdem Rules
- Automating Air Traffic Management for Flight Arrivals
- Policy Learning for Sokoban
- Flight Path Optimization Under Constraints Using a Markov Decision Process Approach
- Visual Localization and POMDP for Autonomous Indoor Navigation
- Monte Carlo Tree Search for Online Learning in Golf Course Management
- Pushing on Leaves
- Beating 2048
- Improved electrical grid balancing with demand response scheduled by an MDP
- Multi-Fidelity Model Management in Engineering Design Optimization Using Partially Observable Markov Decision Processes
- Smarter Generators in Power Markets
- Beach Paddle Ball
- Applying POMDP to RockSample problem
- Targeting Hostile Vehicle Modeled as a Partially Observable Markov Decision Process with State-Dependent Observation Model
- Reinforcement Learning and Linear Gaussian Dynamics Applied to Multifidelity Optimization of a Supersonic Wedge
- Approximate POMDP Solutions for Short-Range UAV Traffic Conflict Resolution
- WorkSmart: The Implementation of a Modified Q-Learning Algorithm for an Intelligent Daily To-Do List Android Application
- Imminent Obstacle Avoidance with Friction Uncertainty
- Dynamic Restrictions during Commercial Space Vehicle Launches
- Autonomous Direct Marketing with Deep Q-Learning
- Efficient Risk Estimation for Chance-Constrained Robotic Motion Planning Under Uncertainty
- Probabilistic Aircraft Arrival Rate Prediction
- Audio Keylogging: Translating Acoustic Signals into Keystrokes
- Collision Avoidance for Small Multi-Rotor Aircraft using SARSA(λ) and Fourier Basis Functions
- Reinforcement Learning with Tetris
- Stock Market Reinforcement Learning
- Obstacle Avoidance for Automated Vehicle using Markov Decision Processes
- Control of Epidemics on a Graph
- Autonomous ATC for non-towered airports
- Path Planning for Terrain Relative Navigation using POMDPs
- Vehicle Braking Controller in a Markov Decision Process Framework
- Multi-Armed Bandit Heuristics for HTTP Denial-of-Service Attacks
- Structure Learning for Probabilistic Driving Models
- Casino Blackjack Modeled as a Markov Decision Process
- Competitive Collision Avoidance
- Efficient Sampling Of Protein Landscapes Via Markov Decision Processes
- Flight Deck Interval Management (An MDP Approach)
- BGT Model for Analysis of Head-On Collisions
- Collision Avoidance System Parameter Optimization
- Dynamic Demand Prediction and Routing for Autonomous Mobility-on-Demand Systems
- Action-Constrained, Multi-Species Task Scheduling: The Kayaker Problem
- Reinforcement Learning with Low-rank Matrix Factorization
- Automated Sequencing and Spacing of Arrival Aircraft in Final Vector Approach Airspace
- Exploring Policy Learning for Blackjack