Research Projects by Topic

Publications by Year

2019

  • The Utility of Sparse Representations for Control in Reinforcement Learning. Vincent Liu, Raksha Kumaraswamy, Lei Le and Martha White. AAAI Conference on Artificial Intelligence, 2019. [pdf]
  • Meta-descent for Online, Continual Prediction. Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White. AAAI Conference on Artificial Intelligence, 2019. [pdf]

2018

  • An Off-policy Policy Gradient Theorem Using Emphatic Weightings. Ehsan Imani, Eric Graves and Martha White. Advances in Neural Information Processing Systems, 2018. [pdf]
  • Evaluating Predictive Knowledge. Alex Kearney, Anna Koop, Craig Sherstan, Richard S. Sutton, Patrick M. Pilarski, Matthew E. Taylor. , 2018. [pdf]
  • Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return. Craig Sherstan, Dylan R. Ashley, Brendan Bennett, Kenny J. Young, Adam White, Martha White, Richard S. Sutton. UAI, 2018. [pdf]
  • Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods. Craig Sherstan, Brendan Bennett, Kenny J. Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton. ArXiv, 2018.
  • Directly Estimating the Variance of the {\lambda}-Return Using Temporal-Difference Methods. Craig Sherstan, Brendan Bennett, Kenny J. Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton. , 2018. [pdf]
  • Two geometric input transformation methods for fast online reinforcement learning with neural nets. Sina Ghiassian, Huizhen Yu, Banafsheh Rafiee, Richard S. Sutton. ArXiv, 2018. [pdf]
  • Multi-Step Reinforcement Learning: A Unifying Algorithm. Kristopher De Asis, J. Fernando Hernandez-Garcia, G. Zacharias Holland, Richard S. Sutton. AAAI, 2018. [pdf]
  • TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent. Alexandra Kearney, Vivek Veeriah, Jaden B. Travnik, Richard S. Sutton, Patrick M. Pilarski. ArXiv, 2018. [pdf]
  • Model-based Reinforcement Learning with Non-linear Expectation Models and Stochastic Environments. Yi Wan, Muhammad Zaheer, Martha White, Richard S. Sutton. , 2018. [pdf]
  • Integrating Episodic Memory into a Reinforcement Learning Agent using Reservoir Sampling. Kenny J. Young, Richard S. Sutton, Shuo Yang. ArXiv, 2018. [pdf]
  • Reactive Reinforcement Learning in Asynchronous Environments. Jaden B. Travnik, Kory Wallace Mathewson, Richard S. Sutton, Patrick M. Pilarski. Front. Robot. AI, 2018. [pdf]
  • Predicting Periodicity with Temporal Difference Learning. Kristopher De Asis, Brendan Bennett, Richard S. Sutton. ArXiv, 2018. [pdf]
  • Per-decision Multi-step Temporal Difference Learning with Control Variates. Kristopher De Asis, Richard S. Sutton. UAI, 2018. [pdf]
  • Online Off-policy Prediction. Sina Ghiassian, Andrew Patterson, Martha White, Richard S. Sutton, Adam White. ArXiv, 2018. [pdf]
  • Smoothed Action Value Functions for Learning Gaussian Policies. Ofir Nachum, Mohammad Norouzi, George Tucker, Dale Schuurmans. ICML, 2018. [pdf]
  • Variational Rejection Sampling. Aditya Grover, Ramki Gummadi, Miguel Lázaro-Gredilla, Dale Schuurmans, Stefano Ermon. AISTATS, 2018. [pdf]
  • The Voice of the Heart: Vowel-Like Sound in Pulmonary Artery Hypertension. Mohamed Elgendi, Prashant Raviprakash Bobhate, Shreepal Ambalal Jain, Long Bao Guo, Jennifer M. Rutledge, Yashu Coe, Roger J. Zemp, Dale Schuurmans, Ian Adatia. Diseases, 2018. [pdf]
  • Planning and Learning with Stochastic Action Sets. Craig Boutilier, Alon Cohen, Amit Daniely, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Martin Mladenov, Dale Schuurmans. IJCAI, 2018. [pdf]
  • Kernel Exponential Family Estimation via Doubly Dual Embedding. Bo Dai, Hanjun Dai, Arthur Gretton, Liyuan Song, Dale Schuurmans, Niao He. ArXiv, 2018. [pdf]
  • Understanding the impact of entropy on policy optimization. Zafarali Ahmed, Nicolas Le Roux, Mohammad Norouzi, Dale Schuurmans. ArXiv, 2018. [pdf]
  • Online Learning to Rank with Features. Shuai Li, Tor Lattimore, Csaba Szepesvári. ArXiv, 2018. [pdf]
  • Bandits with Delayed, Aggregated Anonymous Feedback. Ciara Pike-Burke, Shipra Agrawal, Csaba Szepesvári, Steffen Grünewälder. ICML, 2018. [pdf]
  • Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits. Branislav Kveton, Csaba Szepesvári, Zheng Wen, Mohammad Ghavamzadeh, Tor Lattimore. ArXiv, 2018. [pdf]
  • PAC-Bayes bounds for stable algorithms with instance-dependent priors. Omar Rivasplata, Emilio Parrado-Hernández, John Shawe-Taylor, Shiliang Sun, Csaba Szepesvári. NeurIPS, 2018. [pdf]
  • Regret Bounds for Model-Free Linear Quadratic Control. Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári. ArXiv, 2018. [pdf]
  • LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration. Gellért Weisz, András György, Csaba Szepesvári. ICML, 2018. [pdf]
  • BubbleRank: Safe Online Learning to Rerank. Branislav Kveton, Chang Li, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvári, Masrour Zoghi. ArXiv, 2018. [pdf]
  • Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go?. Chandrashekar Lakshminarayanan, Csaba Szepesvári. AISTATS, 2018. [pdf]
  • Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers. Yao Ma, Alexander Olshevsky, Csaba Szepesvári, Venkatesh Saligrama. ICML, 2018. [pdf]
  • An Exponential Tail Bound for Lq Stable Learning Rules. Application to k-Folds Cross-Validation. Karim T. Abou-Moustafa, Csaba Szepesvári. ISAIM, 2018. [pdf]
  • Cleaning up the neighborhood: A full classification for adversarial partial monitoring. Tor Lattimore, Csaba Szepesvári. ArXiv, 2018. [pdf]
  • Model-Free Linear Quadratic Control via Reduction to Expert Prediction. Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári. , 2018. [pdf]
  • Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures. Jonathan Uesato, Ananya Kumar, Csaba Szepesvári, Tom Erez, Avraham Ruderman, Keith Anderson, Krishmamurthy Dvijotham, Nicolas Heess, Pushmeet Kohli. ArXiv, 2018. [pdf]
  • Hoeffding Bounds vs . Empirical. Volodymyr Mnih, Csaba Szepesvári. , 2018. [pdf]
  • TopRank: A practical algorithm for online stochastic ranking. Tor Lattimore, Branislav Kveton, Shuai Li, Csaba Szepesvári. NeurIPS, 2018. [pdf]
  • Monitoring food pathogens: Novel instrumentation for cassette PCR testing. Darin Hunt, Curtis Figley, Dammika P Manage, Jana Lauzon, Rachel Figley, Linda M. Pilarski, lynn M. McMulleN, Patrick M. Pilarski. PloS one, 2018. [pdf]
  • Characterization of normative hand movements during two functional upper limb tasks. Aïda M. Valevicius, Quinn A. Boser, Ewen B. Lavoie, Glyn Murgatroyd, Patrick M. Pilarski, Craig S. Chapman, Albert H. Vette, Jacqueline S. Hebert. PloS one, 2018. [pdf]
  • Predictions , Surprise , and Predictions of Surprise in General Value Function Architectures. Johannes Gunther, Alex Kearney, Michael Rory Dawson, Craig Sherstan, Patrick M. Pilarski. , 2018. [pdf]
  • Incrementally Added GVFs are Learned Faster with the Successor Representation. Craig Sherstan, Marlos C. Machado, Patrick M. Pilarski. , 2018. [pdf]
  • Cluster-based upper body marker models for three-dimensional kinematic analysis: Comparison with an anatomical model and reliability analysis.. Quinn A. Boser, Aïda M. Valevicius, Ewen B. Lavoie, Craig S. Chapman, Patrick M. Pilarski, Jacqueline S. Hebert, Albert H. Vette. Journal of biomechanics, 2018. [pdf]
  • Using synchronized eye and motion tracking to determine high-precision eye-movement patterns during object-interaction tasks.. Ewen B. Lavoie, Aïda M. Valevicius, Quinn A. Boser, Ognjen Kovic, Albert H. Vette, Patrick M. Pilarski, Jacqueline S. Hebert, Craig S. Chapman. Journal of vision, 2018. [pdf]
  • Accelerating Learning in Constructive Predictive Frameworks with the Successor Representation. Craig Sherstan, Marlos C. Machado, Patrick M. Pilarski. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018. [pdf]
  • Preliminary Testing of a Telerobotic Haptic System and Analysis of Visual Attention During a Playful Activity. Fernanda Aparecida Heleno Batista, Takeshi Matsuda, Mahdi Tavakoli, Patrick M. Pilarski, Kim Adams. 2018 7th IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob), 2018. [pdf]
  • Context-Aware Learning from Demonstration: Using Camera Data to Support the Synergistic Control of a Multi-Joint Prosthetic Arm. Gautham Vasan, Patrick M. Pilarski. 2018 7th IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob), 2018. [pdf]
  • Initial Investigation of a Self-Adjusting Wrist Control System to Maintain Prosthesis Terminal Device Orientation Relative to the Ground Reference Frame. Jintao Shen, Michael Rory Dawson, Glyn Murgatroyd, Jason P. Carey, Patrick M. Pilarski. 2018 7th IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob), 2018. [pdf]
  • Generalizing Value Estimation over Timescale. Craig Sherstan, James MacGlashan, Patrick M. Pilarski. , 2018. [pdf]
  • Design and Implementation of a Two-Channel Interleaved Vienna-Type Rectifierbr With >99% Efficiency. Qiong Wang, Xuning Zhang, Rolando Burgos, Dushan Boroyevich, Adam M. White, Mustansir H. Kheraluwala. IEEE Transactions on Power Electronics, 2018. [pdf]
  • Context-dependent upper-confidence bounds for directed exploration. Raksha Kumaraswamy, Matthew Schlegel, Adam White, Martha White. NeurIPS, 2018. [pdf]
  • The Barbados 2018 List of Open Issues in Continual Learning. Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc G. Bellemare, Doina Precup. ArXiv, 2018. [pdf]
  • Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains. Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White. IJCAI, 2018. [pdf]
  • General Value Function Networks. Matthew Schlegel, Adam White, Andrew Patterson, Martha White. ArXiv, 2018. [pdf]
  • Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control. Yangchen Pan, Amir-massoud Farahmand, Martha White, Saleh Nabi, Piyush Grover, Daniel Nikovski. ICML, 2018. [pdf]
  • Can E-Cigarettes and Pharmaceutical Aids Increase Smoking Cessation and Reduce Cigarette Consumption? Findings From a Nationally Representative Cohort of American Smokers.. Tarik Benmarhnia, John P Pierce, Eric C. Leas, Martha White, David R. Strong, Madison L. Noble, Dennis R. Trinidad. American journal of epidemiology, 2018. [pdf]
  • Correction: Income disparities in smoking cessation and the diffusion of smoke-free homes among U.S. smokers: Results from two longitudinal surveys. Maya Vijayaraghavan, Tarik Benmarhnia, John P Pierce, Martha White, Jennie Kempster, Yuyan Shi, Dennis R. Trinidad, Karen Messer. PloS one, 2018. [pdf]
  • Tobacco control in California compared with the rest of the USA: trends in adult per capita cigarette consumption.. John P Pierce, Yuyan Shi, Erik M. Hendrickson, Martha White, Madison L. Noble, Sheila Kealey, David R. Strong, Dennis R. Trinidad, Anne Marcia Hartman, Karen S Messer. Tobacco control, 2018. [pdf]
  • Income disparities in smoking cessation and the diffusion of smoke-free homes among U.S. smokers: Results from two longitudinal surveys. Maya Vijayaraghavan, Tarik Benmarnhia, John P Pierce, Martha White, Jennie Kempster, Yuyan Shi, Dennis R. Trinidad, Karen S Messer. PloS one, 2018. [pdf]
  • Improving Regression Performance with Distributional Losses. Ehsan Imani, Martha White. ICML, 2018. [pdf]
  • Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces. Sungsu Lim, PhD Nikola Panic MD, Lei Le, Yangchen Pan, Martha White. ArXiv, 2018. [pdf]
  • Association Between Receptivity to Tobacco Advertising and Progression to Tobacco Use in Youth and Young Adults in the PATH Study.. John P Pierce, James Sargent, David B. Portnoy, Martha White, Madison L. Noble, Sheila Kealey, Nicolette Borek, Charles P Carusi, Kelvin Choi, Victoria R Green, Annette R Kaufman, Eric C. Leas, Michael J. Lewis, Katherine A. Margolis, Karen S Messer, Yuyan Shi, Marushka L Silveira, Kimberly Snyder, Cassandra A Stanton, Susanne E. Tanski, Maansi Bansal-Travers, Dennis R. Trinidad, Andrew J. Hyland. JAMA pediatrics, 2018. [pdf]
  • Effectiveness of Pharmaceutical Smoking Cessation Aids in a Nationally Representative Cohort of American Smokers. Eric C. Leas, John P Pierce, Tarik Benmarhnia, Martha White, Madison L. Noble, Dennis R. Trinidad, David R. Strong. Journal of the National Cancer Institute, 2018. [pdf]
  • High-confidence error estimates for learned value functions. Touqir Sajed, Wesley Chung, Martha White. UAI, 2018. [pdf]
  • Trends in lung cancer and cigarette smoking: California compared to the rest of the United States.. John P Pierce, Yuyan Shi, Sara B McMenamin, Tarik Benmarhnia, Dennis R. Trinidad, David R. Strong, Martha White, Sheila Kealey, Erik M. Hendrickson, Matthew D. Stone, Adriana Villaseñor, S. L. Kwong, Xueying Zhang, Karen Messer. Cancer prevention research, 2018. [pdf]
  • Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents. Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew J. Hausknecht, Michael H. Bowling. J. Artif. Intell. Res., 2018. [pdf]
  • Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. Sriram Srinivasan, Marc Lanctot, Vinícius Flores Zambaldi, Julien Pérolat, Karl Tuyls, Rémi Munos, Michael H. Bowling. NeurIPS, 2018. [pdf]
  • The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces. G. Zacharias Holland, Erik Talvitie, Michael H. Bowling. ArXiv, 2018. [pdf]
  • Count-Based Exploration with the Successor Representation. Marlos C. Machado, Marc G. Bellemare, Michael H. Bowling. ArXiv, 2018. [pdf]
  • Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents. Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew J. Hausknecht, Michael H. Bowling. IJCAI, 2018. [pdf]
  • Solving Large Extensive-Form Games with Strategy Constraints. Trevor Davis, Kevin Waugh, Michael H. Bowling. ArXiv, 2018. [pdf]
  • Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines. Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael H. Bowling. ArXiv, 2018. [pdf]
  • Generalization and Regularization in DQN. Jesse Farebrother, Marlos C. Machado, Michael H. Bowling. ArXiv, 2018. [pdf]
  • Count-Based Exploration with the Successor Representation. Marlos C. Machado, Marc G. Bellemare, Michael H. Bowling. ArXiv, 2018. [pdf]
  • Supervised autoencoders Improving generalization performance with unsupervised regularizers.. Lei Le, Andrew Patterson and Martha White. Advances in Neural Information Processing Systems (NIPS), 2018. [pdf]

2017

  • Some Recent Applications of Reinforcement Learning. Andrew G. Barto, Philip S. Thomas, Richard S. Sutton. , 2017. [pdf]
  • A Deeper Look at Experience Replay. Shangtong Zhang, Richard S. Sutton. ArXiv, 2017. [pdf]
  • Crossprop: Learning Representations by Stochastic Meta-Gradient Descent in Neural Networks. Vivek Veeriah, Shangtong Zhang, Richard S. Sutton. ECML/PKDD, 2017. [pdf]
  • On Generalized Bellman Equations and Temporal-Difference Learning. Huizhen Yu, Ashique Rupam Mahmood, Richard S. Sutton. Journal of Machine Learning Research, 2017. [pdf]
  • P552 An online educational portal improves concerns of inflammatory bowel disease patients regarding pregnancy and medication.. Richard S. Sutton, Kelsey Wierstra, Lucio D ‘ Ambrosio, Levinus Albert Dieleman, Richard N Fedorak, Brendan Halloran, Karen Ivy Kroeker, Karen Wong, K-A Berga, Vivian Wai-Mei Huang. , 2017. [pdf]
  • Integral Policy Iterations for Reinforcement Learning Problems in Continuous Time and Space. Jae Young Lee, Richard S. Sutton. ArXiv, 2017. [pdf]
  • A First Empirical Study of Emphatic Temporal Difference Learning. Sina Ghiassian, Banafsheh Rafiee, Richard S. Sutton. ArXiv, 2017. [pdf]
  • Multi-step Off-policy Learning Without Importance Sampling Ratios. Ashique Rupam Mahmood, Huizhen Yu, Richard S. Sutton. ArXiv, 2017. [pdf]
  • GQ($\lambda$) Quick Reference and Implementation Guide. Adam White, Richard S. Sutton. , 2017. [pdf]
  • GQ($λ$) Quick Reference and Implementation Guide. Adam White, Richard S. Sutton. ArXiv, 2017.
  • Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning. Vivek Veeriah, Harm van Seijen, Richard S. Sutton. AAMAS 2017, 2017. [pdf]
  • Associative Learning from Replayed Experience. Elliot A. Ludvig, Mahdieh S. Mirian, E. James Kehoe, Richard S. Sutton. , 2017. [pdf]
  • Communicative Capital for Prosthetic Agents.. Patrick M. Pilarski, Richard S. Sutton, Kory Wallace Mathewson, Craig Sherstan, Adam S. R. Parker, Ann L. Edwards. ArXiv, 2017. [pdf]
  • Generalized Conditional Gradient for Sparse Estimation. Yaoliang Yu, Xinhua Zhang, Dale Schuurmans. Journal of Machine Learning Research, 2017. [pdf]
  • Formalizing Anthropomorphism Through Games: A Study in Deep Neural Networks. Martin A. Zinkevich, Dale Schuurmans. AAAI Workshops, 2017. [pdf]
  • Resampled Proposal Distributions for Variational Inference and Learning. Aditya Grover, Ramki Gummadi, Miguel Lázaro-Gredilla, Dale Schuurmans, Stefano Ermon. , 2017. [pdf]
  • Logistic Markov Decision Processes. Martin Mladenov, Craig Boutilier, Dale Schuurmans, Ofer Meshi, Gal Elidan, Tyler Lu. IJCAI, 2017. [pdf]
  • Holographic Feature Representations of Deep Networks. Martin A. Zinkevich, Alex Davies, Dale Schuurmans. UAI, 2017. [pdf]
  • Bridging the Gap Between Value and Policy Based Reinforcement Learning. Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans. NIPS, 2017. [pdf]
  • Multi-view Matrix Factorization for Linear Dynamical System Estimation. Mahdi Karami, Martha White, Dale Schuurmans, Csaba Szepesvári. NIPS, 2017. [pdf]
  • Trust-PCL: An Off-Policy Trust Region Method for Continuous Control. Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans. ArXiv, 2017. [pdf]
  • Safe Exploration for Identifying Linear Systems via Robust Optimization. Tyler Lu, Martin Zinkevich, Craig Boutilier, Binz Roy, Dale Schuurmans. ArXiv, 2017. [pdf]
  • Unsupervised Sequential Sensor Acquisition. Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama. AISTATS, 2017. [pdf]
  • Bandits with Delayed Anonymous Feedback. Ciara Pike-Burke, Shipra Agrawal, Csaba Szepesvári, Steffen Grünewälder. ArXiv, 2017.
  • Stochastic Rank-1 Bandits. Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen. AISTATS, 2017. [pdf]
  • Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities. Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvári. Journal of Machine Learning Research, 2017. [pdf]
  • An a Priori Exponential Tail Bound for k-Folds Cross-Validation. Karim T. Abou-Moustafa, Csaba Szepesvári. ArXiv, 2017. [pdf]
  • Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging. Chandrashekar Lakshminarayanan, Csaba Szepesvári. ArXiv, 2017. [pdf]
  • A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds. Pooria Joulani, András György, Csaba Szepesvári. ALT, 2017. [pdf]
  • The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits. Tor Lattimore, Csaba Szepesvári. AISTATS, 2017. [pdf]
  • Structured Best Arm Identification with Fixed Confidence. Ruitong Huang, Mohammad M. Ajallooeian, Csaba Szepesvári, Martin Müller. ALT, 2017. [pdf]
  • Stochastic Low-Rank Bandits. Branislav Kveton, Csaba Szepesvári, Anup Rao, Zheng Wen, Yasin Abbasi-Yadkori, S. Muthukrishnan. ArXiv, 2017. [pdf]
  • Online Learning to Rank in Stochastic Click Models. Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvári, Tomás Tunys, Zheng Wen, Masrour Zoghi. ICML, 2017. [pdf]
  • Bernoulli Rank-1 Bandits for Click Feedback. Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen. IJCAI, 2017. [pdf]
  • Crowdsourcing with Sparsely Interacting Workers. Yao Ma, Alexander Olshevsky, Venkatesh Saligrama, Csaba Szepesvári. ArXiv, 2017. [pdf]
  • Editorial: Peripheral Nervous System-Machine Interfaces (PNS-MI). Michael Wininger, Panagiotis K. Artemiadis, Claudio Castellini, Patrick M. Pilarski. Front. Neurorobot., 2017. [pdf]
  • Assessment of feature selection and classification methods for recognizing motor imagery tasks from electroencephalographic signals. Roberto Vega, Touqir Sajed, Kory Wallace Mathewson, Kriti Khare, Patrick M. Pilarski, Russ Greiner, Gildardo Sánchez-Ante, Javier Mauricio Antelis. Artif. Intell. Research, 2017. [pdf]
  • Representing high-dimensional data to intelligent prostheses and other wearable assistive robots: A first comparison of tile coding and selective Kanerva coding. Jaden B. Travnik, Patrick M. Pilarski. 2017 International Conference on Rehabilitation Robotics (ICORR), 2017. [pdf]
  • Learning from demonstration: Teaching a myoelectric prosthesis with an intact limb via reinforcement learning. Gautham Vasan, Patrick M. Pilarski. 2017 International Conference on Rehabilitation Robotics (ICORR), 2017. [pdf]
  • Actor-Critic Reinforcement Learning with Simultaneous Human Control and Feedback. Kory Wallace Mathewson, Patrick M. Pilarski. ArXiv, 2017. [pdf]
  • DEVELOPMENT OF THE HANDI HAND : AN INEXPENSIVE , MULTI-ARTICULATING , SENSORIZED HAND FOR MACHINE LEARNING RESEARCH IN MYOELECTRIC CONTROL. Jintao Shen, Michael Rory Dawson, Patrick M. Pilarski. , 2017. [pdf]
  • Reinforcement Learning Based Embodied Agents Modelling Human Users Through Interaction and Multi-Sensory Perception. Kory Wallace Mathewson, Patrick M. Pilarski. AAAI Spring Symposia, 2017. [pdf]
  • Accelerated Gradient Temporal Difference Learning. Yangchen Pan, Adam M. White, Martha White. AAAI, 2017. [pdf]
  • Re-entry Prediction Uncertainties derived from Environmental and Observation considerations. Noelia Sánchez-Ortiz, Núria López, Ignacio H. López Grande, Luis Felipe Tabera, Adam White, Stijn Lemmens. , 2017. [pdf]
  • Pre-adolescent Receptivity to Tobacco Marketing and Its Relationship to Acquiring Friends Who Smoke and Cigarette Smoking Initiation.. David R. Strong, Karen S Messer, Sheri J Hartman, Jesse N. Nodora, Lisa E Vera, Martha White, Eric C. Leas, Nikolas Pharris-Ciurej, Nicolette Borek, John P Pierce. Annals of behavioral medicine : a publication of the Society of Behavioral Medicine, 2017. [pdf]
  • Susceptibility to tobacco product use among youth in wave 1 of the population Assessment of tobacco and health (PATH) study.. Dennis R. Trinidad, John P Pierce, James Sargent, Martha White, David R. Strong, David B. Portnoy, Victoria R Green, Cassandra A Stanton, Kelvin Choi, Maansi Bansal-Travers, Yuyan Shi, Jennifer L Pearson, Annette R Kaufman, Nicolette Borek, Blair N. Coleman, Andrew J. Hyland, Charles P Carusi, Sheila Kealey, Eric C. Leas, Madison L. Noble, Karen S Messer. Preventive medicine, 2017. [pdf]
  • Associations Between Cigarette Print Advertising and Smoking Initiation Among African Americans.. Dennis R. Trinidad, Lyzette Blanco, Sherry L Emery, Pebbles Fagan, Martha White, Mark B. Reed. Journal of racial and ethnic health disparities, 2017. [pdf]
  • Learning Sparse Representations in Reinforcement Learning with Sparse Coding. Lei Le, Raksha Kumaraswamy, Martha White. IJCAI, 2017. [pdf]
  • Unifying task specification in reinforcement learning. Martha White. ICML, 2017. [pdf]
  • Receptivity to Tobacco Advertising and Susceptibility to Tobacco Products.. John P Pierce, James Sargent, Martha White, Nicolette Borek, David B. Portnoy, Victoria R Green, Annette R Kaufman, Cassandra A Stanton, Maansi Bansal-Travers, David R. Strong, Jennifer L Pearson, Blair N. Coleman, Eric C. Leas, Madison L. Noble, Dennis R. Trinidad, Meghan B. Moran, Charles P Carusi, Andrew J. Hyland, Karen S Messer. Pediatrics, 2017. [pdf]
  • Exercise Decreases and Smoking Increases Bladder Cancer Mortality.. Michael Andre Liss, Martha White, Loki Natarajan, J Kellogg Parsons. Clinical genitourinary cancer, 2017. [pdf]
  • Adapting Kernel Representations Online Using Submodular Maximization. Matthew Schlegel, Yangchen Pan, Jiecao Chen, Martha White. ICML, 2017. [pdf]
  • Recovering True Classifier Performance in Positive-Unlabeled Learning. Shantanu Jain, Martha White, Predrag Radivojac. AAAI, 2017. [pdf]
  • Effective sketching methods for value function approximation.. Yangchen Pan, Erfan Sadeqi Azer, Martha White. UAI 2017, 2017. [pdf]
  • AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games. Neil Burch, Martin Schmid, Matej Moravcik, Michael H. Bowling. AAAI Workshops, 2017. [pdf]
  • Heads-up limit hold'em poker is solved. Michael H. Bowling, Neil Burch, Michael Johanson, Oskari Tammelin. Commun. ACM, 2017. [pdf]
  • Water quality, compliance, and health outcomes among utilities implementing Water Safety Plans in France and Spain.. K Setty, Georgia Lyn Kayser, Michael H. Bowling, Jérôme Enault, J. F. Loret, Claudia Puigdomenech Serra, Jordi Alonso, Arnau Pla Mateu, Jamie Bartram. International journal of hygiene and environmental health, 2017. [pdf]
  • Eqilibrium Approximation Quality of Current No-Limit Poker Bots. Viliam Lisý, Michael H. Bowling. AAAI Workshops, 2017. [pdf]
  • A Laplacian Framework for Option Discovery in Reinforcement Learning. Marlos C. Machado, Marc G. Bellemare, Michael H. Bowling. ICML, 2017. [pdf]
  • Water, sanitation, and hygiene in schools: Status and implications of low coverage in Ethiopia, Kenya, Mozambique, Rwanda, Uganda, and Zambia.. Camille Morgan, Michael H. Bowling, Jamie Bartram, Georgia Lyn Kayser. International journal of hygiene and environmental health, 2017. [pdf]
  • DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker. Matej Moravčík, Martin Schmid, Neil Burch, Viliam Lisý, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, Michael H. Bowling. ArXiv, 2017. [pdf]

2016

  • True Online Temporal-Difference Learning. Harm van Seijen, Ashique Rupam Mahmood, Patrick M. Pilarski, Marlos C. Machado, Richard S. Sutton. Journal of Machine Learning Research, 2016. [pdf]
  • Learning representations through stochastic gradient descent in cross-validation error. Richard S. Sutton, Vivek Veeriah. ArXiv, 2016. [pdf]
  • Application of real-time machine learning to myoelectric prosthesis control: A case series in adaptive switching.. Ann L. Edwards, Michael Rory Dawson, Jacqueline S. Hebert, Craig Sherstan, Richard S. Sutton, K. Ming Chan, Patrick M. Pilarski. Prosthetics and orthotics international, 2016. [pdf]
  • A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward. Susan A. Murphy, Yanzhen Deng, Eric B. Laber, Hamid Reza Maei, Richard S. Sutton, Katie A Witkiewitz. ArXiv, 2016. [pdf]
  • An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning. Richard S. Sutton, Ashique Rupam Mahmood, Martha White. Journal of Machine Learning Research, 2016. [pdf]
  • Face valuing: Training user interfaces with facial expressions and reinforcement learning. Vivek Veeriah, Patrick M. Pilarski, Richard S. Sutton. ArXiv, 2016. [pdf]
  • Stochastic Neural Networks with Monotonic Activation Functions. Siamak Ravanbakhsh, Barnabás Póczos, Jeff G. Schneider, Dale Schuurmans, Russell Greiner. AISTATS, 2016. [pdf]
  • Deep Learning Games. Dale Schuurmans, Martin Zinkevich. NIPS, 2016. [pdf]
  • EARNING WITH A S TRONG A DVERSARY. Ruitong Huang, Bing Xu, Dale Schuurmans, Csaba Szepesvári. , 2016. [pdf]
  • Scalable and Sound Low-Rank Tensor Learning. Hao Cheng, Yaoliang Yu, Xinhua Zhang, Eric P. Xing, Dale Schuurmans. AISTATS, 2016. [pdf]
  • Improving Policy Gradient by Exploring Under-appreciated Rewards. Ofir Nachum, Mohammad Norouzi, Dale Schuurmans. ArXiv, 2016. [pdf]
  • Reward Augmented Maximum Likelihood for Neural Structured Prediction. Mohammad Norouzi, Samy Bengio, Zhifeng Chen, Navdeep Jaitly, Mike Schuster, Yonghui Wu, Dale Schuurmans. NIPS, 2016. [pdf]
  • Chaining Bounds for Empirical Risk Minimization. Gábor Balázs, András György, Csaba Szepesvári. ArXiv, 2016. [pdf]
  • Shifting Regret, Mirror Descent, and Matrices. András György, Csaba Szepesvári. ICML, 2016. [pdf]
  • Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control. A. L. PrashanthL., Cheng Jie, Michael C. Fu, Steven I. Marcus, Csaba Szepesvári. ICML, 2016. [pdf]
  • Regularized Policy Iteration with Nonparametric Function Spaces. Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor. Journal of Machine Learning Research, 2016. [pdf]
  • (Bandit) Convex Optimization with Biased Noisy Gradient Oracles. Xiaowei Hu, A. L. PrashanthL., András György, Csaba Szepesvári. AISTATS, 2016. [pdf]
  • Sequential Learning without Feedback. Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama. ArXiv, 2016. [pdf]
  • Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models. Bernardo Ávila Pires, Csaba Szepesvári. COLT, 2016. [pdf]
  • Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning. Guy Lever, John Shawe-Taylor, Ronnie Stafford, Csaba Szepesvári. AAAI, 2016. [pdf]
  • DCM Bandits: Learning to Rank with Multiple Clicks. Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Zheng Wen. ICML, 2016. [pdf]
  • Conservative Bandits. Yifan Wu, Roshan Shariff, Tor Lattimore, Csaba Szepesvári. ICML, 2016. [pdf]
  • Max-affine estimators for convex stochastic programming. Gábor Balázs, András György, Csaba Szepesvári. ArXiv, 2016. [pdf]
  • SDP Relaxation with Randomized Rounding for Energy Disaggregation. Kiarash Shaloudegi, András György, Csaba Szepesvári, Wilsun Xu. NIPS, 2016. [pdf]
  • Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms. Pooria Joulani, András György, Csaba Szepesvári. AAAI, 2016. [pdf]
  • Multiclass Classification Calibration Functions. Bernardo Ávila Pires, Csaba Szepesvári. ArXiv, 2016. [pdf]
  • Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities. Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvári. NIPS, 2016. [pdf]
  • Machine learning and unlearning to autonomously switch between the functions of a myoelectric arm. Ann L. Edwards, Jacqueline S. Hebert, Patrick M. Pilarski. 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob), 2016. [pdf]
  • Steps toward knowledgeable neuroprostheses. Patrick M. Pilarski, Craig Sherstan. 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob), 2016. [pdf]
  • Simultaneous Control and Human Feedback in the Training of a Robotic Agent with Actor-Critic Reinforcement Learning. Kory Wallace Mathewson, Patrick M. Pilarski. ArXiv, 2016. [pdf]
  • Introspective Agents: Confidence Measures for General Value Functions. Craig Sherstan, Adam White, Marlos C. Machado, Patrick M. Pilarski. AGI, 2016. [pdf]
  • A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning. Martha White, Adam M. White. AAMAS, 2016. [pdf]
  • Investigating practical, linear temporal difference learning. Adam M. White, Martha White. AAMAS, 2016. [pdf]
  • Design and optimization of a high performance isolated three phase AC/DC converter. Qiong Wang, Xuning Zhang, Rolando Burgos, Dushan Boroyevich, Adam M. White, Mustansir H. Kheraluwala. 2016 IEEE Energy Conversion Congress and Exposition (ECCE), 2016. [pdf]
  • Taking Control of Household IoT Device Privacy. Adam White, Martijn C. Willemsen. , 2016. [pdf]
  • Semantic Entity Relationship Management. Thorsten H. Niebuhr, Adam White. , 2016. [pdf]
  • A Popular History of Mammalia; Comprising a Familiar Account of Their Classification and Habits. Adam White. , 2016. [pdf]
  • E-cigarette use and smoking reduction or cessation in the 2010/2011 TUS-CPS longitudinal cohort. Yuyan Shi, John P Pierce, Martha White, Maya Vijayaraghavan, Wilson M. Compton, Kevin P. Conway, Anne Marcia Hartman, Karen S Messer. BMC public health, 2016. [pdf]
  • Incremental Truncated LSTD. Clement Gehring, Martha White. IJCAI, 2016. [pdf]
  • Nonparametric semi-supervised learning of class proportions. Shantanu Jain, Martha White, Michael W. Trosset, Predrag Radivojac. ArXiv, 2016. [pdf]
  • Physical Activity Decreases Kidney Cancer Mortality. Michael Andre Liss, Loki Natarajan, Aws K Hasan, Jonathan L Noguchi, Martha White, J Kellogg Parsons. Current urology, 2016. [pdf]
  • Identifying global optimality for dictionary learning. Lei Le, Martha White. , 2016. [pdf]
  • Mobile and Wearable Device Features that Matter in Promoting Physical Activity.. Julie B. Wang, Janine K Cataldo, Guadalupe Xochitl Ayala, Loki Natarajan, Lisa A Cadmus-Bertram, Martha White, Hala Madanat, Jeanne Nichols, John P Pierce. Journal of mobile technology in medicine, 2016. [pdf]
  • Global optimization of factor models using alternating minimization. Lei Le, Martha White. ArXiv, 2016. [pdf]
  • Estimating the class prior and posterior from noisy positives and unlabeled data. Shantanu Jain, Martha White, Predrag Radivojac. NIPS, 2016. [pdf]
  • Action Selection for Hammer Shots in Curling. Zaheen Farraz Ahmad, Robert C. Holte, Michael H. Bowling. IJCAI, 2016. [pdf]
  • Counterfactual Regret Minimization in Sequential Security Games. Viliam Lisý, Trevor Davis, Michael H. Bowling. AAAI, 2016. [pdf]
  • State of the Art Control of Atari Games Using Shallow Reinforcement Learning. Yitao Liang, Marlos C. Machado, Erik Talvitie, Michael H. Bowling. AAMAS 2016, 2016. [pdf]
  • Monte Carlo Tree Search in Continuous Action Spaces with Execution Uncertainty. Timothy Yee, Viliam Lisý, Michael H. Bowling. IJCAI, 2016. [pdf]
  • Learning Purposeful Behaviour in the Absence of Rewards. Marlos C. Machado, Michael H. Bowling. ArXiv, 2016. [pdf]
  • The Forget-me-not Process. Kieran Milan, Joel Veness, James Kirkpatrick, Michael H. Bowling, Anna Koop, Demis Hassabis. NIPS, 2016. [pdf]

2015

  • An Empirical Evaluation of True Online TD({\lambda}). Harm van Seijen, Ashique Rupam Mahmood, Patrick M. Pilarski, Richard S. Sutton. , 2015. [pdf]
  • Off-policy learning based on weighted importance sampling with linear computational complexity. Ashique Rupam Mahmood, Richard S. Sutton. UAI, 2015. [pdf]
  • Emphatic Temporal-Difference Learning. Ashique Rupam Mahmood, Huizhen Yu, Martha White, Richard S. Sutton. ArXiv, 2015. [pdf]
  • Learning to Predict Independent of Span. Hado van Hasselt, Richard S. Sutton. ArXiv, 2015. [pdf]
  • An Empirical Evaluation of True Online TD(λ). Harm van Seijen, Ashique Rupam Mahmood, Patrick M. Pilarski, Richard S. Sutton. ArXiv, 2015. [pdf]
  • True Online Emphatic TD(λ): Quick Reference and Implementation Guide. Richard S. Sutton. ArXiv, 2015. [pdf]
  • L EARNING WITH A DVERSARY. Ruitong Huang, Dale Schuurmans. , 2015. [pdf]
  • Towards Investigating Global Warming Impact on Human Health Using Derivatives of Photoplethysmogram Signals. Mohamed Elgendi, Ian Norton, Matt Brearley, R. R. Fletcher, Derek Abbott, Nigel H. Lovell, Dale Schuurmans, Paul B Tchounwou. International journal of environmental research and public health, 2015. [pdf]
  • Embedding inference for structured multilabel prediction. Farzaneh Mirzazadeh, Siamak Ravanbakhsh, Nan Ding, Dale Schuurmans. NIPS 2015, 2015. [pdf]
  • L EARNING WITH A S TRONG A DVERSARY. Ruitong Huang, Bing Xu, Dale Schuurmans, Csaba Szepesvári. , 2015. [pdf]
  • On Time Domain Analysis of Photoplethysmogram Signals for Monitoring Heat Stress. Mohamed Elgendi, Richard Ribon Fletcher, Ian Norton, Matt Brearley, Derek Abbott, Nigel H. Lovell, Dale Schuurmans. Sensors, 2015. [pdf]
  • Semi-Supervised Zero-Shot Classification with Label Representation Learning. Xin Li, Yuhong Guo, Dale Schuurmans. 2015 IEEE International Conference on Computer Vision (ICCV), 2015. [pdf]
  • Learning with a Strong Adversary. Ruitong Huang, Bing Xu, Dale Schuurmans, Csaba Szepesvári. ArXiv, 2015. [pdf]
  • Variance Reduction via Antithetic Markov Chains. James Neufeld, Dale Schuurmans, Michael H. Bowling. AISTATS, 2015. [pdf]
  • Optimal Estimation of Multivariate ARMA Models. Martha White, Junfeng Wen, Michael H. Bowling, Dale Schuurmans. AAAI, 2015. [pdf]
  • Scalable Metric Learning for Co-Embedding. Farzaneh Mirzazadeh, Martha White, András György, Dale Schuurmans. ECML/PKDD, 2015. [pdf]
  • Frequency analysis of photoplethysmogram and its derivatives. Mohamed Elgendi, Richard Ribon Fletcher, Ian Norton, Matt Brearley, Derek Abbott, Nigel H. Lovell, Dale Schuurmans. Computer Methods and Programs in Biomedicine, 2015. [pdf]
  • The unique heart sound signature of children with pulmonary artery hypertension.. Mohamed Elgendi, Prashant Raviprakash Bobhate, Shreepal Ambalal Jain, Long Bao Guo, Shine Kumar, Jennifer M. Rutledge, Yashu Coe, Roger J. Zemp, Dale Schuurmans, Ian Adatia. Pulmonary circulation, 2015. [pdf]
  • Correcting Covariate Shift with the Frank-Wolfe Algorithm. Junfeng Wen, Russell Greiner, Dale Schuurmans. IJCAI, 2015. [pdf]
  • Generalization in Unsupervised Learning. Karim T. Abou-Moustafa, Dale Schuurmans. ECML/PKDD, 2015. [pdf]
  • On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments. Yifan Wu, András György, Csaba Szepesvári. ICML, 2015. [pdf]
  • Combinatorial Cascading Bandits. Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári. NIPS, 2015. [pdf]
  • Markov Decision Processes under Bandit Feedback. Gergely Neu, András György, Csaba Szepesvári, András Antos. , 2015. [pdf]
  • Toward Minimax Off-policy Value Estimation. Lihong Li, Rémi Munos, Csaba Szepesvári. AISTATS, 2015. [pdf]
  • Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits. Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári. AISTATS, 2015. [pdf]
  • Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path. Daniel J. Hsu, Aryeh Kontorovich, Csaba Szepesvári. NIPS, 2015. [pdf]
  • Exploiting Symmetries to Construct Efficient MCMC Algorithms With an Application to SLAM. Roshan Shariff, András György, Csaba Szepesvári. AISTATS, 2015. [pdf]
  • Pathological Effects of Variance on Classification-Based Policy Iteration. Bernardo Ávila Pires, Csaba Szepesvári. , 2015. [pdf]
  • Linear Multi-Resource Allocation with Semi-Bandit Feedback. Tor Lattimore, Koby Crammer, Csaba Szepesvári. NIPS, 2015. [pdf]
  • Cascading Bandits: Learning to Rank in the Cascade Model. Branislav Kveton, Csaba Szepesvári, Zheng Wen, Azin Ashkan. ICML, 2015. [pdf]
  • Fast cross-validation for incremental learning. Pooria Joulani, András György, Csaba Szepesvári. IJCAI, 2015. [pdf]
  • Cascading Bandits. Branislav Kveton, Csaba Szepesvári, Zheng Wen, Azin Ashkan. ArXiv, 2015.
  • Decision-theoretic Clustering of Strategies. Nolan Bard, Deon Nicholas, Csaba Szepesvári, Michael H. Bowling. AAMAS, 2015. [pdf]
  • Online Learning with Gaussian Payoffs and Side Observations. Yifan Wu, András György, Csaba Szepesvári. NIPS, 2015. [pdf]
  • Deterministic Independent Component Analysis. Ruitong Huang, András György, Csaba Szepesvári. ICML, 2015. [pdf]
  • Bayesian Optimal Control of Smoothly Parameterized Systems. Yasin Abbasi-Yadkori, Csaba Szepesvári. UAI, 2015. [pdf]
  • Near-optimal max-affine estimators for convex regression. Gábor Balázs, András György, Csaba Szepesvári. AISTATS, 2015. [pdf]
  • A Collaborative Approach to the Simultaneous Multi-joint Control of a Prosthetic Arm. Craig Sherstan, Joseph Modayil, Patrick M. Pilarski. 2015 IEEE International Conference on Rehabilitation Robotics (ICORR), 2015. [pdf]
  • Prosthetic Devices as Goal-Seeking Agents. Patrick M. Pilarski. , 2015. [pdf]
  • Intelligent laser welding through representation , prediction , and control learning: An architecture with deep neural networks and reinforcement learning. Johannes Günther, Patrick M. Pilarski, Gerhard Helfrich, Hao Shen, Klaus Diepold. , 2015. [pdf]
  • Design considerations for a high efficiency 3 kW LLC resonant DC/DC transformer. Qiong Wang, Xuning Zhang, Rolando Burgos, Dushan Boroyevich, Adam M. White, Mustansir H. Kheraluwala. 2015 IEEE Energy Conversion Congress and Exposition (ECCE), 2015. [pdf]
  • Design and implementation of interleaved Vienna rectifier with greater than 99% efficiency. Qiong Wang, Xuning Zhang, Rolando Burgos, Dushan Boroyevich, Adam M. White, Mustansir H. Kheraluwala. 2015 IEEE Applied Power Electronics Conference and Exposition (APEC), 2015. [pdf]
  • COM-Based SBCs : The Superior Architecture for Small Form Factor Embedded Systems. Adam White. , 2015. [pdf]
  • Leveraging Cellphones for Wayfinding and Journey Planning in Semi-formal Bus Systems: Lessons from Digital Matatus in Nairobi. Jacqueline M. Klopp, Sarah Williams, Peter Wagacha Waiganjo, Daniel Orwa, Adam White. , 2015. [pdf]
  • The digital matatu project : Using cell phones to create an open source data for Nairobi ' s semi-formal bus system. Sarah Williams, Adam White, Peter Wagacha Waiganjo, Daniel Orwa, Jacqueline M. Klopp. , 2015. [pdf]
  • Wearable Sensor/Device (Fitbit One) and SMS Text-Messaging Prompts to Increase Physical Activity in Overweight and Obese Adults A Randomized Controlled Trial.. Julie B. Wang, Lisa A Cadmus-Bertram, Loki Natarajan, Martha White, Hala Madanat, Jeanne Nichols, Guadalupe Xochitl Ayala, John P Pierce. Telemedicine journal and e-health: the official journal of the American Telemedicine Association, 2015. [pdf]
  • Trends in use of little cigars or cigarillos and cigarettes among U.S. smokers, 2002-2011.. Karen S Messer, Martha White, David R. Strong, Baoguang Wang, Yuyan Shi, Kevin P. Conway, John P Pierce. Nicotine & tobacco research: official journal of the Society for Research on Nicotine and Tobacco, 2015. [pdf]
  • Cigarette smoking cessation attempts among current US smokers who also use smokeless tobacco.. Karen S Messer, Maya Vijayaraghavan, Martha White, Yuyan Shi, Cindy M C Chang, Kevin P. Conway, Anne Marcia Hartman, Megan J. Schroeder, Wilson M. Compton, John P Pierce. Addictive behaviors, 2015. [pdf]
  • Measurement of multiple nicotine dependence domains among cigarette, non-cigarette and poly-tobacco users: Insights from item response theory.. David R. Strong, Karen S Messer, Sheri J Hartman, Kevin P. Conway, Allison C Hoffman, Nikolas Pharris-Ciurej, Martha White, Victoria R Green, Wilson M. Compton, John P Pierce. Drug and alcohol dependence, 2015. [pdf]
  • Predictive Validity of the Expanded Susceptibility to Smoke Index.. David R. Strong, Sheri J Hartman, Jesse N. Nodora, Karen S Messer, Lisa James, Martha White, David B. Portnoy, Conrad J. Choinière, Genevieve C. Vullo, John P Pierce. Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco, 2015. [pdf]
  • Safety of sublingual immunotherapy Timothy grass tablet in subjects with allergic rhinitis with or without conjunctivitis and history of asthma.. Jennifer S Maloney, Stephen Durham, David Skoner, Ronald Dahl, Albrecht Bufe, David A. Bernstein, Kevin S Murphy, Susan Waserman, Gary D Berman, Martha White, Amarjot Kaur, Hendrik Nolte. Allergy, 2015. [pdf]
  • Improving Exploration in UCT Using Local Manifolds. Sriram Srinivasan, Erik Talvitie, Michael H. Bowling. AAAI, 2015. [pdf]
  • Pairwise Relative Offset Features for Atari 2600 Games. Erik Talvitie, Michael H. Bowling. AAAI Workshop: Learning for General Competency in Video Games, 2015. [pdf]
  • Policy Tree: Adaptive Representation for Policy Gradient. Ujjwal Das Gupta, Erik Talvitie, Michael H. Bowling. AAAI, 2015. [pdf]
  • Solving Games with Functional Regret Estimation. Kevin Waugh, Dustin Morrill, J. Andrew Bagnell, Michael H. Bowling. AAAI Workshop: Computer Poker and Imperfect Information, 2015. [pdf]
  • Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes. Pascal Poupart, Aarti Malhotra, Pei Pei, Kee-Eung Kim, Bongseok Goh, Michael H. Bowling. AAAI, 2015. [pdf]
  • Domain-Independent Optimistic Initialization for Reinforcement Learning. Marlos C. Machado, Sriram Srinivasan, Michael H. Bowling. AAAI Workshop: Learning for General Competency in Video Games, 2015. [pdf]
  • Solving Heads-Up Limit Texas Hold'em. Oskari Tammelin, Neil Burch, Michael Johanson, Michael H. Bowling. IJCAI, 2015. [pdf]
  • Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games. Viliam Lisý, Marc Lanctot, Michael H. Bowling. AAMAS, 2015. [pdf]

2014

  • movements during acquisition , extinction , and reacquisition Time course of the rabbit ' s conditioned nictitating membrane. James L. Kehoe, Elliot A. Ludvig, Richard S. Sutton. , 2014. [pdf]
  • Off-policy TD( l) with a true online equivalence. Hado van Hasselt, Ashique Rupam Mahmood, Richard S. Sutton. UAI, 2014. [pdf]
  • Universal Option Models. Hengshuai Yao, Csaba Szepesvári, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar. NIPS, 2014. [pdf]
  • True Online TD ( λ ) Harm. van Seijen, Richard S. Sutton. , 2014. [pdf]
  • A new Q ( � ) with interim forward view and Monte Carlo equivalence. Richard S. Sutton, Ashique Rupam Mahmood. , 2014. [pdf]
  • ADAPTIVE SWITCHING IN PRACTICE : IMPROVING MYOELECTRIC PROSTHESIS PERFORMANCE THROUGH REINFORCEMENT LEARNING. Ann L. Edwards, Michael Rory Dawson, Jacqueline S. Hebert, Richard S. Sutton, K. Ming Chan, Patrick M. Pilarski. , 2014. [pdf]
  • A new Q(lambda) with interim forward view and Monte Carlo equivalence. Richard S. Sutton, Ashique Rupam Mahmood, Doina Precup, Hado van Hasselt. , 2014. [pdf]
  • True Online TD(lambda). Harm van Seijen, Richard S. Sutton. , 2014. [pdf]
  • True Online TD ( λ ). Harm van Seijen, Richard S. Sutton. , 2014. [pdf]
  • Prediction Driven Behavior : Learning Predictions that Drive Fixed Responses. Joseph Modayil, Richard S. Sutton. , 2014. [pdf]
  • Weighted importance sampling for off-policy learning with linear function approximation. Ashique Rupam Mahmood, Hado van Hasselt, Richard S. Sutton. NIPS, 2014. [pdf]
  • Surprise and Curiosity for Big Data Robotics. Adam White, Joseph Modayil, Richard S. Sutton. , 2014. [pdf]
  • Multi-timescale Nexting in a Reinforcement Learning Robot. Joseph Modayil, Adam White, Richard S. Sutton. Adaptive Behaviour, 2014. [pdf]
  • Time course of the rabbit's conditioned nictitating membrane movements during acquisition, extinction, and reacquisition.. E. James Kehoe, Elliot A. Ludvig, Richard S. Sutton. Learning & memory, 2014. [pdf]
  • Spectral analysis of the heart sounds in children with and without pulmonary artery hypertension.. Mohamed Elgendi, Prashant Raviprakash Bobhate, Shreepal Ambalal Jain, Long Bao Guo, Jennifer M. Rutledge, Yashu Coe, Roger J. Zemp, Dale Schuurmans, Ian Adatia. International journal of cardiology, 2014. [pdf]
  • Adaptive Monte Carlo via Bandit Allocation. James Neufeld, András György, Dale Schuurmans, Csaba Szepesvári. ICML, 2014. [pdf]
  • Detection of a and b waves in the acceleration photoplethysmogram. Mohamed Elgendi, Ian Norton, Matt Brearley, Derek Abbott, Dale Schuurmans. Biomedical engineering online, 2014. [pdf]
  • Convex Co-embedding. Farzaneh Mirzazadeh, Yuhong Guo, Dale Schuurmans. AAAI, 2014. [pdf]
  • Time-domain analysis of heart sound intensity in children with and without pulmonary artery hypertension: a pilot study using a digital stethoscope.. Mohamed Elgendi, Prashant Raviprakash Bobhate, Shreepal Ambalal Jain, Jennifer M. Rutledge, James Y. Coe, Roger J. Zemp, Dale Schuurmans, Ian Adatia. Pulmonary circulation, 2014. [pdf]
  • Convex Deep Learning via Normalized Kernels. Özlem Aslan, Xinhua Zhang, Dale Schuurmans. NIPS, 2014. [pdf]
  • On Minimax Optimal Offline Policy Evaluation. Lihong Li, Rémi Munos, Csaba Szepesvári. ArXiv, 2014. [pdf]
  • Guest Editors' introduction. Jyrki Kivinen, Csaba Szepesvári, Thomas Zeugmann. Theor. Comput. Sci., 2014. [pdf]
  • Generalization Bounds for Partially Linear Models. Ruitong Huang, Csaba Szepesvári. ISAIM, 2014. [pdf]
  • Bayesian Optimal Control of Smoothly Parameterized Systems: The Lazy Posterior Sampling Algorithm. Yasin Abbasi-Yadkori, Csaba Szepesvári. ArXiv, 2014. [pdf]
  • On Learning the Optimal Waiting Time. Tor Lattimore, András György, Csaba Szepesvári. ALT, 2014. [pdf]
  • Sequential Learning for Multi-Channel Wireless Network Monitoring With Channel Switching Costs. Thanh Le, Csaba Szepesvári, Rong Zheng. IEEE Transactions on Signal Processing, 2014. [pdf]
  • Online Learning in Markov Decision Processes with Changing Cost Sequences. Travis Dick, András György, Csaba Szepesvári. ICML, 2014. [pdf]
  • A Finite-Sample Generalization Bound for Semiparametric Regression: Partially Linear Models. Ruitong Huang, Csaba Szepesvári. AISTATS, 2014. [pdf]
  • Partial Monitoring - Classification, Regret Bounds, and Algorithms. Gábor Bartók, Dean P. Foster, Dávid Pál, Alexander Rakhlin, Csaba Szepesvári. Math. Oper. Res., 2014. [pdf]
  • Pseudo-MDPs and factored linear action models. Hengshuai Yao, Csaba Szepesvári, Bernardo Ávila Pires, Xinhua Zhang. 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014. [pdf]
  • Optimal Resource Allocation with Semi-Bandit Feedback. Tor Lattimore, Koby Crammer, Csaba Szepesvári. UAI, 2014. [pdf]
  • Constructivist Foundations 9(2). Olivier L. Georgeon, Patrick M. Pilarski, Gordana Dodig-Crnkovic. , 2014. [pdf]
  • A genome-wide aberrant RNA splicing in patients with acute myeloid leukemia identifies novel potential disease markers and therapeutic targets.. Sophia Adamia, Benjamin Haibe-Kains, Patrick M. Pilarski, Michal Bar-Natan, Samuel J. Pevzner, Hervé Avet-Loiseau, Laurence Lodé, Sigitas J. Verselis, Edward Alan Fox, John Burke, Ilene Galinsky, Ibiayi Dagogo-Jack, Martha Wadleigh, David P Steensma, Gabriela Motyckova, Daniel J. Deangelo, John Quackenbush, Richard M Stone, James D Griffin. Clinical cancer research : an official journal of the American Association for Cancer Research, 2014. [pdf]
  • Proceedings of the first workshop on Peripheral Machine Interfaces: going beyond traditional surface electromyography. Claudio Castellini, Panagiotis K. Artemiadis, Michael Wininger, Arash Ajoudani, Merkur Alimusaj, Antonio Bicchi, Barbara Caputo, William Craelius, Strahinja Dosen, Kevin B. Englehart, Dario Farina, Arjan Gijsberts, Sasha Blue Godfrey, Levi J. Hargrove, Mark Ison, Todd A. Kuiken, Marko Marković, Patrick M. Pilarski, Rüdiger Rupp, Erik J. Scheme. Front. Neurorobot., 2014. [pdf]
  • learning by experiencing versus learning by Registering olivier l . Georgeon Radical Constructivism. Patrick M. Pilarski. , 2014. [pdf]
  • Using Learned Predictions as Feedback to Improve Control and Communication with an Artificial Limb: Preliminary Findings.. Adam S. R. Parker, Ann L. Edwards, Patrick M. Pilarski. ArXiv, 2014. [pdf]
  • Multilayer General Value Functions for Robotic Prediction and Control. Craig Sherstan, Patrick M. Pilarski. , 2014. [pdf]
  • NOTCH2 and FLT3 gene mis-splicings are common events in patients with acute myeloid leukemia (AML): new potential targets in AML.. Sophia Adamia, Michal Bar-Natan, Benjamin Haibe-Kains, Patrick M. Pilarski, Christian Bach, Samuel J. Pevzner, Teresa Calimeri, Hervé Avet-Loiseau, Laurence Lodé, Sigitas J. Verselis, Edward Alan Fox, Ilene Galinsky, Steven M. Mathews, Ibiayi Dagogo-Jack, Martha Wadleigh, David P Steensma, Gabriela Motyckova, Daniel J. Deangelo, John Quackenbush, Daniel G Tenen, Richard M Stone, James D Griffin. Blood, 2014. [pdf]
  • AGenome-WideAberrantRNASplicing inPatientswithAcute MyeloidLeukemia Identi fi esNovelPotentialDiseaseMarkers and Therapeutic Targets. Sophia Adamia, Benjamin Haibe-Kains, Patrick M. Pilarski, Michal Bar-Natan, Samuel J. Pevzner, Hervé Avet-Loiseau, Laurence Lodé, Sigitas J. Verselis, Edward Alan Fox, John Burke, Ilene Galinsky, Ibiayi Dagogo-Jack, Martha Wadleigh, David P Steensma, Gabriela Motyckova, Daniel J. Deangelo, John Quackenbush, Richard M Stone. , 2014. [pdf]
  • DEVELOPMENT OF THE BENTO ARM : AN IMPROVED ROBOTIC ARM FOR MYOELECTRIC TRAINING AND RESEARCH. Michael Rory Dawson, Craig Sherstan, Jason P. Carey, Jacqueline S. Hebert, Patrick M. Pilarski. , 2014. [pdf]
  • DEALING WITH CHANGING CONTEXTS IN MYOELECTRIC CONTROL. Anna Koop, Alexandra Kearney, Michael H. Bowling, Patrick M. Pilarski. , 2014. [pdf]
  • Efficiency evaluation of two-level and three-level bridgeless PFC boost rectifiers. Qiong Wang, Bo Wen, Rolando Burgos, Dushan Boroyevich, Adam M. White. 2014 IEEE Applied Power Electronics Conference and Exposition - APEC 2014, 2014. [pdf]
  • National trends in smoking behaviors among Mexican, Puerto Rican, and Cuban men and women in the United States.. Lyzette Blanco, Robert Garcia, Eliseo J. Pérez-Stable, Martha White, Karen S Messer, John P Pierce, Dennis R. Trinidad. American journal of public health, 2014. [pdf]
  • The efficacy and safety of the short ragweed sublingual immunotherapy tablet MK-3641 is similar in asthmatic and nonasthmatic subjects treated for allergic rhinitis with/without conjunctivitis. Jennifer S Maloney, David I. Bernstein, Jacques Hebert, Martha White, R. N. Fisher, Thomas B. Casale, Amarjot Kaur, Hendrik Nolte. , 2014. [pdf]
  • Curiosity predicts smoking experimentation independent of susceptibility in a US national sample.. Jesse N. Nodora, Sheri J Hartman, David R. Strong, Karen S Messer, Lisa E Vera, Martha White, David B. Portnoy, Conrad J. Choinière, Genevieve C. Vullo, John P Pierce. Addictive behaviors, 2014. [pdf]
  • Increases in light and intermittent smoking among Asian Americans and non-Hispanic Whites.. Lyzette Blanco, Liesl A. Nydegger, Kari-Lyn Kobayakawa Sakuma, Elisa K. Tong, Martha White, Dennis R. Trinidad. Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco, 2014. [pdf]
  • Differential use of other tobacco products among current and former cigarette smokers by income level.. Maya Vijayaraghavan, John P Pierce, Martha White, Karen S Messer. Addictive behaviors, 2014. [pdf]
  • Integrating Representation Learning and Temporal Difference Learning : A Matrix Factorization Approach. Martha White. , 2014. [pdf]
  • A factorization perspective for learning representations in reinforcement learning. Martha White. , 2014. [pdf]
  • Search in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization. Marc Lanctot, Viliam Lisý, Michael H. Bowling. , 2014. [pdf]
  • Asymmetric abstractions for adversarial settings. Nolan Bard, Michael Johanson, Michael H. Bowling. AAMAS, 2014. [pdf]
  • Do pokers players know how good they are? Accuracy of poker skill estimation in online and offline players. T. L. MacKay, Nolan Bard, Michael H. Bowling, D. C. Hodgins. Computers in Human Behavior, 2014. [pdf]
  • Solving Imperfect Information Games Using Decomposition. Neil Burch, Michael Johanson, Michael H. Bowling. AAAI, 2014. [pdf]
  • Using Response Functions to Measure Strategy Strength. Trevor Davis, Neil Burch, Michael H. Bowling. AAAI, 2014. [pdf]

2013

  • Eficient Planning in MDPs by Small Backups. Harm van Seijen, Richard S. Sutton. , 2013. [pdf]
  • Representation Search through Generate and Test. Ashique Rupam Mahmood, Richard S. Sutton. AAAI Workshop: Learning Rich Representations from Low-Level Sensors, 2013. [pdf]
  • Position Paper: Representation Search through Generate and Test. Ashique Rupam Mahmood, Richard S. Sutton. SARA, 2013. [pdf]
  • cient Planning in MDPs by Small Backups. Richard S. Sutton. , 2013. [pdf]
  • Temporal-Difference Learning to Assist Human Decision Making during the Control of an Artificial Limb. Ann L. Edwards, Alexandra Kearney, Michael Rory Dawson, Richard S. Sutton, Patrick M. Pilarski. ArXiv, 2013. [pdf]
  • Planning by Prioritized Sweeping with Small Backups. Harm van Seijen, Richard S. Sutton. ICML, 2013. [pdf]
  • Adaptive artificial limbs: a real-time approach to prediction and anticipation. Patrick M. Pilarski, Michael Rory Dawson, Thomas Degris, Jason P. Carey, K. Ming Chan, Jacqueline S. Hebert, Richard S. Sutton. IEEE Robotics & Automation Magazine, 2013. [pdf]
  • Timing and cue competition in conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus).. E. James Kehoe, Elliot A. Ludvig, Richard S. Sutton. Learning & memory, 2013. [pdf]
  • Real-time prediction learning for the simultaneous actuation of multiple prosthetic joints. Patrick M. Pilarski, Travis B. Dick, Richard S. Sutton. 2013 IEEE 13th International Conference on Rehabilitation Robotics (ICORR), 2013. [pdf]
  • Exploiting Syntactic, Semantic, and Lexical Regularities in Language Modeling via Directed Markov Random Fields. Shaojun Wang, Shaomin Wang, Li Cheng, Russell Greiner, Dale Schuurmans. Computational Intelligence, 2013. [pdf]
  • Learning a Metric Space for Neighbourhood Topology Estimation: Application to Manifold Learning. Karim T. Abou-Moustafa, Dale Schuurmans, Frank P. Ferrie. ACML, 2013. [pdf]
  • Reinforcement Ranking. Hengshuai Yao, Dale Schuurmans. ArXiv, 2013. [pdf]
  • Polar Operators for Structured Sparse Estimation. Xinhua Zhang, Yaoliang Yu, Dale Schuurmans. NIPS, 2013. [pdf]
  • Protein-chemical Interaction Prediction via Kernelized Sparse Learning SVM. Yi Shi, Xinhua Zhang, Xiaoping Liao, Guohui Lin, Dale Schuurmans. Pacific Symposium on Biocomputing, 2013. [pdf]
  • Convex Two-Layer Modeling. Özlem Aslan, Hao Cheng, Xinhua Zhang, Dale Schuurmans. NIPS, 2013. [pdf]
  • Multi-label Classification with Output Kernels. Yuhong Guo, Dale Schuurmans. ECML/PKDD, 2013. [pdf]
  • Divergence based graph estimation for manifold learning. Karim T. Abou-Moustafa, Frank P. Ferrie, Dale Schuurmans. 2013 IEEE Global Conference on Signal and Information Processing, 2013. [pdf]
  • Convex Relaxations of Bregman Divergence Clustering. Hao Cheng, Xinhua Zhang, Dale Schuurmans. UAI, 2013. [pdf]
  • Characterizing the Representer Theorem. Yaoliang Yu, Hao Cheng, Dale Schuurmans, Csaba Szepesvári. ICML, 2013. [pdf]
  • Online Learning under Delayed Feedback. Pooria Joulani, András György, Csaba Szepesvári. ICML, 2013. [pdf]
  • Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions. Yasin Abbasi-Yadkori, Peter L. Bartlett, Csaba Szepesvári. NIPS, 2013. [pdf]
  • A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning. Arash Afkanpour, András György, Csaba Szepesvári, Michael H. Bowling. ICML, 2013. [pdf]
  • Alignment based kernel learning with a continuous set of base kernels. Arash Afkanpour, Csaba Szepesvári, Michael H. Bowling. Machine Learning, 2013. [pdf]
  • Cost-sensitive Multiclass Classification Risk Bounds. Bernardo Ávila Pires, Csaba Szepesvári, Mohammad Ghavamzadeh. ICML, 2013. [pdf]
  • Toward a classification of finite partial-monitoring games. András Antos, Gábor Bartók, Dávid Pál, Csaba Szepesvári. Theor. Comput. Sci., 2013. [pdf]
  • Online Learning with Costly Features and Labels. Navid Zolghadr, Gábor Bartók, Russell Greiner, András György, Csaba Szepesvári. NIPS, 2013. [pdf]
  • Aberrant splicing, hyaluronan synthases and intracellular hyaluronan as drivers of oncogenesis and potential drug targets.. Sophia Adamia, Patrick M. Pilarski, Andrew R. Belch, Linda M. Pilarski. Current cancer drug targets, 2013. [pdf]
  • Alternative splicing in chronic myeloid leukemia (CML): a novel therapeutic target?. Sophia Adamia, Patrick M. Pilarski, Michal Bar-Natan, Richard M Stone, James D Griffin. Current cancer drug targets, 2013. [pdf]
  • Determining the Time until Muscle Fatigue using Temporally Extended Prediction Learning. Patrick M. Pilarski, Liping Qi, Martin Ferguson-Pell, Simon Grange. , 2013. [pdf]
  • Randomized controlled trial of ragweed allergy immunotherapy tablet efficacy and safety in North American adults.. Hendrik Nolte, Jacques Hebert, Gary D Berman, Sandra M Gawchik, Martha White, Amarjot Kaur, Nancy Y. N. Liu, William Raymond Lumry, Jennifer S Maloney. Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology, 2013. [pdf]
  • The effectiveness of cigarette price and smoke-free homes on low-income smokers in the United States.. Maya Vijayaraghavan, Karen S. Messer, Martha White, John P Pierce. American journal of public health, 2013. [pdf]
  • Partition Tree Weighting. Joel Veness, Martha White, Michael H. Bowling, András György. 2013 Data Compression Conference, 2013. [pdf]
  • Rating players in games with real-valued outcomes. Christopher Archibald, Neil Burch, Michael H. Bowling, Matthew Rutherford. AAMAS, 2013. [pdf]
  • Evaluating state-space abstractions in extensive-form games. Michael Johanson, Neil Burch, Richard Anthony Valenzano, Michael H. Bowling. AAMAS, 2013. [pdf]
  • Subset Selection of Search Heuristics. D. Chris Rayner, Nathan R. Sturtevant, Michael H. Bowling. IJCAI, 2013. [pdf]
  • CFR-D: Solving Imperfect Information Games Using Decomposition. Neil Burch, Michael H. Bowling. ArXiv, 2013. [pdf]
  • Automating Collusion Detection in Sequential Games. Parisa Mazrooei, Christopher Archibald, Michael H. Bowling. AAAI, 2013. [pdf]
  • Baseline: practical control variates for agent evaluation in zero-sum domains. Joshua Davidson, Christopher Archibald, Michael H. Bowling. AAMAS, 2013. [pdf]
  • Bayesian Learning of Recursively Factored Environments. Marc G. Bellemare, Joel Veness, Michael H. Bowling. ICML, 2013. [pdf]
  • Online implicit agent modelling. Nolan Bard, Michael Johanson, Neil Burch, Michael H. Bowling. AAMAS, 2013. [pdf]
  • The Arcade Learning Environment: An Evaluation Platform for General Agents. Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael H. Bowling. J. Artif. Intell. Res., 2013. [pdf]

2012

  • Temporal-difference search in computer Go. David Silver, Richard S. Sutton, Martin Müller. Machine Learning, 2012. [pdf]
  • Reinforcement Learning : An Introduction Second edition , in progress. Richard S. Sutton, Andrew G. Barto. , 2012. [pdf]
  • Model-Free reinforcement learning with continuous action in practice. Thomas Degris, Patrick M. Pilarski, Richard S. Sutton. 2012 American Control Conference (ACC), 2012. [pdf]
  • Prediction and Anticipation for Adaptive Artificial Limbs. Patrick M. Pilarski, Michael Rory Dawson, Thomas Degris, Jason P. Carey, K. Ming Chan, Jacqueline S. Hebert, Richard S. Sutton, REDICTING. , 2012. [pdf]
  • Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots. Patrick M. Pilarski, Michael Rory Dawson, Thomas Degris, Jason P. Carey, Richard S. Sutton. 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob), 2012. [pdf]
  • Evaluating the TD model of classical conditioning.. Elliot A. Ludvig, Richard S. Sutton, E. James Kehoe. Learning & behavior, 2012. [pdf]
  • Multi-timescale Nexting in a Reinforcement Learning Robot. Joseph Modayil, Adam White, Richard S. Sutton. SAB, 2012. [pdf]
  • Tuning-free step-size adaptation. Ashique Rupam Mahmood, Richard S. Sutton, Thomas Degris, Patrick M. Pilarski. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012. [pdf]
  • Online Representation Search and Its Interactions with Unsupervised Learning. Ashique Rupam Mahmood, Richard S. Sutton. , 2012. [pdf]
  • Acquiring a broad range of empirical knowledge in real time by temporal-difference learning. Joseph Modayil, Adam White, Patrick M. Pilarski, Richard S. Sutton. 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2012. [pdf]
  • Scaling life-long off-policy learning. Adam White, Joseph Modayil, Richard S. Sutton. 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), 2012. [pdf]
  • Linear Off-Policy Actor-Critic. Thomas Degris, Martha White, Richard S. Sutton. ICML, 2012. [pdf]
  • Between Instruction and Reward: Human-Prompted Switching.. Patrick M. Pilarski, Richard S. Sutton. AAAI 2012, 2012. [pdf]
  • Acquiring Diverse Predictive Knowledge in Real Time by Temporal-difference Learning. Joseph Modayil, Adam White, Patrick M. Pilarski, Richard S. Sutton. , 2012. [pdf]
  • Off-Policy Actor-Critic. Thomas Degris, Martha White, Richard S. Sutton. ArXiv, 2012. [pdf]
  • Regularizers versus Losses for Nonlinear Dimensionality Reduction: A Factored View with New Convex Relaxations. James Neufeld, Yaoliang Yu, Xinhua Zhang, Ryan Kiros, Dale Schuurmans. ICML, 2012. [pdf]
  • A Polynomial-time Form of Robust Regression. Yaoliang Yu, Özlem Aslan, Dale Schuurmans. NIPS, 2012. [pdf]
  • An experimental methodology for response surface optimization methods. Daniel J. Lizotte, Russell Greiner, Dale Schuurmans. J. Global Optimization, 2012. [pdf]
  • Linear Coherent Bi-Clustering via Beam Searching and Sample Set Clustering. Yi Shi, Maryam Hasan, Zhipeng Cai, Guohui Lin, Dale Schuurmans. Discrete Math., Alg. and Appl., 2012. [pdf]
  • An efficient algorithm for maximal margin clustering. Jiming Peng, Lopamudra Mukherjee, Vikas Singh, Dale Schuurmans, Linli Xu. J. Global Optimization, 2012. [pdf]
  • Regularizers versus Losses for Nonlinear Dimensionality Reduction. Yaoliang Yu, James Neufeld, Ryan Kiros, Xinhua Zhang, Dale Schuurmans. , 2012. [pdf]
  • Sparse Learning Based Linear Coherent Bi-clustering. Yi Shi, Xiaoping Liao, Xinhua Zhang, Guohui Lin, Dale Schuurmans. WABI, 2012. [pdf]
  • Semi-supervised Multi-label Classification - A Simultaneous Large-Margin, Subspace Learning Approach. Yuhong Guo, Dale Schuurmans. ECML/PKDD, 2012. [pdf]
  • Protein Phosphorylation Site Prediction via Feature Discovery Support Vector Machine. Yi Shi, Bo Yuan, Guohui Lin, Dale Schuurmans. , 2012. [pdf]
  • Accelerated Training for Matrix-norm Regularization: A Boosting Approach. Xinhua Zhang, Yaoliang Yu, Dale Schuurmans. NIPS, 2012. [pdf]
  • Generalized Optimal Reverse Prediction. Martha White, Dale Schuurmans. AISTATS, 2012. [pdf]
  • The Latent Maximum Entropy Principle. Shaojun Wang, Dale Schuurmans, Yunxin Zhao. TKDD, 2012. [pdf]
  • Convex Multi-view Subspace Learning. Martha White, Yaoliang Yu, Xinhua Zhang, Dale Schuurmans. NIPS, 2012. [pdf]
  • Statistical linear estimation with penalized estimators: an application to reinforcement learning. Bernardo Ávila Pires, Csaba Szepesvári. ICML, 2012. [pdf]
  • Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments. Yevgeny Seldin, Csaba Szepesvári, Peter Auer, Yasin Abbasi-Yadkori. EWRL, 2012. [pdf]
  • Analysis of Kernel Mean Matching under Covariate Shift. Yaoliang Yu, Csaba Szepesvári. ICML, 2012. [pdf]
  • A Randomized Strategy for Learning to Combine Many Features. Arash Afkanpour, András György, Csaba Szepesvári, Michael H. Bowling. ArXiv, 2012. [pdf]
  • An adaptive algorithm for finite stochastic partial monitoring. Gábor Bartók, Navid Zolghadr, Csaba Szepesvári. ICML, 2012. [pdf]
  • Partial Monitoring with Side Information. Gábor Bartók, Csaba Szepesvári. ALT, 2012. [pdf]
  • The grand challenge of computer Go: Monte Carlo tree search and extensions. Sylvain Gelly, Levente Kocsis, Marc Schoenauer, Michèle Sebag, David Silver, Csaba Szepesvári, Olivier Teytaud. Commun. ACM, 2012. [pdf]
  • Deep Representations and Codes for Image Auto-Annotation. Ryan Kiros, Csaba Szepesvári. NIPS, 2012. [pdf]
  • The adversarial stochastic shortest path problem with unknown transition probabilities. Gergely Neu, András György, Csaba Szepesvári. AISTATS, 2012. [pdf]
  • Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits. Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári. AISTATS, 2012. [pdf]
  • Approximate policy iteration with linear action models. Hengshuai Yao, Csaba Szepesvári. AAAI 2012, 2012. [pdf]
  • Recruiting a New Substrate for Triacylglycerol Synthesis in Plants: The Monoacylglycerol Acyltransferase Pathway. James Robertson Petrie, Thomas Vanhercke, Pushkar Shrestha, Anna El Tahchy, Adam White, Xue-Rong Zhou, Qing Liu, Maged Peter Mansour, Peter David Nichols, Surinder Pal Singh. PloS one, 2012. [pdf]
  • Clouds in Space: Scientific Computing using Windows Azure. Steven J. Johnston, Neil S. O'Brien, Hugh G. Lewis, Elizabeth E. Hart, Adam White, Simon J. Cox. Journal of Cloud Computing: Advances, Systems and Applications, 2012. [pdf]
  • Correction: Recruiting a New Substrate for Triacylglycerol Synthesis in Plants: The Monoacylglycerol Acyltransferase Pathway. James Robertson Petrie, Thomas Vanhercke, Pushkar Shrestha, Anna El Tahchy, Adam White, Xue-Rong Zhou, Qing Liu, Maged Peter Mansour, Peter David Nichols, Surinder Pal Singh. , 2012.
  • Hand Lesions : An Unusual Presentation to the Acute Medical Take. Adam White, Alison Bateson. , 2012. [pdf]
  • Quitlines and nicotine replacement for smoking cessation: do we need to change policy?. John P Pierce, Sharon E Cummins, Martha White, Aimee Humphrey, Karen Messer. Annual review of public health, 2012. [pdf]
  • Context Tree Switching. Joel Veness, Kee Siong Ng, Marcus Hutter, Michael H. Bowling. 2012 Data Compression Conference, 2012. [pdf]
  • Finding Optimal Abstract Strategies in Extensive-Form Games. Michael Johanson, Nolan Bard, Neil Burch, Michael H. Bowling. AAAI, 2012. [pdf]
  • Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization. Michael Johanson, Nolan Bard, Marc Lanctot, Richard G. Gibson, Michael H. Bowling. AAMAS, 2012. [pdf]
  • Keynotes [abstracts of three keynote presentations]. Jeff Orkin, Gillian Smith, Michael H. Bowling. CIG, 2012. [pdf]
  • Investigating Contingency Awareness Using Atari 2600 Games. Marc G. Bellemare, Joel Veness, Michael H. Bowling. AAAI, 2012. [pdf]
  • Linear Fitted-Q Iteration with Multiple Reward Functions. Daniel J. Lizotte, Michael H. Bowling, Susan A. Murphy. ICAPS, 2012. [pdf]
  • On Local Regret. Michael H. Bowling, Martin Zinkevich. ICML, 2012. [pdf]
  • Sketch-Based Linear Value Function Approximation. Marc G. Bellemare, Joel Veness, Michael H. Bowling. NIPS, 2012. [pdf]
  • Tractable Objectives for Robust Policy Optimization. Katherine Chen, Michael H. Bowling. NIPS, 2012. [pdf]
  • Generalized sampling and variance in counterfactual regret minimization. Richard G. Gibson, Marc Lanctot, Neil Burch, Duane Szafron, Michael H. Bowling. AAAI 2012, 2012. [pdf]
  • No-Regret Learning in Extensive-Form Games with Imperfect Recall. Marc Lanctot, Richard G. Gibson, Neil Burch, Michael H. Bowling. ICML, 2012. [pdf]

2011

  • Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning. Patrick M. Pilarski, Michael Rory Dawson, Thomas Degris, Farbod Fahimi, Jason P. Carey, Richard S. Sutton. 2011 IEEE International Conference on Rehabilitation Robotics, 2011. [pdf]
  • Beyond Reward: The Problem of Knowledge and Data. Richard S. Sutton. ILP, 2011. [pdf]
  • Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. Richard S. Sutton, Joseph Modayil, Michael Delp, Thomas Degris, Patrick M. Pilarski, Adam White, Doina Precup. AAMAS, 2011. [pdf]
  • Real-Time Discriminative Background Subtraction. Li Cheng, Minglun Gong, Dale Schuurmans, Terry Caelli. IEEE Transactions on Image Processing, 2011. [pdf]
  • Adaptive large margin training for multilabel classification. Yuhong Guo, Dale Schuurmans. AAAI 2011, 2011. [pdf]
  • Modular community detection in networks. Wenye Li, Dale Schuurmans. IJCAI 2011, 2011. [pdf]
  • Convex Sparse Coding, Subspace Learning, and Semi-Supervised Extensions. Xinhua Zhang, Yaoliang Yu, Martha White, Ruitong Huang, Dale Schuurmans. AAAI, 2011. [pdf]
  • Advances in Large Margin Classifiers. Alexander J. Smola, Peter Bartlett, Bernhard Schölkopf, Dale Schuurmans, Alex Smola. , 2011. [pdf]
  • Rank/Norm Regularization with Closed-Form Solutions: Application to Subspace Clustering. Yaoliang Yu, Dale Schuurmans. UAI, 2011. [pdf]
  • MapReduce for Parallel Reinforcement Learning. Yuxi Li, Dale Schuurmans. EWRL, 2011. [pdf]
  • Editors' Introduction. Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann. ALT, 2011. [pdf]
  • Regret Bounds for the Adaptive Control of Linear Quadratic Systems. Yasin Abbasi-Yadkori, Csaba Szepesvári. COLT, 2011. [pdf]
  • Agnostic KWIK learning and efficient approximate reinforcement learning. István Szita, Csaba Szepesvári. COLT, 2011. [pdf]
  • Regularized Least-Squares Regression : Learning from a β-mixing Sequence. Amir-massoud Farahmand, Csaba Szepesvári. , 2011. [pdf]
  • Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems. Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári. ArXiv, 2011. [pdf]
  • PAC-Bayesian Policy Evaluation for Reinforcement Learning. Mahdi Milani Fard, Joelle Pineau, Csaba Szepesvári. UAI, 2011. [pdf]
  • Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments. Gábor Bartók, Dávid Pál, Csaba Szepesvári. COLT, 2011. [pdf]
  • Improved Algorithms for Linear Stochastic Bandits (extended version). Yasin Abbasi-Yadkori, Dr. Abhay Charan Pal, Csaba Szepesvári. , 2011.
  • Non-trivial two-armed partial-monitoring games are bandits. András Antos, Gábor Bartók, Csaba Szepesvári. ArXiv, 2011. [pdf]
  • Least Squares Temporal Difference Learning and Galerkin ’ s Method. Csaba Szepesvári. , 2011. [pdf]
  • Improved Algorithms for Linear Stochastic Bandits. Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári. NIPS, 2011. [pdf]
  • Invited Talk: Towards Robust Reinforcement Learning Algorithms. Csaba Szepesvári. , 2011. [pdf]
  • Sequential learning for optimal monitoring of multi-channel wireless networks. Pallavi Arora, Csaba Szepesvári, Rong Zheng. 2011 Proceedings IEEE INFOCOM, 2011. [pdf]
  • X-Armed Bandits. Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári. Journal of Machine Learning Research, 2011. [pdf]
  • Complex Interactions and Ecosystem Function: Auto-regulation of an Insect Community in a Coffee Agroecosystem by. Heidi Liere, John H. Vandermeer, Robyn J. Burnham, Ivette Perfecto, Braulio Chilel, Gabriel Dominguez, George R. Livingston, Carley J Kratz, Adam White, StacyM. Philpott. , 2011. [pdf]
  • Prevalence and correlates of suicidal ideation among outpatients at a comprehensive cancer center.. J. Michael Randall, Lyudmila Bazhenova, Martha White, Anjali A Bharne, Kelly Anne Shimabukuro, Yuri Matusov, Karen S Messer, Amy E. Lowery, Matthew J Loscalzo, Karen Lynn Clark, Wayne A. Bardwell. , 2011. [pdf]
  • Obesity increases and physical activity decreases lower urinary tract symptom risk in older men: the Osteoporotic Fractures in Men study.. J Kellogg Parsons, Karen Messer, Martha White, Elizabeth Barrett-Connor, Doug Bauer, Lynn M Marshall. European urology, 2011. [pdf]
  • Prevalence of heavy smoking in California and the United States, 1965-2007.. John P Pierce, Karen S. Messer, Martha White, David W. Cowling, David P Thomas. JAMA, 2011. [pdf]
  • TRY – a global database of plant traits. Jens Kattge, Sandra Díaz, Sandra Lavorel, I Colin Prentice, Paul W. Leadley, Gerhard Boenisch, Eric Garnier, Mark Westoby, Peter B. Reich, Ian J Wright, Johannes H C Cornelissen, Cyrille Violle, Sandy P Harrison, Peter M van Bodegom, Markus Reichstein, Brian J. Enquist, Nadejda A. Soudzilovskaia, David D. Ackerly, Madhur Anand, Owen K Atkin, Michael Bahn, Timothy R. Baker, D. Baldocchi, Renée Bekker, Carolina C. Blanco, Benjamin Blonder, William J Bond, R. A. Bradstock, Daniel E. Bunker, Fernando Casanoves, Jeannine Cavender-Bares, Jeffrey Q. Chambers, F. Stuart Chapin, Jérôme Chave, David Anthony Coomes, William K. Cornwell, Joseph M. Craine, Barbara H. Dobrin, Lina Duarte, Walter Durka, Jim Elser, Gabrielle Esser, M. Angels Estiarte, William F. Fagan, Jiacheng Fang, Felipe Fernández-Méndez, Anthony Fidelis, Bryan Finegan, Olivier Flores, Heather Ford, Daniel Frank, Grégoire T Freschet, Nikolaos M. Fyllas, Rachael V. Gallagher, William A. Green, Alvaro G. Gutiérrez, Thomas Hickler, Steven Ian Higgins, John G Hodgson, Adel Jalili, Stephanie Jansen, Carlos A. Joly, Andrew J. Kerkhoff, Donald Kirkup, Kaoru Kitajima, Michael Kleyer, Stefan Klotz, Johannes M. H. Knops, K Kramer, Ingolf Kühn, Hiroko Kurokawa, Dan Laughlin, Tsung D. Lee, Michelle R Leishman, Frederic Lens, Thomas Lenz, Simon L. Lewis, Jananee Lloyd, Joan Llusià, Frédérique Louault, Shuqing Ma, Miguel D. Mahecha, Pete Manning, Tara Joy Massad, Belinda E Medlyn, Julie Messier, Angela T. Moles, Sandra Cristina Müller, Karin Nadrowski, Shahid Naeem, Ülo Niinemets, S Nöllert, A Nüske, Romà Ogaya, Jacek Oleksyn, Vladimir G. Onipchenko, Yusuke Onoda, J. Ordonez, Gritt Overbeck, Wim A Ozinga, Sandra Patiño, Susana Paula, Juli G. Pausas, Josep Peñuelas, Oliver L Phillips, Valério D. Pillar, Hendrik Poorter, Lourens Poorter, Peter Poschlod, Andreas Prinzing, Raphaël Proulx, Anja Rammig, Sabine Reinsch, Björn Reu, Lawren Sack, Beatriz Salgado-Negret, Jordi Sardans, Satomi Shiodera, Bill Shipley, Andrew Siefert, Enio E. Sosinski, J. Soussana, Emily K. Swaine, Natalia Swenson, Ken Thompson, Pat Thornton, Michael Waldram, Evan Weiher, Martha White, Sarah White, S Joseph Wright, Benjamin Yguel, Sönke Zaehle, Amy E. Zanne, Chris Wirth. , 2011. [pdf]
  • A nationwide analysis of US racial/ethnic disparities in smoking behaviors, smoking cessation, and cessation-related factors.. Dennis R. Trinidad, Eliseo J. Pérez-Stable, Martha White, Sherry L Emery, Karen S Messer. American journal of public health, 2011. [pdf]
  • Increasing hookah use in California.. Joshua R. Smith, Steven Edland, Thomas E. Novotny, C. Richard Hofstetter, Martha White, Suzanne Lindsay, Wael K. Al-Delaimy. American journal of public health, 2011. [pdf]
  • Home smoking bans among U.S. households with children and smokers. Opportunities for intervention.. Alice L Mills, Martha White, John P Pierce, Karen S. Messer. American journal of preventive medicine, 2011. [pdf]
  • The lemonade stand game competition: solving unsolvable games. Martin Zinkevich, Michael H. Bowling, Michael Wunder. SIGecom Exchanges, 2011. [pdf]
  • Variance Reduction in Monte-Carlo Tree Search. Joel Veness, Marc Lanctot, Michael H. Bowling. NIPS, 2011. [pdf]
  • Euclidean Heuristic Optimization. D. Chris Rayner, Michael H. Bowling, Nathan R. Sturtevant. AAAI, 2011. [pdf]
  • Accelerating Best Response Calculation in Large Extensive Games. Michael Johanson, Kevin Waugh, Michael H. Bowling, Martin Zinkevich. IJCAI, 2011. [pdf]

2010

  • Opening remarks 9 : 15-9 : 45 Active Sequential Estimation of Object Dynamics with Tactile Sensory Feedback -. Patrick M. Pilarski, Adam White, Thomas Degris, Richard S. Sutton. , 2010. [pdf]
  • Off-Policy Knowledge Maintenance for Robots. Joseph Modayil, Patrick M. Pilarski, Adam White, Thomas Degris, Richard S. Sutton. , 2010. [pdf]
  • GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. Hamid Reza Maei, Richard S. Sutton. , 2010. [pdf]
  • Toward Off-Policy Learning Control with Function Approximation. Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard S. Sutton. ICML, 2010. [pdf]
  • Timing in trace conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus): scalar, nonscalar, and adaptive features.. E. James Kehoe, Elliot A. Ludvig, Richard S. Sutton. Learning & memory, 2010. [pdf]
  • Facility locations revisited: An efficient belief propagation approach. Wenye Li, Linli Xu, Dale Schuurmans. 2010 IEEE International Conference on Automation and Logistics, 2010. [pdf]
  • Improved Natural Language Learning via Variance-Regularization Support Vector Machines. Shane Bergsma, Dekang Lin, Dale Schuurmans. CoNLL, 2010. [pdf]
  • Distributed Flow Algorithms for Scalable Similarity Visualization. Novi Quadrianto, Dale Schuurmans, Alexander J. Smola. 2010 IEEE International Conference on Data Mining Workshops, 2010. [pdf]
  • Strictly Lexicalised Dependency Parsing. Qin Iris Wang, Dale Schuurmans, Dekang Lin. , 2010. [pdf]
  • Linear Coherent Bi-cluster Discovery via Beam Detection and Sample Set Clustering. Yi Shi, Maryam Hasan, Zhipeng Cai, Guohui Lin, Dale Schuurmans. COCOA, 2010. [pdf]
  • Relaxed Clipping: A Global Training Method for Robust Regression and Classification. Yaoliang Yu, Min Yang, Linli Xu, Martha White, Dale Schuurmans. NIPS, 2010. [pdf]
  • A Disease Classifier for Metabolic Profiles Based on Metabolic Pathway Knowledge Examining Committee. Vickie Baracos, Dale Schuurmans. , 2010. [pdf]
  • Prediction of Protein Domain-Types by Backpropagation. Csaba Szepesvári, Csamid Bachrati, Sándor Pongor. , 2010. [pdf]
  • Estimation of Rényi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs. Dávid Pál, Barnabás Póczos, Csaba Szepesvári. NIPS, 2010. [pdf]
  • Algorithms for Reinforcement Learning. Csaba Szepesvári. Algorithms for Reinforcement Learning, 2010. [pdf]
  • Models of active learning in group-structured state spaces. Gábor Bartók, Csaba Szepesvári, Sandra Zilles. Inf. Comput., 2010. [pdf]
  • Error Propagation for Approximate Policy and Value Iteration. Amir Massoud Farahmand, Rémi Munos, Csaba Szepesvári. NIPS, 2010. [pdf]
  • REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization. Barnabás Póczos, Sergey Kirshner, Csaba Szepesvári. AISTATS, 2010. [pdf]
  • A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping. Péter Torma, András György, Csaba Szepesvári. AISTATS 2010, 2010. [pdf]
  • Model-based reinforcement learning with nearly tight exploration complexity bounds. István Szita, Csaba Szepesvári. ICML, 2010. [pdf]
  • Online Markov Decision Processes Under Bandit Feedback. Gergely Neu, András György, Csaba Szepesvári, András Antos. IEEE Transactions on Automatic Control, 2010. [pdf]
  • Model Selection in Reinforcement Learning. Amir Massoud Farahmand, Csaba Szepesvári. Machine Learning, 2010. [pdf]
  • Active learning in heteroscedastic noise. András Antos, Varun Grover, Csaba Szepesvári. Theor. Comput. Sci., 2010. [pdf]
  • Parametric Bandits: The Generalized Linear Case. Sarah Filippi, Olivier Cappé, Aurélien Garivier, Csaba Szepesvári. NIPS, 2010. [pdf]
  • Extending rapidly-exploring random trees for asymptotically optimal anytime motion planning. Yasin Abbasi-Yadkori, Joseph Modayil, Csaba Szepesvári. IROS 2010, 2010. [pdf]
  • Budgeted Distribution Learning of Belief Net Parameters. Liuyang Li, Barnabás Póczos, Csaba Szepesvári, Russell Greiner. ICML, 2010. [pdf]
  • Toward a Classification of Finite Partial-Monitoring Games. András Antos, Gábor Bartók, Dávid Pál, Csaba Szepesvári. ALT, 2010. [pdf]
  • Multiple myeloma may include microvessel endothelial cells of malignant origin.. Linda M. Pilarski, Patrick M. Pilarski, Andrew R. Belch. Leukemia & lymphoma, 2010. [pdf]
  • Towards robust cellular image classification: theoretical foundations for wide-angle
scattering pattern analysis. Patrick M. Pilarski, Christopher J. Backhouse. Biomedical optics express, 2010. [pdf]
  • Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains. Martha White, Adam M. White. NIPS, 2010. [pdf]
  • Report on the 2008 Reinforcement Learning Competition. Shimon Whiteson, Brian Tanner, Adam M. White. , 2010. [pdf]
  • The Reinforcement Learning Competitions. Shimon Whiteson, Brian Tanner, Adam White. , 2010. [pdf]
  • A General Framework for Reducing Variance in Agent Evaluation Examining Committee. Martha White. , 2010. [pdf]
  • Twelve-week efficacy and safety study of mometasone furoate/formoterol 200/10 microg and 400/10 microg combination treatments in patients with persistent asthma previously receiving high-dose inhaled corticosteroids.. Steven F. Weinstein, Jonathan Corren, Kevin S Murphy, Hendrik Nolte, Martha White. Allergy and asthma proceedings, 2010. [pdf]
  • Nanofiltered C1 inhibitor concentrate for treatment of hereditary angioedema.. Bruce L Zuraw, Paula Jane Busse, Martha White, Joshua Jacobs, William Raymond Lumry, James Baker, Timothy Craig, J Andrew Grant, David S. Hurewitz, Leonard Bielory, William E. Cartwright, Majed Koleilat, Walter Ryan, Oren P. Schaefer, Michael Manning, Pragnesh A Patel, J A Bernstein, Roger A Friedman, Robert Wilkinson, David M Tanner, Gary Kohler, Glenne Gunther, Robyn J. Levy, James E. McClellan, Joseph Redhead, David M. Guss, Eugene R. Heyman, Brent A. Blumenstein, Ira N. Kalfus, Mike Frank. The New England journal of medicine, 2010. [pdf]
  • Forty years of faster decline in cigarette smoking in California explains current lower lung cancer rates.. John P Pierce, Karen Messer, Martha White, Sheila Kealey, David W. Cowling. Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, 2010. [pdf]
  • Menthol cigarettes and smoking cessation among racial/ethnic groups in the United States.. Dennis R. Trinidad, Eliseo J. Pérez-Stable, Karen S Messer, Martha White, John P Pierce. Addiction, 2010. [pdf]
  • Camel No. 9 cigarette-marketing campaign targeted young teenage girls.. John P Pierce, Karen Messer, Lisa E. James, Martha White, Sheila Kealey, Donna M Vallone, Cheryl G Healton. Pediatrics, 2010. [pdf]
  • The functionality of a budesonide/formoterol pressurized metered-dose inhaler with an integrated actuation counter.. Shailen R Shah, Martha White, Tom Uryniak, C. Douglas O'Brien. Allergy and asthma proceedings, 2010. [pdf]
  • Intensive Case Management Before and After Prison Release is No More Effective Than Comprehensive Pre-Release Discharge Planning in Linking HIV-Infected Prisoners to Care: A Randomized Trial. David A Wohl, Anna M. Scheyett, Carol E. Golin, Becky White, Jeanine M Matuszewski, Michael H. Bowling, Paula Smith, Faye Duffin, David S. Rosen, Andrew Kaplan, Jo Anne Earp. AIDS and Behavior, 2010. [pdf]
  • Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis. Daniel J. Lizotte, Michael H. Bowling, Susan A. Murphy. ICML, 2010. [pdf]

2009

  • Natural actor-critic algorithms. Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee. Automatica, 2009. [pdf]
  • Natural Actor – Crit ic Algorithms. Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, M. Lee. , 2009. [pdf]
  • Fast gradient-descent methods for temporal-difference learning with linear function approximation. Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. ICML, 2009. [pdf]
  • Multi-Step Dyna Planning for Policy Evaluation and Control. Hengshuai Yao, Richard S. Sutton, Shalabh Bhatnagar, Diao Dongcui, Csaba Szepesvári. , 2009. [pdf]
  • Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton. NIPS, 2009. [pdf]
  • The Grand Challenge of Predictive Empirical Abstract Knowledge. Richard S. Sutton. , 2009. [pdf]
  • Scalar timing varies with response magnitude in classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus).. E. James Kehoe, Kirk N. Olsen, Elliot A. Ludvig, Richard S. Sutton. Behavioral neuroscience, 2009. [pdf]
  • Magnitude and timing of conditioned responses in delay and trace classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus).. E. James Kehoe, Elliot A. Ludvig, Richard S. Sutton. Behavioral neuroscience, 2009. [pdf]
  • Dual Temporal Difference Learning. Min Yang, Yuxi Li, Dale Schuurmans. AISTATS, 2009. [pdf]
  • Learning Exercise Policies for American Options. Yuxi Li, Csaba Szepesvári, Dale Schuurmans. AISTATS, 2009. [pdf]
  • Optimal reverse prediction: a unified perspective on supervised, unsupervised and semi-supervised learning. Linli Xu, Martha White, Dale Schuurmans. ICML, 2009. [pdf]
  • Linear Coherent Bi-cluster Discovery via Line Detection and Sample Majority Voting. Yi Shi, Zhipeng Cai, Guohui Lin, Dale Schuurmans. COCOA, 2009. [pdf]
  • Discriminative Maximum Margin Image Object Categorization with Exact Inference. Qinfeng Shi, Luping Zhou, Li Cheng, Dale Schuurmans. 2009 Fifth International Conference on Image and Graphics, 2009. [pdf]
  • A General Projection Property for Distribution Families. Yaoliang Yu, Yuxi Li, Dale Schuurmans, Csaba Szepesvári. NIPS, 2009. [pdf]
  • A Reformulation of Support Vector Machines for General Confidence Functions. Yuhong Guo, Dale Schuurmans. ACML, 2009. [pdf]
  • Convex Relaxation of Mixture Regression with Efficient Algorithms. Novi Quadrianto, Tibério S. Caetano, John Lim, Dale Schuurmans. NIPS, 2009. [pdf]
  • Inference of the structural credit risk model using MLE. Yuxi Li, Li Cheng, Dale Schuurmans. 2009 IEEE Symposium on Computational Intelligence for Financial Engineering, 2009. [pdf]
  • Fast normalized cut with linear constraints. Linli Xu, Wenye Li, Dale Schuurmans. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. [pdf]
  • Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems. Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor. 2009 American Control Conference, 2009. [pdf]
  • Model-based and Model-free Reinforcement Learning for Visual Servoing. Amir Massoud Farahmand, Azad Shademan, Martin Jägersand, Csaba Szepesvári. ICRA 2009, 2009. [pdf]
  • Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Jean-Yves Audibert, Csaba Szepesvári. Theor. Comput. Sci., 2009. [pdf]
  • Learning when to stop thinking and do something!. Barnabás Póczos, Yasin Abbasi-Yadkori, Csaba Szepesvári, Russell Greiner, Nathan R. Sturtevant. ICML, 2009. [pdf]
  • Training parsers by inverse reinforcement learning. Gergely Neu, Csaba Szepesvári. Machine Learning, 2009. [pdf]
  • Reinforcement Learning Algorithms for MDPs. Csaba Szepesvári. , 2009. [pdf]
  • Regularized Fitted Q-iteration: Application to Planning. Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor. EWRL, 2009. [pdf]
  • Forced-Exploration Based Algorithms for Playing in Stochastic Linear Bandits. Yasin Abbasi-Yadkori, Csaba Szepesvári. , 2009. [pdf]
  • Learning to segment from a few well-selected training images. Alireza Farhangfar, Russell Greiner, Csaba Szepesvári. ICML, 2009. [pdf]
  • Regularized Fitted Q-iteration : Application to Bounded Resource Planning. Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor. , 2009. [pdf]
  • LMS-2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS. Hengshuai Yao, Shalabh Bhatnagar, Csaba Szepesvári. Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference, 2009. [pdf]
  • Workshop summary: On-line learning with limited feedback. Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári. ICML, 2009. [pdf]
  • Genetic abnormalities in Waldenström's macroglobulinemia.. Sophia Adamia, Patrick M. Pilarski, Andrew R. Belch, Linda M. Pilarski. Clinical lymphoma & myeloma, 2009. [pdf]
  • Computational analysis of mitochondrial placement and aggregation effects on wide-angle cell scattering patterns. Patrick M. Pilarski, Xuantao Su, D. Moira Glerum, Christopher J. Backhouse. , 2009. [pdf]
  • RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments. Brian Tanner, Adam M. White. Journal of Machine Learning Research, 2009. [pdf]
  • Changing age-specific patterns of cigarette consumption in the United States, 1992-2002: association with smoke-free homes and state-level tobacco control activity.. John P Pierce, Martha White, Karen S. Messer. Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco, 2009. [pdf]
  • Prevalence and correlates of fatigue among patients at a comprehensive cancer center.. Anjali A Bharne, Kelly Anne Shimabukuro, Priscilla Vu, Martha White, Paul J. Mills, Wayne A. Bardwell, Karen S Messer, Lyudmila Bazhenova. , 2009. [pdf]
  • Learning a Value Analysis Tool for Agent Evaluation. Martha White, Michael H. Bowling. IJCAI, 2009. [pdf]
  • Young adult smoking behavior: implications for future population health.. Elizabeth A. Gilpin, Victoria M White, Martha White, John P Pierce. American journal of health behavior, 2009. [pdf]
  • Intermittent and light daily smoking across racial/ethnic groups in the United States. Dennis R. Trinidad, Eliseo J. Pérez-Stable, Sherry L Emery, Martha White, Rachel A. Grana, Karen S Messer. Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco, 2009. [pdf]
  • A demonstration of the Polaris poker system. Michael H. Bowling, Nicholas Abou Risk, Nolan Bard, Darse Billings, Neil Burch, Joshua Davidson, John Alexander Hawkin, Robert C. Holte, Michael Johanson, Morgan Kan, Bryce Paradis, Jonathan Schaeffer, David Schnizlein, Duane Szafron, Kevin Waugh, Martin Zinkevich. AAMAS 2009, 2009. [pdf]
  • A Practical Use of Imperfect Recall. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, Michael H. Bowling. SARA, 2009. [pdf]
  • Abstraction pathologies in extensive games. Kevin Waugh, David Schnizlein, Michael H. Bowling, Duane Szafron. AAMAS, 2009. [pdf]
  • Strategy Grafting in Extensive Games. Kevin Waugh, Nolan Bard, Michael H. Bowling. NIPS 2009, 2009. [pdf]
  • Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, Michael H. Bowling. NIPS, 2009. [pdf]
  • Data Biased Robust Counter Strategies. Michael Johanson, Michael H. Bowling. AISTATS, 2009. [pdf]

2008

  • Sample-based learning and search with permanent and transient memories. David Silver, Richard S. Sutton, Martin Müller. ICML, 2008. [pdf]
  • Reinforcement Learning in Environments with Independent Delayed-sense Dynamics Reinforcement Learning in Environments with Independent Delayed-sense Dynamics. Masoud Shahamiri, Richard S. Sutton, Martin Jägersand, Sirish Shah. , 2008. [pdf]
  • Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. Maria Cutumisu, Duane Szafron, Michael H. Bowling, Richard S. Sutton. , 2008. [pdf]
  • A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation. Richard S. Sutton, Csaba Szepesvári, Hamid Reza Maei. , 2008. [pdf]
  • A computational model of hippocampal function in trace conditioning. Elliot A. Ludvig, Richard S. Sutton, Eric Verbeek, E. James Kehoe. NIPS, 2008. [pdf]
  • Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling. UAI, 2008. [pdf]
  • Magnitude and timing of nictitating membrane movements during classical conditioning of the rabbit (Oryctolagus cuniculus).. E. James Kehoe, Elliot A. Ludvig, Joanne Dudeney, James Neufeld, Richard S. Sutton. Behavioral neuroscience, 2008. [pdf]
  • Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System. Elliot A. Ludvig, Richard S. Sutton, E. James Kehoe. Neural Computation, 2008. [pdf]
  • A Convergent O ( n ) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation. Richard S. Sutton, Csaba Szepesvári, Hamid Reza Maei. , 2008. [pdf]
  • SELECTED BIBLIOGRAPHY ON CONNECTIONISM. Oliver G. Selfridge, Richard S. Sutton, Charles W. Anderson. , 2008. [pdf]
  • Efficient global optimization for exponential family PCA and low-rank matrix factorization. Yuhong Guo, Dale Schuurmans. 2008 46th Annual Allerton Conference on Communication, Control, and Computing, 2008. [pdf]
  • Policy Iteration for Learning an Exercise Policy for American Options. Yuxi Li, Dale Schuurmans. EWRL, 2008. [pdf]
  • Semi-Supervised Convex Training for Dependency Parsing. Qin Iris Wang, Dale Schuurmans, Dekang Lin. ACL 2008, 2008. [pdf]
  • Efficient Stopping Rules. Volodymyr Mnih, Csaba Szepesvári, Dale Schuurmans. , 2008. [pdf]
  • Online Optimization in X-Armed Bandits. Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári. NIPS, 2008. [pdf]
  • Empirical Bernstein stopping. Volodymyr Mnih, Csaba Szepesvári, Jean-Yves Audibert. ICML, 2008. [pdf]
  • Active Learning of Group-Structured Environments. Gábor Bartók, Csaba Szepesvári, Sandra Zilles. ALT, 2008. [pdf]
  • Active Learning in Multi-armed Bandits. András Antos, Varun Grover, Csaba Szepesvári. ALT, 2008. [pdf]
  • European Workshop on Reinforcement Learning. Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor, Olivier Teytaud, Eric Moulines, Alessandra Russo, Peter Vrancx, Tom Croonenborghs. , 2008. [pdf]
  • Regularized Policy Iteration. Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor. NIPS, 2008. [pdf]
  • Finite-Time Bounds for Fitted Value Iteration. Rémi Munos, Csaba Szepesvári. Journal of Machine Learning Research, 2008. [pdf]
  • Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstractions. Alejandro Isaza, Csaba Szepesvári, Vadim Bulitko, Russell Greiner. UAI, 2008. [pdf]
  • Inherited and acquired variations in the hyaluronan synthase 1 (HAS1) gene may contribute to disease progression in multiple myeloma and Waldenstrom macroglobulinemia.. Sophia Adamia, Amanda A Reichert, Hemalatha Kuppusamy, Jitra Kriangkum, Anirban Ghosh, Jennifer J Hodges, Patrick M. Pilarski, Michael J. Mant, T. H. Reiman, Andrew R. Belch, Linda M. Pilarski. Blood, 2008. [pdf]
  • may contribute to disease progression in multiple myeloma and Inherited and acquired variations in the hyaluronan synthase 1 ( HAS 1 ). Pilarski, Hodges, Patrick M. Pilarski, Michael J. Mant, T. H. Reiman, Andrew R. Belch, Mike Linda, Sophia Adamia, Amanda A Reichert, Hemalatha Kuppusamy, Jitra Kriangkum, Anirban Ghosh. , 2008. [pdf]
  • Multiple Myeloma Includes Phenotypically Defined Subsets of Clonotypic CD20+ B Cells that Persist During Treatment with Rituximab. Linda M. Pilarski, Eva Baigorri, Michael J. Mant, Patrick M. Pilarski, Penelope J Adamson, Heddy Zola, Andrew R. Belch. Clinical medicine. Oncology, 2008. [pdf]
  • Rapid simulation of wide-angle scattering from mitochondria in single cells.. Patrick M. Pilarski, Xuantao Su, D. Moira Glerum, Christopher J. Backhouse. Optics express, 2008. [pdf]
  • Indirect interactions between ant-tended hemipterans, a dominant ant Azteca instabilis (Hymenoptera: Formicidae), and shade trees in a tropical agroecosystem.. George F. Livingston, Adam M. White, Carley J Kratz. Environmental entomology, 2008. [pdf]
  • Sex differences in working memory.. Ashley Harness, Lorri Jacot, Shauna Scherf, Adam White, Jason E. Warnick. Psychological reports, 2008. [pdf]
  • Enabling factors and barriers for the use of health impact assessment in decision-making processes.. Balsam Ahmad, David Chappel, Tanja Pless-Mulloli, Martha White. Public health, 2008. [pdf]
  • The effect of smoke-free homes on smoking behavior in the U.S.. Karen S Messer, Alice L Mills, Martha White, John P Pierce. American journal of preventive medicine, 2008. [pdf]
  • Asthma exacerbations in African Americans treated for 1 year with combination fluticasone propionate and salmeterol or fluticasone propionate alone.. William Carl Bailey, Mario Castro, Jonathan Matz, Martha White, Mark T. Dransfield, Steve Yancey, Hector G. Ortega. Current medical research and opinion, 2008. [pdf]
  • Smoking trends among Filipino adults in California, 1990-2002.. Romina Anne Africa Romero, Karen S. Messer, Joshua H. West, Martha White, Dennis R. Trinidad. Preventive medicine, 2008. [pdf]
  • Sigma point policy iteration. Michael H. Bowling, Alborz Geramifard, David Wingate. AAMAS, 2008. [pdf]
  • Multidisciplinary students and instructors: a second-year games course. Nathan R. Sturtevant, H. James Hoover, Jonathan Schaeffer, Sean Gouglas, Michael H. Bowling, Finnegan Southey, Matthew Bouchard, Ghassan Zabaneh. SIGCSE, 2008. [pdf]
  • Strategy evaluation in extensive games with importance sampling. Michael H. Bowling, Michael Johanson, Neil Burch, Duane Szafron. ICML, 2008. [pdf]
  • Apprenticeship learning using linear programming. Umar Syed, Michael H. Bowling, Robert E. Schapire. ICML, 2008. [pdf]
  • Scalable Action Respecting Embedding. Michael Biggs, Ali Ghodsi, Dana F. Wilkinson, Michael H. Bowling. ISAIM, 2008. [pdf]
  • Autonomous geocaching: navigation and goal finding in outdoor domains. James Neufeld, Michael Sokolsky, Jason Roberts, Adam Milstein, Stephen Walsh, Michael H. Bowling. AAMAS, 2008. [pdf]

2007

  • Reinforcement Learning of Local Shape in the Game of Go. David Silver, Richard S. Sutton, Martin Müller. IJCAI, 2007. [pdf]
  • Learning to Maximize Rewards : A Review of “ Reinforcement Learning : An Introduction ” by. Richard S. Sutton, Andrew G. Barto, Rajesh P. N. Rao. , 2007. [pdf]
  • The PEAK Project. Richard S. Sutton. , 2007. [pdf]
  • Incremental Natural Actor-Critic Algorithms. Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee. NIPS, 2007. [pdf]
  • Research Grant Renewal Proposal Reinforcement Learning and Artificial Intelligence chair. Richard S. Sutton. , 2007. [pdf]
  • Natural-Gradient Actor-Critic Algorithms. Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee. , 2007. [pdf]
  • On the role of tracking in stationary environments. Richard S. Sutton, Anna Koop, David Silver. ICML, 2007. [pdf]
  • Investigating Experience: Temporal Coherence and Empirical Knowledge Representation. Richard S. Sutton, Marcia L. Spetch. , 2007. [pdf]
  • Model-based Reinforcement Learning. Leonid Kuvayev, Richard S. Sutton. , 2007. [pdf]
  • Characterizing the benefits of model-based vs . direct-control learning in exploration. Dale Schuurmans. , 2007. [pdf]
  • Discriminative Batch Mode Active Learning. Yuhong Guo, Dale Schuurmans. NIPS, 2007. [pdf]
  • Greedy importance sampling : A new Monte Carlo inference method. Dale Schuurmans. , 2007. [pdf]
  • Dual Representations for Dynamic Programming and Reinforcement Learning. Tao Wang, Michael H. Bowling, Dale Schuurmans. 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007. [pdf]
  • Convex Relaxations of Latent Variable Training. Yuhong Guo, Dale Schuurmans. NIPS, 2007. [pdf]
  • Automatic gait optimization with Gaussian process regression. Daniel J. Lizotte, Tao Wang, Michael H. Bowling, Dale Schuurmans. IJCAI 2007, 2007. [pdf]
  • Simple training of dependency parsers via structured boosting. Qin Iris Wang, Dekang Lin, Dale Schuurmans. IJCAI 2007, 2007. [pdf]
  • Stable Dual Dynamic Programming. Tao Wang, Daniel J. Lizotte, Michael H. Bowling, Dale Schuurmans. NIPS, 2007. [pdf]
  • Learning Gene Regulatory Networks via Globally Regularized Risk Minimization. Yuhong Guo, Dale Schuurmans. RECOMB-CG, 2007. [pdf]
  • Fitted Q-iteration in continuous action-space MDPs. András Antos, Rémi Munos, Csaba Szepesvári. NIPS, 2007. [pdf]
  • Sequence Prediction Exploiting Similary Information. István Bíró, Zoltán Szamonek, Csaba Szepesvári. IJCAI, 2007. [pdf]
  • Towards Manifold-Adaptive Learning. Amir Massoud Farahmand, Csaba Szepesvári. , 2007. [pdf]
  • Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods. Gergely Neu, Csaba Szepesvári. UAI, 2007. [pdf]
  • Improved Rates for the Stochastic Continuum-Armed Bandit Problem. Peter Auer, Ronald Ortner, Csaba Szepesvári. COLT, 2007. [pdf]
  • Continuous Time Associative Bandit Problems. András György, Levente Kocsis, Ivett Szabó, Csaba Szepesvári. IJCAI, 2007. [pdf]
  • Tuning Bandit Algorithms in Stochastic Environments. Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári. ALT, 2007. [pdf]
  • Variance estimates and exploration function in multi-armed bandit Estimation de la variance et exploration pour le bandit à plusieurs bras. Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári. , 2007. [pdf]
  • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. András Antos, Csaba Szepesvári, Rémi Munos. Machine Learning, 2007. [pdf]
  • Manifold-Adaptive Dimension Estimation. Amir Massoud Farahmand, Csaba Szepesvári, Jean-Yves Audibert. ICML, 2007. [pdf]
  • FISH and chips: chromosomal analysis on microfluidic platforms.. Vincent J. Sieben, Carina S. Debes Marun, Patrick M. Pilarski, Govind V. Kaigala, Linda M. Pilarski, Christopher J. Backhouse. IET nanobiotechnology, 2007. [pdf]
  • The California Tobacco Control Program's effect on adult smokers: (2) Daily cigarette consumption levels.. Wael K. Al-Delaimy, John P Pierce, Karen S. Messer, Martha White, Dennis R. Trinidad, Elizabeth A. Gilpin. Tobacco control, 2007. [pdf]
  • Receptivity to tobacco advertising and promotions among young adolescents as a predictor of established smoking in young adulthood.. Elizabeth A. Gilpin, Martha White, Karen S. Messer, John P Pierce. American journal of public health, 2007. [pdf]
  • The California Tobacco Control Program's effect on adult smokers: (3) Similar effects for African Americans across states.. Dennis R. Trinidad, Karen S. Messer, Elizabeth A. Gilpin, Wael K. Al-Delaimy, Martha White, John P Pierce. Tobacco control, 2007. [pdf]
  • Clinical decision-making: Patients' preferences and experiences.. Elizabeth Murray, Lance Pollack, Martha White, Bernard Lo. Patient education and counseling, 2007. [pdf]
  • Dennis R levels adult smokers : ( 2 ) Daily cigarette consumption The California Tobacco Control Program ’ s effect on. Wael K. Al-Delaimy, John P Pierce, Karen S. Messer, Martha White, Dennis R. Trinidad, Elizabeth A. Gilpin. , 2007. [pdf]
  • IJCAI-07 Reviewers. Akinori Abe, Douglas Aberdeen, Neeharika Adabala, Alicia Ageno, Eneko Agirre, David W. Aha, Stuart Aitken, Shotaro Akaho, Osamu Akashi, Rama Akkiraju, Alexandre Albore, Klaus-Dieter Althoff, Jose R. Alvarez, Sanchez, Analía Amandi, Gianni Amati, Leïla Amgoud, Rema Ananthanarayanan, Rie Kubota Ando, Henrik Andreasson, Fabrizio Angiulli, Carlos Ansótegui, Douglas E. Appelt, Raghav Aras, Aluizio F. R. Araújo, J. Arcos, Kai Oliver Arras, Antonio Artés, Mehran Asadi, Naveen Ashish, Hideki Asoh, Gilles Audemard, Anne Auger, Chen Avin, Franz Baader, Fahiem Bacchus, Jorge Baier, Olivier Bailleux, Mark Baillie, Stuart Bain, Bram Bakker, Sreeram Balakrishnan, Christian Balkenius, Antonio Rubio, Bikramjit Banerjee, Subhashis Banerjee, Raju S. Bapi, Chitta Baral, Guilherme De A. Barreto, Anthony Barrett, Leliane Nunes de Barros, Roman Barták, Peter Baumgartner, Ramón Béjar, Boulle, Paolo Bouquet, Michael H. Bowling, Antonio Rafael Braga, Arthur M. B. Braga, Sebastian Brand, Felix Brandt, Karl Branting, Jonathan Bredin, Philippe Bretier, Gerhard Brewka, Christopher Brewster, Darin Brezeale, Derek Bridge, Will Briggs, Ismel Brito, Kendrick T. Brown, Brett Browning, Frank Broz, Daniel Bryce, Sabine Buchholz, Olivier Buffet, Hung Bui, Vadim Bulitko, Ernesto Burattini, Wolfram Burgard, R. Burke, Michael Buro, Roy. IJCAI, 2007. [pdf]
  • Computing Robust Counter-Strategies. Michael Johanson, Martin Zinkevich, Michael H. Bowling. NIPS 2007, 2007. [pdf]
  • Regret Minimization in Games with Incomplete Information. Martin Zinkevich, Michael Johanson, Michael H. Bowling, Carmelo Piccione. NIPS, 2007. [pdf]
  • A New Algorithm for Generating Equilibria in Massive Zero-Sum Games. Martin Zinkevich, Michael H. Bowling, Neil Burch. AAAI, 2007. [pdf]
  • Particle Filtering for Dynamic Agent Modelling in Simplified Poker. Nolan Bard, Michael H. Bowling. AAAI, 2007. [pdf]
  • Computing Robust Counter-Strategies. Michael Johanson, Martin Zinkevich, Michael H. Bowling. NIPS, 2007. [pdf]

2006

  • Incremental Least-Squares Temporal Difference Learning. Alborz Geramifard, Michael H. Bowling, Richard S. Sutton. AAAI, 2006. [pdf]
  • Gain Adaptation Beats Least Squares ?. Richard S. Sutton. , 2006. [pdf]
  • iLSTD: Eligibility Traces and Convergence Analysis. Alborz Geramifard, Michael H. Bowling, Martin Zinkevich, Richard S. Sutton. NIPS, 2006. [pdf]
  • Robust Support Vector Machine Training via Convex Outlier Ablation. Linli Xu, Koby Crammer, Dale Schuurmans. AAAI, 2006. [pdf]
  • Stochastic Analysis of Lexical and Semantic Enhanced Structural Language Model. Shaojun Wang, Shaomin Wang, Li Cheng, Russell Greiner, Dale Schuurmans. ICGI, 2006. [pdf]
  • implicit Online Learning with Kernels. Li Cheng, S. V. N. Vishwanathan, Dale Schuurmans, Shaojun Wang, Terry Caelli. NIPS, 2006. [pdf]
  • Improved Large Margin Dependency Parsing via Local Constraints and Laplacian Regularization. Qin Iris Wang, Colin Cherry, Daniel J. Lizotte, Dale Schuurmans. CoNLL, 2006. [pdf]
  • Constraint-based optimization and utility elicitation using the minimax decision criterion. Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans. Artif. Intell., 2006. [pdf]
  • Compact, Convex Upper Bound Iteration for Approximate POMDP Planning. Tao Wang, Pascal Poupart, Michael H. Bowling, Dale Schuurmans. AAAI, 2006. [pdf]
  • Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields. Chi-Hoon Lee, Shaojun Wang, Feng Jiao, Dale Schuurmans, Russell Greiner. NIPS, 2006. [pdf]
  • Information Marginalization on Subgraphs. Jiayuan Huang, Tingshao Zhu, Russell Greiner, Dengyong Zhou, Dale Schuurmans. PKDD, 2006. [pdf]
  • Graphical Models and Point Pattern Matching. Tibério S. Caetano, Terry Caelli, Dale Schuurmans, Dante Augusto Couto Barone. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006. [pdf]
  • Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering. Yuhong Guo, Dale Schuurmans. UAI, 2006. [pdf]
  • An Online Discriminative Approach to Background Subtraction. Li Cheng, Shaojun Wang, Dale Schuurmans, Terry Caelli, S. V. N. Vishwanathan. 2006 IEEE International Conference on Video and Signal Based Surveillance, 2006. [pdf]
  • Protein fold recognition using the gradient boost algorithm.. Feng Jiao, Jinbo Xu, Libo Yu, Dale Schuurmans. Computational systems bioinformatics. Computational Systems Bioinformatics Conference, 2006. [pdf]
  • Discriminative unsupervised learning of structured predictors. Linli Xu, Dana F. Wilkinson, Finnegan Southey, Dale Schuurmans. ICML 2006, 2006. [pdf]
  • Web communities identification from random walks. Jiayuan Huang, Tingshao Zhu, Dale Schuurmans. PKDD, 2006. [pdf]
  • Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling. Feng Jiao, Shaojun Wang, Chi-Hoon Lee, Russell Greiner, Dale Schuurmans. ACL, 2006. [pdf]
  • Local Importance Sampling: A Novel Technique to Enhance Particle Filtering. Péter Torma, Csaba Szepesvári. Journal of Multimedia, 2006. [pdf]
  • Universal parameter optimisation in games based on SPSA. Levente Kocsis, Csaba Szepesvári. Machine Learning, 2006. [pdf]
  • RSPSA: Enhanced Parameter Optimization in Games. Levente Kocsis, Csaba Szepesvári, Mark H. M. Winands. ACG, 2006. [pdf]
  • Improved Monte-Carlo Search. Levente Kocsis, Csaba Szepesvári. , 2006. [pdf]
  • Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. András Antos, Csaba Szepesvári, Rémi Munos. COLT, 2006. [pdf]
  • Use of variance estimation in the multi-armed bandit problem. Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári. , 2006. [pdf]
  • Bandit Based Monte-Carlo Planning. Levente Kocsis, Csaba Szepesvári. ECML, 2006. [pdf]
  • A method for cytometric image parameterization.. Patrick M. Pilarski, Christopher J. Backhouse. Optics express, 2006. [pdf]
  • Feature construction for reinforcement learning in hearts. Nathan R. Sturtevant, Adam M. White. Computers and Games, 2006. [pdf]
  • Mercury ITIL Foundation : Mapping ITIL To The Real World. Adam White. , 2006. [pdf]
  • Factors associated with non-participation in a physical activity promotion trial.. David J Chinn, Martha White, Denise Howel, J O E Harland, Chris Drinkwater. Public health, 2006. [pdf]
  • Cigarette promotional offers: who takes advantage?. Victoria M White, Martha White, Karen Freeman, Elizabeth A. Gilpin, John P Pierce. American journal of preventive medicine, 2006. [pdf]
  • Cigarette Promotional OffersWho Takes Advantage. Victoria White, Martha White, Karen Freeman, Elizabeth A. Gilpin, John P Pierce. , 2006. [pdf]
  • Trends in smoking among Hispanic women in California: Relationship to English language use.. Dennis R. Trinidad, Elizabeth A. Gilpin, Karen S. Messer, Martha White, John P Pierce. American journal of preventive medicine, 2006. [pdf]
  • What contributed to the major decline in per capita cigarette consumption during California's comprehensive tobacco control programme?. Elizabeth A. Gilpin, Karen S. Messer, Martha White, John P Pierce. Tobacco control, 2006. [pdf]
  • Adolescents' perceptions about quitting and nicotine replacement therapy: findings from the California Tobacco Survey.. Wael K. Al-Delaimy, Martha White, John P Pierce. The Journal of adolescent health : official publication of the Society for Adolescent Medicine, 2006. [pdf]
  • Optimal Unbiased Estimators for Evaluating Agent Performance. Martin Zinkevich, Michael H. Bowling, Nolan Bard, Morgan Kan, Darse Billings. AAAI, 2006. [pdf]
  • Boosting Expert Ensembles for Rapid Concept Recall. Achim Rettinger, Martin Zinkevich, Michael H. Bowling. AAAI, 2006. [pdf]
  • Bayesian Calibration for Monte Carlo Localization. Armita Kaboli, Michael H. Bowling, Petr Musílek. AAAI, 2006. [pdf]
  • Machine learning and games. Michael H. Bowling, Johannes Fürnkranz, Thore Graepel, Ron Musick. Machine Learning, 2006. [pdf]
  • Robust game play against unknown opponents. Nathan R. Sturtevant, Michael H. Bowling. AAMAS, 2006. [pdf]
  • Prob-Maxn: Playing N-Player Games with Opponent Models. Nathan R. Sturtevant, Martin Zinkevich, Michael H. Bowling. AAAI, 2006. [pdf]
  • Learning predictive state representations using non-blind policies. Michael H. Bowling, Peter McCracken, Michael James, James Neufeld, Dana F. Wilkinson. ICML, 2006. [pdf]

2005

  • Reinforcement Learning for RoboCup Soccer Keepaway. Peter Stone, Richard S. Sutton, Gregory Kuhlmann. Adaptive Behaviour, 2005. [pdf]
  • TD(lambda) networks: temporal-difference networks with eligibility traces. Brian Tanner, Richard S. Sutton. ICML, 2005. [pdf]
  • Temporal Abstraction in TD Networks. Richard S. Sutton, Eddie J. Rafols, Anna Koop. , 2005. [pdf]
  • Using Predictive Representations to Improve Generalization in Reinforcement Learning. Eddie J. Rafols, Mark B. Ring, Richard S. Sutton, Brian Tanner. IJCAI, 2005. [pdf]
  • Off-policy Learning with Options and Recognizers. Doina Precup, Richard S. Sutton, Cosmin Paduraru, Anna Koop, Satinder P. Singh. NIPS, 2005. [pdf]
  • Temporal-Difference Networks with History. Brian Tanner, Richard S. Sutton. IJCAI, 2005. [pdf]
  • Temporal Abstraction in Temporal-difference Networks. Richard S. Sutton, Eddie J. Rafols, Anna Koop. NIPS, 2005. [pdf]
  • Variational Bayesian image modelling. Li Cheng, Feng Jiao, Dale Schuurmans, Shaojun Wang. ICML, 2005. [pdf]
  • Unsupervised and Semi-Supervised Multi-Class Support Vector Machines. Linli Xu, Dale Schuurmans. AAAI, 2005. [pdf]
  • Combining Statistical Language Models via the Latent Maximum Entropy Principle. Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao. Machine Learning, 2005. [pdf]
  • Learning Coordination Classifiers. Yuhong Guo, Russell Greiner, Dale Schuurmans. IJCAI, 2005. [pdf]
  • Improved estimation for unsupervised part-of-speech tagging. Q.I. Wang, Dale Schuurmans. 2005 International Conference on Natural Language Processing and Knowledge Engineering, 2005. [pdf]
  • Bayesian sparse sampling for on-line reward optimization. Tao Wang, Daniel J. Lizotte, Michael H. Bowling, Dale Schuurmans. ICML, 2005. [pdf]
  • Tangent-corrected embedding. Ali Ghodsi, Jiayuan Huang, Finnegan Southey, Dale Schuurmans. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005. [pdf]
  • Strictly Lexical Dependency Parsing. Qin Iris Wang, Dale Schuurmans, Dekang Lin. IWPT, 2005. [pdf]
  • Regret-based Utility Elicitation in Constraint-based Decision Problems. Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans. IJCAI, 2005. [pdf]
  • Maximum Margin Bayesian Networks. Yuhong Guo, Dana F. Wilkinson, Dale Schuurmans. UAI, 2005. [pdf]
  • Reduced Variance Payoff Estimation in Adversarial Bandit Problems. Csaba Szepesvári. , 2005. [pdf]
  • Strict Feedback Systems. Mark French, Csaba Szepesvári, Eric Rogers. , 2005. [pdf]
  • Finite time bounds for sampling based fitted value iteration. Csaba Szepesvári, Rémi Munos. ICML, 2005. [pdf]
  • Learning near-optimal policies with fitted policy iteration and a single sample path. András Antos, Csaba Szepesvári. , 2005. [pdf]
  • On using likelihood-adjusted proposals in particle filtering: local importance sampling. Péter Torma, Csaba Szepesvári. ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005., 2005. [pdf]
  • The Chain of Integrators. Mark French, Csaba Szepesvári, Eric Rogers. , 2005. [pdf]
  • Maximum Margin Discriminant Analysis based Face Recognition. Kornél Kovács, András Kocsor, Csaba Szepesvári. , 2005. [pdf]
  • X-mHMM:an efficient algorithm for training mixtures of HMMs when the number of mixtures is unknown. Zoltán Szamonek, Csaba Szepesvári. Fifth IEEE International Conference on Data Mining (ICDM'05), 2005. [pdf]
  • A SWARM-BASED SYSTEM FOR OBJECT RECOGNITION. Tanya Mirzayans, Nitin Parimi, Patrick M. Pilarski, Chris Backhouse, Loren Wyard-Scott, Petr Musílek. , 2005. [pdf]
  • An adaptable microvalving system for on-chip polymerase chain reactions.. Patrick M. Pilarski, Sophia Adamia, Christopher J. Backhouse. Journal of immunological methods, 2005. [pdf]
  • Phylogenetic analysis of Pinguicula (Lentibulariaceae): chloroplast DNA sequences and morphology support several geographically distinct radiations.. Thomas Cieslak, Jai Santosh Polepalli, Adam White, Kai J. Müller, Thomas Borsch, W Barthlott, Juerg Steiger, Adam D. Marchant, Laurent Legendre. American journal of botany, 2005. [pdf]
  • Nutritional status and energy expenditure in children pre-bone-marrow-transplant. Martha White, Alexia J Murphy, Yvonne Hastings, Jill Shergold, Jim Young, C. Franklin Montgomery, Peter Sw Davies, Laura P. Lockwood. Bone Marrow Transplantation, 2005. [pdf]
  • Facilitating adolescent smoking: who provides the cigarettes?. Martha White, Elizabeth A. Gilpin, Sherry L Emery, John P Pierce. American journal of health promotion : AJHP, 2005. [pdf]
  • How do smokers control their cigarette expenditures?. Victoria M White, Elizabeth A. Gilpin, Martha White, John P Pierce. Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco, 2005. [pdf]
  • Adolescent smoking decline during California's tobacco control programme.. John P Pierce, Martha White, Elizabeth A. Gilpin. Tobacco control, 2005. [pdf]
  • Why does adult African-American smoking prevalence in California remain higher than for non-Hispanic whites?. Dennis R. Trinidad, Elizabeth A. Gilpin, Martha White, John P Pierce. Ethnicity & disease, 2005. [pdf]
  • Subjective Localization with Action Respecting Embedding. Michael H. Bowling, Dana F. Wilkinson, Ali Ghodsi, Adam Milstein. ISRR, 2005. [pdf]
  • Bayes' Bluff: Opponent Modelling in Poker. Finnegan Southey, Michael H. Bowling, Bryce Larson, Carmelo Piccione, Neil Burch, Darse Billings, D. Chris Rayner. UAI, 2005. [pdf]
  • Coordination and Adaptation in Impromptu Teams. Michael H. Bowling, Peter McCracken. AAAI, 2005. [pdf]
  • Multiagent Planning in the Presence of Multiple Goals. Michael H. Bowling, Rune M. Jensen, Manuela M. Veloso. , 2005. [pdf]
  • Online Discovery and Learning of Predictive State Representations. Peter McCracken, Michael H. Bowling. NIPS, 2005. [pdf]
  • Learning Subjective Representations for Planning. Dana F. Wilkinson, Michael H. Bowling, Ali Ghodsi. IJCAI, 2005. [pdf]
  • Action respecting embedding. Michael H. Bowling, Ali Ghodsi, Dana F. Wilkinson. ICML, 2005. [pdf]

2004

  • TD Networks. Richard S. Sutton, Brian Tanner. , 2004. [pdf]
  • Introduction: The challenge of reinforcement learning. Richard S. Sutton. Machine Learning, 2004. [pdf]
  • Associative search network: A reinforcement learning associative memory. Andrew G. Barto, Richard S. Sutton, Peter S. Brouwer. Biological Cybernetics, 2004. [pdf]
  • Reinforcement learning with replacing eligibility traces. Satinder P. Singh, Richard S. Sutton. Machine Learning, 2004. [pdf]
  • Temporal-Difference Networks. Richard S. Sutton, Brian Tanner. NIPS, 2004. [pdf]
  • Appeared in Proceedings of the Seventh Yale Workshop on Adaptive and Learning Systems pp Gain Adaptation Beats Least Squares. Richard S. Sutton. , 2004. [pdf]
  • Exploiting syntactic, semantic and lexical regularities in language modeling via directed Markov random fields. Shaojun Kang, Shaomin Wang, Russell Greiner, Dale Schuurmans, Li Cheng. ISCSLP, 2004. [pdf]
  • Learning mixture models with the regularized latent maximum entropy principle. Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao. IEEE Transactions on Neural Networks, 2004. [pdf]
  • Dynamic Web log session identification with statistical language models. Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans. JASIST, 2004. [pdf]
  • Transformation-Invariant Embedding for Image Analysis. Ali Ghodsi, Jiayuan Huang, Dale Schuurmans. ECCV, 2004. [pdf]
  • Maximum Margin Clustering. Linli Xu, James Neufeld, Bryce Larson, Dale Schuurmans. NIPS, 2004. [pdf]
  • Augmenting Naive Bayes Classifiers with Statistical Language Models. Fuchun Peng, Dale Schuurmans, Shaojun Wang. Information Retrieval, 2004. [pdf]
  • Exploiting syntactic, semantic and lexical regularities in language modeling via directed Markov random fields. Shaojun Wang, Shaomin Wang, Russell Greiner, Dale Schuurmans, Li Cheng. 2004 International Symposium on Chinese Spoken Language Processing, 2004. [pdf]
  • Computer Aided Diagnosis of Clustered Microcalci fi cations Using Arti fi cial Neural Nets. Erich Sorantin, Ferdinand Schmidt, Heinz Mayer, Michael Becker, Csaba Szepesvári, E. Graif, Peter Winkler. , 2004. [pdf]
  • Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results. Csaba Szepesvári. AAAI, 2004. [pdf]
  • Interpolation-based Q-learning. Csaba Szepesvári, William D. Smart. ICML, 2004. [pdf]
  • Kernel Machine Based Feature Extraction Algorithms for Regression Problems. Csaba Szepesvári, András Kocsor, Kornél Kovács. ECAI, 2004. [pdf]
  • Enhancing Particle Filters Using Local Likelihood Sampling. Péter Torma, Csaba Szepesvári. eccv 2004, 2004. [pdf]
  • Margin Maximizing Discriminant Analysis. András Kocsor, Kornél Kovács, Csaba Szepesvári. ECML, 2004. [pdf]
  • Hyaluronan Synthases and RHAMM as Synergistic Mediators of Malignancy in B Lineage Cancers. Linda M. Pilarski, Sophia Adamia, Christopher A Maxwell, Patrick M. Pilarski, T. H. Reiman, Andrew R. Belch. , 2004. [pdf]
  • Improved Diagnosis and Monitoring of Cancer Using Portable Microfluidics Platforms. Linda M. Pilarski, Sophia Adamia, Patrick M. Pilarski, Ranjit Prakash, Jana Lauzon, Christopher J. Backhouse. 2004 International Conference on MEMS, NANO and Smart Systems (ICMENS'04), 2004. [pdf]
  • OSS through Java as an Implementation of NGOSS A White Paper. Adam White. , 2004. [pdf]
  • Efficacy and safety of fluticasone propionate/salmeterol HFA 134A MDI in patients with mild-to-moderate persistent asthma.. David S. Pearlman, David Peden, John J. Condemi, Steven Weinstein, Martha White, Leslie A Baitinger, Catherine Scott, Suzanne Ho, Karen House, Paul M Dorinsky. The Journal of asthma : official journal of the Association for the Care of Asthma, 2004. [pdf]
  • Patient-centered communication: do patients really prefer it?. Sara L. Swenson, Stephanie Buell, Patti Zettler, Martha White, Delaney C. Ruston, Bernard Lo. Journal of general internal medicine, 2004. [pdf]
  • STP: Skills, tactics and plays for multi-robot control in adversarial environments. Brett Browning, James Bruce, Michael H. Bowling, Manuela M. Veloso. , 2004. [pdf]
  • Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games. Darse Billings, Aaron Davidson, Terence Schauenberg, Neil Burch, Michael H. Bowling, Robert C. Holte, Jonathan Schaeffer, Duane Szafron. Computers and Games, 2004. [pdf]
  • Safe Strategies for Agent Modelling in Games. Peter McCracken, Michael H. Bowling. , 2004. [pdf]
  • Convergence and No-Regret in Multiagent Learning. Michael H. Bowling. NIPS, 2004. [pdf]
  • Existence of Multiagent Equilibria with Limited Agents. Michael H. Bowling, Manuela M. Veloso. J. Artif. Intell. Res., 2004. [pdf]
  • Plays as Effective Multiagent Plans Enabling Opponent-Adaptive Play Selection. Michael H. Bowling, Brett Browning, Manuela M. Veloso. ICAPS, 2004. [pdf]

2003

  • iCORE Research Grant Proposal Reinforcement Learning and Artificial Intelligence. Richard S. Sutton. , 2003. [pdf]
  • Applying Machine Learning to Text Segmentation for Information Retrieval. Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone, Stephen E. Robertson. Information Retrieval, 2003. [pdf]
  • Language and task independent text categorization with simple language models. Fuchun Peng, Dale Schuurmans, Shaojun Wang. HLT-NAACL 2003, 2003. [pdf]
  • Automatic basis selection techniques for RBF networks. Ali Ghodsi, Dale Schuurmans. Neural Networks, 2003. [pdf]
  • Semantic n-gram language modeling with the latent maximum entropy principle. Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao. ICASSP, 2003. [pdf]
  • Learning Mixture Models with the Latent Maximum Entropy Principle. Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao. ICML, 2003. [pdf]
  • Model-Based Least-Squares Policy Evaluation. Fletcher Lu, Dale Schuurmans. , 2003. [pdf]
  • Learning Continuous Latent Variable Models with Bregman Divergences. Shaojun Wang, Dale Schuurmans. ALT, 2003. [pdf]
  • Face Alignment Using Statistical Models and Wavelet Features. Feng Jiao, Stan Z. Li, Harry Shum, Dale Schuurmans. CVPR, 2003. [pdf]
  • Session Boundary Detection for Association Rule Learning Using n-Gram Language Models. Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans, Nick Cercone. Canadian Conference on AI, 2003. [pdf]
  • Constraint-Based Optimization with the Minimax Decision Criterion. Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans. CP, 2003. [pdf]
  • Boltzmann Machine Learning with the Latent Maximum Entropy Principle. Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao. UAI, 2003. [pdf]
  • Text classification in Asian languages without word segmentation. Fuchun Peng, Xiangji Huang, Dale Schuurmans, Shaojun Wang. IRAL, 2003. [pdf]
  • Latent Maximum Entropy Approach for Semantic N-gram Language Modeling. Shaojun Wang, Dale Schuurmans, Fuchun Peng. AISTATS, 2003. [pdf]
  • Language Independent Authorship Attribution with Character Level N-Grams. Fuchun Peng, Dale Schuurmans, Vlado Keselj, Shaojun Wang. EACL, 2003. [pdf]
  • Combining Naive Bayes and n-Gram Language Models for Text Classification. Fuchun Peng, Dale Schuurmans. ECIR, 2003. [pdf]
  • Monte Carlo Matrix Inversion Policy Evaluation. Fletcher Lu, Dale Schuurmans. UAI, 2003. [pdf]
  • Sequential Importance Sampling for Visual Tracking Reconsidered. Péter Torma, Csaba Szepesvári. AISTATS, 2003. [pdf]
  • Aortic valve calcification on computed tomography predicts the severity of aortic stenosis.. S Joanna Cowell, David E. Newby, James Burton, Adam White, David B. Northridge, Nicholas A. Boon, John Z. Reid. Clinical radiology, 2003. [pdf]
  • Correlation between religion and happiness: a replication.. Leslie J. Francis, Mandy Robbins, Adam White. Psychological reports, 2003. [pdf]
  • Changes in youth smoking participation in California in the 1990s. Elizabeth A. Gilpin, Sherry L Emery, Martha White, John P Pierce. Cancer Causes & Control, 2003. [pdf]
  • A Formalization of Equilibria for Multiagent Planning. Michael H. Bowling, Rune M. Jensen, Manuela M. Veloso. IJCAI, 2003. [pdf]
  • Simultaneous adversarial multi-robot learning. Michael H. Bowling, Manuela M. Veloso. IJCAI 2003, 2003. [pdf]
  • Multi-robot team response to a multi-robot opponent team. James Bruce, Michael H. Bowling, Brett Browning, Manuela M. Veloso. ICRA, 2003. [pdf]
  • Plays as Team Plans for Coordination and Adaptation. Michael H. Bowling, Brett Browning, Allen Chang, Manuela M. Veloso. RoboCup, 2003. [pdf]