Abstract:Recently,machine learning algorithms have been widely used in many fields.Hyperparameter directly affects the performance of the machine learning algorithms.However,hyperparameter tuning depends on the professional knowledge and the expert experience.In order to solve the above problem,we propose an automatic hyperparameter optimization method based on reinforcement learning.This method considers the hyperparameter optimization problem as a sequence decision problem and models it as a Markov decision process (MDP).An reinforcement learning agent automatically selects hyperparameters for a machine learning algorithm.The accuracy of the model on the validation data set is used as a reward.To reduce the variance during the training,a data boot pool technique is designed.We have conducted a series of experiments to tune hyperparameters for the Random forest and XGBoost.We have compared our method with five optimization methods: random search,Bayesian optimization,TPE,CM-AES and SMAC on five datasets.The experimental results show that the proposed method achieves the best performance on 90% of the tasks..In addition,we have verified the effectiveness of the agent structure and the data boot pool by performing the ablation experiments.