Abstract [eng] |
The Compact Muon Solenoid (CMS) is one of the general-purpose detectors at the CERN Large Hadron Collider (LHC) which collects enormous amounts of physics data. Before the final physics analysis can proceed, data has to be checked for quality (certified) by passing a number of automatic (like physics objects reconstruction, histogram preparation) and manual (checking, comparison and decision making) steps. Last manual step of decision making is very important, error-prone and demands a lot of manpower. Decision making (certification) is currently under active research in computer science for automation by applying recent advancements from computer science, specifically, machine learning (ML). Ultimately, CMS data certification is a binary classification task where various ML techniques are being investigated for applicability. Just like in any other ML task the hyper-parameter tuning is a difficult problem, there is no golden rule and each use case is different. This study explored meta-learning applicability, it is a hyper-parameters finding technique where algorithm learns hyper-parameters from previous training experiments. An Evolutionary genetic algorithm has been used to tune hyper-parameters of a neural network, like number of hidden layers, number of neurons per layer, activation functions, dropouts, training batch size and optimizer. Initially, the genetic algorithm takes manually specified set of hyper-parameters and then evolves towards the near-optimal solution. Genetic stochastic operators, crossover and mutation, were applied to avoid local optimal solutions. This study shows that by carefully seeding the initial solution the optimal is likely to be found. The proposed solution has improved AUC score of neural network used for CERN CMS data certification. Similar algorithm can be applied for other machine learning models for hyper-parameter optimization. |