Real-time object recognition is becoming an essential part of many emerging services, such as augmented reality, which require accurate inference in a timely fashion with low delay. We consider an edge-assisted object recognition system that can be configured in ways that have diverse impacts on these key performance criteria. Our goal is to design an online algorithm that learns the optimal configuration of the system by observing the outcomes of configurations applied in the past. We leverage the structure of the problem and combine a Gaussian process with a multi-armed bandit framework to efficiently solve the problem at hand. Our results indicate that our solution makes better configuration choices compared to other bandit algorithms, resulting in lower regret.