Monitoring your NVidia GPU with Prometheus
I have a deep learning box that runs various computations on both CPU and GPU. The ability to off-load expensive computations from my laptop is fantastic.
I created a small tool, nvidia-smi-prometheus which runs as a service and reads the NVidia system stats every 10 seconds, and then exports them for prometheus.
(insert image from next time)