Job Title: Grafana Architect Location: Coppell, TX or Tampa, Fl or Jersey City, NJ (Hybrid) Employment Type: CTH Department: IT Infrastructure / DevOps / Observability About the Role We are seeking an experienced Grafana Architect to design, implement, and optimize observability solutions across our infrastructure and application ecosystem. The ideal candidate will have deep expertise in Grafana , Prometheus , and related monitoring tools, along with a strong understanding of data visualization, time-series databases, and alerting systems . You will collaborate closely with DevOps, SRE, and application teams to build scalable dashboards and ensure the reliability, availability, and performance of our systems. Key Responsibilities Design, implement, and maintain Grafana-based observability architectures across multi-environment infrastructures (cloud, hybrid, on-premises). Develop advanced dashboards, alerts, and data visualizations for infrastructure and application monitoring. Integrate Grafana with Prometheus, Loki, Tempo, InfluxDB, Elasticsearch, and other data sources . Define and implement monitoring standards, best practices, and governance models . Collaborate with application, DevOps, and SRE teams to ensure comprehensive monitoring coverage. Automate provisioning, configuration, and deployment of Grafana environments using Terraform, Ansible, or Helm . Optimize performance, scalability, and data retention for time-series data. Evaluate and integrate Grafana Enterprise / Cloud capabilities where applicable. Provide technical guidance, documentation, and training to teams on observability practices. Participate in incident response and post-incident analysis to improve observability and alerting. Required Qualifications Bachelors degree in Computer Science, Engineering, or related field (or equivalent experience). 5+ years of experience in IT infrastructure, DevOps, or SRE, with at least 2+ years focusing on Grafana architecture or administration. Strong hands-on experience with Grafana , Prometheus , and Loki . Solid understanding of metrics, logs, and traces concepts and how they interrelate. Experience integrating Grafana with diverse data sources (SQL/NoSQL, Elasticsearch, AWS CloudWatch, Azure Monitor, etc.) . Proficiency in Linux, Docker, Kubernetes , and CI/CD pipelines. Familiarity with scripting or automation tools (Python, Bash, Terraform, Ansible). Excellent analytical, troubleshooting, and communication skills. Preferred Qualifications Experience with Grafana Enterprise / Grafana Cloud features (Alerting, Reporting, RBAC, Teams). Knowledge of OpenTelemetry and distributed tracing tools. Certification(s) in Grafana Labs , Kubernetes , or Cloud platforms (AWS, GCP, Azure) . Experience architecting observability platforms in large-scale environments.
Read Less