Job Title: Site Reliability Engineer Location: Austin, Texas, United States Type: Full time Job Summary Our client is seeking an experienced Site Reliability Engineer to champion reliability, scalability, and operational excellence across large-scale enterprise platforms. This role focuses on applying engineering principles to operations, building automation, and ensuring highly available systems that support critical business applications. Key Responsibilities Promote and apply Site Reliability Engineering principles by driving automation and repeatable solutions to operational challenges. Design and develop tools and scripts that automate operational workflows and integrate seamlessly with infrastructure and platforms. Identify opportunities to enhance system reliability, performance, and observability for enterprise applications. Partner with engineering, Agile, and operations teams to provide technical guidance and support initiatives focused on system stability and availability. Respond to alerts, troubleshoot complex production issues, and manage changes to minimize risk and downtime. Build frameworks, instrumentation, and monitoring solutions to improve deployment success and application health. Support capacity planning efforts and contribute to CI/CD orchestration to improve software delivery efficiency. Diagnose and resolve real-time issues in mission-critical application environments and feed insights back into development teams. Participate in on-call rotations to support production systems. Required Qualifications Six to eight years of experience supporting and administering enterprise-scale systems and applications. Extensive background in automation scripting, proactive monitoring, dashboard development, and alerting strategies. Strong understanding of the software development lifecycle and experience driving process improvements. Hands-on experience with enterprise system administration, deployment, and monitoring in both Linux and Windows environments. Experience configuring, migrating, and supporting cloud-based applications; exposure to GCP or Pivotal Cloud Foundry is a plus. Solid understanding of networking fundamentals, including DNS, DHCP, firewalls, and routing concepts. Experience working with distributed systems and high-availability architectures. Development or scripting experience using technologies such as .NET, PowerShell, Java, Python, or Bash. Familiarity with relational and NoSQL databases, including SQL-based systems, Oracle, or MongoDB. Experience with messaging platforms and streaming technologies such as Kafka, RabbitMQ, Solace, or IBM MQ. Hands-on experience with monitoring and observability tools such as Splunk, AppDynamics, or similar platforms. Bachelors degree in Computer Science or a related technical discipline. Preferred Qualifications Prior experience in the financial services industry. Experience working within Agile or iterative delivery frameworks. Personal Attributes Strong customer-focused mindset with a proactive approach to ownership and accountability. High level of commitment to diagnosing and resolving complex issues in distributed environments. Natural problem-solver with the persistence to investigate deeply and identify root causes. Self-motivated professional capable of working independently while delivering consistent, high-quality results.
Read Less