Job Description
Paradyme, a CATHEXIS Company has partnered with an industry leader in enterprise Artificial Intelligence software and is seeking a highly skilled Site Reliability Engineer (SRE) to join our team to manage, monitor, and optimize our clusters on Kubernetes. Together we’re accelerating our client’s digital transformation through the building and deployment of data-driven, scalable AI solutions. The ideal candidate will have a deep understanding of Kubernetes, Cloud Infrastructure, and Infrastructure as Code (IaC) practices. You will be responsible for ensuring the reliability, scalability of our Kubernetes clusters and Cloud Infrastructure Active SECRET clearance or higher is required for consideration. Responsibilities:
- Monitor and Manage Kubernetes Clusters: Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes.
- Kubernetes Management: Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance.
Cloud Infrastructure Management: Work with leading Cloud Platforms (AWS, GCP, Azure) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.).
- Monitoring & Incident Response: Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins, or Kubernetes clusters.
- Automate Infrastructure Processes: Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, or equivalent.
- Collaborate Across Teams: Work closely with development, services, and operations teams to ensure a seamless integration between application development and infrastructure.
- Security & Compliance: Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning. Requirements:
- Active SECRET clearance or higher is required for consideration.
- Bachelor’s degree (or equivalent) in computer science or related discipline
- A minimum of two(2) years of experience working with on-premise and off-premise cloud environments.
- Experience with AWS, Azure and / or GCP
- Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C/C++, Ruby, and JavaScript
- Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic resource management frameworks (Apache Mesos, Kubernetes, Yarn)
- Proactive approach to identifying problems, performance bottlenecks, and areas for improvement
Agile/Scrum experience.
Job Tags
Similar Jobs
Jacobs
...environment, we would love to have you as part of our team that is making history, today. It takes big ideas and determination to take NASA's vision and make it reality. That's what we do every day. The COMET contract provides overall management and implementation of...
Baylor Scott & White Medical Center - College Station
...College Station Job Type Permanent Offering Advanced Practice Profession NP Specialty Neurosurgery Job ID 15605891 Shift Details Shift Full Time Days Scheduled Hours 40 Job Order...
Semper Solaris
...Residential Solar Installation, also offering Battery Storage, Roofing, HVAC, and Window services. Veteran-owned and proudly... ...Solaris is looking for an organized and motivated Roofing Field Project Manager for our Roofing department in the. Under the general direction...
Aldi Inc.
...Part-Time Warehouse Associate Outbound Our warehouse operations make sure that products are properly received, selected and delivered to our stores for our ever-growing number of customers to enjoy. While not guaranteed, on average our warehouse employees work 32 hours...
Jewish Long Beach
...Fitness Certification from ACE, AFAA/NASM, or a specialty certification in your area of expertise (e.g., Pilates, Barre, Zumba, Yoga, Cycling, etc.). Current CPR/AED certification (required). Strong communication and interpersonal skills. Flexible schedule...