Summary:
We are seeking a skilled and dynamic DevOps Engineer with a strong foundation in Linux Administration, hands-on experience in Kubernetes, and a passion for learning new technologies. The ideal candidate will bring expertise in DevOps practices and an understanding of High-Performance Computing (HPC) systems, ensuring the delivery of scalable and reliable infrastructure solutions.
Key Responsibilities:
- Design, implement, and manage DevOps pipelines to support continuous integration and deployment (CI/CD) processes.
- Administer and maintain Linux-based systems, ensuring optimal performance and security.
- Deploy, manage, and optimize Kubernetes clusters for containerized applications.
- Collaborate with development and operations teams to streamline workflows and improve system reliability.
- Monitor, troubleshoot, and resolve system and infrastructure issues proactively.
- Contribute to the design and implementation of HPC systems, ensuring they meet the organization's performance requirements.
- Evaluate and adopt emerging technologies to enhance the infrastructure and streamline processes.
- Develop and maintain detailed documentation for system configurations, processes, and workflows.
Skills
Qualifications and Skills:
- Minimum Experience: 2+ years in Linux Administration with a strong understanding of system internals and shell scripting.
- Proven experience in DevOps practices, including automation, configuration management, and CI/CD workflows.
- Hands-on expertise with Kubernetes, including deployment, scaling, and monitoring of clusters.
- Ability to quickly learn and adapt to new tools and technologies in a fast-paced environment.
- Broad understanding of HPC systems and their operational requirements is a strong plus.
- Knowledge of cloud platforms (AWS, Azure, or GCP) is desirable but not mandatory.
- Strong problem-solving skills and the ability to work independently as well as collaboratively.
- Familiarity with tools such as Docker, Ansible, Terraform, and Jenkins.
- Experience with monitoring and logging tools like Prometheus, Grafana, or ELK stack.
- Excellent communication skills to convey technical concepts to non-technical stakeholders.