About the Role:We are seeking an experienced Architecture Expert to lead the design, strategy, and evolution of our High-Performance Computing (HPC) environment. This role involves defining the architectural vision for our HPC infrastructure, guiding the selection and implementation of cutting-edge technologies (compute, storage, interconnects, schedulers), and ensuring the platform effectively supports demanding computational research, simulations, and data analysis workloads relevant to our software and industrial applications.
Responsibilities:Lead the architectural design, planning, and implementation of scalable, reliable, and efficient HPC systems. Develop and maintain the strategic roadmap for HPC infrastructure, including hardware, software, storage, and networking components. Provide technical leadership and mentorship to HPC architects, engineers, and administrators. Evaluate emerging HPC technologies and trends, making recommendations for adoption. Collaborate with researchers, data scientists, software engineers, and application owners to understand computational requirements and design appropriate HPC solutions. Define standards and best practices for HPC system configuration, job scheduling, resource management, and performance optimization. Ensure the security and integrity of the HPC environment. Oversee architectural aspects of system upgrades, expansions, and technology refreshes. Work with vendors to evaluate and select HPC hardware and software components. Contribute to capacity planning and performance tuning efforts.
Qualifications:Minimum Qualifications:Preferred PhD/Master degree or Bachelor's degree in Computer Science, Engineering, Physics, or a related computational field.8+ years of experience working with High-Performance Computing (HPC) systems.3+ years of experience in designing, architecting, or leading technical implementations of complex HPC environments. Deep understanding of HPC architectures, parallel file systems (e.g., Lustre, GPFS), high-speed interconnects (e.g., Infini Band, Slingshot), job schedulers (e.g., Slurm, LSF), and MPI/parallel programming concepts. Experience leading technical projects or initiatives. Experience applying HPC within specific industrial contexts (e.g., Energy, Aerospace, Life Sciences, Finance). Experience with cloud-based HPC or hybrid HPC environments. Programming skills (e.g., Python, C/C++, Fortran) and experience with scientific computing applications.