The Institute for Health Metrics and Evaluation (IHME) is an independent research center at the University of Washington focused on expanding the quantitative evidence base for health. A core research area for IHME is the Global Burden of Diseases, Injuries, and Risk Factors (GBD) enterprise. A systematic, scientific effort to quantify the comparative magnitude of health loss due to diseases, injuries, and risk factors by age, sex, and geography over time, the GBD is the largest and most comprehensive effort to date to measure epidemiological levels and trends worldwide. We are expanding the GBD to include both forecasts and detailed geospatial estimates, which will require data pipelines and analytic tools capable of processing hundreds of terabytes of data in parallel.
IHME is looking for an Infrastructure Software Engineer to work closely with our Infrastructure DevOps team to develop and support the processes and tools necessary for researchers, analysts, and developers to produce estimates in a timely and efficient manner. The Infrastructure software engineer will play an important role in creating a more flexible, automated, and streamlined computing environment to support the increasingly complex computational needs of IHME's research. They will work to understand users' computational needs and then collaborate with scientific computing and research software engineers to design and implement new systems that will provide a powerful platform for users to experiment and scale their projects while maintaining software development best practices across the institute.
A successful candidate will be passionate about infrastructure as code, automation, software build tools and big data, able to communicate with users of varying levels of expertise to understand their needs and experienced in the full software development life-cycle. Position is contingent upon grant funding.
- Identify common needs across research teams and lead the development of shared software libraries to ensure quality and standardization.
- Build API and Web services to create infrastructure access tools for the researchers and engineers.
- Work closely with developers to troubleshoot infrastructure and load issues and implement solutions in both development and production environments.
- Implement configuration and automation tools to simplify the process of deploying and upgrading software across our infrastructure.
- Aid in bridging the gap between the development of new scientific software and its implementation as production-ready systems, ensuring that new experimental software is developed with scale in mind and facilitating that growth.
- Define and develop system reporting and infrastructure health monitoring tools to be used by the Scientific Computing team.
- Keep up with industry trends to ensure we are using the best tools and services.
- Provide input to timelines to deliver both iterative milestones and completed products
- Develop, use, and train others on best practices for deployment and upgrades.
- Become familiar with the different components of IHME’s analytic process and their purpose.
- Other duties as assigned.
Bachelors in Computer Science or related degree plus 3 year related work experience or an equivalent combination of education and experience
- Experience in writing new code, modifying existing code, application design, and relevant languages. (Java, Bash, C, Python, PHP)
- Experience designing modular, reusable code.
- Experience operating in a large scale environment.
- Strong Linux command line skills
- Experience with build Continuous Integration and deployment systems such as Jenkins/Hudson, Make, Maven, Ant, Docker, Rancher, Kubernetes, VMWare, Salt, Terraform
- Experience with source control systems such as Git, Perforce, Subversion, Mercurial
- Experience with SQL and MySQL
- Ability to use a wide variety of open source technologies and cloud services
- Familiarity with deployment and automation tools and concepts
- Demonstrated ability to innovate and work independently and successfully in a fast-paced, deadline driven environment
- Experience working and communicating effectively with academic and scientific researchers.
- Strong sense of focus and attention to detail
- Experience with Hadoop, Spark, or other big data solutions
- Knowledge of cluster management and scheduling systems like SGE/UGE, Torque or Slurm
- Experience with monitoring and metrics gathering tools (ELK stack, Nagios, Sensu)
Conditions of Employment: Weekend and evening work may be required. On-call responsibilities may be assigned some evenings and weekends every-other month.