|Location:||Remote - AU/NZ only|
|Job Type:||Full Time|
The vision of the Azure Site Reliability Engineering group is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring together significant and complementary capabilities with tooling, infrastructure, monitoring and insights in new ways to increase our perspective. Our diversity of knowledge and experience comes together for the benefit of our users, our colleagues, our business, and ourselves.
If you enjoy analysing complicated problems, coming up with creative solutions, and working in focused teams to build reliable and novel solutions, we want you to join our Site Reliability Engineering (SRE) team.
The Azure SRE team is looking for engineers with broad experience in distributed systems to join their team. SREs are people who are curious and take engineering-based approaches to solve operations problems: we like infrastructure, we like seeing how big, complicated things work, and most importantly, we gain great satisfaction from making them better.
You will be working across Azure with a focus on increasing quality, performance, and reliability of the most essential services within Azure.
Our team has a wide variety of backgrounds, from Computer Science, Mathematics, and Engineering to and Physics, Philosophy, Psychology, and English. Our diversity of knowledge and experience comes together for the benefit of our billions of daily users, our business, our colleagues, and ourselves.
If you are excited by this type of challenge, and you love to work in groups of people who are similarly excited, come join us. We value the input of people who aren't afraid to be learning all the time, who embrace mistakes as they show the way forward, and those who are excited to continuously improve both services and themselves. We strongly believe that diverse experiences, backgrounds, and an environment where everyone can feel safe to contribute their own insights in a data-driven, objective, and supportive way is the key to making the best workplace possible, and the best workplace makes the best products and services. Not only is it the smart thing, it's the right thing.
- Work across Azure’s internal systems and services to design, develop, and improve platforms and processes that result in improved end-to-end reliability and maintainability for all.
- Work across Azure SRE to drive tools that help deliver insights and automation to simplify the complex world of planetary scale services.
- Communicate effectively and partner well with other disciplines of the project team to deliver high quality solutions from ideas to production code.
- Write clean and thorough design documents and code that exemplify quality, simplicity, and maintainability.
- Be a mentor for design reviews, code, and test cases.
- Design systems that prioritize the customer perspective and experience.
- Quickly adapt and apply new technologies, tools, methods, and processes from both internal and external sources.
- Design and influence design, implementation, and architectural direction.
- Drive architectural consolidation and simplification.
- Exemplify the Microsoft values of leveraging the work of others and helping others be successful through your behaviours and actions.
- Bachelor of Science, Computer Science degree, or 5+ years in software development.
- 3+ years of software development in distributed systems or automation.
- 3+ years of experience using languages such as C, C++, C# or Java (others are acceptable).
- 2+ years of design, build, or implementation of distributed service health and telemetry.
- Collaboration to accomplish large projects with excellent communication and demonstrated initiative.
- Awareness of, and ability to reason about, modern software & systems architectures, including load-balancing, queueing, caching, distributed systems failure modes, and microservices.
- Associated troubleshooting skills, including the ability to follow service dependency chains across arbitrary network steps.
- Experience running large scale cloud systems.
- Ability to analyse, understand, and solve complex problems by leveraging and extending existing technology.
- Willingness and ability to respectfully challenge the status quo.
- Able to operate in ambiguity and drive clarity through partnerships.
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.