|Location:||Remote - AU/NZ only|
|Job Type:||Full Time|
We are the Azure Reliability team. We are a multidisciplinary engineering organization tasked with leading reliability holistically across the Azure platform – our goal is to Make Azure the World’s Safest and Most Reliable Cloud.
The Azure Incident Management Team are responsible for driving complex multiservice outages to resolution in a timely and effective manner through coordination of internal Azure service teams and key stakeholders. This requires effectiveness under pressure, broad technical, analytical, and problem-solving expertise, the ability to confidently collaborate with varied partners, and great written and spoken communication.
As an Azure Security Incident Manager, you also are responsible for building and evolving the practice of Security Incident Management across Azure, closely partnering with Azure service teams and key stakeholders to drive security remediation programs to closure. As part of this work, you capture insights, attend Post Incident Reviews, and develop processes to drive platform improvements globally. You will also be expected to drive/contribute to cross team projects resulting from incidents you manage. Your work in this role will use cutting edge technologies and industry concepts to directly prevent customer impact arising from security events.
Our people have a wide variety of professional experiences, and we are interested to meet both candidates with traditional engineering backgrounds and those without. Some of us are industry veterans, while others joined quite recently. Together we form a varied and talented team, and we want to continue building our diversity with our new hires. We strongly believe that diversity and an environment where everyone can feel safe to contribute their own insights is the key to making the best workplace possible. We know that the best workplace makes the best products and services: not only is it the smart thing to do, but it is also the right thing.
We are not looking for people who know it all, we are looking for people who want to learn it all.
If you are excited by this type of challenge and you love to work in groups of people who are similarly excited: come join us! We value the input of people who aren’t afraid to be learning all the time and embrace mistakes as they continuously improve both our services and themselves.
- Be passionate about cloud, customer focused, and a technologist at heart.
- Effectively coordinate production incidents across multiple organizations within Microsoft.
- Drive continuous swift momentum towards mitigation, asking leading technical questions, offering suggestions around troubleshooting direction.
- Provide excellent incident communication to stakeholders.
- Identify opportunities and take ownership for automation and/or continuous improvement of Incident Management process and best practices.
- Lead and/or participate in Post Incident Review and Problem Management meetings with key stakeholders and service owners to review events and opportunities for ongoing improvement.
- Work within a “Follow the Sun” global shift rotation, covering local day-time hours, including holidays and weekends, on a rotational basis.
- Bachelor’s degree or 5 years equivalent work experience
- Must be technically literate and be able to articulate technical issues in a meaningful way to both engineers and executive level management.
- The ability to communicate confidently and clearly on conference calls, in meetings and via email, at all levels of the organization is essential.
- Crisis management skills: able to set priorities, pursue multiple threads at the same time, accurately reflect current state and drive towards desired state.
- Ability to maintain calm during stressful situations; demonstrated leadership skills under fast-paced, highly dynamic situations.
- Strong design, scripting, problem solving and debugging skills.
- Strong collaboration skills: working across teams and organizations is necessary to be successful.
- Experience managing complex projects spanning multiple teams and organizations.
- Must be able to participate in a multi-location on-call rotation.
- Security related experience detecting, collecting, and mitigating threats at scale.
- Candidates must pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Preferred Qualifications: Create a list qualifications / skill that could be beneficial to this role, but are not required.
- Software/System Engineering/Solution Development/Technical Program Management experience
- Knowledge of Microsoft Azure, AWS, GCP or similar cloud computing platforms
- Expertise building, delivering, and supporting extensible, high scale service platforms.
- Expertise in debugging and remediating issues in large-scale distributed systems
- PMP, ITIL, Six Sigma with demonstrated application towards service improvement.
- Confident in collaborating, building trust and respect with people outside of the immediate team.
- Excellent project/program management skills with great attention to detail
- Demonstrates technical excellence by applying engineering principles to solve complex problems.
- Strong reporting and analytics experience
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.