|Job Type:||Full Time|
Amazon is looking to hire highly motivated, best in class Network Operations Engineers for our Network Operations team to help drive the stability and sustainability of our next-generation networks and to discover innovative ways to automate and scale our network as we expand.
Amazon doesn’t have a traditional NOC team. Instead, we are relying on our automation tools on the first time resolution for the raised tickets. Our Engineers are most concerned with how to fix automation failures and improve our workflows to increase our software tools ability to automatically solve the problems as fast as possible. Amazon Operations work under the follow the sun model throughout the Sydney, Seattle and Dublin offices. No shift required.
The ideal candidate will be expected to provide high quality second and third tier network and network event management for Amazons worldwide network operations. He/she will demonstrate an excellent knowledge of networking and scripting, and have experience in participating in operational network support for large-scale IP networks. He/she will have a proven track record of success in driving complex issues to resolution autonomously and/or collaboratively. A love for working with new technologies and pushing the envelope on existing technology is required.
This is an excellent opportunity to join Amazons world class technical teams, working with some of the best and brightest engineers while also developing your skills and furthering your career within one of the most innovative and progressive technology companies anywhere.
Here are some of your daily responsibilities:
- Help managing the largest network in the world
- Provide critical network operations support to AWS’s internal and external customers to diagnose, remediate and trouble shoot large-scale networking events
- Support and maintain our next generation data-center networks - Deliver simple, sustainable and repeatable solutions and processes to mitigate complex issues
- Cooperate with our broader Technical Operations organization to reduce operational burden by enhancing auto-remediation tools and workflows.
- Work closely with our Network Engineering teams to ensure fast, smooth roll-out of new designs and products
- Prompt standards across all layers of the network and ensure that we are fully compliant to those standards and policies
- Monitor and manage communications during large-scale events utilizing an established Event Management process - Collaborate with level two and level-three resolvers
- Identify and troubleshoot recurring platform issues with a focus on removing manual actions from mitigation activities
- Effective escalation of issues to mid and senior-level management for full resolution
- Create and review documentation and process regarding recurring issues, new standard operating procedures, knowledge transfer material, etc.
- Troubleshoot networking, routing and interconnectivity issues, including troubleshooting of network device configuration and low level application interaction
- Complete customer requests via internal trouble ticketing systems
- Identify and take ownership for opportunities to automate repeatable networking tasks through creation and maintenance of scripts and tools and enhancements of existing workflows
- Excellent Ethernet and IP networking knowledge and extensive experience in the application of IP protocols. - Substantial background in large scale datacenter network implementations and support.
- Significant past experience with the following protocols and concepts: BGP, OSPF, TCP/IP, TCP/IP Flow control, UDP, HSRP/VRRP/GLBP, DNS, ARP, HTTP, SSL, STP, Trunking, & Port Channeling, IPsec and TLS
- Software development/scripting skills are required - desirable Python or shell (bash) but other languages are welcome.
- Must be comfortable working in a Linux/Unix environment
- Knowledge or awareness of Operational Excellence concepts is a plus
- Excellent analytical skills
- At least three years relevant experience in a large-scale network operations environment is preferred.
- In-depth knowledge of and experience with major Internet routing protocols, BGP, OSPF
- Knowledge of Agile/SCRUM and/or feeling comfortable with those principles