Microsoft Corporation Senior Software Engineer in Redmond, Washington
Singularity is a globally distributed, multi-tenant service that provides robust, cost-effective and competitive AI infrastructure (compute, networking and storage) for AI training and inferencing. At Singularity, our mission is to unlock the possibility of Artificial Intelligence and change the future of computing. We believe that building a planet-scale AI Supercomputer from the ground-up which addresses the fundamental pain-points of data scientists and AI practitioners and takes AI to the unprecedented scale is an opportunity of a lifetime. If you share the same dream as us, come join us!
As a member of the Singularity Infrastructure and Fundamentals team, you will build the service and engineering infrastructure for Singularity, and focus on solving difficult problem associated with service availability, observability, performance, scalability, security and compliance. In this role you will have ample opportunity to work with engineering leaders from within and across teams to define our mission, chart our path and enable the service for unparalleled growth in the coming months and years.
Design, implement and deliver service infrastructure to support service expansion in regions and clouds; strategize and codify capacity management to meet customer demand.
Deliver world-class monitoring systems and telemetrypipelines to enhance service and job observability for both end-users and operators.
Design and implement release and deployment infrastructure to scale service deployments to thousands of clusters while continue to increase our release cadence and agility.
Design and build change management systems that orchestrate and automatically ensure the safety and correctness of any change made to the production system.
Codify security and compliance requirements by building and strengthening system defenses against malicious attacks and exploits.
Use data-driven and machine learning approaches to build quality and operational insights; leverage insights to drive quality and operational excellence across pre and post production pipelines.
Design and implement performance and scalability infrastructure that focuses on methodically calibrating data at scale to ensure meaningful characterizations and comparisons.
Leverage performance and profiling tools to identify hot spots and bottlenecks across hardware and software boundaries: from CPU, GPU, microcode, OS, networking to product code and drive end-to-end job performance.
5+ years of experience with coding in one of C#, Java, C or C++.
At least5years of experience building and shipping production software or services.
Experience with improving service operations or engineering fundamentals.
Excellent collaboration skills.
Proven ability to create componentized and well-architected software
Prior experience in building large scale cloud services, distributed systems, or operating systems
Understanding of TensorFlow and PyTorch runtimes - a plus
Experience programming GPUs (graphics processing units), CUDA/cuDNN/NCCL - a plus
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:Microsoft Cloud Background Check. This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form (https://careers.microsoft.com/us/en/accommodationrequest) .
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.