Job Information
Microsoft Corporation Principal Hardware Quality Engineer in Austin, Texas
Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Teams, OneDrive, and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions.
We are looking for a Principal Hardware Quality Engineer to join the team.
As Microsoft's cloud business continues to grow the ability to deploy new offerings and hardware infrastructure on time, in high volume with high quality and lowest cost is of paramount importance. To achieve this goal, the Hardware, Infrastructure Management, and Fundamentals Engineering (HIFE) team is instrumental in defining and delivering operational measures of success for hardware manufacturing, improving the planning process, quality, delivery, scale and sustainability related to Microsoft cloud hardware.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.
Responsibilities
Lead an effective and robust supplier quality management strategy to ensure the data center hardware is manufactured at the highest level of quality standards.
Lead quality issues at the system leveland conduct debug and failure analysis for any issues including GPU in the Azure fleet and drive resolution with partners and suppliers.
Provide system level technical guidance to SI and various internal stakeholders and lead through complex problems.
Drive the continuous improvement process based on Root Cause Analysis (RCA) and identified opportunities.
Responsible for quality readouts based on the telemetry data analysis, to bring clarity on status, actions across the organization and next steps for issue resolution.
Establish Critical-to-Quality performance metrics to measure and improve product quality.
Act as the voice of quality in the hardware change management process, ensuring quality requirements are considered and met and improved.
Mentors and develops team members, fostering a culture of excellence and innovation.
Embody ourCulture (https://www.microsoft.com/en-us/about/corporate-values) andValues (https://careers.microsoft.com/us/en/culture)
Qualifications
Required Qualifications:
Bachelor's Degree in Reliability Engineering, Electrical Engineering, or related field AND 8+ years technical engineering experience
OR Master's Degree in Reliability Engineering, Electrical Engineering, or related field AND 7+ years technical engineering experience
OR Doctorate Degree in Reliability Engineering, Electrical Engineering, or related field AND 5+ years technical engineering experience.
5+ years of experience in working with modern server architectures and/or their subsystems– including GPU, CPU, AI hardware, Memory, Motherboards and methods for root cause analysis and debugging.
3+ years of experience in leading a large-scale taskforce to resolve technical problems and solutions.
Other Requirements:
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Preferred Qualifications:
Master’s degree in Electrical Engineering, Computer HW, Or System Engineering.
Leadership skills and ability to collaborate with diverse teams and drive a call to action.
10+ years of experience in working with the modern server architectures and/or their subsystems – including GPU, CPU, AI hardware, Memory and methods for root cause analysis and debugging.
5+ years of experience in leading a large-scale taskforce to resolve technical problems and solutions.
Reliability Engineering IC5 - The typical base pay range for this role across the U.S. is USD $137,600 - $267,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $180,400 - $294,000 per year. Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft will accept applications and processes offers for these roles on an ongoing basis.
#AHSI
#SHPE24MSFT
#SCHIE
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .