This is a contract role, 6-12 months. Candidate must be able to work hybrid in our Austin, Texas office.
Here at Arm, we are building the Future of Computing. Together. For Everyone. This is a technical engineering role based in the US and part of the rapidly growing Arm Infrastructure Line of Business. Our Neoverse cores are leading a technology disruption. We need a Senior Platform Validation Engineer to work in the Infrastructure Line of Business with stakeholders from different groups including but limited to Arm Solution Engineering, Central Engineering, Infrastructure LOB partners and support the growing demand for Neoverse CSS in the Cloud and Datacenter. You will be an individual contributor engaging and collaborating with internal and external partners to enable and support Arm Infrastructure hardware software ecosystem. Join us as we deliver solutions across Cloud, Enterprise, Edge, 5G, and Networking segments.
Responsibilities:
- Hardware platform testing, pre and post-silicon validations, methodologies and tools for bringup and beyond, debugging particularly SoC peripherals.
- Develop and implement system level tests for SoC and peripherals, hardware system testing, reporting progress and issues, and conducting operating system tests with a deep understanding of Kernel concepts and internals (Processor, Memory Management, Virtualization, Scheduling, Networking, Security, etc.).
- Excel in debugging system and silicon issues using scopes, logic-analyzers, debuggers, and other advance debug technics. Debug and resolve hardware issues during the bringup and beyond.
- Develop and enable system-level stress workloads on Linux/ distros for various Arm-based Infrastructure devices.
- Maintain scalability and deployment of production firmware, validation OS and test suite for device screening process.
- PCIe/ CXL and DDR interfaces with end point hardware validation including but not limited to DRAM, SSDs, NICs, DPUs, and GPUs etc. Driver development for end point hardware for various systems and configurations.
- Implement and build up system process to package, maintain and deploy validation test suite to qualify and enable validation and OCP reference and other enterprise platforms.
- Collaborate with post-silicon validation team and others to engage in debug and test enablement effort on both validation and reference platforms.
- Listen to different perspectives, evaluate, persuade, and carefully craft your work to deliver impactful results.
Required Skills and Experience:
- A degree or equivalent experience in Computer Science, Electrical Engineering, or a related field.
- 5+ years of experience specifically focus on validation, configuring, tuning, and troubleshooting data center/ enterprise servers.
- Experienced with GPU, SoC architecture and microprocessor cores, x86 platform architecture, AI/HPC platform architecture is highly desired.
- Demonstrated experience as a Firmware Engineer, Hardware Test Engineer, and/or similar role.
- Proficient in creating new test cases using programming languages such as C, C , C/C , assembly, python, and scripting languages; capable of both manual and automated test case execution.
- Strong skills related to debug and root-cause of silicon issues. Excellent problem-solving skills and attention to detail.
- Proven time-management skills to prioritize and accomplish your work and will demonstrate excellent verbal and written communication.
- Strive to achieve wining solutions while you build positive relationships which are founded upon mutual trust, respect for others, open communication and sharing of information and success.
- Familiar with Agile and DevOps practices, including Continuous Integration (CI), Continuous Deployment (CD), and Continuous Testing (CT).
"Nice To Have":
- Knowledge of the Arm SystemReady compliance program.
- Experience with Arm system architecture specifications, such as the Base Boot Requirements (BBR), Base System Architecture (BSA), the Server Base System Architecture (SBSA), and the Server Base Manageability Requirements (SBMR).
- Familiarity with industry standards such as UEFI, DMTF, OCP, OPI, DDR, PCIe, CXL, and UCIe.
- Experience with the install and debug of Linux and/or Windows operating systems for use in Cloud and Datacenter solutions.
- Demonstrated knowledge of or experience with Cloud and Datacenter infrastructure and software solutions.
- AI/HPC software stacks, ROCm, TensorFlow or Pytorch is a plus with regards to AI application.