Sr Sde, Agi Inference- Genai

2 ore fa


WorkFromHome, Italia Amazon A tempo pieno

Overview Job ID : | Amazon.com Services LLC The Sensory Inference team at AGI is a group of innovative developers working on ground-breaking multi-modal inference solutions that revolutionize how AI systems perceive and interact with the world. We push the limits of inference performance to provide the best possible experience for our users across a wide range of applications and devices. We are looking for talented, passionate, and dedicated Inference Engineers to join our team and build innovative, mission‑critical, high‑volume production systems that will shape the future of AI. This role offers the exciting chance to work in a highly technical domain at the boundary between fundamental AI research and production engineering such as Quantization, Speculative Decoding, and Long Context for inference efficiency. Responsibilities Develop high-performance inference software for a diverse set of neural models, typically in C / C++ Design, prototype, and evaluate new inference engines and optimization techniques Participate in deep‑diving analysis and profiling of production code Collaborate closely with research scientists to bring next‑generation neural models to life Partner with internal and external hardware teams to maximize platform utilization Work in an Agile environment to deliver high‑quality software against tight schedules Hold a high bar for technical excellence within the team and across the organization Basic Qualifications 5+ years of non‑internship professional software development experience 5+ years of programming with at least one software programming language experience 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience Experience as a mentor, tech lead or leading an engineering team Experience with inference frameworks such as PyTorch, TensorFlow, ONNXRuntime, TensorRT, LLaMA.cpp, etc. Preferred Qualifications 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience Experience with inference frameworks such as PyTorch, TensorFlow, ONNXRuntime, TensorRT, LLaMA.cpp Proficiency in performance optimization for CPU, GPU, or AI hardware Proficiency in kernel programming for accelerated hardware using programming models such as CUDA, OpenMP, OpenCL, Vulkan, and Metal Experience with latency‑sensitive optimizations and real‑time inference Knowledge of model compression techniques (quantization, pruning, distillation, etc.) Experience with LLM efficiency techniques like speculative decoding and long context Equal Opportunity & Inclusion Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status. Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. Accommodation & Benefits Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country / region you’re applying in isn’t listed, please contact your Recruiting Partner. Compensation & Benefits The base salary range for this position is listed below. Your Amazon package will include sign‑on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for supplemental life plans, EAP, mental health support, medical advice line, flexible spending accounts, adoption and surrogacy reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at USA, CA, Sunnyvale - 193,300.00 - 261,500.00 USD annually End of posting #J-18808-Ljbffr



  • WorkFromHome, Italia Amazon A tempo pieno

    A leading technology company is seeking Inference Engineers to develop high-performance inference software for neural models. Responsibilities include designing new inference engines, collaborating with scientists, and maximizing platform utilization. Candidates should have over 5 years of software development experience, proficiency in C/C++, and...


  • WorkFromHome, Italia Amazon A tempo pieno

    A leading e-commerce company in Pisa is seeking a Sr. Applied Scientist to work on innovative GenAI solutions for enhancing the Trustworthy Shopping Experience. The ideal candidate will have expertise in machine learning, programming in Java or Python, and experience with neural networks. Responsibilities include designing experimental approaches for complex...


  • WorkFromHome, Italia Amazon A tempo pieno

    Amazon Web Services (AWS) is building a central pipeline of Software Development Engineer (SDE) talent for anticipated roles in 2026. This requisition supports hiring across all AWS SDE positions, from fungible SDE roles to specialized engineering positions in areas including Embedded Systems, Game Development, Compiler Engineering, Artificial...


  • WorkFromHome, Italia Amazon A tempo pieno

    Sr. Applied Scientist, Trustworthy Shopping Experience (TSE) Are you excited about solving complex business problems at scale through GenAI? Are you fascinated about the application of Agentic AI and LLMs on real‑life scenarios? Are you looking to invent solutions that drive Autonomous Artificial Intelligence? If so, we are looking for you to fill a...

  • Sr. Machine Learning

    2 settimane fa


    WorkFromHome, Italia Amazon A tempo pieno

    Job ID: | Amazon.com Services LLC The Product: AWS Machine Learning accelerators are at the forefront of AWS innovation and one of several AWS tools used for building Generative AI on AWS. The Inferentia chip delivers best-in‑class ML inference performance at the lowest cost in cloud. Trainium will deliver the best-in‑class ML training performance with...


  • WorkFromHome, Italia Amazon A tempo pieno

    Sr. Solutions Architect, Guidance Lead, AWS Cloud Optimization Job ID: | AWS EMEA SARL (UK Branch) AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team...


  • WorkFromHome, Italia Amazon A tempo pieno

    Senior Applied Scientist, Generative AI Innovation Center Job ID: | Amazon (China) Holding Company Limited - D24 Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply Generative AI algorithms to solve real world problems with significant impact? The Generative AI Innovation Center helps AWS customers implement...


  • WorkFromHome, Italia Amazon A tempo pieno

    ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium. The Acceleration Kernel Library team is at the forefront of maximizing...