Lavori attuali relativi a AI Engineer - Napoli, Campania - Aieng

  • Senior Software Engineer

    4 settimane fa


    casalnuovo di napoli, Italia Stellar AI A tempo pieno

    OverviewAt Stellar AI (Permanent / Contractor). Remote policy: Global remote. Expires at .We are seeking experienced Software Engineers to contribute to projects across a wide range of technologies and programming languages, including JavaScript, Python, Go, C++, Ruby, and more.This is an open-ended contract opportunity, structured around project-based work...

AI Engineer

4 giorni fa


Napoli, Campania, Italia Aieng A tempo pieno

Aieng
  is a
high-growth innovative startup
focused on delivering next-generation engineering solutions. We bridge the gap between ambitious vision and technical reality by investing heavily in research, development, and top-tier talent. Currently in a dynamic phase of expansion, we pride ourselves on our agility and our ability to navigate complex industrial challenges. At Aieng, we don't just follow industry trends—we aim to set them. Join us as we build the infrastructure of tomorrow.

Your Role

As an
AI Engineer (Inference & RAG Architect)
, you will design, optimize, and operate
local, production-grade LLM systems
, owning the full lifecycle from low-level inference performance to high-level semantic memory and agent orchestration.

Key Responsibilities

  • Architect and optimize
    high-throughput LLM inference pipelines
  • Design and implement
    enterprise-grade RAG systems
  • Benchmark, validate, and fine-tune open-source models for domain-specific workloads
  • Build
    agentic AI systems
    with deterministic, auditable behavior
  • Ensure scalability, observability, and reliability of AI systems in production

Technical Skills (Hard Skills)

LLM Inference & Systems Optimization

  • Advanced configuration and tuning of
    vLLM
    ,
    Ollama
    , and
    TGI
  • Deep understanding of
    PagedAttention
    , continuous batching, and KV-cache optimization
  • Model
    quantization
    techniques (INT8, INT4, GPTQ, AWQ, GGUF)
  • GPU scheduling
    , VRAM optimization, multi-GPU and multi-node inference
  • CUDA-aware performance tuning (conceptual and practical)
  • Deployment of LLMs in
    on-prem, edge, and air-gapped environments

Retrieval-Augmented Generation (RAG) & Knowledge Systems

  • Design of
    multi-stage RAG pipelines
  • Integration with
    vector databases
    (Qdrant, Weaviate, FAISS)
  • Hybrid retrieval strategies (dense, sparse, BM25)
  • Re-ranking
    using cross-encoders and LLM-based rankers
  • Metadata-driven access control and document-level security
  • Chunking, embedding strategy design, and context window optimization

Model Lifecycle & Evaluation

  • Evaluation and benchmarking of models such as
    Llama 3, Mistral, Phi, Mixtral
  • Domain adaptation via
    LoRA / QLoRA
  • Prompt and system prompt engineering with reproducibility guarantees
  • Offline and online evaluation frameworks (faithfulness, groundedness, latency, cost)
  • Versioning and rollback strategies for models and prompts

Agentic Architectures & Orchestration

  • Design of
    agent-based systems
    with tool use, memory, and planning
  • Development using
    Semantic Kernel
    , LangGraph, or custom agent frameworks
  • Deterministic execution, guardrails, and fallback strategies
  • Implementation in
    Python and C#

MLOps, DevOps & Observability

  • Containerization with
    Docker
    and orchestration via
    Kubernetes
  • CI/CD for AI systems
  • Monitoring of latency, throughput, hallucination rates, and failures
  • Logging, tracing, and observability for LLM pipelines
  • Infrastructure-as-Code (Terraform or equivalent)

Soft Skills

  • Strong
    system-level thinking
    and architectural mindset
  • Obsession for performance, reliability, and correctness
  • Ability to translate business requirements into technical architectures
  • Clear communicator in cross-functional, high-complexity environments
  • Ownership mentality and engineering rigor

Experience & Education

  • 3+ years of experience in
    AI Engineering, ML Systems, or Platform Engineering
  • Strong academic background in
    Computer Science, Engineering, or related fields
  • Proven experience deploying
    self-hosted LLMs
    in production
  • Exposure to enterprise constraints (security, compliance, scalability)