Production Support Engineering LMTS

Own Company

Own Company

Customer Service

San Francisco, CA, USA

Posted on May 2, 2026

Description

Opportunity & Product

Join an agile team with deep startup roots. We operate as a high-velocity ‘startup-within-Salesforce,’ following our recent acquisition. You’ll be managed by the same founders and engineers who built the original company, offering the autonomy of a small team backed by the global scale and trust of Salesforce.

We have successfully moved past the "0 to 1" phase. We have a product that works, customers who love it, and the backing of Salesforce. Now, we are entering the "1 to 100" phase: scaling our architecture to handle global demand, hardening our systems for enterprise-grade resilience, and integrating deeply with the Agentforce ecosystem. This is your chance to help lead that transition.

What You’ll Do

As a Production Support Engineer (LMTS), you will be a senior technical lead within our embedded reliability team. You aren’t building the foundation alone—you’ll work alongside a group of engineers and product owners to ensure the Agentforce for Supply Chain platform is the most reliable AI-powered engine in the industry.

This is a role for an engineer who loves the "scaling" problem. You will focus on production excellence, performance tuning, and infrastructure automation. Because you are embedded in the product organization, you’ll have a seat at the table during design reviews, ensuring that as we add new agentic capabilities, they are built to scale from day one.

Responsibilities

  • Scaling & Reliability: Own the reliability roadmap for major product areas, working to transition our systems from startup-speed architectures to highly-available, global-scale enterprise solutions.

  • Collaborative Leadership: Partner with PMTS-level engineers to refine our infrastructure strategy, contributing senior-level perspectives on system design, capacity planning, and bottleneck identification.

  • Infrastructure as Code: Maintain and evolve our automated environments, focusing on making our "infrastructure-as-plugins" model more robust and developer-friendly.

  • AI Operations (AIOps): Support the scaling of our AI/ML infrastructure, ensuring our models have the GPU resources and data pipelines required to deliver real-time supply chain insights.

  • Production Excellence: Lead the "1 to 100" hardening of our observability stack. You won’t just respond to incidents; you’ll build the tooling that prevents them and the telemetry that explains them.

  • Performance Engineering: Deep-dive into SQL optimization, API latency, and cross-service communication to ensure our data-intensive supply chain platform remains performant under heavy load.

  • AI-First Workflow: Lean into the future of engineering by using AI tools (Claude Code, etc.) to automate routine operational tasks and accelerate infrastructure delivery.

  • Contribute to building and maintaining the shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably.

  • Critically evaluate code (Human or AI-generated) for correctness, quality, security, and performance

Required Qualifications

  • 5+ years of experience in SRE, Production Engineering, or Backend Engineering with a heavy focus on operations and scale.

  • Proven Scaling Experience: You have previously helped take a product through a high-growth phase (the "1 to 100" journey), dealing with the technical debt and architectural shifts that come with it.

  • Technical Breadth: Strong proficiency in Kubernetes, Terraform/OpenTofu, and AWS/GCP/Azure.

  • Coding Mastery: Ability to write and review production-level code in Golang, TypeScript, or Python—you view automation as a software engineering problem.

  • Systems Expert: Deep understanding of distributed systems, including how to debug complex interactions between microservices, databases, and AI agents.

  • Low-Ego Collaboration: Experience working within a senior team of Principal engineers, capable of both leading specific initiatives and supporting the broader group’s technical vision.

  • A demonstrated, genuine AI-first approach to engineering. Using AI to move faster, build fluency across the stack, and contribute well beyond your core specialty.

  • Experience using AI tools (e.g., Claude Code, GitHub Copilot, Codex, Cursor, etc.) in development workflows

  • Advanced prompt engineering skills and the ability to write precise, structured prompts and cultivate the system context that makes AI outputs reliable, secure, and production-ready.

    Preferred Qualifications

    • M.S. in Computer Science or equivalent practical experience.

    • Database Specialist: Strong experience with PostgreSQL at scale (partitioning, indexing, query tuning).

    • Distributed Systems: Advanced knowledge of microservice orchestration and durability patterns, including hands-on experience with Temporal for workflow reliability and service mesh for secure, observable service-to-service communication in high-growth SaaS environments.

    • Supply Chain/Logistics: Experience with the unique data constraints and reliability requirements of manufacturing or global logistics.

    • Salesforce Knowledge: Familiarity with Salesforce infrastructure, Hyperforce, or Data Cloud is a plus.

    • Public Cloud Expertise: Deep knowledge of networking, security, and identity management within major cloud providers.

    For roles in San Francisco and Los Angeles: Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.