Description
Overview of the Role:
As a Software Engineer on the Data Cloud Big Data Compute Platform team at Salesforce, you are part of a team that owns the compute infrastructure powering large-scale Spark workloads. You work on optimizing core Spark performance, solving complex distributed systems challenges, and building scalable AI infrastructure — including smaller language models. You collaborate closely with engineers across Data Cloud to deliver reliable, high-performance systems at scale.
Responsibilities:
Design, develop, and optimize compute infrastructure supporting large-scale distributed Spark workloads.
Solve complex distributed systems challenges including scalability, fault tolerance, and consistency trade-offs.
Build and maintain scalable AI infrastructure, including smaller language model systems.
Partner with engineers across Data Cloud to improve system observability, telemetry pipelines, and monitoring platforms.
Required Qualifications:
4+ years of backend software development experience building large-scale distributed systems.
Strong programming skills in Java; proficiency in Python or Rust is a plus.
Hands-on experience with cloud platforms (AWS or GCP) and Kubernetes/container orchestration.
Experience with Agile and Test-Driven Development (TDD) methodologies.
Preferred Qualifications:
Experience with distributed data technologies such as Spark, Flink, Kafka, Trino, or HBase.
Familiarity with observability systems, telemetry pipelines, or monitoring platforms.
Knowledge of AI/ML infrastructure and language model systems.