Sr. Platform Engineer at Comcast responsible for optimizing Kubernetes infrastructure and managing large-scale Spark workloads. Collaborating with teams to ensure performance and reliability in data processing environments.
Responsibilities
Building, managing, and optimizing the underlying infrastructure and tools for large-scale data processing workloads.
Designing systems for collecting metrics (Prometheus) and visualizing data (Grafana).
Architecting and managing the platforms where Spark runs, such as Kubernetes clusters or cloud services like AWS (EKS).
Packaging Spark workloads and integrating them with orchestration systems.
Deploying Infrastructure via Terraform/Ansible and troubleshooting job failures.
Building automation and tools in languages like Python, Java, or Scala, Linux Scripting (Bash).
Implementing and maintaining systems for monitoring, logging, and alerting.
Developing and optimizing the data catalog platform (e.g., Apache Iceberg).
Collaborating with Data Stewards, Analysts, and Scientists to address data needs and issues.
Creating and maintaining documentation for Kubernetes infrastructure and providing training to team members.
Requirements
Bachelor's degree in computer science or a related field, or equivalent experience, typically 7 years in a DevOps or Systems Engineering role.
Expertise in Apache Spark: Deep understanding of Spark architecture, including RDDs, DataFrames, execution hierarchy, lazy evaluation, shuffling, and fault tolerance.
Proficiency in languages used for Spark development and automation, such as Python, Pyspark and Scala/Java.
Proficient in Linux Scripting (Bash).
Proficient in writing SQL.
Experience in CI/CD tools, Github.
Experience in setting up and using observability tools like Prometheus, Grafana etc.
Strong knowledge on Networking Protocols (TCP/IP, DNS, Load Balancer etc.) and hardware components.
Automation via Terraform/Ansible.
Hands-on experience with on-prem and major cloud providers (AWS, Azure, GCP) and container orchestration tools like Docker and Kubernetes.
Hands-on experience setting up IAM, VPC, EC2 etc.
Familiarity with related technologies and formats like Delta Lake, Apache Iceberg, Apache Kafka, Hadoop, and various data storage systems (S3, HDFS, etc.).
Hands-on experience with Databricks, Snowflake, Apache Iceberg, Unity Catalog, or similar tools.
Solid understanding of data lakes and governance.
Experience setting up, maintaining caching layers like Alluxio.
Strong analytical skills for debugging complex distributed systems issues.
Strong communication and collaboration abilities.
Benefits
Best-in-class Benefits to eligible employees
Expert guidance and always-on tools
Support physically, financially and emotionally during big milestones and in everyday life
ML/AI Platform Engineer at ZEIT Verlagsgruppe developing and maintaining AI and ML solutions. Collaborating with various teams on creating scalable and secure platforms.
Engineering leader improving performance and scalability of Lodgify’s Runtime platform. Leading cross - team initiatives and mentoring Engineering Managers to boost productivity and deliver value.
Senior AI Platform Engineer at Honeywell supporting AI solutions for enhanced decision - making in Phoenix, AZ or Charlotte, NC with hybrid work schedule.
Senior Dynamics & Power Platform Developer designing and optimizing solutions across Microsoft’s Power Platform ecosystem. Collaborating within a dynamic technology team to deliver enterprise - grade solutions.
Senior Platform Engineer responsible for database health and performance in Ovoko's data infrastructure. Building, maintaining, and optimizing systems within a fast - scaling e - commerce environment.
Data Platform Engineer responsible for reviewing and stabilizing platform implementations at AVL Maroc SARL AU. Involves mentoring the development team and ensuring best practices in data engineering.
Platform Engineer at IGT specializing in build systems, continuous integration, and automation. Enhancing software delivery processes while collaborating with engineering teams.
Power Platform Engineer responsible for secure, scalable Power Platform solutions. Collaborating with technology teams and ensuring compliance within a global law firm environment.
Senior Principal Platform Engineer at Navy Federal designing and maintaining scalable infrastructure platforms for application delivery and operations. Leading complex projects with significant business impact in IT systems administration.