DevOps Engineer for VarSome.com building and managing cloud and on-premises infrastructure. Collaborating with software engineers and handling CI/CD pipelines and network architectures.
Responsibilities
Design, build and maintain our cloud and on-premises infrastructure using IaC with Terraform and Configuration Management with Ansible
Work closely with software engineers to design, deploy and manage applications running on Linux servers in GCP, OCI and ONPREM environments
Design, implement and manage secure and scalable network architectures, including VPCs, subnets, firewall rules and load balancing in our cloud environments
Develop and maintain our CI/CD pipelines using GitHub Workflows or Cloud Build for seamless application delivery
Manage code repositories and collaboration through GitHub
Proactively troubleshoot production issues, perform root cause analysis and implement remediation fixes to ensure business continuity with minimum downtime
Contribute to project planning, including task estimation and the creation of comprehensive technical documentation
Continuously investigate and suggest improvements to enhance system performance, scalability, and cost effectiveness
Collaborate with peers and architects to ensure compliance with company standards and security best practices
Design, implement, and manage scalable logging, monitoring, and alerting systems
Utilize Python and Bash scripting to automate operational tasks and improve efficiency
Configure and manage our API ecosystem using APISIX Gateway
Provide on call support according to scheduled rotation
Requirements
University degree in Computer Science or a related field
4+ years of work experience as a DevOps Engineer or in a similar role
Minimum 4 years of work experience with UNIX/Linux systems, including configuration, troubleshooting and scripting with Python and/or Bash
Strong, hands-on experience with: -Agile methodologies, including frameworks like Scrumban. -Terraform and Ansible. -Docker containerization. -HashiCorp stack, including Nomad for orchestration, Consul for service discovery and Packer for image building. -Version control systems, particularly Git and GitHub. -Networking principles (TCP/IP, DNS, HTTP/S) and cloud networking concepts (VPCs, firewalls, load balancers)
Good knowledge of: -API management with gateways like APISIX. -Building and managing CI/CD pipelines, preferably with GitHub Workflows or Cloud Build. -Logging and monitoring tools (e.g., Elasticsearch, Kibana, Grafana or cloud-native solutions)
Fluency in English (written and spoken)
Strong communication and collaboration skills, with experience working in Agile environments using Jira and Confluence.
GCP or OCI certifications are nice to have.
Proactive team player with demonstrated ability in self-organization, task prioritization, planning, and estimation.
Proactive mindset with a passion for continuous improvement and learning new technologies.
Benefits
A competitive compensation package combined with additional benefits
Remote work if you are based outside Athens
Hybrid 1 day per week at the office if you are based in Athens
Senior Site Reliability Engineer at Diligent leading reliability, automation, and observability across cloud infrastructure. Build tools for incident response and enhance performance in fast - paced environments.
Perception Deployment Engineer deploying deep learning models on embedded systems at Caterpillar. Collaborating with cross - functional teams for integration and optimization of perception modules in vehicles.
Principal Site Reliability Engineer at AT&T required to design scalable solutions for critical operations with minimal downtime. Collaborating with teams to monitor and improve system performance in cloud environments.
DevOps Engineer managing AI SaaS infrastructure at a high - growth European company. Supporting AI model deployment and ensuring platform security and compliance with multiple systems integration.
Engineering Manager leading teams for observability platforms at LexisNexis. Owns operational excellence across software delivery lifecycle in Raleigh, NC.
Reliability Engineer optimizing site facility infrastructure and utility systems at Roche. Conducting root cause analyses and developing maintenance plans to enhance reliability and efficiency.
DevOps SME designing, implementing, and operating multi - cloud platforms for The Missing Link. Collaborating with engineering, security, and operations teams while embedding DevOps best practices.
Site Reliability Engineer improving reliability of cloud infrastructure for an AI - specialized company. Taking ownership of monitoring and incident response processes in hybrid - working style.
DevOps Engineer leading automation for sophisticated release/deployment pipelines at Securonix. Focused on Python, Ansible, and cloud services to enhance security operations.
Senior Analyst on Data Platform DevOps at AIMCo, responsible for building data operations and collaborating with teams on innovative solutions. Focused on ensuring data quality and integrity across technologies.