Onsite Lead Architect

Posted 4 hours ago

Apply now

About the role

  • Hands-on Lead Architect focusing on SRE practices, cloud technology implementation, and automation. Working specifically within AWS architecture and observability frameworks.

Responsibilities

  • 7A lead architect (only hands- on) in practice for SRE (Observability and DevOps).

Requirements

  • 7A lead architect (only hands- on) in practice for SRE (Observability and DevOps).
  • AWS architecture: VPC, Subnets, Routing, NAT, Security Groups, NACLs, Transit Gateway
  • Compute & container orchestration: EC2, ECS, EKS (Kubernetes) , Fargate, Lambda
  • Storage & data: S3, EBS/EFS, RDS/Aurora, DynamoDB, ElastiCache
  • Networking & edge: ALB/NLB, API Gateway, Route 53, CloudFront, Global Accelerator
  • Identity & access: IAM policies/roles, STS, Organizations, Control Tower, SCPs
  • Reliability patterns: multi-AZ/region HA, DR (Pilot Light/Warm Standby/Active-Active), backup/restore automation
  • AWS stack: CloudWatch (metrics/logs/alarms), CloudTrail (audit), X-Ray (tracing), Config (drift/compliance)
  • Metrics & tracing: Prometheus , Grafana , Jaeger , OpenTelemetry (OTLP, SDKs, collectors)
  • Log aggregation & search: ELK/Elastic Stack (Elasticsearch, Logstash, Kibana) , Fluentd/Fluent Bit , Splunk
  • APM tools: Datadog , New Relic , AppDynamics , Dynatrace (bonus)
  • SLO/SLI/SLA design, error budgets, golden signals, alert hygiene & runbook quality
  • Pipeline design: GitHub Actions, GitLab CI, Jenkins, AWS CodePipeline/CodeBuild/CodeDeploy
  • Deployment strategies: Blue/Green, Canary, Rolling, Feature Flags; automated rollbacks
  • Artifact & dependency management: Docker registries (ECR), SBOM, supply chain security
  • Release governance: trunk-based development, GitOps (Argo CD/Flux), approvals & gates
  • Terraform (modules, workspaces, remote state, data sources), AWS CloudFormation/CDK (TypeScript/Python), nested stacks, custom resources
  • Ansible (playbooks, roles, vault), Packer (AMI pipelines), Helm charts for Kubernete
  • Cluster lifecycle: node groups, CNI (Amazon VPC CNI/Calico), storage classes, ingress controllers
  • Service mesh: Istio/Linkerd (optional), mTLS, traffic policies, sidecars
  • Workload ops: HPA/VPA, pod disruption budgets, resource quotas/requests/limits
  • Observability for K8s: kube-state-metrics, Prometheus Operator, Grafana dashboards
  • Multi-tenancy, namespaces, RBAC, network policies; admission controllers & policy-as-code
  • Incident management: on-call practices, escalation, blameless postmortems, RCA depth
  • Chaos engineering: fault injection (chaos-mesh/litmus), game days, resilience scoring
  • Capacity planning & performance tuning: autoscaling, throughput/latency profiling, caching strategies
  • Availability engineering: circuit breakers, retries/backoff, bulkheads, graceful degradation
  • Cloud security: IAM least privilege, Secrets Manager/Parameter Store, KMS, VPC endpoints
  • Container security: image scanning (Trivy/Grype), runtime policies (Falco), admission controls
  • Policy-as-code: AWS Config rules, GuardDuty, Security Hub
  • Compliance: audit trails, encryption in transit/at rest, CIS/NIST/ISO mappings
  • Cost governance: tagging standards, cost allocation, savings plans/reserved instances, rightsizing
  • Strong programming for tooling/automation: Python / Go (preferred), Bash
  • Event-driven ops: Lambda/Step Functions for remediation; webhooks & bots (ChatOps)
  • API-first mindset: AWS SDK/CLI, tool integrations, custom exporters/collectors

Job title

Lead Architect

Job type

Experience level

Senior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job