Responsible for the management and coordination of day-to-day and strategic operations of our log analysis framework
Develop L0-L2 SOP’s related to the operational support of the logging framework
Collect and report relevant KPIs that clearly show value/ROI and progression of the log analysis service
Stay abreast of emerging technology advancements of the current logging platform and/or open-source alternatives including implementation of pilots and/or POC/POV’s
Recognize and onboard new data sources into Splunk, analyze data for anomalies and trends, and build relevant dashboards/alerts that improve visibility
Responsible for the installation, configuration, and ongoing administration of Cribl environments
Collaborate with cross-functional teams to optimize log pipelines and maintain system reliability
Ensures secure, efficient, and compliant data flows to support organizational observability and security needs
Develop/Refine organizations pattern based automated log ingestion via tight integration with existing/emerging technology pipelines and/or create a robust and repeatable onboarding process
Ensure proper operation and performance of Splunk index cluster, search heads, other backend components, universal forwarders, modules/plug-ins, and connectors
Standardize Splunk agent deployment, configuration, and maintenance across multiple configuration management systems
Develop, Manage, and Maintain the organization's Event Management Framework
Administers and maintains Grafana environments, ensuring reliable dashboard performance and secure user access
Designs and develops interactive Grafana dashboards for real-time data visualization and monitoring
Manages and optimizes ClickHouse database clusters to ensure high performance, availability, and data integrity
Utilizes ClickHouse for efficient querying and analysis of large-scale datasets to support business insights
Educate/mentor junior team members to grow their capabilities and skills
Requirements
4+ years in a role supporting the operational needs of a relevant enterprise log analysis framework
Bachelor's degree in Computer Science, or related discipline, or equivalent work experience
In-depth experience installing, configuring, maintaining log analysis & visualization & next gen pipeline tools such as Splunk, Grafana, Clickhouse & Cribl
Basic familiarity with a wide array of IT monitoring tools, ITIL & Devops framework(s), and ITSM tools
Proficiency in leveraging regular expression patterns
Understanding of Windows Server and Linux Operating Systems Administration
Hands-on & practical experience of log aggregation related to Cloud Platforms, server-less compute, and micro-services (Lamba, Docker, SSM,RDS)
Benefits
world-class benefits
highly competitive compensation
disproportionate rewards for top performers
flexibility and support for hybrid work environment
Management Engineer supporting productivity management and operational support in Duke Health systems. Role involves labor budgeting, performance reporting, and improvement processes.
IAM Engineer Intern role within Zoox, focusing on Identity and Access Management operations. Support SSO onboarding and improve IAM product features while collaborating with engineering teams.
Tooling Engineer overseeing daily production on injection molding lines at MAHLE. Ensuring quality and efficiency in industrial processes while managing tooling and team development.
Production Engineer optimizing manufacturing processes and driving cost efficiency at MAHLE. Focused on improving production performance, supporting design changes, and ensuring compliance with quality standards.
Professional Services Engineer providing on - site services and remote support for video networking solutions. Involves installation, training, and troubleshooting in broadcast and media environments.
Engineer 5th Class responsible for maintenance of building and medical systems at Portage Regional Health Centre. Collaborating with teams to ensure operational standards are met.
Infrastructure Monitoring Engineer working on SCOM in large - scale enterprise environments across Australia, enhancing monitoring solutions and collaborating with cross - functional teams.
SOE Engineer responsible for planning, developing, testing, and deploying Windows 11 SOE across the enterprise. Collaborating with infrastructure teams to meet security and compliance standards.
Quality Management Systems Engineer at Leidos ensuring quality standards and compliance in security solutions. Supporting audits, reports, and continuous improvement initiatives for high - tech systems.