Software Engineer, Data Acquisition at Mistral AI | Hybrid Hired

About the role

Develop and maintain web crawlers using Python libraries such as Beautiful Soup to extract data from target websites.
Utilize headless browsing techniques, such as Chrome DevTools, to automate and optimize data collection processes.
Collaborate with cross-functional teams to identify, scrape, and integrate data from APIs to support business objectives.
Create and implement efficient parsing patterns using regular expressions, XPaths, and CSS selectors to ensure accurate data extraction.
Design and manage distributed job queues using technologies such as Redis, Kubernetes, and Postgres to handle large-scale data processing tasks.
Develop strategies to monitor and ensure data quality, accuracy, and integrity throughout the crawling and indexing process.
Continuously improve and optimize existing web crawling infrastructure to maximize efficiency and adapt to new challenges.

Requirements

Proficiency in Python, Java, or C++
Strong understanding of HTTP/HTTPS protocols and web communication.
Knowledge of HTML, CSS, and JavaScript for parsing and navigating web content.
Mastery of queues, stacks, hash maps, and other data structures for efficient data handling.
Ability to design and optimize algorithms for large-scale web crawling.
Hands-on experience with web scraping libraries/frameworks (e.g., Scrapy, BeautifulSoup, Selenium, Playwright).
Understanding of how search engines work and best practices for web crawling optimization.
Experience with SQL and/or NoSQL databases (e.g., PostgreSQL, MongoDB) for storing and managing crawled data.
Familiarity with data warehousing and scalable storage solutions.
Knowledge of distributed systems (e.g., Hadoop, Spark) for processing large datasets.
Proficiency in Pandas, NumPy, and Matplotlib for analyzing and visualizing scraped data.
Experience applying Machine Learning to improve crawling efficiency or accuracy.
Familiarity with cloud platforms (AWS, GCP) and containerization (Docker) for deployment.

Benefits

💰 Competitive salary and equity
🧑‍⚕️ Health insurance
🚴 Transportation allowance
🥎 Sport allowance
🥕 Meal vouchers
💰 Private pension plan
🍼 Parental : Generous parental leave policy
🌎 Visa sponsorship

Similar roles

Browse all Full Stack Engineer jobs

23 minutes ago

BV

Black & VeatchLead Engineering Technician

Lead Engineering Technician at Black & Veatch applying technical expertise to solve design challenges and mentoring a team. Utilizing advanced digital tools to maintain project deliverables and ensure quality standards.

Onsite Role

Melbourne Australia Full Stack Engineer

1 hour ago

CU

CursorSoftware Engineer, Client Infrastructure

Software Engineer designing and building performant systems for desktop applications at Cursor, focusing on client infrastructure and developer experience.

Hybrid Role

San Francisco United States Full Stack Engineer

1 hour ago

PA

PayflipSoftware Engineer, Full Stack – Junior/Medior

Junior/Medior Software Engineer joining Payflip's product team to develop and grow an innovative salary platform. Working collaboratively to deliver high - quality features through a transparent and learning - focused culture.

Hybrid Role

Brussels Belgium Full Stack Engineer

2 hours ago

PC

Post CompanySenior Software Engineer

Senior Software Engineer developing backend applications using Go at PostFinance. Collaborating on innovative financial solutions and ensuring application reliability.

Hybrid Role

Bern Switzerland Full Stack Engineer

2 hours ago

QI

Qorvo, Inc.Senior SAP Technical Lead

Senior SAP Technical Lead for Qorvo's Enterprise Business Systems, requiring expertise in SAP and collaboration with global teams. Based in Bangalore, India, working five days a week on - site.

Onsite Role

Bangalore India Full Stack Engineer

3 hours ago

TE

TeleflexPrincipal Engineer, Design Development, R&D

Principal Engineer in Interventional Urology at Teleflex developing innovative R&D medical devices. Mentoring team and leading complex design processes for minimally invasive technologies in urology.

Onsite Role

Pleasanton United States Full Stack Engineer

$170,000 - $185,000 per year

4 hours ago

NV

NVIDIACloud Software Engineer

Software Engineer at NVIDIA designing scalable cloud - native solutions using advanced technologies. Collaborating across teams to enhance AI supercomputing platforms in a dynamic environment.

Onsite Role

Raanana Israel Full Stack Engineer

9 hours ago

AR

Ants Tech RecruitersTech Lead

Tech Lead at Viewer Labs designing and developing data - driven marketing products for gaming. Oversee technology decisions and manage remote development while building internal tech team.

Hybrid Role

Stockholm Sweden Full Stack Engineer

9 hours ago

VI

Vidoori Inc.Full Stack Developer

Full Stack Developer at Vidoori, specializing in web applications and cloud - native solutions. Collaborating with cross - functional teams to deliver high - quality digital solutions in a hybrid work environment.

Hybrid Role

United States Full Stack Engineer

10 hours ago

OI

Onto InnovationSenior Software Engineer

Senior Software Engineer designing control software for semiconductor equipment at Onto Innovation. Collaborating with cross - functional teams and ensuring reliability and scalability in high - tech environments.

Hybrid Role

Wilmington United States Full Stack Engineer