Post a job
Back

Senior Software Engineer (Big data/AI)

About Scrapinghub:

Founded in 2010, Scrapinghub is a fast growing and diverse technology business turning web content into useful data with a cloud-based web crawling platform, off-the-shelf datasets, and turn-key web scraping services.

We’re a globally distributed team of over 180 Shubbers working from over 30 countries who are passionate about scraping, web crawling, and data science.

About the Job:

Scrapinghub is looking for a Senior Backend Engineer to develop and grow a new web crawling and extraction SaaS.

The new SaaS will include our recently released AutoExtract which provides an API for automated e-commerce and article extraction from web pages using Machine Learning. AutoExtract is a distributed application written in Java, Scala and Python; components communicate via Apache Kafka and HTTP, and orchestrated using Kubernetes.

You will be designing and implementing distributed systems: large-scale web crawling platform, integrating Deep Learning based web data extraction components, working on queue algorithms, large datasets, creating a development platform for other company departments, etc. - this is going to be a challenging journey for any backend engineer!

As a Senior Backend Engineer, you will have a large impact on the system we’re building, the new SaaS is still in the early stages of development.

Job Responsibilities:

  • Work on the core platform: develop and troubleshoot Kafka-based distributed application, write and change components implemented in Java, Scala and Python.
  • Work on new features, including design and implementation. You should be able to own and be responsible for the complete lifecycle of your features and code.
  • Solve distributed systems problems, such as scalability, transparency, failure handling, security, multi-tenancy.

Requirements

  • 3+ years of experience building large scale data processing systems or high load services
  • Strong background in algorithms and data structures.
  • Strong track record in at least two of these technologies: Java, Scala, Python.
  • 3+ years of experience with at least one of them.
  • Experience working with Linux and Docker.
  • Good communication skills in English.
  • Computer Science or other engineering degree.

Bonus points for:

  • Kubernetes experience
  • Apache Kafka experience
  • Experience building event-driven architectures
  • Understanding of web browser internals
  • Good knowledge of at least one RDBMS.
  • Knowledge of today’s cloud provider offerings: GCP, Amazon AWS, etc.
  • Web data extraction experience: web crawling, web scraping.
  • Experience with web data processing tasks: finding similar items, mining data streams, link analysis, etc.
  • History of open source contributions

Benefits

As a new Shubber, you will:

Become part of a self-motivated, progressive, multi-cultural team.

Have the freedom and flexibility to work from where you do your best work.

Attend conferences and meet with team members from across the globe.

Work with cutting-edge open source technologies and tools.

Receive paid time off

Enrol in Scrapinghub's Share Option Programme

Posted on Wednesday, April 01, 2020
Apply Now
Notice anything wrong with this job post?

Is this job not remote?
Does this job not accept Asian timezones?
Had a bad experience?

Let us know

Scrapinghub

We help you extract data at scale quickly and effectively using open source technologies.
Browse all jobs at Scrapinghub Apply Now