Data Engineer

Allozymes is a deep tech company based in Singapore. We are revolutionising the way industry uses enzymes for manufacturing chemicals and natural compounds. Our rapid discovery and evolution of custom-designed enzymes enables breakthrough developments for sustainable production of ingredients for pharmaceuticals, cosmetics, chemical, food and beverages.

We’re hiring a highly capable data engineer into our Data team. This team is responsible for developing and implementing state-of-the-art approaches for evolving enzymes and microbes to enhance the production of chemicals and natural compounds. The engineer will integrate the development of Allozymes’ cloud-based data infrastructure and work on upgrading our enzyme and strain optimization platform as well as collaborate in the development of customer ready solutions. Working in a highly collaborative and dynamic environment, this role has the opportunity to interact with other scientists, automation and process engineers to achieve these goals

Responsibilities

  • Contribute to the design and improvement of Allozymes cloud-based data infrastructure.
  • Develop, implement and deploy cloud-based data analysis pipelines with a variety of data science, cloud computing and analytics methods.
  • Create automated and reliable data ETL pipelines
  • Manage, store and process proprietary datasets and data lakes to find opportunities for process optimization.
  • Apply predictive models to test the effectiveness of different courses of action.
  • Collaborate in the development and deployment of custom UI solutions.

Requirements & Qualifications

  • MS or PhD in statistics, mathematics, applied mathematics, computer science, bioinformatics or a related quantitative field with a focus on data science and data engineering.
  • 2+ years of relevant industry experience.
  • Proficient in Python.
  • Experience with cloud systems providers, such as AWS, and cloud based warehouses
  • Experience with SQL/NoSQL.
  • Expertise in using python libraries such as Pandas, Numpy, Scipy, Matplotlib, Seaborn, Scikit-learn.
  • Strong experience in handling big-data, extrapolating information from data, building and deploying data science/ETL pipelines.
  • Ability to work in a fast-paced, collaborative and cross-functional environment and communicate results effectively to management.
  • Experience building RESTful APIs (Django, FastAPI, etc.) is a plus
  • Experience with front-end development is a plus
  • Knowledge and experience in statistical and data mining techniques is a plus
  • Experience in biology, bioinformatics or computational biology is a plus
  • Experience with code version controlling and platforms such as github/gitlab is a plus
  • Experience with CI/CD processes is a plus
  • Experience with containers and container orchestration tools is a plus
  • Experience writing production-ready code (version-controlled, scalable, well-documented, testable, deployment-ready is a plus

Candidates who fulfill the above-mentioned criteria are encouraged to Apply here

Job Category: Data
Job Type: Full Time
Job Location: Remote

Apply for this position

Allowed Type(s): .pdf, .doc, .docx