Allozymes is a deep tech company based in Singapore. We are revolutionising the way industry uses enzymes for manufacturing chemicals and natural compounds. Our rapid discovery and evolution of custom-designed enzymes enables breakthrough developments for sustainable production of ingredients for pharmaceuticals, cosmetics, chemical, food and beverages.

We’re hiring a highly capable data engineer into our Data team. This team is responsible for developing and implementing state-of-the-art approaches for evolving enzymes and microbes to enhance the production of chemicals and natural compounds. The engineer will integrate the development of Allozymes’ cloud-based data infrastructure and work on upgrading our enzyme and strain optimization platform as well as collaborate in the development of customer ready solutions. Working in a highly collaborative and dynamic environment, this role has the opportunity to interact with other scientists, automation and process engineers to achieve these goals

Responsibilities

Contribute to the design and improvement of Allozymes cloud-based data infrastructure.
Develop, implement and deploy cloud-based data analysis pipelines with a variety of data science, cloud computing and analytics methods.
Create automated and reliable data ETL pipelines
Manage, store and process proprietary datasets and data lakes to find opportunities for process optimization.
Apply predictive models to test the effectiveness of different courses of action.
Collaborate in the development and deployment of custom UI solutions.

Requirements & Qualifications

MS or PhD in statistics, mathematics, applied mathematics, computer science, bioinformatics or a related quantitative field with a focus on data science and data engineering.
2+ years of relevant industry experience.
Proficient in Python.
Experience with cloud systems providers, such as AWS, and cloud based warehouses
Experience with SQL/NoSQL.
Expertise in using python libraries such as Pandas, Numpy, Scipy, Matplotlib, Seaborn, Scikit-learn.
Strong experience in handling big-data, extrapolating information from data, building and deploying data science/ETL pipelines.
Ability to work in a fast-paced, collaborative and cross-functional environment and communicate results effectively to management.
Experience building RESTful APIs (Django, FastAPI, etc.) is a plus
Experience with front-end development is a plus
Knowledge and experience in statistical and data mining techniques is a plus
Experience in biology, bioinformatics or computational biology is a plus
Experience with code version controlling and platforms such as github/gitlab is a plus
Experience with CI/CD processes is a plus
Experience with containers and container orchestration tools is a plus
Experience writing production-ready code (version-controlled, scalable, well-documented, testable, deployment-ready is a plus

Candidates who fulfill the above-mentioned criteria are encouraged to Apply here

Data Engineer

Apply for this position