Successful candidate will have experience creating data pipelines, to provision data from a variety of sources using tools like Spark, Azure Data Factory and Kafka.
You will be able to identify appropriate data processes, structures, and models to meet customer objectives and have experience performing data transformations that meet customer and organisational needs.
Will be proficient at coding in languages such as Python, PySpark, Scala, and writing SQL (however, we are not looking for people to just write code.)
The client encourages people with a curious mind, to bring their outside experience, listen to and challenge their peers, to get the best possible results. In a nutshell, they like people who ask 'why?' The best solutions come from teams comprised of people with diverse skills working together with mutual trust.
A candidate's ability to work in teams, often remotely, is critical, which means knowing when to listen, and when to speak-up.
Databases, modelling, and data flows: Common relational, non-relational and spatial databases. Workflow and ETL/ELT tools and methods.
Cloud: Azure, AWS or GCP. Docker, Kubernetes, helm, data lake storage