Data Engineer, Data Science
Conshohocken, Pennsylvania, US
6 Months Contract to hire.
As part of the data science team, you will:
Design and build infrastructure to integrate data from various sources to be used by the Data Scientists and Software Developers.
Create automated production pipelines to run data science models and build the tools to monitor their progress and ensure their completion.
Ingesting complex, semi-structured data from a variety of real-world sources that dynamically evolve.
Evaluating the technical trade-offs of tools to build simple, yet robust, architectures that survive hardware failures.
Build scalable, integrated and automated systems in compliance with Dev team data models.
Promote the use of the latest Data Science and Big Data technologies.
Help build a brand-new data science platform.
What you need to succeed (must haves):
2+ years professional experience with data development
Passion for working with data and interest in big data technologies.
Experience with ETL and data manipulation such as through SQL technologies (Hive/PostgreSQL/MSSQL/MySql/Cassandra/RedShift/ Oracle).
Experience with data modeling.
Shell scripting (Bash OR PowerShell)
Cloud based environments (AWS or Azure)
Experience with Linux
Skills that are not required, but will set you apart:
Python development experience
Experience with Pandas and NumPy
Experience with Apache Spark (Scala or PySpark)
Familiarity with container technologies (Docker/Kubernetes)
Experience with Big Data or Container Pipeline/Workflow tools (Oozie / AirFlow)
Some DevOps experience such as CI/CD, Deployment or Networking.
Release tools (Jenkins/TeamCity/Travis/VSTS)