Data Engineer at Motion Recruitment Arlington, TX
Motion Recruitment3 months ago
Arlington, TX, United States
Hybrid
Contract
Junior Level (1-3 years)
Job Description
Position Overview
Data Engineer job at Motion Recruitment in Arlington, TX. We have an immediate 6 Month Contract-to-Hire opportunity for a Data Engineer. This role requires working onsite in Arlington, TX two days per week (Tuesdays, Wednesdays preferred) and involves developing scalable data systems for processing large, semi-structured or unstructured data sets. The position supports both off-line and in-line machine learning training as well as search engine based analytics through batch and streaming data transformation processes.
Key Responsibilities
- Troubleshoot complex problems and work across teams to meet commitments.
- Contribute to the evaluation, research, and experimentation of batch and streaming data engineering technologies.
- Collaborate with data engineering groups to showcase and adopt emerging technologies.
- Define and refine processes and procedures for the data engineering practice.
- Work with data scientists, data architects, ETL developers, and business partners to capture and format data from diverse sources.
- Code, test, deploy, monitor, document, and troubleshoot data processing systems and associated automation.
- Conform with all company policies and procedures.
Required Qualifications
- Bachelor’s Degree in a related field or equivalent work experience.
- 4-6+ years of experience in data engineering.
- 3+ years of Python experience, including manipulation of Data Frames and transformation logic.
- Strong SQL knowledge.
- Experience with unstructured/semi-structured data (e.g., JSON, XML).
- Experience with Databricks.
- Knowledge of Ralph Kimball Star Schema.
- 3-5 years of hands-on experience processing large data sets.
- 3-5 years of hands-on experience with SQL, data modeling, and working with relational and/or NoSQL databases.
- Strong interpersonal, verbal, and writing skills.
Preferred Qualifications
- Experience with processing large data sets using Hadoop, HDFS, Spark, Kafka, Flume, or similar distributed systems.
- Experience with ingesting various data formats such as JSON, Parquet, SequenceFile, and working with cloud databases.
- Experience with Cloud technologies (Azure, AWS, GCP) and native toolsets like Azure ARM Templates, Hashicorp Terraform, or AWS CloudFormation.
- Understanding of cloud computing technologies, business drivers, and emerging trends.
- Familiarity with hybrid cloud computing models, virtualization technologies, and various cloud delivery models (IaaS, PaaS, SaaS).
- Working knowledge of object storage technologies such as Data Lake Storage Gen2, S3, Minio, Ceph, or ADLS.
- Experience with containerization including Docker, Kubernetes, Spark on Kubernetes, or Spark Operator.
- Familiarity with Agile development frameworks (SAFe, Scrum) and Application Lifecycle Management.
- Experience with source control management systems, build systems, code quality tools, artifact repositories, and CI/CD pipelines.
- Experience with NoSQL data stores such as CosmosDB, MongoDB, Cassandra, Redis or related technologies integrating search capabilities.
- Experience in creating and maintaining ETL processes.
- Knowledge of IT governance and privacy compliance best practices.
- Experience with Adobe solutions (e.g., Adobe Experience Platform, DTM/Launch) and REST APIs.
- Proficiency in digital data collection and familiarity with digital technology solutions (DMPs, CDPs, Tag Management Platforms, etc.).
- Understanding of real-time CDP and journey analytics solutions.
- Knowledge of big data platforms, data stream processing pipelines, data lake architectures, and data lake houses.
- Strong SQL querying skills with the ability to derive actionable insights.
- Understanding of cloud solutions and architectures on platforms like Google Cloud Platform, Microsoft Azure, and Amazon AWS.
- Familiarity with GDPR, privacy, and security topics.
Required Skills
Data Modeling
SQL
Stream Processing
Databricks
Batch Processing
ETL Processes
Data Engineering
Semi-Structured Data Handling
Python