Data Engineer at Motion Recruitment Arlington, TX

Motion Recruitment3 months ago
Arlington, TX, United States
Hybrid
Contract
Junior Level (1-3 years)

Job Description

Position Overview

Data Engineer job at Motion Recruitment in Arlington, TX. We have an immediate 6 Month Contract-to-Hire opportunity for a Data Engineer. This role requires working onsite in Arlington, TX two days per week (Tuesdays, Wednesdays preferred) and involves developing scalable data systems for processing large, semi-structured or unstructured data sets. The position supports both off-line and in-line machine learning training as well as search engine based analytics through batch and streaming data transformation processes.

Key Responsibilities

  • Troubleshoot complex problems and work across teams to meet commitments.
  • Contribute to the evaluation, research, and experimentation of batch and streaming data engineering technologies.
  • Collaborate with data engineering groups to showcase and adopt emerging technologies.
  • Define and refine processes and procedures for the data engineering practice.
  • Work with data scientists, data architects, ETL developers, and business partners to capture and format data from diverse sources.
  • Code, test, deploy, monitor, document, and troubleshoot data processing systems and associated automation.
  • Conform with all company policies and procedures.

Required Qualifications

  • Bachelor’s Degree in a related field or equivalent work experience.
  • 4-6+ years of experience in data engineering.
  • 3+ years of Python experience, including manipulation of Data Frames and transformation logic.
  • Strong SQL knowledge.
  • Experience with unstructured/semi-structured data (e.g., JSON, XML).
  • Experience with Databricks.
  • Knowledge of Ralph Kimball Star Schema.
  • 3-5 years of hands-on experience processing large data sets.
  • 3-5 years of hands-on experience with SQL, data modeling, and working with relational and/or NoSQL databases.
  • Strong interpersonal, verbal, and writing skills.

Preferred Qualifications

  • Experience with processing large data sets using Hadoop, HDFS, Spark, Kafka, Flume, or similar distributed systems.
  • Experience with ingesting various data formats such as JSON, Parquet, SequenceFile, and working with cloud databases.
  • Experience with Cloud technologies (Azure, AWS, GCP) and native toolsets like Azure ARM Templates, Hashicorp Terraform, or AWS CloudFormation.
  • Understanding of cloud computing technologies, business drivers, and emerging trends.
  • Familiarity with hybrid cloud computing models, virtualization technologies, and various cloud delivery models (IaaS, PaaS, SaaS).
  • Working knowledge of object storage technologies such as Data Lake Storage Gen2, S3, Minio, Ceph, or ADLS.
  • Experience with containerization including Docker, Kubernetes, Spark on Kubernetes, or Spark Operator.
  • Familiarity with Agile development frameworks (SAFe, Scrum) and Application Lifecycle Management.
  • Experience with source control management systems, build systems, code quality tools, artifact repositories, and CI/CD pipelines.
  • Experience with NoSQL data stores such as CosmosDB, MongoDB, Cassandra, Redis or related technologies integrating search capabilities.
  • Experience in creating and maintaining ETL processes.
  • Knowledge of IT governance and privacy compliance best practices.
  • Experience with Adobe solutions (e.g., Adobe Experience Platform, DTM/Launch) and REST APIs.
  • Proficiency in digital data collection and familiarity with digital technology solutions (DMPs, CDPs, Tag Management Platforms, etc.).
  • Understanding of real-time CDP and journey analytics solutions.
  • Knowledge of big data platforms, data stream processing pipelines, data lake architectures, and data lake houses.
  • Strong SQL querying skills with the ability to derive actionable insights.
  • Understanding of cloud solutions and architectures on platforms like Google Cloud Platform, Microsoft Azure, and Amazon AWS.
  • Familiarity with GDPR, privacy, and security topics.

Required Skills

Data Modeling
SQL
Stream Processing
Databricks
Batch Processing
ETL Processes
Data Engineering
Semi-Structured Data Handling
Python