Google Cloud Platform Data Engineer (Locals Only)

Jobs via Dice6 months ago
Mountain View, California, United States
On-site
Full-time
Junior Level (1-3 years)

Job Description

Position Overview

Dice is the leading career destination for tech experts. Our client, VeridianTech, is seeking a Google Cloud Platform Data Engineer to join their team in Mountain View, CA (Onsite) for a duration of 12+ months. In this role, you’ll develop and enhance Python frameworks and design robust data pipelines that power advanced data processing, quality, and machine learning operations. Apply via Dice today!

Key Responsibilities

  • Develop and enhance Python frameworks and libraries to support data processing, quality, lineage, governance, analysis, and machine learning operations.
  • Design, build, and maintain scalable and efficient data pipelines on Google Cloud Platform.
  • Implement robust monitoring, logging, and alerting systems to ensure the reliability and stability of data infrastructure.
  • Build scalable batch pipelines leveraging BigQuery, Dataflow and Airflow/Composer scheduler/executor framework on Google Cloud Platform.
  • Build data pipelines leveraging Scala, Pub/Sub, Akka, and Dataflow on Google Cloud Platform.
  • Design data models for optimal storage and retrieval to support machine learning modeling using technologies like Bigtable and Vertex Feature Store.
  • Contribute to shared Data Engineering tooling and standards to improve productivity and quality for the team.

Required Qualifications

  • Python Expertise: Write and maintain Python frameworks and libraries to support data processing and integration tasks.
  • Code Management: Use Git and GitHub for source control, code reviews, and version management.
  • Google Cloud Platform Proficiency: Extensive experience working with GCP services (e.g., BigQuery, Cloud Dataflow, Pub/Sub, Cloud Storage).
  • Python Mastery: Proficient in Python with experience in optimizing data processing frameworks and libraries.
  • Software Engineering: Strong understanding of best practices including version control, collaborative development, code reviews, and CI/CD.
  • Data Management: Deep knowledge of data modeling, ETL/ELT, and data warehousing concepts.
  • Problem-Solving: Excellent problem-solving skills with the ability to tackle complex data engineering challenges.
  • Communication: Ability to explain complex technical details to non-technical stakeholders.
  • Data Science Stack: Proficiency in data analysis with tools such as Jupyter Notebook, pandas, and NumPy.
  • Frameworks/Tools: Familiarity with machine learning and data processing tools such as TensorFlow, Apache Spark, and scikit-learn.
  • Education: Bachelor’s or master’s degree in Computer Science, Engineering, Computer Information Systems, Mathematics, Physics, or a related field, or equivalent software development training.

Required Skills

Pub/Sub
pandas
Machine Learning
Apache Spark
GitHub
Git
ETL/ELT
TensorFlow
Cloud Dataflow
Akka
Jupyter Notebook
Python
BigQuery
scikit-learn
NumPy
Bigtable
Google Cloud Platform
Data Modeling
Scala
Airflow/Composer