Google Cloud Platform Data Engineer (Locals Only)

Jobs via Dice8 months ago

Mountain View, California, United States

On-site

Full-time

Junior Level (1-3 years)

Job Description

Position Overview

Dice is the leading career destination for tech experts. Our client, VeridianTech, is seeking a Google Cloud Platform Data Engineer to join their team in Mountain View, CA (Onsite) for a duration of 12+ months. In this role, you’ll develop and enhance Python frameworks and design robust data pipelines that power advanced data processing, quality, and machine learning operations. Apply via Dice today!

Key Responsibilities

Develop and enhance Python frameworks and libraries to support data processing, quality, lineage, governance, analysis, and machine learning operations.
Design, build, and maintain scalable and efficient data pipelines on Google Cloud Platform.
Implement robust monitoring, logging, and alerting systems to ensure the reliability and stability of data infrastructure.
Build scalable batch pipelines leveraging BigQuery, Dataflow and Airflow/Composer scheduler/executor framework on Google Cloud Platform.
Build data pipelines leveraging Scala, Pub/Sub, Akka, and Dataflow on Google Cloud Platform.
Design data models for optimal storage and retrieval to support machine learning modeling using technologies like Bigtable and Vertex Feature Store.
Contribute to shared Data Engineering tooling and standards to improve productivity and quality for the team.

Required Qualifications

Python Expertise: Write and maintain Python frameworks and libraries to support data processing and integration tasks.
Code Management: Use Git and GitHub for source control, code reviews, and version management.
Google Cloud Platform Proficiency: Extensive experience working with GCP services (e.g., BigQuery, Cloud Dataflow, Pub/Sub, Cloud Storage).
Python Mastery: Proficient in Python with experience in optimizing data processing frameworks and libraries.
Software Engineering: Strong understanding of best practices including version control, collaborative development, code reviews, and CI/CD.
Data Management: Deep knowledge of data modeling, ETL/ELT, and data warehousing concepts.
Problem-Solving: Excellent problem-solving skills with the ability to tackle complex data engineering challenges.
Communication: Ability to explain complex technical details to non-technical stakeholders.
Data Science Stack: Proficiency in data analysis with tools such as Jupyter Notebook, pandas, and NumPy.
Frameworks/Tools: Familiarity with machine learning and data processing tools such as TensorFlow, Apache Spark, and scikit-learn.
Education: Bachelor’s or master’s degree in Computer Science, Engineering, Computer Information Systems, Mathematics, Physics, or a related field, or equivalent software development training.

Required Skills

Pub/Sub

pandas

Machine Learning

Apache Spark

GitHub

Git

ETL/ELT

TensorFlow

Cloud Dataflow

Akka

Jupyter Notebook

Python

BigQuery

scikit-learn

NumPy

Bigtable

Google Cloud Platform

Data Modeling

Scala

Airflow/Composer