SpotHero is seeking a Staff/Principal Data Engineer to join the Data Engineering squad. This squad interacts with data consumers such as Data Science, Marketing, Engineering and Business Analysts to provide data platform solutions that meet their day-to-day needs and long term vision.
As a Staff/Principal Data Engineer, you’ll focus heavily on backend application development with a focus on building re-usable infrastructure services for our stakeholders to enable them to model, store, access, process, and analyze SpotHero’s data. You’ll also design, instantiate, observe and maintain infrastructure services, both AWS-managed and open source solutions. As a Staff/Principal Data Engineer, you’ll influence the technology choices and patterns established for data-heavy workloads at SpotHero.
Who we are:
SpotHero is a parking reservation service that helps drivers find and reserve parking at thousands of lots and garages in all major cities. We are on a mission to bring the parking industry into the future through technology. Drivers in the U.S. and Canada use the SpotHero mobile app or website to reserve convenient, affordable parking on-the-go or in advance, and parking companies rely on us to help them reach new customers while optimizing their business. We combine hard-won industry knowledge, a large parking demand and supply dataset, and solid software engineering to serve both sides of the parking market. SpotHero leads by flexibility when it comes to a daily working environment and this role is open to candidates based in Illinois, New York, California, Washington State, Texas, Maryland, Ontario, British Columbia, and Nova Scotia. Full-time remote candidates in any of these locations are encouraged to apply!
What will you do:
- Work with our analytics, marketing and data science teams to understand our data processing needs.
- Be a key hands-on contributor to the design and implementation of our data platform solutions from the infrastructure layer up to the API.
- Model and architect our data in a way that will scale with the increasingly complex ways we’re analyzing it.
- Build robust pipelines that make sure data is where it needs to be, when it needs to be there.
- Build frameworks and tools to help our software engineers, data analysts, and data scientists design and build their own data pipelines in a self-service manner.
- Performance testing and engineering to ensure that our systems always scale to meet our needs.
- Be a key member of the team focused on pure hands-on contribution to the implementation and operation of our data platform.
- Contribute to more junior team members' development, via thoughtful reviews and your own exemplary work
- Data Modeling/Architecting
- Design data models with a broader understanding of underlying systems.
- Create approachable, thorough documentation of data models describing how to access their data in a performant way.
- Build performant models that are consistent with accompanying documentation that are built with quality in mind.
- Consult with stakeholders on the best practices for creation and deployment of data models and data flows.
- Data Processing
- Define and enforce service level agreements between products owned and stakeholders, including configuration of monitoring and alerting.
- Understand data lineage and dependencies between data pipeline.
- Design, implement, and maintain complex data processing pipelines which involve multiple integration points, including those which rely on distributed systems like Kafka and Spark.
- Influence data processing and infrastructure practices across all of SpotHero.
- Determine the best architecture, batch or streaming, for applications being built.
- Working with Infrastructure
- Evaluate different architectures for new systems or changes to the company’s existing systems, and propose thorough, specific designs for implementing those architectures.
- Provision new infrastructure in cloud environments.
- Deploy and manage containerized applications running in Kubernetes.
- Identify and remedy security, cost, and maintainability issues in the team’s infrastructure.
- Manage and integrate autoscaling, logging, monitoring and alerting for the team’s systems.
- Working with Infrastructure
- Proficient at provisioning new infrastructure across environments.
- Capable of managing/integration autoscaling, logging, monitoring and alerting for the system. Your infrastructure as code is environment agnostic.
We care about your abilities, not how you gained them.
You might demonstrate the capabilities below through any combination of relevant professional experience, experience in a research setting, formal education, self-guided learning, open source contributions, or public speaking / writing / teaching experience.
- You are able to design and implement high-quality software in Python and at least one JVM language (we use Kotlin, but Java or Scala experience works).
- Strong SQL skills and data modeling experience.
- You have experience provisioning and managing infrastructure with infrastructure-as-code tools (we use Terraform, but experience with similar tools like CloudFormation, Pulumi, or SaltStack is ok).
- Hands-on experience using multiple data platforms and tools (e.g. Airflow, Hive, Kafka, Postgres, Redshift, S3, Spark, Trino), and experience deploying, monitoring, and maintaining some of them.
- Experience designing and implementing software (pipelines, services and client libraries) that is run in Docker containers, automatically tested on a continuous integration (CI) system, and versioned in git. You have experience writing shell scripts, Makefiles, or other configuration to glue together these components.
- Ability to deploy containerized software in Kubernetes, or sufficient experience in similar technologies like Apache Mesos or Amazon ECS.
- Experience designing and implementing architectures that rely on cloud compute, networking, storage, and security services (we’re an AWS shop, but similar experience in Azure or GCP is ok if you’re willing to learn).
- Demonstrated experience designing and supporting technology intended to be used by other stakeholders.
- Passion for ensuring timeliness, availability and quality of our highest value data-sets that meets established SLOs.
- Comfortable working on a small team with minimal direction.
- Demonstrated experience measuring the impact of technology solutions.
- Strong ability to communicate on both business and technical subjects.
Nice to Haves:
- Message driven or streaming architectures, such as those with Kafka, Spark, Flink.
- Postgres, MySql, or other RDBMS experience.
- Redshift, Presto, or other MPP database experience.
- Experience with cloud data pipeline services like dbt Cloud, Fivetran, or Hightouch.
- Airflow, Luigi, or other ETL scheduling tool experience.
- Experience contributing to open source projects that are relevant to data engineering and data science.
Technology we use:
- Our Android Stack is: Kotlin and XML (standard for Android apps) using MVI architecture (still working on refactoring old views), our database layer is built in Realm. Bitrise for CI/CD. We also make heavy use of Dagger, RxJava, Espresso (testing). Network stack uses Retrofit.
- Our iOS Stack is: Swift using MVC architecture, CoreData for Local Storage, XCUI for UI Testing, XCTest for Unit testing, SPM for Package Management, Fastlane for app automation and build scripts, Bitrise for CI/CD, and Sentry for crash reporting.
- Our Back End Stack is: Monolith using Django/Python/PostgreSQL. We are moving our Monolith to a Modular Monolith, using Domain Driven Design. When relevant we extract specific domains to Services currently using Java, Kotlin and Go. We also use Docker, deploy our apps via Kubernetes. We use Kafka for asynchronous-, and gRPC for synchronous service-to-service communication. Our Integrations are on a .Net CORE, moving to Kotlin.
- Our Front End Stack is: Our Front End stack is React/Redux, Sass, Jest/React Testing Library/Cypress, and Webpack. We maintain a private npm repository with shareable UI components, utility functions, Babel/ESLint/Prettier configurations, and custom tasks
- Our Data Stack is: Our Monolith Database is Postgres and Redis for caching. We also use Redshift as our data warehouse and S3 as our data lake. The data lake is queried using Presto. We use Airflow and Spark for ETL, as well as do some stream processing (Kafka Streams and Spark at the moment). Our Model pipeline uses scikit-learn, pandas. Our analysts utilize Looker as our Business Intelligence tool. And we use Quicksight for Dashboard on our external Data Products.
- Our Dev Tools Stack is: AWS+Kubernetes for hosting. Terraform + Helm Charts for IaaS/Deployment. ConcourseCI for CI/CD. Prometheus/Alertmanager/VictorOps for team alerting. We’re starting to work on multi-region available services.
What we are offering:
- Career game changer – A truly unique experience to work for a fast-growing startup in a role with unlimited potential for growth.
- Excellent benefits –
- In the US we cover up to 95% of Medical Premiums, 50% of Dental & Vision Premiums, company sponsored Life Insurance, 401K, and generous parental leave.
- In Canada we offer Medical (prescription drug and paramedical coverage), Dental, Vision, Life Insurance, STD and LTD
- Flexible PTO policy and great work/life balance – We value and support each individual team member.
- Annual parking stipend – we help people park!
- The opportunity to collaborate with fun, innovative, and passionate people in a casual, yet highly productive atmosphere.
- A workplace recognized as the Best Consumer Web Company by Built in Chicago, Top Company Culture by Entrepreneur, a Top Workplace by Chicago Tribune, and one of Chicago’s Best Places to Work for Women Under 35 by Crain’s Chicago Business.
Steps to apply: Please include any GitHub account, LinkedIn profile, and any project that you’re particularly proud of. We love seeing work that others loved working on.
At SpotHero, we Respect Fellow Drivers by providing an inclusive interview experience for everyone, including people with disabilities. We are happy to provide reasonable accommodations to candidates in need of individualized support during the hiring process. Please let our team know of your need when you apply or as you begin interviewing with our team.
Additionally, because we want to Remember to Signal, if you choose to provide us personal information in connection with a job application, please review our Applicant Privacy Notice which provides details about what information we collect and process about you in order to consider your candidacy.
SpotHero is an equal opportunity employer. We know that a diverse workforce is the strongest workforce, and are committed to building and supporting an inclusive environment for all.