Factored
Company
Data Engineer (Databricks)
Job Description
Factored was conceived in Palo Alto, California by Andrew Ng and a team of highly experienced AI researchers, educators, and engineers to help address the significant shortage of qualified AI & Machine-Learning engineers globally. We know that exceptional technical aptitude, intelligence, communication skills, and passion are equally distributed around the world, and we are very committed to testing, vetting, and nurturing the most talented engineers for our program and on behalf of our clients.
We are currently looking for an exceptionally talented Data Engineer to join our team. You will be called on for a wide range of responsibilities, from data aggregation, scraping, validation, transformation, quality and DataOps administration of both structured and unstructured datasets. Ideally, you will be experienced in optimizing data architecture, building data pipelines and wrangling data to suit the needs of our algorithms and application functionality.
Functional Responsibilities:
- Develop and maintain ETL (Extract, Transform, Load) processes using Python.
- Design, build, and optimize large-scale data pipelines on Databricks.
- Write efficient SQL queries to extract, manipulate, and analyze data from various databases.
- Design and develop optimal data processing techniques: automating manual processes, data delivery, data validation and data augmentation.
- Collaborate with stakeholders to understand data needs and translate them into scalable solutions.
- Design and develop API integrations in order to feed different data models.
- Architect and implement new features from scratch, partnering with AI/ML engineers to identify data sources, gaps and dependencies.
- Identify bugs and performance issues across the stack, including performance monitoring and testing tools to ensure data integrity and quality user experience.
- Build a highly scalable infrastructure using SQL and AWS big data technologies.
- Keep data secure and compliant with international data handling rules.
Qualifications:
- 3 - 5+ years of professional experience shipping high-quality, production-ready code.
- Strong computer science foundations, including data structures & algorithms, OS, computer networks, databases, algorithms, and object-oriented programming.
- Experience with Databricks.
- Experience in Python.
- Experience in setting up data pipelines using relational SQL and NoSQL databases, including Postgres, Cassandra or MongoDB.
- Experience with cloud services for handling data infrastructure such as: Snowflake(preferred), Azure, Databricks, Azure Databricks, and/or AWS.
- Experience with orchestration tools such as Airflow
- Proven success manipulating, processing, and extracting value from large datasets.
- Experience with Big Data tools, including Hadoop, Spark, Kafka, etc.
- Expertise with version control systems, such as Git.
- Strong analytic skills related to working with unstructured datasets.
- Excellent verbal and written communication skills in English.
Factored
7 jobs posted
About the job
Similar Jobs
Discover more opportunities that match your interests
- 18 days ago
数据工程师 (Data Engineer)
BJAK
ChinaView details - 18 days ago
データエンジニア (Data Engineer)
BJAK
Tokyo, JapanView details - 18 days ago
Data Engineer
BJAK
DublinView details - 18 days ago
Data Engineer
BJAK
SingaporeView details - 18 days ago
Data Engineer
BJAK
ThailandView details - 18 days ago
Data Engineer
BJAK
VietnamView details - 18 days ago
Data Engineer
BJAK
IndonesiaView details - 18 days ago
Data Engineer
BJAK
MalaysiaView details - 6 days ago
Data Engineer
BJAK
VietnamView details - 6 days ago
Data Engineer
BJAK
ThailandView details
Looking for something different?
Browse all AI jobs