About The Position

We are seeking a highly skilled and motivated Data Engineer to join our growing AI/ML team. This role is ideal for someone passionate about building scalable data pipelines, enabling machine learning workflows, and integrating cutting-edge Large Models (LLMs) into production systems. You will work closely with data scientists, ML engineers, and software developers to design and implement robust data infrastructure that powers intelligent applications.

Requirements

  • Bachelor's or Master's degree in Computer Science, Data Engineering, or related field.
  • 3 years of experience in data engineering or backend development.
  • Strong proficiency in Python and libraries such as Pandas, NumPy, PySpark, etc.
  • Experience with AI/ML frameworks (e.g., TensorFlow, PyTorch, Scikit-learn).
  • Hands-on experience with LLMs and NLP tools (e.g., LangChain, Hugging Face, OpenAI API).
  • Proficiency in SQL and working with relational and NoSQL databases.
  • Familiarity with cloud platforms (AWS, GCP, Azure) and containerization (Docker, Kubernetes).
  • Knowledge of CI/CD pipelines and version control (Git).

Nice To Haves

  • Experience with MLOps tools (MLflow, Airflow, Kubeflow).
  • Understanding of data privacy and security best practices.
  • Exposure to vector databases (e.g., Pinecone, FAISS, Weaviate).
  • Experience with real-time data processing (Kafka, Flink).

Responsibilities

  • Design, build, and maintain scalable and efficient ETL/ELT pipelines using Python and modern data engineering tools.
  • Collaborate with AI/ML teams to support model training, evaluation, and deployment workflows.
  • Develop and optimize data schemas, storage solutions, and APIs for structured and unstructured data.
  • Integrate and fine-tune LLMs (e.g., OpenAI, Hugging Face Transformers) for various business use cases.
  • Ensure data quality, governance, andpliance across all data systems.
  • Monitor and troubleshoot data workflows and model performance in production.
  • Automate data ingestion from diverse sources including APIs, databases, and cloud storage.
  • Contribute to the development of internal tools and libraries for ML experimentation and deployment.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service