(Remote) Senior AI / Knowledge Graph Engineer (m/f/d)
Pinnipedia Technologies GmbH
engineering
full-time
senior
Berlin
Posted an hour ago
RemoteEngineeringprofessional / experienced
About the Role
Pinnipedia is a new Berlin startup building a cloud platform that automates and assists the creation of audit-ready IT-security concepts (e.g., BSI-Grundschutz, C5). We’re IGP-funded (2025/26) and co-develop with FU Berlin and pilot users from industry and security consulting. We’re hiring an AI Engineer to turn messy inputs into structured knowledge and reliable answers. Your Mission -Own the end-to-end pipeline that turns unstructured documents into a validated, queryable knowledge graph. Accountable for extraction quality, graph integrity, and the data layer that backs the product's read path. Tasks • LLM extraction pipelines -document chunking, property and relationship extraction, cross-chunk reconciliation, gap detection. Built with structured-output LLM agents orchestrated by durable workflows. • Knowledge graph -schema design as typed Pydantic models, Cypher access patterns and indexing strategy, graph operations, schema evolution and migration. Scope ends at the graph boundary: API contracts and query abstractions exposed to consumers belong to the full-stack engineer. • Deterministic rule engines -table-driven evaluators for cases where code beats LLM judgment; clear contracts between deterministic and probabilistic components. • Data validation & quality -schema enforcement, required-property contracts, audit trails, eval harnesses (expert review, unsupervised checks, synthetic fixtures, LLM-as-judge). • Live data ops -backfills, coordinated migrations across relational + graph stores, observability on extraction throughput and quality, incident response. Requirements Must-have • 5+ years shipping data/AI systems to production with real customers -has been on-call for live pipelines and knows what breaks at 2am. • Strong Python (typed, modern) and SQL. Comfortable with PostgreSQL under load. • Production experience with at least one graph database (Neo4j preferred; Neptune, ArangoDB, TigerGraph acceptable) -schema design, query tuning, not toy use. • Production LLM pipeline experience: structured output, agent orchestration, prompt and version management, evaluation frameworks. PydanticAI, LangChain, DSPy, or Instructor all welcome. • Durable workflow orchestration in production (DBOS, Temporal, Airflow, Prefect, Dagster). • Test-first discipline -integration tests against real datastores (Testcontainers or equivalent), not mock-heavy unit tests. • Fluent English skills. Nice-to-have • Experience with regulated, compliance-driven, or standards-heavy extraction domains (legal, medical, financial, security/audit). • Designed deterministic evaluators alongside LLM components and knows when to reach for which. • Contributions to data contracts, schema governance, or ontology work. • German language skills. Benefits Remote, full-time with flexible scheduling. CET (Berlin) timezone availability expected. Possibility of relocation if successfull work relationship is achieved after a period of time. Competitive salary: 32.000–42.000 € base (premium for exceptional senior profiles). Small, focused team; direct collaboration with the Product Owner and Full-Stack Engineer. Modern tooling, real ownership, and a learning budget for role-relevant training. Impact: help SMEs meet rising security requirements with less friction. Apply on JOIN with your CV (PDF) and a short note (max 200 words) describing how you would design a KG-backed RAG pipeline (ontology scope, indexing, retrieval, and evaluation you’d use). Process: 20-min intro → 90-min practical (graph modeling + retrieval evaluation) → 45-min team chat → references. We review applications within 5 business days. Find Jobs in Germany on Arbeitnow
How to Apply
Apply directly through the original job page using the link below.