🌐 Federated Data Spaces

Domain-specific ecosystems for secure, governed data sharing without central data lakes

What It Is

A Federated Data Space is a decentralized ecosystem where organizations share data under common governance rules, without moving data to a central repository. Participants retain sovereignty over their dataβ€” deciding who accesses what, for how long, and under which conditions. Think "data marketplace" meets "trust framework."

Core Principle: Data stays at the source. Instead of copying data to data lakes, participants expose data products (APIs, query endpoints) that others can access with permission. Governance, identity, and access control are federated across organizational boundaries.

Inspired by Europe's GAIA-X and International Data Spaces (IDS) initiatives, Finnish data spaces focus on domain-specific use cases: healthcare (patient data continuity), education (credential portability), business (supply chain transparency). Each space has its own rules, but interoperates via shared standards.

Technical Architecture

Core Components

Component Technology Purpose
Data Connector IDS Connector, Eclipse Dataspace Components (EDC) Gateway for secure data exchange between participants
Identity Provider Decentralized Identifiers (DIDs), X.509 certs Authenticate participants (organizations, systems)
Usage Control ODRL (Open Digital Rights Language) Enforce data usage policies (e.g., "no redistribution", "delete after 30 days")
Data Catalog DCAT (Data Catalog Vocabulary), CKAN Discover available datasets/APIs within the space
Vocabulary Hub Suomi.fi Interoperability Platform (SKOS, OWL) Shared semantic layer for data interoperability
Clearing House Blockchain or distributed ledger Audit trail for data transactions (who accessed what, when)
Trust Anchor PKI infrastructure, Trust Registry Verify participant credentials (e.g., healthcare provider license)

Data Space Architecture (IDS Model)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Federated Data Space β”‚ β”‚ (e.g., Finnish Health Data Space) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ DATA PROVIDER β”‚ β”‚ DATA CONSUMER β”‚ β”‚ (Hospital A) β”‚ β”‚ (Research Inst)β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Data Exchange β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Health β”‚ β”‚ β”‚ β”‚ Analysis β”‚ β”‚ β”‚ β”‚ Records β”‚ β”‚ β”‚ β”‚ System β”‚ β”‚ β”‚ β”‚ Database β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ IDS Connector │◄────────────────────────►│ IDS Connector β”‚ β”‚ β”‚ β”‚ (Provider) β”‚ Encrypted channel β”‚ (Consumer) β”‚ β”‚ β”‚ β”‚ β”‚ (TLS + usage policy) β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Metadata Broker β”‚ β”‚ (Data Catalog) β”‚ β”‚ - Who has what β”‚ β”‚ - Access terms β”‚ β”‚ - Schemas β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Clearing House β”‚ β”‚ (Audit Log) β”‚ β”‚ - Transaction ID β”‚ β”‚ - Timestamp β”‚ β”‚ - Data hash β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Key Flow: 1. Consumer discovers dataset via Metadata Broker 2. Consumer requests data from Provider's connector 3. Provider's connector checks: - Is consumer authorized? (Trust Registry) - Does usage comply with policy? (ODRL rules) 4. If approved β†’ data transferred (encrypted) 5. Transaction logged in Clearing House (audit trail) 6. Usage policy enforced at consumer's connector (e.g., auto-delete after 30 days)

Data Connectors & Usage Control

IDS Connector Architecture

The IDS Connector is a secure gateway that enforces data sovereignty. It sits between internal systems and the data space, acting as both firewall and policy enforcement point.

# IDS Connector Deployment (Docker example) docker run -d \ --name ids-connector-hospital \ -e CONNECTOR_ID=did:web:hospital.fi \ -e CLEARING_HOUSE_URL=https://clearing.healthspace.fi \ -e METADATA_BROKER=https://broker.healthspace.fi \ -e DAPS_URL=https://daps.gaia-x.eu \ -v /path/to/certs:/certs \ -v /path/to/policies:/policies \ idsconnector:latest # Configuration: /policies/data-sharing-policy.json { "policy": { "@type": "odrl:Set", "odrl:permission": [{ "odrl:target": "https://hospital.fi/api/patient-records", "odrl:action": "odrl:read", "odrl:constraint": [ { "odrl:leftOperand": "odrl:purpose", "odrl:operator": "odrl:eq", "odrl:rightOperand": "medical-research" }, { "odrl:leftOperand": "odrl:elapsedTime", "odrl:operator": "odrl:lteq", "odrl:rightOperand": "P30D" // Auto-delete after 30 days } ], "odrl:duty": [{ "odrl:action": "odrl:delete", "odrl:constraint": [{ "odrl:leftOperand": "odrl:dateTime", "odrl:operator": "odrl:gt", "odrl:rightOperand": { "@value": "2026-04-15", "@type": "xsd:date" } }] }] }] } }

Usage Control Enforcement

Policy Type Example Rule Enforcement Mechanism
Purpose Limitation "Data may only be used for cancer research" Consumer declares purpose; provider validates against whitelist
Time-to-Live "Delete data after 30 days" Consumer's connector enforces automatic deletion
Anonymization "No PII may be revealed" Provider's connector strips identifiers before transfer
No Redistribution "Data cannot be shared with third parties" Consumer's connector blocks outbound sharing
Consent-Based "Patient must have given explicit consent" Provider checks consent log before data release

Domain-Specific Data Spaces

1. Finnish Health Data Space

Challenge: Patient data scattered across hospitals, health centers, labs. No unified view.

Solution: Health providers expose patient records via IDS connectors. Authorized clinicians query federated data (with patient consent). Data never leaves source systemsβ€”queries federated at runtime.

Example: Patient visits Specialist B (never seen before). Specialist queries Health Data Space:
  • Health Center A shares: medication history (patient consented)
  • Hospital C shares: lab results from 2024 (patient consented)
  • Pharmacy chain shares: prescription fills (automatic, legal basis)
Result: Complete health history assembled in seconds, zero manual data transfer.

2. Finnish Education Data Space

Challenge: Students change schools, universities. Transcripts, certificates manually requested.

Solution: Educational institutions share credentials (grades, degrees) as verifiable credentials within a federated space. Students control who accesses their records.

Example: Student applies to Master's program:
  • University requests: "Share your Bachelor's degree credential"
  • Student's wallet presents VC (issued by previous university)
  • University's connector verifies signature against Trust Registry
  • Admission decision automated (no transcript courier needed)
Result: Lifelong learning passportsβ€”credentials portable, instantly verifiable.

3. Finnish Mobility Data Space

Challenge: Transportation data siloed (trains, buses, taxis). No unified planning possible.

Solution: Transport operators share real-time data (schedules, capacity, routes) via data space. Journey planners, cities, researchers access data for optimization.

Example: City of Helsinki optimizes bus routes:
  • Queries: passenger flows (VR trains), e-scooter usage (Voi), taxi demand (metered trips)
  • Data anonymized by connectors (GDPR-compliant aggregation)
  • AI model identifies underserved routes β†’ proposes new bus line
Result: Data-driven urban planning without central data ownership.

Data Space Governance

Governance Model

Governance Layer Mechanism Example
Participation Rules Onboarding process + technical certification Healthcare providers must prove ISO 27001 compliance to join Health Data Space
Data Quality Standards Mandatory schemas + validation Patient records must conform to HL7 FHIR (validated at connector)
Usage Policies ODRL policy templates (sector-specific) Health data: "Research use only, no commercial exploitation"
Dispute Resolution Data Space Authority (legal entity) If participant violates rules β†’ suspension from space
Audit & Compliance Clearing House logs + periodic audits Annual review: did participants honor usage policies?

Trust Framework

# Example: Verifying a participant's identity and authorization 1. Organization requests to join Health Data Space 2. Identity verification: - Provides organization DID (did:web:hospital.fi) - Presents credentials: business registration, healthcare license 3. Technical certification: - Connector passes security scan (ISO 27001 audit) - Data encryption validated (AES-256, TLS 1.3) 4. Trust Registry entry created: { "participantId": "did:web:hospital.fi", "role": "data-provider", "certifications": ["ISO27001", "GDPR-compliant"], "allowedDataTypes": ["patient-records", "lab-results"], "validUntil": "2027-12-31" } 5. Participant's connector can now authenticate to other participants 6. Other connectors verify Trust Registry before accepting connections
Challenge: Enforcing usage policies after data leaves the provider's connector is difficult. Once data is at the consumer's site, technical enforcement relies on consumer's connector integrity. Legal contracts and audits are fallback mechanisms (deterrence, not prevention).

Key Standards & Technologies

Standard Organization Role in Data Spaces
IDS Reference Architecture International Data Spaces Association Blueprint for connector design, trust model
GAIA-X European Commission Federated cloud infrastructure, self-descriptions
ODRL (Usage Control) W3C Express data usage policies (permissions, duties, prohibitions)
DCAT (Data Catalog) W3C Metadata format for dataset discovery
HL7 FHIR Health Level 7 Healthcare data interoperability (patient records, observations)
Eclipse Dataspace Components (EDC) Eclipse Foundation Open-source IDS connector implementation

Deployment Roadmap

Phase Timeframe Deliverables
Pilots 2024-2025 Health Data Space pilot (3 hospitals), Education pilot (universities)
National Data Spaces 2026 Health, Education, Mobility spaces operational in Finland
Cross-Domain Interop 2027 Shared trust framework (participants can join multiple spaces)
EU Federation 2028+ Finnish Health Data Space interoperates with German, Dutch equivalents

Technical References

Back to Overview