SAFEWORDS: Ethical, Privacy-Preserving and Trustworthy Language Technologies within the HumanAIze Project

SAFEWORDS is a core component of the HumanAIze project, a coordinated research initiative aimed at developing the next generation of human-centred, trustworthy and multilingual large language models (LLMs) for Europe. Within this framework, SAFEWORDS provides the ethical, legal and data governance backbone that ensures all technological developments are fully aligned with European fundamental rights, democratic values and regulatory requirements.

Role within HumanAIze

HumanAIze brings together a multidisciplinary consortium combining expertise in artificial intelligence, computational linguistics, knowledge representation, reasoning, evaluation and regulatory compliance.

  • WP2 – Governance and Ethical Frameworks: Establishes the ethical, legal and sustainability guidelines for model development. SAFEWORDS plays a central role in codifying privacy, fairness and compliance principles.

  • WP3 – Data Resources and Management: Focuses on dataset curation, multilingual resources, anonymisation and bias assessment. SAFEWORDS leads privacy-preserving data governance and compliance mechanisms.

  • WP4 – Base Models and Knowledge Integration: Develops and extends multilingual foundation models, incorporating structured knowledge and reasoning capabilities.

  • WP5 – Alignment, Safety and Trustworthiness: Aligns models with human values, mitigates bias, integrates reasoning strategies, and enhances explainability and reliability.

  • WP6 – Evaluation and Use Cases: Applies and evaluates models in real-world domains such as administrative-legal and biomedical scenarios.

SAFEWORDS Contribution

Within HumanAIze, SAFEWORDS ensures that innovation in LLM development is built on a robust data governance framework, covering data sourcing, classification, access control, and lifecycle management. It develops and validates advanced privacy-preserving anonymisation pipelines capable of detecting and mitigating personally identifiable information (PII) while preserving data utility.

SAFEWORDS guarantees compliance with the General Data Protection Regulation (GDPR) and the EU Artificial Intelligence Act (AI Act), while establishing methodologies for bias detection, mitigation and fairness evaluation. It also contributes to the development of benchmarks and evaluation protocols that promote transparency, explainability and accountability.

By integrating legal, ethical and technical expertise, SAFEWORDS strengthens the HumanAIze ecosystem, ensuring that the resulting language technologies are trustworthy, sustainable, socially responsible and compliant with European regulation.

European Added Value

HumanAIze, through SAFEWORDS and the broader consortium, directly supports European strategic priorities in:

  • Trustworthy and Human-Centric AI

  • Digital Sovereignty and Multilingual AI

  • Responsible Data Spaces

  • Ethical and Sustainable AI Development

Together, the consortium advances Europe’s capacity to develop competitive AI systems that respect fundamental rights while fostering innovation and societal impact.