IR Seminar: “Introducing the EU Foreign Interference Dataset: AI Assisted, Machine-Readable Ontology for Multilingual FIMI Attributions”, H. Akın Ünver, 12:30Noon April 16 2026 (EN)

Title: “Introducing the EU Foreign Interference Dataset: AI Assisted, Machine-Readable Ontology for Multilingual FIMI Attributions”

Date and Time: April 16, 2026, 12:30
Venue: A130 Seminar Room, FEASS

Speaker: H. Akın Ünver, Özyeğin University

Abstract: Research on European foreign information manipulation and interference (FIMI) in international relations is constrained by a basic measurement problem: the most detailed empirical evidence is produced as narrative assessments by EU institutions, agencies, and civil-society monitors, but these reports are heterogeneous in terminology, attribution standards, evidentiary thresholds, language, and unit-of-analysis. As a result, scholars face persistent tradeoffs between comparability and nuance, and between scale and validity. This talk introduces ongoing work within the HorizonEurope project DE-CONSPIRATOR (Grant agreement ID: 101132671) to build the EU Foreign Interference Dataset (EU-FID), an ontology-driven effort to transform multi-source FIMI detection and attribution reports into a singular, universal, machine-readable, and multilingual dataset designed explicitly for cumulative, reproducible inference for empirical and comparative inquiry.

EU-FID contributes to break the deadlock in three empirical avenues. First, it formalizes a standardized ontology that separates (a) observed content and dissemination behavior, (b) hypothesized actor intent and coordination, and (c) attribution claims and confidence—allowing researchers to model uncertainty rather than collapse it. Second, it provides a transparent source-to-schema mapping that preserves provenance (which organization asserts what, when, and on what basis) and enables cross-source triangulation instead of forced harmonization. Third, it operationalizes an AI-assisted extraction and normalization pipeline for converting unstructured multilingual text into structured case records, while maintaining human-auditable traces for error analysis and iterative refinement. The talk emphasizes design choices that directly affect inference: defining “cases” versus “episodes,” representing partially observed campaigns, encoding competing attributions, handling translation drift and multilingual synonymy, and mitigating institutional and selection biases inherent in monitor-generated data. We present a validation strategy that combines expert adjudication, inter-coder reliability, schema-consistency checks, and cross-source agreement metrics, alongside benchmarks for extraction quality and robustness under distribution shift. We conclude by outlining research applications enabled by EU-FID—including comparative campaign typologies, actor–tactic–target regularities, temporal dynamics around crises and elections, and evaluation of counter-FIMI interventions—while specifying the dataset’s limits and the conditions under which causal claims remain unwarranted, directly contributing to the frontier of policy research under Common Foreign and Security Policy (CFSP) and the upcoming EU Defense Shield.

Information for: