Frame Induction at SemEval | Behrang Q. Zadeh

SemEval-2019 Task 2: Unsupervised Lexical Frame Induction

Behrang QasemiZadeh, Miriam R. L. Petruck, Regina Stodden, Laura Kallmeyer, and Marie Candito. SemEval-2019 Task 2: Unsupervised Lexical Frame Induction.
In Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval-2019), pages 16–30. ACL Anthology | DOI: 10.18653/v1/S19-2003

The task focuses on automatically discovering semantic frames — groups of verbs and their argument structures that describe similar situations — without supervision.

It’s inspired by FrameNet and VerbNet, but participants must induce frames directly from raw linguistic data (syntactic and morphological information only absent of semantic annotations).

+ The CodaLab page of the task is available at SemEval 2019 task 2 on Unsuperivsed Lexical Frame Induction
+ The scorer for the task is available for download from http://pars.ie/lr/semeval2019-task2/semeval-2019-task2-scorer.zip
+ The public trial data is available from http://pars.ie/lr/semeval2019-task2/trial-public.zip

Task Setup

Subtasks

Subtask	Description	Gold Reference
A. Verb Clustering	Cluster verb usages into groups that correspond to FrameNet frames.	FrameNet 1.7
B. Argument Clustering	Cluster arguments into semantic roles.	Split into: B.1: FrameNet core frame elements B.2: VerbNet semantic roles

Input

Sentences with syntactic dependencies and lemmas.
No frame labels (unsupervised).

Output

Clusters of verbs or argument slots that align with gold semantic frames or roles.

Conceptual Diagram

Raw Text Corpus

▼

Verb Instances

▼

Verb Clustering
(Task A)

▼

Argument Extraction

▼

Argument Clustering
(Task B)

╲

╱

▼

Induced Semantic Frames

▼

Evaluation wrt FrameNet & VerbNet

Data and Evaluation

Source data: Sentences with dependency parses and morphological annotations.
Gold standards: FrameNet 1.7 and VerbNet 3.2 annotations (for evaluation only).
Evaluation metric: Clustering metrics comparing system outputs to gold frames/roles (e.g., B-Cubed F-score).

(Data can be obtained from LDC)

## Notable Results

1. HHMM Team (Anwar et al., 2019)

Paper: “HHMM at SemEval‑2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings.”
[PDF – University of Hamburg]
Top system in Subtask B.1 (FrameNet roles) and strong results for Task A (verb lustering).
Method: combined syntactic dependency information with contextualized ELMo embeddings and hierarchical lustering.

2. L2F / INESC‑ID Team (Ribeiro et al., 2019)

Paper: “L2F/INESC‑ID at SemEval‑2019 Task 2: Unsupervised Lexical Semantic Frame Induction using Contextualized Word Representations.”
[PDF – SciSpace]
Used contextual embeddings (ELMo) + graph‑based lustering over verb‑argument pairs.
Demonstrated the benefits of contextual similarity for frame induction.

Insights & Challenges

Argument lustering (especially VerbNet roles) remains difficult — higher semantic ambiguity and role overlap.
Contextual embeddings (like ELMo, later BERT) significantly improved results but did not fully solve the problem.
The task showed that syntax plays an important role alongside embeddings for structured semantics.

Fllow‑up and Influence

The task paper has been cited numouros times**, influencing research in:

Unsupervised frame induction and semantic role discovery
Representation learning for semantic parsing
Cross‑lingual frame induction and multilingual semantic lustering

Notable follow‑up works include:

Unsupervised Semantic Frame Induction Revisited (IWCS 2021)
FrameBERT: Contextualized Frame Induction Using Transformer Models (arXiv 2023)
From Syntax to Semantics: Role Discovery in Low‑Resource Languages (ACL 2024)

The task dataset is commonly used in follow-up work on unsupervised semantic structure learning as a standard benchmark.

📚 Selected References

Task description:
QasemiZadeh, B., Reiter, N., Dobnik, S., Abend, A., & Idiart, M. (2019). SemEval-2019 Task 2: Unsupervised Lexical Frame Induction.
[ACL Anthology]

HHMM system:
Anwar, U., Ustalov, D., Arefyev, N., Ponzetto, S. P., & Biemann, C. (2019).
HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings.
[ACL Anthology]

L2F/INESC-ID system:
Ribeiro, R., & Mendonça, F. (2019).
INESC-ID at SemEval-2019 Task 2: Unsupervised Frame Induction with Contextualized Embeddings and Graph Clustering.
[ACL Anthology]

Explore truncated data in KWIC view:

This page last edited on 06 October 2025.