first entry - project update
As a test/demo of the my first blog entry for this site, a few findings from the past few weeks and update on my current projects seemed reasonable..?
There are currently three major projects in the works:
- Metadata extraction and semantic search for criminal investigations (CCIQ)
- Unsupervised sentence summarization/compression
- Coreference resolution (CorefUD 1.1)
And of course, this blog, along with a new birthday page for the institute: https://idi-birthday.web.app/ for the past two days.
Focus has primarily been on Project 1, CCIQ.
The core finding has been that Event Extraction (EE) and similar fields are largely undefined and have little practical use in terms of criminal cases, as there are no ground truths. When data is based on claimed facts, treating these as events may not necessarily be correct, and thus needs to be accounted for. The act of detecting them is interesting, but from previous police work, a solid foundation for semantic search has seemed to be a better fit - as we can then search for meanings behind a claimed fact.
Maarten Grootendorst has done interesting work on the field, and is behind OSS like BERTopic.
When investigating some use-cases for EE, before disregarding further developments, the following papers were considered:
- Beyond Bag-of-Concepts: Vectors of Locally
Aggregated Concepts
- A Survey on Deep Learning Event Extraction:
Approaches and Applications - R2E: Rule-based Event Extractor
- Textual Entailment for Event Argument Extraction:
Zero- and Few-Shot with Multi-Source Learning - Along with the github page: https://github.com/osainz59/Ask2Transformers - Towards High Performance Multilingual Event Extraction:
Language Specific Issue and Feature Exploration - CLEVE: Contrastive Pre-training for Event Extraction
- EventGraph: Event Extraction as Semantic Graph Parsing - HIGHLY relevant
