tollef.web

first entry - project update

April 01 2023 @ 00:20

As a test/demo of the my first blog entry for this site, a few findings from the past few weeks and update on my current projects seemed reasonable..?

There are currently three major projects in the works:

  • Metadata extraction and semantic search for criminal investigations (CCIQ)
  • Unsupervised sentence summarization/compression
  • Coreference resolution (CorefUD 1.1)

And of course, this blog, along with a new birthday page for the institute: https://idi-birthday.web.app/ for the past two days.

Focus has primarily been on Project 1, CCIQ.

The core finding has been that Event Extraction (EE) and similar fields are largely undefined and have little practical use in terms of criminal cases, as there are no ground truths. When data is based on claimed facts, treating these as events may not necessarily be correct, and thus needs to be accounted for. The act of detecting them is interesting, but from previous police work, a solid foundation for semantic search has seemed to be a better fit - as we can then search for meanings behind a claimed fact.

Maarten Grootendorst has done interesting work on the field, and is behind OSS like BERTopic.

When investigating some use-cases for EE, before disregarding further developments, the following papers were considered:

2023-03-31T22:20:36.728Z.png