Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Aggregate Engines #6

Open
6 tasks
DrDub opened this issue Feb 6, 2024 · 6 comments
Open
6 tasks

Support Aggregate Engines #6

DrDub opened this issue Feb 6, 2024 · 6 comments
Assignees

Comments

@DrDub
Copy link
Contributor

DrDub commented Feb 6, 2024

Is your feature request related to a problem? Please describe.
The UIMA framework has engines composed by primitive engines (annotators) or aggregate engines. At present, the C++ version of the framework cannot handle aggregate engines, only primitives.

An example of a primitive annotator descriptor is the SimpleTextSegmenter.xml. It refers to the annotator itself, SimpleTextSegmenter.cpp.

The aggregate descriptors are discussed in the Apache UIMA Reference. An example descriptor from the Java framework is the NamesAndGovernmentOfficials_TAE.xml.

The Java UIMA aggregate analysis engine implementation involves the class AggregateAnalysisEngine_impl.java and many others.

Describe the solution you'd like
The UIMACPP framework should be able to load and execute Aggregate Engines in XML format composed of other aggregate engines or primitive engines implemented in C++.

This includes parsing the XML descriptors and routing the annotations (as part of the Common Annotation Structure, or CAS) from the different annotators. Note that that aggregators shield annotators based on the input and output annotations present in their descriptors.

Describe alternatives you've considered
Using UIMA-AS it was possible to interoperate between Java and C++, but the UIMA-AS framework has been retired.

Additional context
This has been discussed as one of the main roadblocks in using the C++ version of the framework by its users: https://lists.apache.org/thread/f1r3sghgn2oqhvzz27y26zg6j3olv8qq

Tasks

  • Initial classes for aggregate descriptors.
  • Parse and validate aggregate descriptor XML files.
  • Test cases for base aggregate functionality.
  • Base aggregate execution functionality.
  • Flow controller (optional/undecided for this issue, might file another in the future).
  • SofA mappers (optional/undecided for this issue, might file another in the future).
@ShaiviAgarwal2
Copy link

@DrDub Hi, Would like to work on this issue!!

@DrDub
Copy link
Contributor Author

DrDub commented Mar 26, 2024

Hi @ShaiviAgarwal2, do you mind checking whether the instructions in the new readme at #15 work? I'd really appreciate it.

Also, if you're considering applying to the GSoC please send me an email to drdub@apache.org to further discuss.

@ShaiviAgarwal2
Copy link

ShaiviAgarwal2 commented Mar 26, 2024

Hi @ShaiviAgarwal2, do you mind checking whether the instructions in the new readme at #15 work? I'd really appreciate it.

Also, if you're considering applying to the GSoC please send me an email to drdub@apache.org to further discuss.

Sent you the email. Could you please check it!!

@ShaiviAgarwal2
Copy link

Hi @ShaiviAgarwal2, do you mind checking whether the instructions in the new readme at #15 work? I'd really appreciate it.

Also, if you're considering applying to the GSoC please send me an email to drdub@apache.org to further discuss.

@DrDub I checked the instructions mentioned by you in the new readme at #15. It works fine :)

@mac-op
Copy link
Contributor

mac-op commented May 30, 2024

Hi, I'll be working on this issue

@mac-op
Copy link
Contributor

mac-op commented Jun 5, 2024

We have identified what has been already been implemented (in internal_aggregate_engine.cpp, annotator_mgr.cpp and related) as well as what's currently missing (eg. when a delegate is a CAS Multiplier the CASIter for Aggregates does not return children CASes).

I will create test cases to examine the capabilities of the XML Parser to see if it conforms to the entire spec and then move on to the Aggregate Engine functionalities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants