Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency with API on empty strings #228

Open
nkkarpov opened this issue Sep 7, 2024 · 2 comments
Open

Inconsistency with API on empty strings #228

nkkarpov opened this issue Sep 7, 2024 · 2 comments
Labels

Comments

@nkkarpov
Copy link

nkkarpov commented Sep 7, 2024

For example, edlib.align("ACGT", "", k=1) returns 4 instead of -1. Additionally, it appears that the CIGAR string and locations are unreliable when the query string is empty, as they contain None even when k is larger than the edit distance.

edlib.align("ACGT", "", k=5, task="path")
{'editDistance': 4,
 'alphabetLength': 4,
 'locations': [(None, -1)],
 'cigar': None}
@nkkarpov nkkarpov added the bug label Sep 7, 2024
@Martinsos
Copy link
Owner

Thanks for reporting this!

Steps I think we should take:

  1. It seems you reported issue with python version of Edlib -> I wonder if this is issue also with C/C++ edlib, so the issue is originating from there, or this is happening only in the python binding in the logic that wraps C/C++ edlib.
  2. "Easy fix" for this is to update docs/comments that empty strings are not acceptable + throw runtime error on empty strings.
  3. Proper fix would be to make sure that edlib correctly handles empty string. We would first need to define what behaviour we expect, and then implement it. Also add tests for these cases.

I am short with time at the moment + this is easy to handle on the user side so I won't be prioritizing it at the moment, but I would be happy to accept PRs, as long as they follow the steps above.

@nkkarpov
Copy link
Author

The main reason this happens is that C++ code considers empty strings to be a special case. The code without these particular cases would produce the expected outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants