Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

Breakpoint with Unicode chars in filename doesn't work in 3.5 #1124

Closed
int19h opened this issue Jan 26, 2019 · 1 comment
Closed

Breakpoint with Unicode chars in filename doesn't work in 3.5 #1124

int19h opened this issue Jan 26, 2019 · 1 comment
Assignees
Milestone

Comments

@int19h
Copy link
Contributor

int19h commented Jan 26, 2019

Manifests in tests/func/test_breakpoints.py::test_path_with_unicode[file-attach_socket_import] on Python 3.5:

tests/func/test_breakpoints.py::test_path_with_unicode[file-attach_socket_import]
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Captured stdout ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@00.000000: New debug session with method 'launch'
@00.000000: Initializing debug session for ptvsd#5678
@00.000000: ptvsd: c:\git\ptvsd\.tox\py35\lib\site-packages\ptvsd\__init__.py
@00.000000: Start method: attach_socket_import
@00.000000: Target: (file) C:\git\ptvsd\tests\func\testfiles\bp\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py
@00.000000: Current directory: None
@00.000000: PYTHONPATH: c:\git\ptvsd\.tox\py35\lib\site-packages;C:\git\ptvsd\tests\helpers\debuggee
@00.000000: Spawning ['c:\\git\\ptvsd\\.tox\\py35\\scripts\\python.exe', 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py']
@00.016000: Trying to connect to ptvsd#5678
@01.016000: Successfully connected to ptvsd#5678
@01.016000: ptvsd#5678 --> {"event": "output", "type": "event", "body": {"data": {"version": "4.2.1+3.gece703c"}, "output": "ptvsd", "category": "telemetry"}, "seq": 0}
@01.110000: ptvsd#5678 has pid=1648
@01.110000: Waiting for next Event('output', ANY)
@01.110000: Realized (1!Mark('begin',) >> Event('output', ANY)):
    Event('output', ANY) by 2!Event('output', {'data': {'version': '4.2.1+3.gece703c'}, 'output': 'ptvsd', 'category': 'telemetry'})
@01.110000: ptvsd#5678 <-- {"arguments": {"adapterID": "test"}, "type": "request", "seq": 1, "command": "initialize"}
@01.110000: Waiting for Response(3!Request('initialize', {'adapterID': 'test'}), ANY)
@01.110000: ptvsd#5678 --> {"type": "response", "body": {"exceptionBreakpointFilters": [{"filter": "raised", "default": false, "label": "Raised Exceptions"}, {"filter": "uncaught", "default": true, "label": "Uncaught Exceptions"}], "supportsSetExpression": true, "supportsModulesRequest": true, "supportsLogPoints": true, "supportsValueFormattingOptions": true, "supportsEvaluateForHovers": true, "supportsConditionalBreakpoints": true, "supportsExceptionInfoRequest": true, "supportsConfigurationDoneRequest": true, "supportTerminateDebuggee": true, "supportsCompletionsRequest": true, "supportsExceptionOptions": true, "supportsDebuggerProperties": true, "supportsSetVariable": true, "supportsHitConditionalBreakpoints": true}, "message": "", "request_seq": 1, "seq": 1, "success": true, "command": "initialize"}
@01.110000: ptvsd#5678 --> {"event": "initialized", "type": "event", "body": {}, "seq": 2}
@01.125000: Realized Response(3!Request('initialize', {'adapterID': 'test'}), ANY):
    Response(3!Request('initialize', {'adapterID': 'test'}), ANY) by 4!Response(3!Request('initialize', {'adapterID': 'test'}), {'exceptionBreakpointFilters': [{'filter': 'raised', 'default': False, 'label': 'Raised Exceptions'}, {'filter': 'uncaught', 'default': True, 'label': 'Uncaught Exceptions'}], 'supportsSetExpression': True, 'supportsModulesRequest': True, 'supportsLogPoints': True, 'supportsValueFormattingOptions': True, 'supportsEvaluateForHovers': True, 'supportsConditionalBreakpoints': True, 'supportsExceptionInfoRequest': True, 'supportsConfigurationDoneRequest': True, 'supportTerminateDebuggee': True, 'supportsCompletionsRequest': True, 'supportsExceptionOptions': True, 'supportsDebuggerProperties': True, 'supportsSetVariable': True, 'supportsHitConditionalBreakpoints': True})
@01.125000: Waiting for next Event('initialized', {})
@01.125000: Realized (2!Event('output', {'data': {'version': '4.2.1+3.gece703c'}, 'output': 'ptvsd', 'category': 'telemetry'}) >> Event('initialized', {})):
    Event('initialized', {}) by 5!Event('initialized', {})
@01.125000: ptvsd#5678 <-- {"arguments": {"debugOptions": ["RedirectOutput"], "pathMappings": []}, "type": "request", "seq": 2, "command": "attach"}
@01.125000: Waiting for Response(6!Request('attach', {'debugOptions': ['RedirectOutput'], 'pathMappings': []}), ANY)
@01.125000: ptvsd#5678 --> {"type": "response", "body": {}, "message": "", "request_seq": 2, "seq": 3, "success": true, "command": "attach"}
@01.125000: Realized Response(6!Request('attach', {'debugOptions': ['RedirectOutput'], 'pathMappings': []}), ANY):
    Response(6!Request('attach', {'debugOptions': ['RedirectOutput'], 'pathMappings': []}), ANY) by 7!Response(6!Request('attach', {'debugOptions': ['RedirectOutput'], 'pathMappings': []}), {})
@01.125000: ptvsd#5678 <-- {"type": "request", "seq": 3, "command": "threads"}
@01.125000: Waiting for (8!Request('threads', None) >> Event('thread', ANY))
@01.125000: ptvsd#5678 --> {"event": "thread", "type": "event", "body": {"reason": "started", "threadId": 1}, "seq": 4}
@01.125000: ptvsd#5678 --> {"type": "response", "body": {"threads": [{"id": 1, "name": "MainThread"}]}, "message": "", "request_seq": 3, "seq": 5, "success": true, "command": "threads"}
@01.125000: Realized (8!Request('threads', None) >> Event('thread', ANY)):
    Event('thread', ANY) by 9!Event('thread', {'reason': 'started', 'threadId': 1})
@01.125000: Waiting for Response(8!Request('threads', None), ANY)
@01.125000: Realized Response(8!Request('threads', None), ANY):
    Response(8!Request('threads', None), ANY) by 10!Response(8!Request('threads', None), {'threads': [{'id': 1, 'name': 'MainThread'}]})
@01.125000: ptvsd#5678 <-- {"arguments": {"breakpoints": [{"line": 6}], "source": {"path": "C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py"}}, "type": "request", "seq": 4, "command": "setBreakpoints"}
@01.125000: Waiting for Response(11!Request('setBreakpoints', {'breakpoints': [{'line': 6}], 'source': {'path': 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py'}}), ANY)
@01.125000: ptvsd#5678 --> {"type": "response", "body": {"breakpoints": [{"id": 0, "line": 6, "verified": true}]}, "message": "", "request_seq": 4, "seq": 6, "success": true, "command": "setBreakpoints"}
@01.141000: Realized Response(11!Request('setBreakpoints', {'breakpoints': [{'line': 6}], 'source': {'path': 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py'}}), ANY):
    Response(11!Request('setBreakpoints', {'breakpoints': [{'line': 6}], 'source': {'path': 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py'}}), ANY) by 12!Response(11!Request('setBreakpoints', {'breakpoints': [{'line': 6}], 'source': {'path': 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py'}}), {'breakpoints': [{'id': 0, 'line': 6, 'verified': True}]})
@01.141000: ptvsd#5678 <-- {"type": "request", "seq": 5, "command": "configurationDone"}
@01.141000: Waiting for next (Event('process', ANY) & Response(13!Request('configurationDone', None), ANY))
@01.141000: ptvsd#5678 OUT b'??? ??????'
@01.141000: ptvsd#5678 --> {"event": "process", "type": "event", "body": {"systemProcessId": 1648, "startMethod": "attach", "isLocalProcess": true, "name": "C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py"}, "seq": 7}
@01.141000: ptvsd#5678 --> {"type": "response", "body": {}, "message": "", "request_seq": 5, "seq": 8, "success": true, "command": "configurationDone"}
@01.141000: Realized (12!Response(11!Request('setBreakpoints', {'breakpoints': [{'line': 6}], 'source': {'path': 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py'}}), {'breakpoints': [{'id': 0, 'line': 6, 'verified': True}]}) >> (Event('process', ANY) & Response(13!Request('configurationDone', None), ANY))):
    Event('process', ANY) by 14!Event('process', {'systemProcessId': 1648, 'startMethod': 'attach', 'isLocalProcess': True, 'name': 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py'})
    Response(13!Request('configurationDone', None), ANY) by 15!Response(13!Request('configurationDone', None), {})
@01.141000: Realized Event('process', {'startMethod': 'attach', 'systemProcessId': 1648, 'isLocalProcess': True, 'name': ANY.str}):
    Event('process', {'startMethod': 'attach', 'systemProcessId': 1648, 'isLocalProcess': True, 'name': ANY.str}) by 14!Event('process', {'systemProcessId': 1648, 'startMethod': 'attach', 'isLocalProcess': True, 'name': 'C:\\git\\ptvsd\\tests\\func\\testfiles\\bp\\\u0ca8\u0ca8\u0ccd\u0ca8_\u0cb8\u0ccd\u0c95\u0ccd\u0cb0\u0cbf\u0caa\u0ccd\u0c9f\u0ccd.py'})
@01.141000: Waiting for next Event('stopped', {'reason': 'breakpoint', ...})
@01.141000: ptvsd#5678 --> {"event": "output", "type": "event", "body": {"output": "b'??? ??????'", "category": "stdout"}, "seq": 9}
@01.141000: ptvsd#5678 --> {"event": "output", "type": "event", "body": {"output": "\n", "category": "stdout"}, "seq": 10}
@01.141000: ptvsd#5678 --> {"event": "thread", "type": "event", "body": {"reason": "exited", "threadId": 1}, "seq": 11}
@01.282000: ptvsd#5678 --> {"event": "exited", "type": "event", "body": {"exitCode": 0}, "seq": 12}
@01.282000: ptvsd#5678 --> {"event": "terminated", "type": "event", "body": {}, "seq": 13}

The test times out waiting for the "stopped" event:

  File "C:\git\ptvsd\tests\func\test_breakpoints.py", line 56, in test_path_with_unicode
    hit = session.wait_for_thread_stopped('breakpoint')

When pytest-xdist is used (i.e. by default, unless running with -n0 - so also on all automated runs), the error message is different:

2019-01-26T01:52:18.9629438Z Traceback (most recent call last):
2019-01-26T01:52:18.9629644Z   File "d:\a\1\s\.tox\py35\lib\site-packages\pytest_timeout.py", line 344, in timeout_timer
2019-01-26T01:52:18.9629820Z     write(stdout)
2019-01-26T01:52:18.9630019Z   File "d:\a\1\s\.tox\py35\lib\site-packages\pytest_timeout.py", line 397, in write
2019-01-26T01:52:18.9630193Z     stream.write(text)
2019-01-26T01:52:18.9630546Z   File "d:\a\1\s\.tox\py35\lib\encodings\cp1252.py", line 19, in encode
2019-01-26T01:52:18.9630726Z     return codecs.charmap_encode(input,self.errors,encoding_table)[0]
2019-01-26T01:52:18.9630927Z UnicodeEncodeError: 'charmap' codec can't encode characters in position 290-293: character maps to <undefined>

This is due to a bug in pytest-xdist when handling captured test output with non-Unicode characters. The underlying root cause is per above.

@int19h
Copy link
Contributor Author

int19h commented Jan 31, 2019

The root cause here is the way Python pre-3.6 handles script filenames internally - while __file__ of a module and co_filename of a code object are Unicode in 3.x, the string seems to pass through a decoding/encoding process at some point, which uses sys.getfilesystemencoding() - which on Windows corresponds to "ANSI" encoding for the current locale. As a result, if any non-ASCII characters cannot be represented in that encoding, the code object ends up with co_filename == '<encoding error>'. Which, of course, doesn't match the proper file name in the breakpoint. In 3.6+, this changed due to PEP 529, which is why this issue no longer exists there (unless legacy mode is enabled).

So far as I can tell, the only way to mitigate this is to force a different locale that can accommodate the characters in the test. But it doesn't look like that can be changed easily just for the test run, and otherwise we generally can't assume what the locale is on the system which runs the test. So, we'll just have to disable this test for 3.5, and generally recommend that people who want to use Unicode paths in Python on Windows should migrate to 3.6+.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants