Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid multi-byte sequences cause internal file translations (to UTF8) to fail #13150

Closed
Colengms opened this issue Jan 14, 2025 · 1 comment
Assignees
Labels
bug fixed Check the Milestone for the release in which the fix is or will be available. Language Service world ready An issue relating string character encodings, localization translations, etc.
Milestone

Comments

@Colengms
Copy link
Contributor

Colengms commented Jan 14, 2025

Opening this to track an issue I noticed while investigating: #13078

If files.encoding is set to a multi-byte encoding, and an invalid multi-byte sequence is detected, the translation would fail. This should instead skip the invalid sequece and continue trying to translate the encoding.

@Colengms Colengms added bug Language Service world ready An issue relating string character encodings, localization translations, etc. labels Jan 14, 2025
@Colengms Colengms added this to the 1.23 milestone Jan 14, 2025
@Colengms Colengms self-assigned this Jan 14, 2025
@Colengms Colengms moved this to Pull Request in cpptools Jan 14, 2025
@sean-mcmanus sean-mcmanus modified the milestones: 1.23, 1.23.4 Jan 15, 2025
@sean-mcmanus sean-mcmanus moved this from Pull Request to Done in cpptools Jan 15, 2025
@sean-mcmanus sean-mcmanus added the fixed Check the Milestone for the release in which the fix is or will be available. label Jan 15, 2025
@v-frankwang
Copy link

@Colengms @sean-mcmanus We went to verify the issue in the latest C++ extension 1.24.1 (pre-release) according to the following steps, is the gif showing the expected result?
If our validation steps are not correct, could you share our detailed validation steps?

Verification Steps:

  1. Create a Demo folder in PC.
  2. Open Demo with VSCode, then add a .cpp file and paste the following code:
#include <iostream>
 
int main() {
    
    const char* invalidSequence = "\xC3\x28"; // Invalid UTF-8 sequence
    std::cout << "Invalid sequence: " << invalidSequence << std::endl;
    return 0;
}
  1. Navigate to the bottom toolbar and select the GB18030 encode.
  2. Run the .cpp file.

Verification Results: Skip invalid sequece and run the file successfully

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fixed Check the Milestone for the release in which the fix is or will be available. Language Service world ready An issue relating string character encodings, localization translations, etc.
Projects
Status: Done
Development

No branches or pull requests

3 participants