Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure we pass UTF-16 code unit offsets in all the LSP types #1113

Open
Xanewok opened this issue Nov 3, 2018 · 5 comments
Open

Make sure we pass UTF-16 code unit offsets in all the LSP types #1113

Xanewok opened this issue Nov 3, 2018 · 5 comments

Comments

@Xanewok
Copy link
Member

@Xanewok Xanewok commented Nov 3, 2018

cc #1112
cc microsoft/language-server-protocol#376

This causes problems with displaying correct diagnostic span and code suggestion spans (here).

@Xanewok
Copy link
Member Author

@Xanewok Xanewok commented Nov 30, 2018

Currently LSP specifies all the text offset to use the UTF-16 code unit ("Text Documents section in the LSP specification) and so that's what Range type is expected to pass.

However, RLS uses its own rls_span::Range (from rls-span crate, used both by the rustc and rls), which has text unit offset specified as the unicode scalar values (think Rust char and chars()), which we naively transform to Range using rls_to_range: (bad!)

pub fn rls_to_range(r: span::Range<span::ZeroIndexed>) -> Range {

For lines it doesn't matter, but we should only be able to make the UTF-16 code units <> Unicode scalar value offset conversion given a source line that the range operates on.

It might make sense to create a method on the VFS (https://github.com/rust-dev-tools/rls-vfs) to convert between given spans or columns.

See #1112 and rust-dev-tools/rls-vfs#24 for related changes

@lijinpei
Copy link
Contributor

@lijinpei lijinpei commented Feb 13, 2019

Maybe we should ignore this problem, and wait (or make ?) M$ to change that to utf-8?

@Xanewok
Copy link
Member Author

@Xanewok Xanewok commented Feb 13, 2019

@mawww
Copy link

@mawww mawww commented Feb 14, 2019

If enough client/servers disregard the spec and unify on a sane alternative (byte or codepoint count), VSCode and the spec will eventually adapt. I suspect most tools use byte or codepoint counts until an issue gets opened due to a strange interaction with another lsp tool, at which point somebody reads the spec, re-reads it again, and goes through the various stages of grief...

Microsoft has control of the spec, but we, as tools writers, have no obligation to follow it to the letter, provided we unify on alternative behaviours and make it known.

@Xanewok Xanewok added the bug label Mar 3, 2019
@soc
Copy link

@soc soc commented Mar 26, 2019

@mawww This is exactly what I intend to do for the (non-Rust) LSP implementation I'm planning to write in the coming months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.