-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow options to be enclosed in '' "" or || #551
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me sometime to emulate the string parsing in my mind... Looks good to me.
@Ionic: do you want to take a second look?
Yes, I also think we should remove any strings that are nestable as parseable delimiters to avoid ambiguity. I'll look into this later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having thought even more about this, I still maintain that we shouldn't allow delimiters for which the start and end characters are different (like {}, [] and <>), but actually decide on one fixed delimiter.
I don't like |
because it's non-standard and weird in this context.
From the remaining characters '
and "
, "
is probably most natural.
I'd go with supporting "
only.
char *realvalue = NULL; | ||
char *next = fullstring + strlen(string); | ||
|
||
if ((delimiter = index(string, '='))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strchr
instead of index
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, can do that. The original used rindex (which is wrong IMHO). I can see posix has deprecated both of them.
{ | ||
*delimiter = 0; | ||
value = delimiter + 1; | ||
for (int i = 0; i < strlen(startchars); i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use sizeof(startchars) - 1
here to avoid the overhead of strlen()
, but not really important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
else | ||
{ | ||
/* option without value */ | ||
value = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useless, value will already be NULL
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will fix that.
char *valstart = fullstring + (value - string) + 1; | ||
if ((next = strchr(valstart, endchars[i]))) | ||
{ | ||
realvalue = strndup(valstart, next - valstart); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hope that strndup(s, 0)
is legit... on the platforms I tested, it does what a human would expect, but no idea if that's universally true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If s is longer than n, only n bytes are copied, and a terminating null byte ('\0') is added."
It's not clearly stated what happens for n=0 but it is also not stated that 0 is illegal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was my point, too.
delimiter = rindex(string, ':'); | ||
|
||
if (delimiter) | ||
if ((delimiter = rindex(string, ':'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to strrchr()
if touching this anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are some more uses of (r)index() in Binder.c, should also be fixed, but separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are a lot of other cases that use deprecated functionality, but we can update what we need to touch anyway "for free".
current = NULL; | ||
} | ||
} | ||
if (current == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, current can also become NULL inside the if clause. Using an else would skip the check in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, right.
@@ -1545,45 +1545,138 @@ static void nxagentParseOptions(char *name, char *value) | |||
free(argv[0]); | |||
} | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/*
* Please use
* the standard
* multi-line
* comment format
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
char *valstart = fullstring + (value - string) + 1; | ||
if ((next = strchr(valstart, endchars[i]))) | ||
{ | ||
realvalue = strndup(valstart, next - valstart); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An option string such as foo="b,ar"baz
will be parsed wrongly, with only b,ar
passed as the value. It's especially problematic that the remainder baz
will be handled as a key in the next iteration.
I guess that more work is necessary here.
And it can get even more complex, think of a string like foo="b,ar"b"az"
as well. We actually need to scan for "quotes" multiple times until we either hit another comma or a NULL
character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Quoting characters are only allowed at the beginning and at the end of the string. Everything else is considered unsupported and will/can produce unexpected results.
However, what about extending that a bit by only excepting the end character it it is directly followed by ',' or '\0'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That at least prohibits unexpected/wrong further parsing and looks correct. I guess that's good enough under the constraint you mentioned.
So let us participate in your thoughts. Why so you want to limit that? Having more possibilities should not hurt anyone. The rule is: Do not use quotes that conflict with the content or you will get wrong results.
Vom Smartphone gesendet.
----- Ursprüngliche Nachricht -----
Von: "Mihai Moldovan" <notifications@github.com>
Gesendet: 18.11.2017 06:28
An: "ArcticaProject/nx-libs" <nx-libs@noreply.github.com>
Cc: "Ulrich Sibiller" <uli42@gmx.de>; "Author" <author@noreply.github.com>
Betreff: Re: [ArcticaProject/nx-libs] Allow options to be enclosed in '' "" ||{} () [] or <> (#551)
@Ionic requested changes on this pull request.
Having thought even more about this, I still maintain that we shouldn't allow delimiters for which the start and end characters are different (like {}, [] and <>), but actually decide on one fixed delimiter.
I don't like | because it's non-standard and weird in this context.
From the remaining characters ' and ", " is probably most natural.
I'd go with supporting " only.
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
+ If the function finds it needs more characters it will take
+ those from fullstring. It will return the position in fullstring
+ after the comma following the current processed option
+
+*/
+static char *processoption(char *string, char *fullstring)
+{
+ const char startchars[] = "'\"|{([<";
+ const char endchars[] = "'\"|})]>";
+
+ char *delimiter = NULL;
+ char *value = NULL;
+ char *realvalue = NULL;
+ char *next = fullstring + strlen(string);
+
+ if ((delimiter = index(string, '=')))
strchr instead of index.
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
+*/
+static char *processoption(char *string, char *fullstring)
+{
+ const char startchars[] = "'\"|{([<";
+ const char endchars[] = "'\"|})]>";
+
+ char *delimiter = NULL;
+ char *value = NULL;
+ char *realvalue = NULL;
+ char *next = fullstring + strlen(string);
+
+ if ((delimiter = index(string, '=')))
+ {
+ *delimiter = 0;
+ value = delimiter + 1;
+ for (int i = 0; i < strlen(startchars); i++)
Could use sizeof(startchars) - 1 here to avoid the overhead of strlen(), but not really important.
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
+ realvalue = strndup(valstart, next - valstart);
+ next++; /* skip endchar */
+ }
+ else
+ {
+ fprintf(stderr, "matching [%c] not found in value for option [%s]\n",
+ endchars[i], string);
+ }
+ break;
+ }
+ }
+ }
+ else
+ {
+ /* option without value */
+ value = NULL;
Useless, value will already be NULL here.
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
+ char *value = NULL;
+ char *realvalue = NULL;
+ char *next = fullstring + strlen(string);
+
+ if ((delimiter = index(string, '=')))
+ {
+ *delimiter = 0;
+ value = delimiter + 1;
+ for (int i = 0; i < strlen(startchars); i++)
+ {
+ if (*value == startchars[i])
+ {
+ char *valstart = fullstring + (value - string) + 1;
+ if ((next = strchr(valstart, endchars[i])))
+ {
+ realvalue = strndup(valstart, next - valstart);
Let's hope that strndup(s, 0) is legit... on the platforms I tested, it does what a human would expect, but no idea if that's universally true.
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
char *delimiter = NULL;
/*
* Remove the port specification.
*/
- delimiter = rindex(string, ':');
-
- if (delimiter)
+ if ((delimiter = rindex(string, ':')))
Change to strrchr() if touching this anyway.
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
+ if (*current == '\0')
+ {
+ break;
+ }
+ else if (*current == ',')
+ {
+ /* skip the comma */
+ current++;
+ }
+ else
+ {
+ fprintf(stderr, "unexpected character '%c' at position %ld of options string\n",*current, (current - string));
+ current = NULL;
+ }
+ }
+ if (current == NULL)
else?
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
@@ -1545,45 +1545,138 @@ static void nxagentParseOptions(char *name, char *value)
free(argv[0]);
}
+/*
/*
* Please use
* the standard
* multi-line
* comment format
*/
In nx-X11/programs/Xserver/hw/nxagent/Args.c:
+ char *value = NULL;
+ char *realvalue = NULL;
+ char *next = fullstring + strlen(string);
+
+ if ((delimiter = index(string, '=')))
+ {
+ *delimiter = 0;
+ value = delimiter + 1;
+ for (int i = 0; i < strlen(startchars); i++)
+ {
+ if (*value == startchars[i])
+ {
+ char *valstart = fullstring + (value - string) + 1;
+ if ((next = strchr(valstart, endchars[i])))
+ {
+ realvalue = strndup(valstart, next - valstart);
An option string such as foo="b,ar"baz will be parsed wrongly, with only b,ar passed as the value. It's especially problematic that the remainder baz will be handled as a key in the next iteration.
I guess that more work is necessary here.
And it can get even more complex, think of a string like foo="b,ar"b"az" as well. We actually need to scan for "quotes" multiple times until we either hit another comma or a NULL character.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I'd exclude delimiters for which start and end characters do not match ( This leaves My point against Leaves Of course we could allow both of them, but these are also slightly ambiguous. In most cases where quoting is allowed via quote characters, double and single quotes behave differently. In such cases, having two options available makes sense, since, for instance, one option (mostly single quotes) treats the quotes text literally, while the other option (mostly double quotes) allows substitutions within the literal text if specific markers are used. In our case, we don't really have such a connotation in mind - or a need for it, or functionality available. Using |
The whole purpose of all this is allowing value containing commas. Depending on the character set in the value you must choose a quoting character that is not contained. So having more to choose makes it more probable that the user (t.i. the programmer of the client application that sets up the option file) finds a suitable one. Sticking to " will make it impossible to quote anything that contains " and , in the value. Further, I don't see anyone using nested quotes, simply because there's no use case IMHO. Can you come up with a parameter example that would require nested quoting? As long as we do not offer escaping here I'd say there is none. The character set I chose for quoting includes the ones I consider as being the best regarding readability. So I included the | character as you can clearly see where it begins and where it ends. I came across all this when implementing rmlvo keyboard definition for the options string because in that case commas can occur. So there really is a need for that. Instead of enclosing the whole string in quotes or brackets I also thought about using an escaping character instead which would really only be required for escaping commas. However, I consider this being less readable and more complicated to handle from the user's standpoint (I have an nx client script written in bash; it is easier/clearer to add quoting than to escape stuff in bash strings). One clean approach would be a new format of the options file: instead of having one long line with comma separated values the option file should consist of several lines, one for each option. Quoting/escaping would not be required as long as we restrict values to not contain any line break characters. But: I don't have an idea how to handle values with commas in the DISPLAY or in the case where you pass a string instead of a filename via -options. Maybe we should make a big step here and restrict those cases to the most common/required ones, the rest has to be passed via options file. Update: improved explanation |
8a1b2dd
to
13854c0
Compare
pushed updated version: replaced index() calls, use sizeof(), abort if quoting characters appear in the middle of the value |
13854c0
to
58a4b9e
Compare
58a4b9e
to
32e06a6
Compare
Right, and that's why I recommend not supporting
Well... then let's keep |
On Di 21 Nov 2017 03:25:34 CET, Mihai Moldovan wrote:
> Further, I don't see anyone using nested quotes, simply because
> there's no use case IMHO. Can you come up with a parameter example
> that would require nested quoting? As long as we do not offer
> escaping here I'd say there is none.
Right, and that's why I recommend not supporting `{}`, `()`, `[]` or
`<>`. They *look* nestable, but we don't want to support it and we
don't have any use case either.
> Sticking to " will make it impossible to quote anything that
> contains " and , in the value.
Well... then let's keep `''`, `""` and `||`. Especially the last one
should be uncommon enough to not accidentally be part of a quoted
string.
Reminds me a bit of ``sed``... Possibly, more chars make sense: ``@``,
``/``, ``whatnot-supported-by-sed``...
Greets,
Mike
--
DAS-NETZWERKTEAM
mike gabriel, herweg 7, 24357 fleckeby
mobile: +49 (1520) 1976 148
landline: +49 (4354) 8390 139
GnuPG Fingerprint: 9BFB AEE8 6C0A A5FF BF22 0782 9AF4 6B30 2577 1B31
mail: mike.gabriel@das-netzwerkteam.de, http://das-netzwerkteam.de
|
f842054
to
a5ebee4
Compare
new version with only ", ' and | and manpage. |
Yes, We cannot directly mimic sed here, since
Say, the string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few doc fixups.
session startup and at runtime (i.e. when resuming a suspended session) | ||
by so-called nx/nx options. | ||
Other than the command line options, \fBnxagent\fR can be configured | ||
at session startup and at runtime (i.e. when resuming a suspended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e.,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather leave it that way. We are not uniform in this respect, there are occurances of both forms in the manpage and in the sourcecode.
https://jakubmarian.com/comma-after-i-e-and-e-g/ (I prefer British here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, your choice (and likely makes sense since the statement is enclosed in parentheses.)
Other than the command line options, \fBnxagent\fR can be configured | ||
at session startup and at runtime (i.e. when resuming a suspended | ||
session) by so-called nx/nx options. These options are always passed | ||
as a (sometime looong) options string consisting of option=value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right
as a (sometime looong) options string consisting of option=value | ||
pairs, separated by commas. \fBnxagent\fR will extract that | ||
options string from the DISPLAY variable (see below) or an options | ||
file (provided via the \-\fIoptions\fR command line option). To |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or an options string passed on the command line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left that out on purpose:
- the text is already confusing, I did not want to add more confusion.
- the manpage does not include any hint that this is possible at all (which should probably be changed; I have opened an issue for that: nxagent manpage does not mention options string for -options #565)
- an options string passed via cmdline is mainly usable for non-persistent sessions and testing. So it is nothing for the majority of users
Further, I'am wondering if we should drop even "options via DISPLAY" out of the equation, see #564.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no... the rules for this stuff are quite complicated and currently not documented - mainly the overrides.
I guess we can leave that out of this PR and only mention the most commonly used scheme, yes.
options string from the DISPLAY variable (see below) or an options | ||
file (provided via the \-\fIoptions\fR command line option). To | ||
overcome problems in case a value contains the comma character a | ||
simple quoting is supported. The quoting character must immediately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simple quoting scheme
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that!
file (provided via the \-\fIoptions\fR command line option). To | ||
overcome problems in case a value contains the comma character a | ||
simple quoting is supported. The quoting character must immediately | ||
follow the = character and again directly precede the next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave out again
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. I have just pushed a new version.
2b7f5b6
to
8a7942d
Compare
Do you want to rebase the second commit into the first one prior to merging? |
Please do not merge now - need to adapt nxcomp's option handling in the same way first! |
On Di 05 Dez 2017 08:55:39 CET, Ulrich Sibiller wrote:
Please do not merge now - need to adapt nxcomp's option handling in
the same way first!
Noted... ACK.
Mike
--
DAS-NETZWERKTEAM
mike gabriel, herweg 7, 24357 fleckeby
mobile: +49 (1520) 1976 148
landline: +49 (4354) 8390 139
GnuPG Fingerprint: 9BFB AEE8 6C0A A5FF BF22 0782 9AF4 6B30 2577 1B31
mail: mike.gabriel@das-netzwerkteam.de, http://das-netzwerkteam.de
|
Note: Nesting and escaping is not supported currently! Quoting is is only allowed around the complete value part of an option. Quoting must start directly after the '=' and end directly before ',' or EOS. This is not perfect, e.g. the following string will be processed erroneously: key="v,a,l",u,e1",key2=value2 However, it will help if some value needs to contain a comma.
8a7942d
to
905fd39
Compare
check: are the changes to nxcomp included already? |
On Mi 22 Aug 2018 15:41:11 CEST, Ulrich Sibiller wrote:
check: are the changes to nxcomp included already?
Don't think so. Actually, I think that's why you wanted me to stand by
on this.
Mike
--
DAS-NETZWERKTEAM
mike gabriel, herweg 7, 24357 fleckeby
mobile: +49 (1520) 1976 148
landline: +49 (4354) 8390 139
GnuPG Fingerprint: 9BFB AEE8 6C0A A5FF BF22 0782 9AF4 6B30 2577 1B31
mail: mike.gabriel@das-netzwerkteam.de, http://das-netzwerkteam.de
|
Looking at the code in nxcomp/src/Loop.cpp I found that nxcomp supports typical URL encoding techniques in the option values (see function |
I have implemented URL decoding in #725. That solves the underlying problem in a very elegant way. |
Superceded by #725, thus closing... |
Note: Nesting and escaping is not supported currently!
This is not perfect, e.g. the following string will be processed
wrong:
key=<v,a,l>,u,e1>,key2=value2
However, it will help if some value needs to contain a comma (which will be required by one of my keyboard patches to pass full RMLVO keyboard definition via the
keyboard
option.