Python: Add StringConstCompare BarrierGuard#4700
Conversation
Don't know why it would ever have been under default sanitizers :D
`not` is not working properly, but otherwise pretty good
In the future, I could imagine we would have something like this, but for now,
I'm just keeping it simple.
```codeql
/**
* A collection of common guards that ensure the checked value cannot have arbitrary
* values.
*
* Currently only supports comparison with constant string value, but could also
* include checking whether all characters are alphanumeric, or whether a regex is
* matched against the value.
*
* Such guards will be useful for many taint-tracking queries, but not necessarily
* all, which is why you need to opt into these manually.
*/
class CommonNonArbitraryGuard extends BarrierGuard {
CommonNonArbitraryGuard() {
this instanceof StringConstCompare
}
override predicate checks(ControlFlowNode node, boolean branch) {
this.(StringConstCompare).checks(node, branch)
}
}
```
|
This is turning into a bit of a Frankensteins monster... sorry about that 👹 😞 |
Also moved file to reflect that. Added tests of + `!=` + `in` + `not in`
|
I think I tried to cut too many corners for My implementation doesn't try to handle variable references at all though 😞 We could do that, but I'm not sure it's the most important thing right now. Let me know if you think differently 😛 |
tausbn
left a comment
There was a problem hiding this comment.
A few questions/comments, otherwise this looks good to me. I like the test cases. 🙂
| str_const_iterable instanceof SequenceNode | ||
| or | ||
| str_const_iterable instanceof SetNode | ||
| ) and | ||
| forall(ControlFlowNode elem | | ||
| elem = str_const_iterable.(SequenceNode).getAnElement() | ||
| or | ||
| elem = str_const_iterable.(SetNode).getAnElement() | ||
| | | ||
| elem.getNode() instanceof StrConst |
There was a problem hiding this comment.
This bit bugs me a bit, because it seems strangely inverted. Would it perhaps improve matters to define a separate abstract class of "constant iterables containing strings" and then extending it with subclasses for sequences and sets?
In particular, one thing that's unfortunate is that this doesn't account for local flow, which might be useful. One could extend the above to take local flow into consideration, but it would really blow up the size (in term of lines of code) of the characteristic predicate.
This also makes me think that perhaps these things should not live in DataFlowPublic (which is getting a bit crowded), but perhaps in a file dedicated solely to barrier guards?
There was a problem hiding this comment.
I agree. that code did not turn out that nice. I've tried to rewrite, let me know what you think :)
I'm specifically not using an abstract class, since that is an anti-pattern: github#4357 (comment) (I'm still trying to wrap my head fully aroudn this)
tausbn
left a comment
There was a problem hiding this comment.
Thanks for the changes. I think this looks good! 👍
I made a few other changes, since I was exploring how we handle
BarrierGuards in general, and saw that I had not updated the logical test for sanitizers.As mentioned in the commit message: In the future, I could imagine we would have something like this, but for now, I'm just keeping it simple.