Hacker News Clone

Unexpected security footguns in Go's parsers

by ingve on 6/18/2025, 11:35:12 AM with 86 comments

by octo888 on 6/21/2025, 10:35:11 AM
What is "IsAdmin" doing in the "create user" request DTO in the first place? The examples seem to indicate inappropriate re-use of data models.
Would it not be better to:
```
  type CreateUserRequest struct {
    Username string
    Password string
  }

  type UserView struct {
    Username string
    IsAdmin boolean
  }
```
etc?
No need to just have just 1 model that maps 1:1 to your DB row. This applies to all languages
by fainpul on 6/21/2025, 12:49:49 PM
I didn't see this mentioned in the article: wouldn't the obvious way be to make the private field private (lowercase)?
(I'm not a Go developer, just tried the language casually).
```
    type User struct {
     Username string `json:"username_json_key,omitempty"`
     Password string `json:"password"`
     isAdmin  bool
    }
```
https://go.dev/play/p/1m-6hO93Xce
by em-bee on 6/21/2025, 12:45:46 PM
i am not aware of any parser that does that differently, but i would also argue that this is not the job of parsers. after parsing (or before exporting) there should be a data validation step based on whitelists.
so the user can send in unknown fields all they want, the code will only accept the username and firstname strings, and ignore the other ones.
same with fetching data and sending it to the user. i fetch only the fields i want and create the correct datastructures before invoking the marshaling step.
there are no footguns. if you expect your parser to protect you you are using it wrong. they were not designed for that.
input -> parse -> extract the fields we want, which are valid -> create a data-structure with those fields.
data -> get fields i want -> create datastructures with only wanted fields -> write to output format
by glenjamin on 6/21/2025, 1:18:25 PM
It’s worth noting that if you DisallowUnknownFields it makes it much harder to handle forward/backward compatible API changes - which is a very common and usually desirable pattern
by anitil on 6/20/2025, 1:03:27 AM
This was all very interesting, but that polyglot json/yaml/xml payload was a big surprise to me! I had no idea that go's default xml parser would accept proceeding and trailing garbage. I'd always thought of json as one of the simpler formats to parse, but I suppose the real world would beg to differ.
It's interesting that decisions made about seemingly-innocuous conditions like 'what if there are duplicate keys' have a long tail of consequences
by e_y_ on 6/21/2025, 8:44:42 AM
As someone who isn't a Go programmer, on the face of it using strings (struct tags) for field metadata seems pretty backwards compared to Rust macros (which parses the metadata at compile time) or Java annotations (which are processed at runtime but at least don't require parsing a string to break apart options).
The accidental omitempty and - are a good example of the weirdness even if they might not cause problems in practice.
by asimops on 6/21/2025, 1:09:43 PM
In the case of Attack scenario 2, I do not get why in a secure design you would ever forward the client originating data to the auth service. This is more of a broken best practise then a footgun to me.
The logic should be "Parse, don't validate"[0] and after that you work on those parsed data.
[0]: https://hn.algolia.com/?q=https%3A%2F%2Flexi-lambda.github.i...
by tptacek on 6/21/2025, 5:08:42 PM
These kinds of issues (parser differentials in particular) are why you shouldn't trust Go SAML implementations that use `encoding/xml`, which was never designed for that application to begin with; I just wrote my own for my SAML.
(I mean, don't use SAML to begin with, but.)
by aintnolove on 6/21/2025, 12:31:48 PM
I know these problems are easily avoidable... but I'm finally starting to see the appeal of protocol buffers.
Just to have the assurance that, regardless of programming language, you're guaranteed a consistent ser/de experience.
by piinbinary on 6/21/2025, 4:07:36 PM
Kudos to the author for making this very clear and easy to understand. More technical writing should be like this.
On another note, it's mind-blowing that a single string can parse as XML, JSON, and YAML.
by jerf on 6/21/2025, 2:11:56 PM
There's a lot of good information in here, but if you think this is a time to go all language supremicist about how much better your language is and how this proves Go sucks, you might want to double-check the CVE database real quick first. A lot of these issues are more attributable to plain ol' user error and the fundamental messiness of JSON and XML than Go qua Go and I've seen many similar issues everywhere.
For instance, there simply isn't a "correct" way for a parser to handle duplicate keys. Because the problem they have is different layers seeing them differently, you can have the problem anywhere duplicate keys are treated differently, and it's not like Go is the only thing to implement "last wins". It doesn't matter what you do. Last wins? Varies from the many "first wins" implementations. First wins? Varies from the many "last wins" implementations. Nondeterministically choose? Now you conflict with everyone, even yourself, sometimes. Crash or throw an exception or otherwise fail? Now you've got a potential DOS. There's no way for a parser to win here, in any langauge. The code using the parser has some decisions to make.
Another example, the streaming JSON decoder "accepts" trailing garbage data because by the act of using the streaming JSON decoder you have indicated a willingness to potentially decode more JSON data. You can use this to handle newline-separated JSON, or other interesting JSON protocols where you're not committing to the string being just one JSON value and absolutely nothing else. It's not "an issue they're not planning on fixing", it's a feature with an absolutely unavoidable side effect in the context of streams. The JSON parser stops reading the stream at the end of the complete JSON object, by design, and anything else would be wrong because it would be consuming a value to "check" whether the next thing is JSON or not, when you may not even be "claiming" that the "next thing" is JSON, and whatever input it consumed to verify a claim that nobody is even making would itself be a bug.
Accepting user input into sensitive variables is a common mistake I've seen multiple times in a number of langauges. The root problem there is more the tension between convenience and security than languages themselves; any language can make it so convenient to load data that developers accidentally load more data than they realize.
Etc. The best lesson to take away from this is that there is more than meets the eye with JSON and XML and they're harder to use safely than its ease-of-use suggests.
Although in the interests of fairness I also consider case insensitivity in the JSON field names to be a mistake; maybe it should be an option, JSON can get messy in the real world, but it's a bad default. I have other quibbles but most of them are things where there isn't a correct answer where you can unambiguously say that some choice is wrong. JSON is really quite fundamentally messier than people realize, and XML, while generally more tightly specified at the grammer level than JSON is, is generally quite messy in the protocols people build on top of it.
by neuroelectron on 6/21/2025, 8:26:34 AM
Been seeing these same problems in services for decades now. It's almost like they made these protocol languages exploitable on purpose.
by vlowther on 6/21/2025, 3:27:07 PM
Code by vibes and copy/paste instead of R'ing TFM for your language and libraries and understanding your problem domain, introduce security vulnerabilities. Film at 11.
Seriously, the only one that is arguably bad design is the case insensitive JSON keys thing. Everything else is just a reasonable engineering tradeoff given the intersection of the marshalling format design constraints and Go's design. Parsing untrusted data always has the capacity to introduce security vulnerabilities.