• by muldvarp on 12/25/2023, 12:25:03 AM

    I did research on parser differentials for my bachelor's thesis. My initial hope was that I would find a few mismatches for formats without a formal specification. I found mismatches for _every single_ pair of parsers I looked at and that included formats with formal specifications. My personal takeaway was "If you use one parser for validation and another parser for evaluation, you're fucked. No exceptions."

  • by SAI_Peregrinus on 12/24/2023, 11:35:02 PM

    As the article mentions, Postel's Law is likely to create vulnerabilities. It makes individual systems more robust, but the whole becomes fragile.

  • by Turing_Machine on 12/25/2023, 12:36:17 AM

    > Well, these browsers "helpfully" fix the URL to change backslashes into regular forward slashes, I suppose because people sometimes type in URLs and get their forward and back slashes confused.

    More likely because Windows has historically used \ rather than the / that's standard in Unixish systems. Windows people are used to typing \, so it's indeed somewhat helpful for the browser to accept either (e.g., in file:// URLs).

  • by cjbprime on 12/25/2023, 2:43:49 AM

    Odd that the article doesn't use the more standard term "parser differential", with "differential fuzzing" as the fuzzing community's method for finding those.

  • by ngneer on 12/25/2023, 2:09:28 AM

    This is a LANGSEC concept. A broader survey can be found at: https://www.computer.org/csdl/proceedings-article/spw/2023/1...

  • by dandanua on 12/25/2023, 10:07:23 AM

    I guess if we add all the problems in IT that were caused by bugs and poor designs of parsers/serializations, e.g. SQL injections, XSS, null byte vulns etc., we get billions of human hours in damages.

    What should be instead is an absolutely clear serialization format into a byte string of ANY data structure that must processed by two different programs.

    Parsers are programs, they should "parse" bytes, not strings, like we humans do.

  • by conartist6 on 12/25/2023, 12:54:12 PM

    If BABLR succeeds in creating a shared instruction set for defining parsers, you'd just have portable parser grammars running on compatible parser VMs

  • by o11c on 12/24/2023, 10:55:11 PM

    Usually? a result of the parser not having a machine-readable specification.

    For parsing proper, `bison --xml` is useful if you're allergic to code-generation. I don't have an equivalent for lexing.

  • by RicoElectrico on 12/25/2023, 11:33:35 AM

    Honestly we should have a name for such class of bugs. It's not an "I didn't know" kind of mistake. Every person sufficiently intelligent to program should figure out by themselves that having 2 parser implementations can cause various undesired consequences.

  • by sylware on 12/23/2023, 10:52:26 AM

    Usually, some not verified and cleaned enough external input text managed to get into some complex and often brain damaged text parser (printf,sql,etc).