by rwmj on 9/2/2025, 9:37:07 AM
by pavlov on 9/2/2025, 7:05:24 AM
The sample set contains:
{
"causal_relation": {
"cause": {
"concept": "boom"
},
"effect": {
"concept": "bust"
}
}
}
It's practically a hedge-fund-in-a-box.by kruffalon on 9/2/2025, 1:12:13 PM
I read it as "casual" rather than "causal", got very dissapointed while reading the article!
An inventory of casual knowledge would be really fun, although it's hard to think what it would consist of now that I think about it...
There is this concept of "hidden knowledge" about all the things you know at work that no one really thinks about is knowledge so it's hard to let newcomers know about it.
But that does sound different than "casual knowledge", and so does "trivia".
Oh well!
by tgv on 9/2/2025, 7:42:58 AM
This makes little sense to me. Ontologies and all that have been tried and have always been found to be too brittle. Take the examples from the front page (which I expect to be among the best in their set): human_activity => climate_change. Those are such a broad concepts that it's practically useless. Or disease => death. There's no nuance at all. There isn't even a definition of what "disease" is, let alone a way to express that myxomatosis is lethal for only European rabbits, not humans, nor gold fish.
by mark_l_watson on 9/2/2025, 2:15:23 PM
This might be of at least some value to augment training LLMs? I spent a lot of time in the 1980s and early 1990s using symbolic AI techniques: conceptual dependency, NLP, expert systems, etc. While two large and well funded expert system projects I worked on (paid for by DARPA and PacBell) worked well, mostly symbolic AI was brittle and required what seemed like an i finite amount of human labor.
LLMs are such a huge improvement that the only real use I see in projects like Cause et, the defunct OpenCyc project, etc. the only possible practical use might be as a little extra training data.
by refactor_master on 9/2/2025, 7:34:54 AM
Might as well go ahead and add https://tylervigen.com/spurious-correlations?page=135 from the looks of it.
by TofuLover on 9/2/2025, 7:53:01 AM
This reminds me of an article I read that was posted on HN only a few days ago: Uncertain<T>[1]. I think that a causality graph like this necessarily needs a concept of uncertainty to preserve nuance. I don't know whether this would be practical in terms of compute, but I'd think combining traditional NLP techniques with LLM analysis may make it so?
by bbor on 9/2/2025, 7:11:25 AM
> CauseNet aims at creating a causal knowledge base that comprises all human causal knowledge and to separate it from mere causal beliefs
Pretty bold to use a picture of philosophers as your splash page and then make a casual claim like this. To say the least, this is an impossible task!
The tech looks cool and I'm excited to see how I might be able to work it into my stuff and/or contribute. But I'd encourage the authors to reign in the rhetoric...
by larodi on 9/2/2025, 10:29:39 AM
Why not use PROLOG then, is the essence of cause and effect in programming. And also can expound syllogisms.
by rhizome on 9/2/2025, 7:42:36 AM
"The map is not the territory" ensures that bias and mistakes are inextricable from the entire AI project. I don't want to get all Jaron Lanier about it, but they're fundamental terms in the vocabulary of simulated intelligence.
by circlemaker on 9/2/2025, 9:16:16 PM
This made me think of a much more interesting project. A compendium of information automatically extracted from research articles.
Essentially one totalizing meta analysis.
E.g. If it reads an article about the relationship between height and various life outcomes in Indonesian men, then first, it would store the average height of Indonesian men, the relationship between the average height of Indonesian men and each life outcome in Indonesian men, the type of relationship (e.g. Pearson's correlation), the relationship values (r value), etc. It would store the entity, the relationship, the relationship values, and the doi source.
Something like a quantitative Wikipedia.
by jack_riminton on 9/2/2025, 7:39:33 AM
Reminds me of the early attempts at hand categorising knowledge for AI
by koliber on 9/2/2025, 7:54:06 AM
I wonder how they will quantize causality. Sometimes a particular cause has different, and even opposite, effects.
Alcohol causes anxiety. At the same time it causes relaxation. These effects depend on time frame, and many individual circumstances.
This is a single example but the world is full of them. Codifying causality will involve a certain amount of bias and belief. That does not lead to a better world.
by fohara on 9/2/2025, 3:10:33 PM
The associated paper references Judea Pearl's theories on causality, but curiously doesn't mention the DoWhy implementation [0], which seems to have some recognition in the causal inference space.
by sinuhe69 on 9/2/2025, 7:23:04 PM
I find the simple expression of a causes b as in this database without qualification not very helpful. At least, we need causal graphs/causal digram loops to describe these causal relationships better.
[0] https://en.wikipedia.org/wiki/Causal_graph
Harvard has a free course about it: https://www.edx.org/learn/data-analysis/harvard-university-c...
by maweki on 9/2/2025, 7:17:50 AM
It's nice to see more semantic web experiments. I always wanted to do more reasoning with ontologies, etc., and it's such an amazing idea, to reference objects/persons/locations/concepts from the real world with uris and just add labeled arrows between them.
This is such a cool schemaless approach and has so much potential for open data linking, classical reasoning, LLM reasoning. But open data (together with RSS) has been dead for a while as all big companies have become just data hoarders. And frankly, while the concept and the possibilities are so cool, the graph databases are just not that fast and also not fun to program.
by ivape on 9/2/2025, 8:16:06 AM
I don’t know if it’s inadvertent, but it’s headed toward just becoming an engine for over fitted generalizations. Each casual pair will just emerge based on frequency, which will reinforce itself in preemptively and prematurely classifying all future information.
Unfortunately, frequency is the primary way AI works, but it will never be accurate for causality because causality always has the dynamic that things can happen just “because”. It’s hacked into LLMs via deliberate randomness in next-token prediction.
by growingkittens on 9/2/2025, 4:46:49 PM
Organizing all knowledge requires a flexible system of organization (starting with how the categories are organized and accessed, not the data).
Random thoughts about organizing knowledge:
- Categories need fractal structures.
- Categories need to be available as subcategories to other categories as a pattern.
- Words need to be broken down into base concepts and used as patterns.
- Social information and context alter the meaning of words in many cases, so any semantic web without a control system has limited use as an organization tool.
by kgrizzle on 9/2/2025, 6:24:40 PM
Reminds me of the cyc project. https://en.wikipedia.org/wiki/Cyc
by pfdietz on 9/2/2025, 5:28:44 PM
This is difficult, but then I just had someone earnestly inform me that the covid virus doesn't cause covid, so I think there's a need here, if only to have an automated way of identifying idiots.
by athrowaway3z on 9/2/2025, 9:57:47 AM
A cool idea, in desperate need of an example use case.
by lwansbrough on 9/2/2025, 8:10:48 AM
I was hoping this would be actual normalized time series data and correlation ratios. Such a dataset would be interesting for forecasting.
by thicknavyrain on 9/2/2025, 7:24:30 AM
I know it's a reductive take to point to a single mistake and act like the whole project might be a bit futile (maybe it's a rarity) but this example in their sample is really quite awful if the idea is to give AI better epistemics:
{
"causal_relation": {
"cause": {
"concept": "vaccines"
},
"effect": {
"concept": "autism"
}
}
},
... seriously? Then again, they do say these are just "causal beliefs" expressed on the internet, but seems like some stronger filtering of which beliefs to adopt ought to be exercised for an downstream usecase.by quirk on 9/2/2025, 8:01:40 PM
The fact that they are using Wikipedia for a primary data source exempts them from any further serious consideration.
by huragok on 9/2/2025, 8:40:26 AM
the cyc of this current ai winter
by amelius on 9/2/2025, 12:52:16 PM
Can't an LLM extract this type of information with reasonably high accuracy?
by bbstats on 9/2/2025, 11:49:54 AM
Causality is literally impossible to deduce...
by AlienRobot on 9/2/2025, 10:26:32 AM
I wonder what is this for.
by MangoToupe on 9/2/2025, 5:22:34 PM
Wittgenstein is calling
by daloodewi on 9/2/2025, 9:12:33 AM
this will be super cool if it can be done!
Isn't this like Cyc? There have been a couple of interesting articles about that on HN:
https://news.ycombinator.com/item?id=43625474 "Obituary for Cyc"
https://news.ycombinator.com/item?id=40069298 "Cyc: History's Forgotten AI Project"