Hacker News Clone

Build models like we build open-source software

by tristanz on 12/8/2021, 4:59:26 PM with 8 comments

by mmmeff on 12/10/2021, 4:53:31 AM
I feel like there's several industries that are practically computer science yet don't utilize open source effectively. Data science is definitely one, but the video game industry definitely comes to mind.
You could argue game engines are notoriously complex, but the Linux kernel would like a word.
by tristanz on 12/8/2021, 5:01:23 PM
Collaborative incremental improvement of models would be extremely disruptive. While this happens via research, it's massively inefficient, particularly as pretrained models get larger and span multiple modalities.
by amznbyebyebye on 12/10/2021, 6:44:34 AM
There is definitely a problem re: large parameter models, the issue is I don’t think throwing software dev tools at this is the right solution.
The constraint is largely hardware. The incremental post training done via transfer learning is generally not broadly applicable to many use cases.
by sharemywin on 12/10/2021, 3:42:24 PM
I'm curious how Deepmind's MOE models Perceiver and Switch might play into managing a open distributed model.