Hacker News Clone

Show HN: Ragas – Open-source library for evaluating RAG pipelines

by shahules on 3/21/2024, 3:48:16 PM with 26 comments

by swyx on 3/21/2024, 7:23:31 PM
congrats on launching! i think my continuing struggle with looking at Ragas as a company/library rather than a very successful mental model is that the core of it is like 8 metrics (https://github.com/explodinggradients/ragas/tree/main/src/ra...) that are each 1-200 LOC. i can inline that easily in my app and retain full control, or model that in langchain or haystack or whatever.
why is Ragas a library and a company, rather than an overall "standard" or philosophy (eg like Heroku's 12 Factor Apps) that could maybe be more universally adopted without using the library?
(just giving an opp to pitch some underappreciated benefits of using this library)
by dataexporter on 3/21/2024, 6:11:22 PM
Based on our initial analysis with RAGAS a few months ago, it didn't provide the results that our team was expecting. Required a lot of customisation on top of it. Nevertheless a pretty solid library.
by pawanapg on 3/21/2024, 9:05:07 PM
Also check out DeepEval... our team has been using it for a while, and it's been working well for us because we can evaluate any LLMs, something this library doesn't seem to support (https://github.com/confident-ai/deepeval).
by AndrewCook71 on 3/22/2024, 4:39:57 AM
This is nice, we've got more Open Source LLM Evaluation Libraries coming in more often.
We're using DeepEval (https://github.com/confident-ai/deepeval) currently. How is this different from that?
by redskyluan on 3/21/2024, 9:51:09 PM
Great product and great progress.
The first step to build rage is always to evaluate.
Except all the current evaluations, cost and perf should also be part of evaluations
by rhogar on 3/21/2024, 8:37:42 PM
Congratulations on the launch! Personally would love to see a rough estimates of the expected number of requests and tokens required to run tasks like synthetic data generation for different amounts of data. Though this is likely highly variable, would like to have a loose idea of possible incurred costs and execution time.
by nkko on 3/21/2024, 8:15:49 PM
Congratulations on the launch of Ragas! This looks like an incredibly valuable tool for the LLM community. As the library continues to evolve, it will be interesting to see how it adapts to handle the growing diversity of LLM architectures and use cases.
by jfisher4024 on 3/21/2024, 7:55:47 PM
Congratulations on the launch! I was unable to use this library: I was trying to evaluate different non-OpenAI models and it consistently failed due to malformed JSONs coming from the model.
Any thoughts about using different models? Is this just a langchain limitation?
by retrovrv on 3/21/2024, 7:51:56 PM
Phenomenal to see how Ragas has progressed. Congratulations on the launch