Joined 11/2/2010, 4:46:10 AM has 317 karma
EvalGen: Helping Developers Create LLM Evals Aligned to Their Preferences
Semantic Commit: Helping Users Update Intent Specifications for AI Memory
What AI Engineers Can Learn from Qualitative Research Methods
DocETL: A tool for creating LLM-powered data processing pipelines
Aligning LLM-as-a-Judge with Human Preferences
LLM Wrapper Papers Are Hurting HCI Research
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs
If in a Crowdsourced Data Annotation Pipeline, a GPT-4
Antagonistic AI
How to Compare Prompts with ChainForge [video]
AI for ChainForge Beta
ChatGPT does not have seasonal affective disorder
There is no "seasonal affective disorder" of ChatGPT
There will never be fully automated prompt engineering
ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing
Ask HN: Have LLM API Updates or Deprecations Impacted You?
Appleās ML model and dataset introspection API