This post describes a research project I did which ended up needing domain-specific natural language processing (NLP).
The project started out with learning Bayesian Networks. The nets were learned to determine probabilities of maintenance actions on machines. But, once we explored the data, we realized extensive pre-processing was needed. Most of the information about actions was in free-form descriptions, and not tabulated numerically.
So, the choice was between using a large language model (LLM), remotely or locally, or a more rustic NLP approach. I chose the latter for:
1. Explainability. Smaller models are easier to decompose and analyze.
2. Security. Using LLMs would likely require cloud-based or off-premise devices. They were an additional security hurdle in our case.
3. Speed. This was a very domain-specific dataset with relatively few training instances. Fine-tuning LLMs on this would take time (data collection + training).
4. Performance. The use-case was not generative text, but text retrieval for Bayesian reasoning. We could get away with NLP models that could only adequately infer similarity between text.
This post describes a research project I did which ended up needing domain-specific natural language processing (NLP).
The project started out with learning Bayesian Networks. The nets were learned to determine probabilities of maintenance actions on machines. But, once we explored the data, we realized extensive pre-processing was needed. Most of the information about actions was in free-form descriptions, and not tabulated numerically.
So, the choice was between using a large language model (LLM), remotely or locally, or a more rustic NLP approach. I chose the latter for:
1. Explainability. Smaller models are easier to decompose and analyze.
2. Security. Using LLMs would likely require cloud-based or off-premise devices. They were an additional security hurdle in our case.
3. Speed. This was a very domain-specific dataset with relatively few training instances. Fine-tuning LLMs on this would take time (data collection + training).
4. Performance. The use-case was not generative text, but text retrieval for Bayesian reasoning. We could get away with NLP models that could only adequately infer similarity between text.