• by Imanari on 8/5/2020, 1:56:50 PM

    In your example it seems the primary clue to find matches is the name, i.e. 'ABC' + Corp/Des/etc. So how about doing some fuzzy string matching? Once you have done this you can identify edge cases and additionally group by dates or whatever.

    So you would have 'ABC' in L and a selection of matches in S. If not all of the matches in S actually belong to the ABC in L you are faced with the Knapsack Problem[0] that you can solve with different methods(sorry, no expert here).

    [0] https://en.wikipedia.org/wiki/Knapsack_problem

  • by doonesbury on 8/4/2020, 3:29:54 PM

    You mean comparing data? For what purpose (to help assess solution) ... and why ML? Surely a rules engine is much more practical.