• Top
  • New

OCRing Music from YouTube with Common Lisp

by superdisk on 1/5/2025, 12:29:48 PM with 18 comments
  • by notpublic on 1/6/2025, 12:10:57 PM

    Instead of doing a diff, curious if Normalized compression distance (NCD)[1] will yield a better result. It is very simple algorithm:

    to compare two images, i1 and i2

      l1  = length(gzip(i1))
      l2  = length(gzip(i2))
      l12 = length(gzip(concatenate(i1, i2))
    
      ncd = (l12 - min(l1, l2))/max(l1, l2)
    
    Here is a nice article where I found out about this long ago.

    https://yieldthought.com/post/95722882055/machine-learning-t...

    From the article:

    "Basically it states that the degree of similarity between two objects can be approximated by the degree to which you can better compress them by concatenating them into one object rather than compressing them individually."

    [1] https://en.wikipedia.org/wiki/Normalized_compression_distanc...

  • by varjag on 1/6/2025, 9:59:59 AM

    If you're also getting a 500:

    https://web.archive.org/web/20250106075631/https://nickfa.ro...

  • by xenonite on 1/6/2025, 12:17:00 PM

    To OCR music scores, see e.g., https://digitalcollection.zhaw.ch/items/276365b9-0a20-4286-a...

  • by rcarmo on 1/6/2025, 8:58:28 AM

    Holy cow.

  • by kanwisher on 1/6/2025, 8:03:11 AM

    honestly this would be better with an AI model