Other articles


  1. Basics of Near Duplicate Detection

    Finding duplicate files is easy, anyone can do it. Finding files that are almost identical is more difficult, but it's useful for use cases like detecting plagiarism. In this article, I'll present a simple python program that calculates the textual similarity of two documents.

    The basic idea is to reduce …

    read more
  2. Are Link-Sharing Services Irrelevant?

    You can use RSS to easily follow a few high-profile websites and link sharing services like Slashdot or Digg to discover popular web content. But that's like reading a classic newspaper and some magazines: The information provided may have a higher chance of being relevant to you, but there's still …

    read more

social