AboutThis is a technology blog by Matthias Friedrich, a software developer and architect from Karlsruhe, Germany. more ...
Tagsandroid backup best practices books build systems c/c++ computer science databases deployment distributed systems django docker documentation google hadoop java java python libraries linux machine learning maven meta music musicbrainz networking opinion oracle process productivity python quality quick tips rcs scalability scheme scripting security server shell standards testing tools ubuntu web xml
Tag Archives: computer science
I’ve been playing with scikit-learn recently, a machine learning package for Python. While there’s great documentation on many topics, feature extraction isn’t one of them. My use case was to turn article tags (like I use them on my blog) … Continue reading
Finding duplicate files is easy, anyone can do it. Finding files that are almost identical is more difficult, but it’s useful for use cases like detecting plagiarism. In this article, I’ll present a simple python program that calculates the textual … Continue reading
Computer science and software development are two entirely different things. The former is a science, the latter is mostly craftsmanship, still struggling to become an engineering discipline in its own right. Being a good computer scientist doesn’t make you a … Continue reading
You can use RSS to easily follow a few high-profile websites and link sharing services like Slashdot or Digg to discover popular web content. But that’s like reading a classic newspaper and some magazines: The information provided may have a … Continue reading
Going through old CACM issues I discovered a paper (PDF) on stream processing. A common problem in this field is to find frequent items in a data stream when you only get one pass through the data and you need … Continue reading