Sort Ubuntu packages by popularity

Use case

So you want to install a Twitter client, but you don't know which one. You type in apt-cache search twitter client, which gives something like this:

choqok - KDE micro-blogging client
gwibber - Open source social networking client for GNOME
hotot - lightweight microblogging client
pidgin-microblog - Microblogging plugins for Pidgin
smuxi - graphical IRC client
tircd - ircd proxy to the twitter API
ttytter - console Twitter client
turpial - Light, fast, and fully functional Twitter client written in Python
twidge - Unix Command-Line Twitter and Identica Client
twittering-mode - Twitter client for Emacs

As usual in the open-source world, there are several Twitter clients. Which one should you use?

The most popular

Ubuntu and Debian have a popularity contest for packages. If you install the popcon package, it periodically submits the list of installed packages to Ubuntu. This data is aggegrated and published in a list, so that you can see which packages are the most popular. This can be used to sort the output of apt-cache on popularity.

This script reads the packages ranked by popularity from the file by_inst. It also reads the output of apt-cache and sorts the packages according to popularity:


import sys
import urllib
import gzip

packageRanks = {}
popfile = file('by_inst')
for line in popfile:
	if line.startswith('#') or line.startswith('----'):
	parts = line.split()
	packageRanks[parts[1]] = int(parts[0])

def getRank(packageLine):
	parts = packageLine.split(' - ')
	packageName = parts[0]
	if not packageRanks.has_key(packageName):
		return sys.maxint
	return packageRanks[packageName]

inputPackages = sys.stdin.readlines()

sortedPackages = sorted(inputPackages, key = getRank)

for line in sortedPackages:
	print line,