20:09:44 <MadameZou> #startmeeting 20:09:44 <MeetBot> Meeting started Thu Dec 16 20:09:44 2010 UTC. The chair is MadameZou. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:09:44 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 20:09:52 <enrico> Hello 20:10:01 <MadameZou> #topic Debian package informations 20:10:18 <enrico> I'm Enrico Zini and I'll bring you on a trip of Debian package information 20:10:28 <enrico> If you have any questions, please direct them to #dw-question, and prefix them with "QUESTION: " 20:10:34 <enrico> Feel free to make any kind of question, even if it might seem silly -- the answer could help other people 20:10:39 <enrico> aghisla will take care of posting them here, when it's more appropriate 20:10:56 <enrico> I am going to talk about information about Debian package 20:10:58 <enrico> there is a lot of it 20:11:17 <enrico> some we see every day, some we can't even begin to suspect it could possibly ever exist, but is there 20:11:42 <enrico> The thing you see every day in Debian is packages 20:11:59 <enrico> there are loads of them, we usually say in the order of like 25000 or so 20:12:20 <enrico> we install and remove packages from our systems, upload new versions of them and so on 20:12:35 <enrico> we're probably all used in seeing package information with apt-cache 20:12:48 <enrico> for example, "apt-cache show debtags" shows information about the package "debtags" 20:15:54 <enrico> Every package has a name, the format of the name is defined by the Debian policy: for example, it cannot contain underscores, but it can contain dashes 20:15:54 <enrico> Then there is a version, with a more interesting format. The policy defines it as well as how to compare two versions, which is a remarkably interesting problem 20:15:54 <enrico> Then there's all the rest that we're used to see, like dependencies, descriptions, maintainers and so on 20:15:54 <enrico> Information about packages is used for many different tasks, some are performed by machines and some by humans 20:15:55 <enrico> so you have dependencies, that package managers such as apt or software-center use to decide what is needed for a package to work 20:15:55 <enrico> and you have descriptions, which are used by people to decide whether they'd like to install a package or not 20:16:44 <enrico> these tasks can be nontrivial: dependency resolution is a complex task (so complex there are research centers devoted to studying the problem, which is great because they hire Debian people :) 20:16:55 <enrico> and another complex tasks is to find the packages you need 20:17:14 <enrico> very often we really really need a package that is in Debian but we don't know how to find it 20:17:23 <enrico> so a good description is important 20:17:45 <enrico> not only to find a package, but to evaluate it, and to compare it with its alternatives before installing it, and so on 20:17:50 <enrico> you're probably familiar with it 20:18:05 <enrico> There are other interesting things in the output of apt-cache 20:18:46 <enrico> like "how big it is". Maybe nowadays we don't care anymore how big is the software we install on an average desktop, but it does make sense on smaller systems 20:19:16 <enrico> it'd be nice to have a package manager to be able to compute the space that would be used by a package and all its dependencies, at the moment we don't have that 20:19:34 <enrico> (talking about package information is a good idea to have cool ideas for package managers :) 20:20:01 <enrico> Recently there are new fields in the output of apt-cache: Homepage: and Tag: 20:20:31 <enrico> Homepage is nice: we can learn more about a package by just visiting its website. It's a simple additions that makes package managers much more useful 20:21:04 <magellanino> enrico: but were not present before? this field? 20:21:07 <enrico> (it'd be nice to have a system that automatically checks the Homepage: fields for broken links: I'm not aware of it existing yet) 20:21:25 <enrico> magellanino: please ask questions in #dw-question, and prefix them with "QUESTION: ": aghisla will take care of posting them here 20:21:37 <magellanino> enrico: ah ok sorry 20:22:53 <enrico> anyway, Homepage and Tags are recent additions. Recent as in, 2 or 3 years IIRC 20:22:54 <enrico> "Tag:" is categories for packages. There are lots of them available for use, we'll come back to them later when I'll cover debtags 20:23:33 <enrico> A useful thing for Tag: seen together with the package descriptions is that it gives you lots of extra information like "what programming language is this written in?" "what UI toolkit does it use?" that could be interesting but should really not be in the package descriptions 20:24:10 <enrico> The information you see in "apt-cache show debtags" comes from "Packages files" 20:24:33 <enrico> they are found in Debian mirrors and CDs and acquired by Apt when you do "apt-get update" 20:24:55 <enrico> if you do /var/lib/apt/lists/ you can see your local copies of Packages files acquired by apt 20:25:04 <enrico> ...if you do "ls /var/lib/apt/lists/" sorry 20:25:49 <enrico> Here is where you find Package files on mirrors: http://ftp.debian.org/debian/dists/squeeze/main/binary-armel 20:25:55 <enrico> (that is for the people who run armel) 20:26:27 <enrico> Every combination of distribution, suite and architecture has a different Packages file 20:26:53 <enrico> so in any computer, apt needs to download at least 2 of them: the one for your architecture and the one for the "all" architecture 20:27:33 <enrico> then it does some merging and indexing and builds the .bin files in /var/cache/apt that it uses to access the package information efficiently 20:27:39 <enrico> Any questions so far? 20:27:51 <MadameZou> there's the one by magellanino 20:28:18 <enrico> MadameZou: I think I have answered it 20:28:22 <MadameZou> :) 20:28:44 <MadameZou> so, we are ok 20:28:50 <enrico> Ok. The information we have seen so far is about "binary" packages 20:30:09 <enrico> a binary package is the one you install in your machine. It's called binary because it's been made ready for use by the computer. It is not the source that you download from the package author: it's been compiled and somehow preinstalled so that it can be unpacked in your system 20:30:21 <enrico> in Debian we also have "Source" packages 20:30:49 <enrico> that is, you can download the source code of any package in Debian 20:31:10 <enrico> if you do "apt-cache showsrc debtags" you find information about the sources of Debtags 20:31:27 <enrico> it doesn't work on all systems: you need to have source entries in /etc/apt/sources.list 20:31:50 <enrico> Something like "deb-src http://ftp.uk.debian.org/debian/ sid main" 20:32:11 <enrico> if you have "deb-src" sources, apt will download Sources files from the mirrors, and make them available for you when you do "apt-cache showsrc" 20:32:30 <enrico> In http://ftp.debian.org/debian/dists/squeeze/main/source/ you can see the source files in the mirror 20:33:35 <enrico> You have a different source file per combination of (distribution, suite). But you have a single source package for all architectures. The source package will be compiled once per architecture to build the various binary packages 20:34:36 <enrico> Let's see an example of source package information. You can run "apt-cache showsrc debtags"; I've pasted the output to http://paste.debian.net/102567/ in case you don't have sources available in your /etc/apt/sources.list 20:35:22 <enrico> Some information, like the package name, version and maintainers, is similar. Some is different: for example we have "Build-Depends" instead of "Depends". 20:35:43 <enrico> Build-Depends are the binary packages you need to build this source package 20:36:11 <enrico> They are usually different from Depends: for example you need "gcc" to compile many packages, but not to run them. 20:36:54 <enrico> "Vcs-Browser:" and the other "Vcs-*" tags are another very welcome recent addition: they tell you where you can find the sources of the package in a version control system 20:37:29 <enrico> suppose you find a bug in a package, you can use "apt-cache showsrc" to see where is its code, check it out and start hacking on it 20:38:23 <dapal> enrico: QUESTION: is that (Vcs-*) an upstream source or a debian source? 20:38:51 <enrico> IIRC it's the *debian* source, but please correct me if I remember wrong 20:39:14 <dapal> (yup, it's the Debian source) 20:39:31 <enrico> there is a difference because often the Debian developers have a version control system where they do the packaging, which is not necessarily the same one used by the software author 20:39:59 <enrico> In the description of _binary_ packages you have an interesting header which doesn't always show, and it tells you what is the name of the source package 20:40:10 <enrico> it's not always the same: one source package can generate many binary packages 20:40:38 <enrico> If you do, for example, "apt-cache show libc6" you'll see "Source: eglibc" 20:41:03 <enrico> there is no "libc6" source package: "libc6" is generated by the "eglibc" sources 20:41:26 <enrico> so apt tells you that if you want to see the sources of libc6, you need to get the "eglibc" source package 20:41:37 <enrico> the "Source:" header is omitted when the names of the source and binary packages are the same 20:42:08 <enrico> "apt-cache showsrc libc6" is smart enough to see the "Source:" header and show you the right source package anyway 20:42:59 <enrico> you have the opposite header in "apt-cache showsrc": for example, "apt-cache showsrc eglibc" has: "Binary: libc-bin, libc-dev-bin, glibc-doc, eglibc-source, locales, locales-all, [...]" 20:43:50 <enrico> eglibc is a source package that generates many binary packages :) 20:43:50 <enrico> So we've seen binary packages and source packages 20:43:50 <enrico> Any question? 20:44:37 <MadameZou> QUESTION:one source package can generate many binary packages.. i dont understand this 20:44:45 <enrico> Good question 20:45:09 <enrico> Think of a source package as the real software you find on the internet 20:45:14 <enrico> for example, "Open Office" 20:45:23 <enrico> or "Firefox" 20:45:33 <enrico> we normally have one source package for them 20:45:52 <enrico> but after compiling it, their build system generates lots of different packages 20:46:13 <enrico> because we don't always want to install all of Open^WLibre Office, or all the translations of Firefox 20:46:49 <enrico> so if you run "apt-cache search openoffice.org" you'll find lots of binary packages, and likely they're all pieces of the single big source 20:47:12 <enrico> does it make sense? 20:47:36 <magellanino> yes yes i understand now thanks enrico 20:47:41 <enrico> :) 20:48:01 <MadameZou> no more questions, enrico 20:48:17 <enrico> So we've seen source packages and binary packages. Maybe it would make sense to count the size of Debian in terms of source packages, but that'd make Debian feel much smaller :) 20:48:34 <enrico> like only 15000 packages or so <grin> 20:48:43 <enrico> which is still a lot 20:49:20 <enrico> Let's see that "Tag:" header 20:49:31 <enrico> it's been introduce to help dealing with a large number of packages 20:50:00 <enrico> in the past there was only the "Section:" header, which still exists: you also see it in "apt-cache show" 20:50:13 <enrico> Section is limited, in that one package can only be in one section 20:50:45 <enrico> would you put Evolution in the "mail" section or in the "gnome" section? 20:51:33 <magellanino> mail 20:51:36 <enrico> Both would be appropriate. So we started working on "Debtags" as a way to have a far better category system 20:52:17 <enrico> You can see that the Tag: header has several tags, not just one 20:52:26 <enrico> but Debtags is not just "multiple sections" 20:53:20 <enrico> Every tag is made of two parts, separated by "::" 20:53:33 <enrico> For example, debtags is "role::program" 20:54:05 <enrico> the first part identifies a group of similar tags, which due to habits in library science is called a "facet" 20:54:36 <enrico> so role::* is the group of all roles one package can have in the system 20:54:44 <enrico> program, library, documentation, plugin and so on 20:55:17 <enrico> there is a fantastic web page to browse all available tags, let me find the URL 20:56:01 <enrico> http://debtags.alioth.debian.org/vocabulary/ 20:56:23 <enrico> You can see all the facets (groups of tags) and, clicking on a facet, you can see all the options for that group 20:56:29 <enrico> ...all the tags for that group 20:57:09 <enrico> there are 620 different tags available at the moment 20:57:10 <enrico> quite a lot 20:57:23 <enrico> if we didn't have the groups, it'd be really complicated to keep track of them 20:57:38 <enrico> each group is a different "point of view" from which we look at Debian 20:58:15 <enrico> this is called "Faceted Classification" and the Debtags simplification of it is described here: http://debtags.alioth.debian.org/paper-debtags.html#debtags-theoretical-foundations 20:58:46 <enrico> the theory behind it is fascinating, but I won't go into it now 20:59:18 <enrico> There are many things in the Debtags project that is worth looking into 20:59:37 <enrico> one of them is the idea of looking at Debian from different points of view 20:59:49 <enrico> we like to say that "Debian is the universal operating system" 21:00:34 <enrico> but saying that you can do everything with Debian is not really helpful if somebody has a specific need 21:00:51 <enrico> so by using a group of tags we can give examples of what is available for a given field 21:01:02 <enrico> See the "Accessibility Support" group of tags 21:01:34 <enrico> "Biology", "Software Development", "Games and Amusement", "Security", "World Wide Web"... 21:01:59 <enrico> they are all examples of how rich is Debian 21:02:26 <enrico> Debtags is designed so that there are at least 7 packages for each tag 21:02:47 <enrico> this makes tags very concrete, they really represent some bit of Debian 21:03:07 <enrico> (7 comes from http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two) 21:03:28 <dapal> enrico, QUESTION: is there a command to search for tags? Say like if i search a media-player i do :"apt-cache search audio" 21:03:42 <enrico> dapal: very good question 21:03:57 <enrico> having so many (620) different tags calls for a search system for tags 21:04:36 <enrico> over the time we put together some interestingly scary smart algorithms to find tags 21:05:14 <enrico> one you can see in "axi-cache": if you have it installed, you can run, for example, "axi-cache search --tags image editor" 21:05:24 <enrico> and it will give you a list of tags that could be related to those keywords 21:05:43 <enrico> axi-cache comes with the package "apt-xapian-index": it is installed by default in many systems, but not all of them 21:05:47 <enrico> more of that later 21:06:20 <enrico> "goplay" is a wonderful little program that shows off the "many different points of view" idea 21:06:43 <enrico> thanks to Miriam Ruiz 21:06:48 <enrico> Here is a screenshot: http://www.miriamruiz.es/img/goplay-1.0_screenshot.png 21:06:58 <enrico> it is a program to find packages, but only *game* packages 21:07:15 <enrico> it can show screenshots, and allow to filter by game categories 21:07:40 <enrico> it's a program that does more with less: it *hides* some information to show only the information that really matters in a given field 21:08:37 <dapal> enrico: <dunetna> QUESTION: Do you think, in the future, Section: header could disappear (and only use debtags)? 21:08:40 <enrico> in the package goplay there are also goadmin, golearn, gosafe and goweb, which are similar to goplay but show a different point of view (for example, system administration) 21:08:51 <enrico> thanks, good question 21:08:54 <enrico> Probably not 21:09:08 <enrico> Sections will be around for quite a while 21:09:33 <enrico> showing an example in a sec 21:09:40 <dapal> enrico: (I also have another question queued, regarding debtags+UDD, do you want it now or later?) 21:10:19 <dapal> in the meanwhile: 21:10:20 <dapal> <valhalla_> QUESTION: the goplay screenshot shows Sexual Content and Violence Content facets, but I can't find them in the Debtags - Volabulary Browser, why? 21:11:05 <enrico> For example "Section: oldlibs" is used to automatically track packages that need to be ported to newer libraries 21:11:53 <enrico> There is a big difference between "Section" and "Tag": Section is maintained by ftp-master and Tag is maintained by developers and users 21:13:03 <enrico> so Section is a field that can be used to take important decisions on a package, because its editing is much more controlled 21:13:40 <enrico> but Section is going to be something that's used to sort of track the state of a package in Debian, and Tag something used to find a package in Debian 21:13:51 <enrico> I see them evolving in different directions 21:14:08 <enrico> although I reckon this is a rather subtle distinction at this stage 21:14:17 <enrico> more on the "how Debtags is maintained" later 21:14:42 <enrico> about "Sexual Content and Violence Content", that was an experiment by Miriam 21:14:47 <enrico> a very big work, actually 21:15:18 <enrico> debtags allow to have external tag sources, listed in /etc/debtags/sources.list 21:15:31 <enrico> it will download them and merge them similarly to what apt does with package information 21:15:56 <enrico> this can be used to provide tags that Debian cannot maintain in a standard way 21:16:25 <enrico> for example, many people disagree on the methods to rate a game by violence or sexual content 21:17:15 <enrico> while I don't feel confident in picking one method and making it Universal by adding the information in vanilla Debtags, I'm very happy to allow the content to be merged to a system if the user wants 21:17:26 <enrico> http://www.miriamruiz.es/weblog/?p=69 is some information from Miriam about the project 21:18:23 <enrico> I personally have't heard news about the game rating project since quite a while and I lost the link to the debtags source to use for it 21:19:07 <enrico> we had the idea to ship the ratings in a Debian package one can install, and provides the extra bit of configuration for Debtags 21:19:56 <enrico> (someone should chase Miriam up, and maybe offer her help: it was the main example of external tag data that can be optionally included in a Debian system and I'd hate to lose it) 21:20:43 <enrico> Another example use of external tag sources is to make Debian scale *down* to an organisation 21:21:45 <enrico> for example, a network of schools can maintain its own tag database with things like "school::teacher" "school::primary-education" "school::science-lab" and so on 21:22:32 <enrico> I think the Fuss project played with the idea some time ago, but I don't remember if they eventually deployed it 21:22:44 <enrico> (http://fuss.bz.it/) 21:23:13 <enrico> (The Fuss project is a Debian blend for the Italian speaking minority shools in the German speaking area of Italy) 21:23:39 <enrico> Let's move to how Debtags information is edited 21:23:56 <enrico> Obviously we cannot ask Debian developers to learn how to use 620 tags for their packages 21:24:11 <enrico> We could ask them to, but we can't expect them to actually do it well 21:24:36 <enrico> also, users can be better taggers than developers, because they can be field experts in a way the developer is not 21:24:57 <enrico> every IT person who worked with very specialised customers is well aware of this 21:25:46 <enrico> it's common to be asked to write, debug or package software that does things that one cannot understand 21:25:53 <enrico> (at least, it happens to me a lot) 21:26:05 <enrico> so tagging is done is a wiki-like way 21:26:24 <enrico> If you go to http://debtags.alioth.debian.org/todo.html you see a list of packages that need tagging 21:26:31 <enrico> click on a package and you'll have the tag editor 21:26:58 <enrico> the editor is a web application that allows anybody to edit the tags of a package 21:27:25 <enrico> it has interesting features, like it tries to suggest you tags or ways to improve the classification of a package 21:28:24 <enrico> Debian Developers are of course encouraged to have a look at their packages: in http://qa.debian.org/developer.php?login=enrico for example you can find a "Debtags" link that takes to a per-developer tagging TODO-list page 21:28:38 <enrico> http://debtags.alioth.debian.org/todo.html?maint=enrico%40debian.org is mine 21:28:56 <enrico> oh dear the interface is telling me off, I should fix some of them 21:29:19 <enrico> note it says things like "There is a 95.4% chance that the tag devel::library is missing" 21:29:25 <dapal> <dunetna> QUESTION: I see the debtag devel::lang:c. Is "lang" a kind of "subfacet"? 21:29:43 <enrico> it uses the same algorithms used by supermarkets to suggest you products to buy :) but I digress 21:29:57 <enrico> dunetna: well spotted 21:31:03 <enrico> I really want to keep the structure of debtags as just 2 levels: facet and tag. We tried trees and gave up because they are extremely difficult to maintain 21:31:56 <enrico> but sometimes we end up having little groups inside a facet, like in devel::lang:c ; it's convenient in that case, but not something I'd like to encourage 21:32:18 <enrico> so I don't like to think of "subfacets" or "subtags" 21:32:48 <dapal> <hlf> QUESTION : if anybody can add tag, are there no SPAM, or false tag ? 21:33:08 <dapal> (*cough*) 21:33:12 <enrico> hlf: thanks, good question 21:33:14 <enrico> dapal: :) 21:33:31 <enrico> indeed everybody can edit tags: go to http://debtags.alioth.debian.org/edit.html pick a package and play with it 21:34:00 <enrico> there is a "When done: [Submit]" button that does just that: it saves your edits in the Debtags database 21:34:19 <enrico> and no autentication of any kind: the idea is, you see an issue in the tagging of the package, go there and fix it 21:34:45 <enrico> SPAM is not an issue, because there is no way to send email or even to enter text contents like advertisement: the only thing you can do is add and remove tags 21:35:02 <enrico> there is an issue of quality of course, and possibly vandalism 21:36:04 <enrico> (although if somebody wanted to vandalise debtags, I'd be impressed: there are far more visible and more rewarding things worth messing with :) 21:36:47 <enrico> in case of vandalism, we have daily backups going back since the beginning of the Debtags project: the dataset is small, so backups are cheap :) 21:37:28 <dapal> <dunetna> QUESTION: Can you have two facets with more than one tag for a package? (I'm thinknig in works-with-format::) 21:37:30 <enrico> the issue is indeed quality. Sometimes people play with the interface by clicking at random and accidentally submit 21:37:59 <enrico> dunetna: yes. I'll add details in a moment 21:38:04 <dapal> (oops, sorry, thought you had finished) 21:38:27 <enrico> for ensuring quality, what happens is that all submissions are manually reviewed before entering Debian proper 21:38:50 <enrico> they are somehow aggregated so that they are easier to review 21:39:00 <enrico> the review is done by me and dapal 21:39:11 <enrico> big applause to dapal for helping there 21:39:16 <enrico> \o/ 21:39:16 <dapal> \o/ 21:39:20 * dapal thanks everybody 21:39:50 <enrico> the plan is to design some interface to allow debian maintainers to review submissions for their own packages 21:40:05 <TetsuyO> \o/ 21:40:06 <enrico> something like "people think the tagging of your packages should be changed this way:" 21:40:55 <enrico> but that interface is technically feasible, we have a decently good idea of how to build it, but still needs to be written 21:41:07 <enrico> I see it happening in a year or so, to give a rough timeframe 21:41:28 <enrico> with regards of two facets with more than one tag per package, you can, indeed 21:41:52 <enrico> another example is the "use::" facet, and the fact that a package can have many uses (think a web browser) 21:42:33 <enrico> in fact, any attempt to add restrictions to the way tags can be used has succeeded in showing a sizable number of unexpected corner cases where the rule would need to be broken 21:42:49 <enrico> therefore it just makes sense to have no restrictions except common sense 21:43:18 <dapal> enrico: two (three) questions in queue 21:43:25 <enrico> dapal: go ahead 21:43:48 <dapal> so, the in-topic one: 21:43:48 <enrico> let's do all the questions before moving on 21:43:49 <dapal> <MadameZou> QUESTION: are you looking for volunteer to review submissions? 21:43:52 <dapal> nice 21:44:09 <enrico> Always looking for volunteers there :) 21:44:20 <enrico> beware the current procedure is... special 21:45:24 <enrico> so I'm not too actively advertising the need for volunteers because I'm not sure I feel comfortable asking people to do it the way I do it, and I can't think of any better way that can be quickly put into place 21:45:53 <enrico> for that reason I'm very interested in building new "allow people to review" interfaces 21:46:19 <dapal> enrico: next question, <komozo> QUESTION: Could you give us one example (a name or a link) of algo used to implement facets ? 21:46:36 <enrico> MadameZou: but in the meantime, by all means if you'd like to get your hands dirty in it you'd make me very happy 21:47:09 <enrico> komozo: what do you mean with "algorithm used to implement facets"? 21:47:12 <MadameZou> enrico: thanks ;) 21:48:07 <enrico> The "supermarket suggestion" algorithm used to give some tagging suggestions is here: http://www.borgelt.net/apriori.html 21:48:53 <enrico> and http://www.enricozini.org/2007/debtags/axi-query-tags/ has the algorithm used for the smart way of searching tags used in "axi-cache search --tags" 21:49:48 <komozo> enrico: thanks 21:49:51 <enrico> komozo: could those be examples of what you're looking for? If not, please ask more :) 21:50:21 <dapal> enrico: next, <valhalla_> QUESTION: is partial tagging better than no tagging, or is it better not to add a few tags to a package if one is not sure it is missing some other tag? 21:50:22 <enrico> more questions? 21:50:29 <dapal> enrico: (that one, and one more) 21:50:51 <enrico> valhalla_: partial tagging is better than no tagging 21:51:04 <enrico> valhalla_: the wiki phylosophy works: you do your bit, someone else will do their bit 21:51:28 <enrico> valhalla_: there are "special::not-yet-tagged" tags in the web interface, removing those means one considers the package acceptably tagged 21:51:41 <enrico> valhalla_: worse case you can add some tags but leave it as "not yet tagged" 21:52:06 <enrico> Another interesting bit of the not-yet-tagged tags is that they are used to keep robots away 21:52:31 <enrico> there are tagging "robots" that use euristics on package information to decide that some tags could be added 21:52:44 <enrico> but they only work on packages that have "not-yet-tagged" tags attached 21:53:12 <enrico> only a human would remove the "not-yet-tagged" tags, so the tagging robots will respect the superior intelligence of humans and stop interfering :) 21:53:23 <TetsuyO> cool :) 21:53:35 <dapal> enrico: QUESTION: <hlf> can we use udd to search for tag like Implemented in C 21:53:58 <enrico> hlf1: I believe there is a debtags table in UDD, yes 21:54:29 <enrico> (http://wiki.debian.org/UltimateDebianDatabase is the page describing UDD) 21:54:38 <enrico> (for those who haven't heard it) 21:54:41 <dapal> (queue empty :)) 21:54:49 <enrico> it's the Ultimate Debian Database, a big source of information about Debian 21:55:16 <enrico> I'll move on with the trip, possibly a bit quicker (that is, going into a bit less detail) because there is more 21:55:45 <enrico> An interesting newish software is apt-xapian-index, that we quickly mentioned earlier because of axi-search 21:56:17 <enrico> apt-xapian-index maintains another index of package information in your system, in /var/lib/apt-xapian-index/ 21:56:45 <enrico> it does not replace apt's index, but it adds to it: it's designed to support higher-level queries 21:57:05 <enrico> it cannot be however used for installing packages because it cannot do depedency resolution (apt does that well, why reimplementing it) 21:57:11 <enrico> axi-cache is a tool that uses apt-xapian-index 21:57:27 <enrico> for example, "axi-cache search image editor" will show you image editors 21:57:44 <enrico> http://paste.debian.net/102573/ is an example in my system 21:58:23 <enrico> It will also suggest terms to improve the search, show a little tag cloud of extra tags you could use (text only, so somehow simplified) 21:58:28 <enrico> and it will also do spell checking 21:58:45 <enrico> axi-cache search firefax -> Did you mean: firefox ? 21:59:22 <enrico> it has really nice tab completion (dapal being the bash-complation maintainer as well as an extremely helpful fellow) 21:59:36 <enrico> axi-cache search <TAB> will start suggesting you tags 21:59:46 <enrico> axi-cache search image <TAB> will search you image-related keywords, and so on 22:00:26 <enrico> a really interesting feature of apt-xapian-index is that it can index all sorts of package information, even things that are not found in the Packages file 22:01:04 <enrico> one can implement more indexing features via plugins 22:01:29 <dapal> (another applause?) 22:01:34 <enrico> it's also self-documenting: every indexing run generates an updated version of /var/lib/apt-xapian-index/README which documents what is in the index 22:02:49 <enrico> so debtags tags are indexed for fast lookup in the apt-xapian-index index 22:03:02 <enrico> that's why axi-cache can generate tag clouds and suggest tags so quickly 22:03:29 <enrico> (I want a tag cloud in every graphical package manager! We're almost in 2011!) 22:04:31 <enrico> I was looking for a blog post where I show the algorithm for computing tag clouds but I can't find it right away 22:04:48 <enrico> extra information you find in apt-xapian-index: 22:04:55 <enrico> - "newness of a package" 22:05:29 <enrico> - "GUI menu entries for applications provided by this package" (and their icons) 22:05:40 <enrico> - translated package descriptions 22:06:14 <enrico> for example, you can look for "all packages that provide an application in a menu entry" 22:06:46 <enrico> I used this feature to implement fuss-launcher (http://www.enricozini.org/2010/debian/fuss-launcher/) 22:07:12 <enrico> which was interesting, because it uses Debian package information to look, not for packages, but for programs to run 22:07:55 <enrico> ideally you could write an application launcher that shows, grayed, matching applications that are not installed; then you could ask for information about them, and ask it to install them 22:08:06 <enrico> all the data is there, indexed and querable in a very fast way 22:08:22 <enrico> "newness" of a package is a very new feature 22:09:13 <enrico> in a nutshell, every time apt-xapian-index sees a package that wasn't there before, it takes note of the date 22:09:31 <enrico> so you could search or sort packages by "how recently they appeared in my system" 22:09:46 <enrico> like the "New packages" view of aptitude, but with history 22:10:00 <enrico> "what was that package that was new last week?" 22:10:24 <enrico> there are currently no UIs I know of that use this information, but the data is there 22:10:46 <enrico> ready to be used 22:11:17 <enrico> "newness" is not information about a package per se, but more like information about a package in a specific system 22:11:28 <enrico> other similar information is "is the package installed?" 22:11:43 <enrico> or "was the package installed automatically or was it explicitly requested by the user?" 22:12:00 <enrico> these you usually find in aptitude or apt 22:12:09 <enrico> there is more 22:12:29 <enrico> if you have popularity-contest installed, you get /var/log/popularity-contest with information about when you last used every package in your system 22:12:50 <enrico> it'd be trivial to write a script that shows you the packages you have installed but never used, using that information 22:13:31 <enrico> (I need a plugin to get that information into apt-xapian-index, so that one can sort packages by "when did I last use it" in the axi-cache results) 22:14:32 <enrico> I mentioned apt-xapian-index knows of what applications are provided by a package, even for packages that are not installed: it can do so thanks to the information provided in the "app-install-data" package, which contains a copy of the .desktop files contained in any package in Debian 22:14:56 <enrico> it's used to implement "find more applications for this menu" kind of features 22:16:02 <enrico> There is obviously more information about packages: most of it you can find in UDD (http://wiki.debian.org/UltimateDebianDatabase) if you know SQL 22:16:10 <enrico> for example: bug reports 22:16:21 <enrico> or all sort of information collected by the Debian-QA project 22:17:31 <enrico> ok, that's a general idea of information about Debian packages 22:17:59 <enrico> There is also quite a bit of information about packagers :) 22:18:11 <enrico> http://wiki.debian.org/DDPortfolio is a very good index 22:18:28 <enrico> you can use it to look up everything known about a Debian Developer 22:18:38 <enrico> (people in Front Desk use it quite a bit :) 22:19:29 <enrico> I notice now that I have another page in my notes 22:19:46 <enrico> I could: 22:19:50 <enrico> 1. keep talking for another hour 22:19:57 <enrico> 2. quick fire links about more information 22:20:05 <enrico> 3. keep the rest for another session 22:20:42 <dapal> enrico: in the meanwhile, question 22:20:49 <enrico> unfortunately I can't see from IRC whether you're all listening keenly or snoring loudly :/ 22:20:56 <dapal> <nadir> QUESTION: the xapian in apt-xapian-index has got a meaning? I got problems to remember the name... knowing something about xapian might help. 22:21:29 <enrico> that is a very good point 22:21:50 <enrico> it's called Xapian because it's built on the Xapian indexing system http://xapian.org/ 22:22:14 <enrico> unfortunately I don't know why they chose that name for their project 22:22:42 <enrico> in hindsight, apt-xapian-index should have had some more memorable name 22:23:10 <enrico> the idea was to not require users to install that package explicitly, but to have it as a dependency of high level package managers 22:23:17 <enrico> for example, goplay depends on apt-xapian-index 22:24:50 <enrico> Ok, so I got two votes for option 1 and none for 2 and 3 22:24:59 <enrico> so there are at least 2 people not snoring loudly :) 22:25:17 <enrico> More package information: popularity contest 22:25:26 <enrico> see http://popcon.debian.org/ 22:25:37 <enrico> for example: http://qa.debian.org/popcon.php?package=debtags 22:25:58 <enrico> it shows some statistics of how many people have that package installed 22:26:24 <enrico> it has all sort of biases, but it's a way to implement a "sort by popularity" feature in a package manager 22:27:07 <enrico> such feature has not yet happened because there is still no proper way to acquire that information in a Debian system 22:28:07 <enrico> I'd like to have a way to have it done at "apt-get update" time, maybe with a file in the mirrors next to the Packages file; that would be the proper way to do it, but it would require coordination about several busy people in Debian 22:28:24 <enrico> still, it's in my wishlist of things to maybe tackle at some Debconf 22:28:50 <enrico> Another data source, really cute one, the EDOS Debian Weather: http://edos.debian.net/weather/ 22:29:35 <enrico> It's a research project studying package dependencies 22:30:08 <enrico> they put together some really smart algorithms for checking dependencies, and as a demo they compute how "installable" Debian is on any given day 22:30:23 <enrico> if most packages can be installed fine, they show a sunny icon 22:30:40 <enrico> if there are so and so packages that are uninstallable due to broken dependencies, they show rain 22:31:15 <enrico> if there are a lot of broken packages today, maybe because there is some transition mess going on in sid, they show a thunderstorm icon 22:31:25 <enrico> so you can check how's the weather like before running dist-upgrade 22:31:28 <enrico> genius! 22:32:13 <enrico> I wanted them to make an applet with the Debian Weather to add to my panel, but I'm not aware it has been made yet :-/ 22:32:40 <enrico> Another information source: apt-file 22:32:51 <enrico> you can use it to search the contents of packages 22:33:34 <enrico> for example, you hear a friend say "ah, you can do that by running foo". You run "foo" and you get "Command not found": what package contains foo? 22:33:43 <enrico> "apt-file search foo" will tell you# 22:33:52 <enrico> it uses the Contents files in the Debian mirrors 22:34:17 <enrico> if you look at http://ftp.debian.org/debian/dists/squeeze/ you'll see the Contents files 22:34:34 <enrico> they're very big, as they list the name of every file provided by every package 22:34:54 <enrico> in order to run apt-file, you need to run "apt-file update", which will download the right Contents files for your system 22:35:18 <enrico> if you're in a hurry, you can also use "rapt-file", which is also in the "apt-file" package. The "r" stands for remote 22:36:10 <enrico> so if you want to find out what is the package that provides GNU R, and "apt-cache search r" is not very helpful, you can use "rapt-file search bin/R" 22:36:36 <enrico> (alternatively, you can use "axi-cache search r" and wow yes it is that smart, it does the right thing) 22:37:18 <enrico> Questions so far? 22:37:47 <dapal> enrico: none in #dw-question 22:38:06 <enrico> If you're interested in tracking what happens in a package, there is also the Package Tracking System 22:38:10 <dapal> err 22:38:13 <dapal> there's a question 22:38:18 <enrico> at http://packages.qa.debian.org 22:38:36 <enrico> dapal: I'll take the question 22:38:47 <dapal> <nadir> Question: i ran across special signs, where apt-cache failed (often a + sign). Is axi-cache a way out? 22:39:24 <enrico> Good question. 22:39:59 <enrico> axi-cache delegates most of indexing and query parsing to Xapian, so it boils down to how Xapian treats special signs 22:40:20 <enrico> it looks like the + sign is handled properly: at least "axi-cache search a+" finds the A+ programming language 22:41:28 <enrico> I wouldn't know for sure about other characters, at least not without looking up the documentation of Xapian's TermGenerator and QueryParser 22:42:07 <enrico> talking about QueryParser documentation, http://xapian.org/docs/queryparser.html is a good piece of documentation for axi-cache 22:42:31 <enrico> you can for example do "axi-cache search mail AND NOT implemented::php" 22:43:15 <enrico> (...implemented-in::php) 22:43:56 <enrico> back to the package tracking system 22:44:40 <enrico> The Package Tracking System (packages.qa.debian.org) is a tool to track everything about a package 22:45:20 <enrico> If you look for example at http://packages.qa.debian.org/d/debtags.html you'll find a page with the package status and all sorts of links to every possible information available about it 22:46:05 <enrico> and in the bottom left of the page there is a little half hidden box where you can add your e-mail address to be kept "in the loop" about many things that happen to the package 22:46:50 <enrico> the little selection next to the email field has three options: sub/unsub/opts 22:47:10 <enrico> sub for subscribe, unsub for unsubscribe. Opts for subscription options 22:47:45 <enrico> http://paste.debian.net/102578/ is a list of all available subscription options 22:48:02 <enrico> it is a really nice tool 22:48:29 <enrico> you can get for example a copy of all mails reporting a new bug in a package 22:48:50 <enrico> or a mail with the changelog of every new version of the package uploaded in Debian 22:50:07 <enrico> Finally, still a bit work in progress, we have Debian Data Export: http://dde.debian.net/dde/ which is a web application to make it easy to download information about Debian packages 22:50:27 <enrico> it is currently used as the remote backend for rapt-file 22:51:41 <enrico> I'm looking for an example URL, a sec... 22:52:24 <enrico> For example, http://dde.debian.net/dde/q/bts/bynumber/123456 will give you all available information about Debian bug 123456 22:52:42 <enrico> by default, it shows a page with a bit of documentation 22:53:07 <enrico> but you can add ?t=FORMAT and it will give you the same information in a format of your choice: for example, http://dde.debian.net/dde/q/bts/bynumber/123456?t=json 22:53:45 <enrico> http://dde.debian.net/dde/ lists the formats that are available: currently JSON, YAML, CSV and Python Pickled objects 22:54:11 <enrico> the JSON export is interesting: a DDE plugin can become the backend for a Javascript web application 22:54:50 <enrico> that is, incidentally, how I intend to implement the interface for maintainers to approve changes to the Debtags tags of their packages 22:55:46 <enrico> This more or less brings us to the end of my notes 22:56:23 <enrico> I'd do the final questions 22:56:34 <enrico> (if any) 22:57:12 <enrico> Personal relection of mine: we have way more information that we currently show 22:57:49 <dapal> seems like there are no questions? 22:57:59 <enrico> there is an incredible amount of neat applications that can be built on it 22:58:36 <enrico> I hope this trip can inspire more such applications to appear :) 22:59:01 <dapal> MadameZou: time to #endmeeting ? :) 22:59:13 <MadameZou> dapal: yes sir 22:59:29 <MadameZou> #endmeeting