Home

Advertisement

Customize

Wouldn't it be nice ...

if life were perfect?

8/3/07 11:48 pm - Improving quality in Gentoo

I recently posted about making Gentoo a better tool. A requirement for being a good tool is being a tool that doesn't break—thus, we need to improve our quality to a more reliable level. I'm going to mention a few ideas to start this discussion, which I hope the rest of our community will participate in.

First, we essentially have no code review. About the only time any code in Gentoo is reviewed is before and during a developer's training, with a notable exception being the requirement to post eclasses to the gentoo-dev list. Increasing our code review ought to result in an increase in quality, in ability to justify code in words, and in a stronger community of contributors.

How do we increase code review? One idea is to require reviewer approval prior to committing, but this isn't the best answer for Gentoo. We've always been a pretty open community. Developers aren't prohibited by ACL from committing anywhere in our ebuild repository, so I don't think they would accept additional requirements that increased the burden of contributing.

Instead, let's create a gentoo-commits mailing list or RSS feed(s), with full diffs. We should use this tool in many different ways.
  • Each team should use it internally to review all commits to its packages.

  • Mentors should continue to follow their mentees' commits well after they're granted commit access (6 months minimum, and I recommend forever).

  • Mentees should also review their mentors' commits, first to learn and later to review.

  • Every developer should have at least one reviewer and review at least one other developer. This should be formal and documented to ensure it's happening.

These uses will require that the commit diffs be easily filterable by both committer and files affected. RSS feeds could be made available based on developer or herd, and e-mail lists could contain the information in e-mail subjects or headers.

Second, we should improve our unit testing, where the units are individual packages. This can be both automated and performed by developers and arch testers. Although a number of packages have a useful, working test suite, most lack one. For these packages, we should attempt to provide something automatable in src_test() even when a test suite is absent. Failing that, we should print out a checklist in src_test() of tests to perform before stabilizing a package. There should never be an empty src_test().

Another package-level testing approach is to create solid, automated tinderboxes. This remains unrealistic for our entire database of 10,000+ packages, but we should at least get this going for our "system" set and perhaps for some of the most common sets of packages for servers and desktops. Exactly how to set this up remains a question, since there's a lot of tinderbox code floating around. Bonsaikitten has some almost-working code based on swegener's work; Catalyst has some tinderboxing capability; or we could look into using Mozilla's tinderbox.

Third, we should improve our integration testing, on the entire repository level. Our main source of testing here will be our users, because they have infinitely more combinations of build options and hardware than we can reproduce on Gentoo infrastructure. But how can we take advantage of this testing to improve our quality? By creating an additional, time-lagged set of rsync mirrors with additional QA checks, we could allow users who want to test the latest and greatest software to help those who want stable and solid software.

We already have keywords for ~arch and arch, but they're still too mixed. A problem in ~arch ebuilds can break the entire tree for all users. They really need a stronger separation. Perhaps the separate repositories should be ~arch versus stable. But another way to do it is to add a delay to the second set of repos, anywhere from 24 hours to a week. This delay allows us time to encounter major problems in the fast-sync repos, fix them, and carry the fixes over to the slow-sync repo. But we'll need a way to make this really easy to do. It feels like branching with periodic merges, along with cherry picks of major bugfixes, is the right way to do this. Unfortunately, CVS sucks at this. We may need to migrate to a more capable version-control system before this option becomes realistic. In addition to the user testing, we could add a tinderbox into the slow-sync repos to require that they build with the most common configurations.

To sum up, I want to increase code review, unit testing, and integration testing. These three things will strengthen Gentoo's quality, reputation, and community.

11/24/06 12:35 am - It's been a while ...

As that famous song [YouTube, last.fm] says, it's been a while. Since last I blogged, that is. Lots of stuff going on in my world, although I haven't been spending enough time on Gentoo lately.

I've joined the Web 2.0 trend, using Google Reader and saving my bookmarks on del.icio.us via the wonderful Firefox plugin. Next thing you know, I'll be reading Digg or another equally trendy Slashdot replacement. The only thing like that I read now is the superb LWN. I just added the Planet Conary feed (thanks ferringb!), because I think there's a lot Gentoo can learn from rPath, since it's got a similar base.

My Gentoo activity is probably best illustrated via the CIA commit stats — only 9 commits this week and 41 this month. A large part of my drop in commit activity lately is thanks to Joshua Baergen (Josh_B on IRC), who's really started to take over X maintenance with double my commits this month, mostly in preparation for X.Org 7.2 as well as the new input-hotplug work for X.Org 7.3.

In Gentoo, we plan to show you a mixture of 7.2 and 7.3. What we try to do is mix and match the latest individual X component releases wherever they're compatible, regardless of which "official release" they come from. So you may already have a number of input-hotplug components, and the only changes you'd need to make are the server and drivers. This mirrors what you saw with 7.0 and 7.1, where the server and drivers lagged back on 7.0 waiting for Nvidia and ATI while all the other components jumped to 7.1.

I'd like to publicly thank Diego Pettenò (look, I got the accent right!) for his contributions to XCB, both in my overlay and upstream. On that note, I encourage anyone using my overlay to send me patches for anything that doesn't work. There's no reason a personal overlay should only hold commits from that person.

In the past month, I've gotten in touch with two new, exciting ventures using Gentoo. Engine Yard is a Ruby on Rails deployment provider that allows you to purchase virtual clusters, and SiCortex is an innovative HPC cluster creator that uses Gentoo on clusters with 5,800 nodes. Check out the videos on the Engine Yard site, they've got one specifically about their use of Gentoo.

I've also taken on the job of creating a monthly newsletter for the OSEL, which aims to get more students involved in open source at OSU and liaise with the academic side of the university, while the LUG interfaces with the local community and the OSL connects with the broader, outside community. This is really exciting for me because I've got a significant journalism background [PDF] (and no, that contact information is no longer accurate), but I haven't had a chance to use it for a couple of years. I'll share the first issue with all of you once I finish it.

10/26/06 10:10 pm - Current projects

For anyone who's interested, here are the projects I've got going right now. Many of them could use some help, so take a look and let me know if you're interested in any. Roughly in my order of interest:

  • Add the Sugar desktop environment for OLPC — it's in my overlay, but it segfaults on startup of sugar-emulator somewhere in sugar-shell code. Try it out and see whether you can come up with a fix.

  • Port LTSP to Gentoo — pioto, straaken and perhaps another person or two are working with me on this. This involves changes to the client-building plugins, init scripts, and adding some ebuilds. Also, probably creating Seeds for the client and server.

  • Get the rest of the system-config-* GUI tools from Red Hat working — some remain masked. Would appreciate testing and fixing on any that remain masked.

  • Add virt-manager into the main tree from my overlay — haven't got a Xen instance to test it with yet. If anyone would like to test this and let me know, that'd be great.

  • Fix our X init scripts to be more like upstream intended, then fixing upstream to be current. havner is taking the lead on this, and I look forward to seeing his work.

  • Add some new science packages, including KiNG and friends from the Richardson lab, CCP4MG, CCTBX and more.

  • The infamous bug #44132 — make multiple MPI implementations simultaneously installable.

  • Resume my occasional series of blog posts on Gentoo in the enterprise, embedded, cluster etc environments. One post I want to make is how to use the Gentoo installer's CLI frontend to make large, automated installations easy.


And of course these are beyond the usual ongoing maintenance of X, science packages and cluster packages.

9/11/06 12:27 am - [Gentoo] Focusing Gentoo without forking it

Mark Shuttleworth posted (Thanks to Steve O'Grady for the link) about how Ubuntu focuses in a few specific areas, but Debian is a more general plateau. One can trivially draw the parallel between Gentoo and Debian, so his points are equally applicable to us. Most Gentoo developers draw the most pleasure from working near the bleeding edge, not from trying to backport fixes and fix old stable software.

Perhaps this is because it requires more creativity and less monotony. I certainly feel more challenged and fulfilled by packaging new software (such as the system-config-* utilities I did last weekend) than by fixing some simple bugs for random stable packages.

But this raises some new questions: Can Gentoo develop specific "peaks" in conflicting areas, without forcing new subdistributions to form that focus on them? If so, how? Stuart Herbert and I threw around some ideas shortly after I started a discussion about whether democracy works for Gentoo, and our lack of goals.

Stuart's idea, which I like, is preparing specific "releases" for certain vertical markets. Yeah, I said "vertical markets." WTH is that? Just a given group of people using Gentoo for a certain purpose, such as a LAMP stack, an HPC cluster or a development workstation. One could create a LiveCD with an installer image tailored to, and preconfigured for, a LAMP server. The key here, as Stuart pointed out in our discussion, is making things "just work," not just installing the packages and leaving the user to set everything up. But we'd need more than just the LiveCD, because clearly people want to maintain the installation. Perhaps adding a series of profiles for these vertical markets could do the trick. Some developers have already tested this concept with GNAP, the Gentoo Network APpliance, but not in a formalized way that pushes into a number of different areas.

8/3/06 09:13 pm - [Gentoo] Gentoo in the enterprise 2: Infrastructures ~= Clusters

Last night, I checked out our server project, which aims to make Gentoo more suitable for the enterprise. The page links to an intriguing paper on infrastructures.org that suggests treating an entire infrastructure of hundreds of machines as a single "virtual machine."

What does this remind you of? Oh yeah, a cluster! Modern cluster architectures consist of a collection of roles, each role fulfilled by any number of machines. Roles include master node to serve out jobs, compute node, file server, and so forth. In the paper, they envisioned an infrastructure as a collection of computers filling various roles: NFS servers, Web servers, clients, etc. that would be imaged out from a gold server. In both cases, the goal is to avoid ever dealing with configuration changes on individual machines but instead find a way to centrally administer the entire group.

Conclusion? Clusters and infrastructure are merging, and expertise in one grows more and more applicable to the other. Clustering and infrastructure people should collaborate more to develop methods to manage the madness.

8/2/06 12:34 pm - [Gentoo] Gentoo in the enterprise, part 1

My Google news feed for Gentoo just turned this up: "GEMS aims to make large-scale Gentoo Linux management easier." The homepage is here if you'd prefer to skip the news story.

It's an interesting overlap with Gentoo's own SCIRE project, the "Systems Configuration, Installation, and Replication Environment", which aims to create a distro-neutral administrative portal for large networks of Linux machines. By the way, SCIRE is looking for more developers. If you're good at Python and could use something like this in your enterprise installation, check out the homepage, or stop by Freenode IRC on #gentoo-scire.

7/31/06 10:54 pm - [Gentoo] Gentoo in ... (Enterprise|Clustering|Embedded|Development|...)

I plan to write a few articles about how to use Gentoo in widely varying areas, from the enterprise to the development box. I'd appreciate some suggestions on good areas to pick, any tips about any of these areas that you use, or anything else relevant.

2/2/06 11:30 pm - [Gentoo] E-Trade and Gentoo

I've come across an interesting via LWNeWeek article on their use of open source. It's intriguing to see some of the reasoning that goes on in the minds of people like vice president of architecture Lee Thompson.

A couple of Gentoo-relevant excerpts follow my comments. The main gist is that the ability of your system to survive a larger rate of change is what makes it a survivor, that he wants a Gentoo server distro, and that his goal is to balance the agressive change of the first point and the stability of the second point.

OK, so you know the phenomenon—the phenomena is, the amount of change
that you are sustaining on a Gentoo system is orders of magnitude larger
than the amount of change that a typical proprietary operating system
from anybody—Solaris, HP-UX, mainframes, whatever—[would go through].

Whatever operating system, the rate of patches coming out of the vendor
is much lower than what you enjoy on, you know, my Gentoo laptop or your
Gentoo machine.

And then I started looking, kind of watching this, obviously, from a
technology management perspective. … If you can sustain change faster
than somebody else, you're going to survive, and the person who can't
sustain the change is not going to evolve, and they're going to die off.
This is almost more important a realization than the direct cost
savings, which is still phenomenal.

Yeah, I've been running Gentoo for the 2002 to 2003 time frame, and I've
had several issues. I've said to myself, well, you know, the change rate
is worth it. Change destabilizes, but change is good, and that's kind of
a classic problem. I don't want to suffer from innovator's dilemma at
E-Trade. I want to keep pushing this company very, very hard. So I want
to drive change. The downside of that is if you try to change, you can
destabilize the system.

[Gentoo's Chief Architect] Daniel Robbins always wanted to do a server
variant of Gentoo, which the project, I don't think, ever started, but
it's always been something that was kind of on the mind of the Gentoo
community—that there should be the top-of-tree distro, and behind it
something a little more stable, almost exactly mirroring what the Fedora
community project is and the Red Hat AS series of servers.

So, here I am, the guy who's trying to push change. I work on a Gentoo
box, while our production system is Red Hat AS 3.4, which is very
stable. And so that's kind of a good way of balancing aggressive change
and stability, in our mind.
Powered by LiveJournal.com