Donnie Berkholz ([info]spyderous) wrote,
@ 2007-08-03 23:48:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Entry tags:communication, development, enterprise, gentoo

Improving quality in Gentoo
I recently posted about making Gentoo a better tool. A requirement for being a good tool is being a tool that doesn't break—thus, we need to improve our quality to a more reliable level. I'm going to mention a few ideas to start this discussion, which I hope the rest of our community will participate in.

First, we essentially have no code review. About the only time any code in Gentoo is reviewed is before and during a developer's training, with a notable exception being the requirement to post eclasses to the gentoo-dev list. Increasing our code review ought to result in an increase in quality, in ability to justify code in words, and in a stronger community of contributors.

How do we increase code review? One idea is to require reviewer approval prior to committing, but this isn't the best answer for Gentoo. We've always been a pretty open community. Developers aren't prohibited by ACL from committing anywhere in our ebuild repository, so I don't think they would accept additional requirements that increased the burden of contributing.

Instead, let's create a gentoo-commits mailing list or RSS feed(s), with full diffs. We should use this tool in many different ways.

  • Each team should use it internally to review all commits to its packages.

  • Mentors should continue to follow their mentees' commits well after they're granted commit access (6 months minimum, and I recommend forever).

  • Mentees should also review their mentors' commits, first to learn and later to review.

  • Every developer should have at least one reviewer and review at least one other developer. This should be formal and documented to ensure it's happening.

These uses will require that the commit diffs be easily filterable by both committer and files affected. RSS feeds could be made available based on developer or herd, and e-mail lists could contain the information in e-mail subjects or headers.

Second, we should improve our unit testing, where the units are individual packages. This can be both automated and performed by developers and arch testers. Although a number of packages have a useful, working test suite, most lack one. For these packages, we should attempt to provide something automatable in src_test() even when a test suite is absent. Failing that, we should print out a checklist in src_test() of tests to perform before stabilizing a package. There should never be an empty src_test().

Another package-level testing approach is to create solid, automated tinderboxes. This remains unrealistic for our entire database of 10,000+ packages, but we should at least get this going for our "system" set and perhaps for some of the most common sets of packages for servers and desktops. Exactly how to set this up remains a question, since there's a lot of tinderbox code floating around. Bonsaikitten has some almost-working code based on swegener's work; Catalyst has some tinderboxing capability; or we could look into using Mozilla's tinderbox.

Third, we should improve our integration testing, on the entire repository level. Our main source of testing here will be our users, because they have infinitely more combinations of build options and hardware than we can reproduce on Gentoo infrastructure. But how can we take advantage of this testing to improve our quality? By creating an additional, time-lagged set of rsync mirrors with additional QA checks, we could allow users who want to test the latest and greatest software to help those who want stable and solid software.

We already have keywords for ~arch and arch, but they're still too mixed. A problem in ~arch ebuilds can break the entire tree for all users. They really need a stronger separation. Perhaps the separate repositories should be ~arch versus stable. But another way to do it is to add a delay to the second set of repos, anywhere from 24 hours to a week. This delay allows us time to encounter major problems in the fast-sync repos, fix them, and carry the fixes over to the slow-sync repo. But we'll need a way to make this really easy to do. It feels like branching with periodic merges, along with cherry picks of major bugfixes, is the right way to do this. Unfortunately, CVS sucks at this. We may need to migrate to a more capable version-control system before this option becomes realistic. In addition to the user testing, we could add a tinderbox into the slow-sync repos to require that they build with the most common configurations.

To sum up, I want to increase code review, unit testing, and integration testing. These three things will strengthen Gentoo's quality, reputation, and community.



(Post a new comment)

About tinderboxes
(Anonymous)
2007-08-04 08:11 am UTC (link)
Maybe there could be a distributed tinderboxes that spread individual packages to be compiled on donating user's system via icecc or similar (gentoo_arch_folding@home, heh)?

I bet there would be a lot of people that would be willing to donate CPU cycles only as they don't want to get more involved, like coding and testing...

Just a thought. ;-)

(Reply to this) (Thread)

Re: About tinderboxes
[info]spyderous
2007-08-04 06:24 pm UTC (link)
I'm a bit concerned about this idea because we would have a hard time reproducing the build environment.

(Reply to this) (Parent)

Great thoughts
(Anonymous)
2007-08-04 08:11 am UTC (link)
But this requires a significant amount of manpower and an increased investment of time from every developer. Do we have the resouces to tackle this? Wiser heads than mine may know...

Perhaps it would help to launch a new recruitment drive to pull in devs whose primary job is QA. Then again, it's hard enough to find people just to maintain many of the most important ebuilds.

Your second suggestion, which involves (mostly) one-time costs rather than ongoing commitments of resources, sounds very reasonable. I hope to see src_test become far more widespread in the near future.

(Reply to this) (Thread)

Re: Great thoughts
[info]spyderous
2007-08-04 06:26 pm UTC (link)
Code review is something everyone should do; it shouldn't be restricted to a certain few.

(Reply to this) (Parent)(Thread)

Re: Great thoughts
(Anonymous)
2007-08-04 06:46 pm UTC (link)
Indeed, properly implemented code review is as much about *education* as it is about *quality. Of course education indirectly results in quality.

(Reply to this) (Parent)


[info]chewi
2007-08-04 09:38 am UTC (link)
I've written a fair few ebuilds but I've never written src_test. If the package has no test target in the Makefile, what sort of thing should we be testing for? That the program runs? That it produces the expected output?

(Reply to this) (Thread)


(Anonymous)
2007-08-04 03:33 pm UTC (link)
can only second that, a lot of packages dont have a test target so whatever you're gonna test its gonna be a pain. (unfortunately) unless u don't have a life, but supporting a few packages myself im already often out of time.

(Reply to this) (Parent)(Thread)


[info]spyderous
2007-08-04 06:32 pm UTC (link)
The kinds of things an arch team would need to test to determine whether a package is ready to mark stable.

(Reply to this) (Parent)

fsteinel
(Anonymous)
2007-08-04 10:21 am UTC (link)
For the rss feeds, cia.vc comes to mind.
http://cia.vc/stats/project/gentoo
The gentoo vc script need some update compare with
http://cia.vc/stats/project/GNOME

(Reply to this) (Thread)

Re: fsteinel
[info]spyderous
2007-08-04 06:38 pm UTC (link)
Yeah, it's nice that it shows subprojects, but it doesn't get as far as actual diffs. I wonder how much work that would be to make happen.

(Reply to this) (Parent)(Thread)

Re: fsteinel
(Anonymous)
2007-08-05 11:11 am UTC (link)
Change the line
ChangeLog
to
ChangeLog
:-)
The ciabot_svn.py sript supports this already.

http://cia.vc/clients/svn/ciabot_svn.py
http://cia.vc/doc/clients/
http://cia.vc/blog/2007/04/run-your-own-cia-virtual-machine/
http://svn.navi.cx/misc/trunk/cia/

(Reply to this) (Parent)(Thread)

Re: fsteinel
(Anonymous)
2007-08-05 11:14 am UTC (link)
<file action="modify" uri="http://svn.gnome.org/viewsvn/deskbar-applet/trunk/ChangeLog"ChangeLog</file to <file action="modify" uri="http://svn.gnome.org/viewcvs/deskbar-applet/trunk/ChangeLog?r1=1430&r2=1431"ChangeLog</file

(Reply to this) (Parent)

Re: fsteinel
[info]spyderous
2007-08-07 08:01 am UTC (link)
Problem is, we use CVS. =)

(Reply to this) (Parent)

Unit Tests
(Anonymous)
2007-08-04 10:49 am UTC (link)
Having src_test() being used more would be great, especially since there is a lot of packages in the tree already that fail src_test() (paludis runs tests automatically, and its really annoying when a package fails).

(Reply to this)

Code review and the commit list
(Anonymous)
2007-08-04 12:54 pm UTC (link)
"gentoo-commits mailing list or RSS feed" -- That should definitely be an "and", create both!
We could go even further. Filtering a commit list with local procmail rules is easy and can be done by any dev. But what about non-devs which are not on procmail-systems? How to handle package moves with all those local settings?
After creating such a list, I'd say it would be time for a system similar to Debian's QA - where you can subscribe to Bugs (and commits) on a per-package basis. Filtering CVS commits by directories should not be a problem on our side. Bugs could be harder, but maybe we want to solve that bugzilla-metadata problem sometimes?

(rbu)

(Reply to this) (Thread)

Re: Code review and the commit list
[info]spyderous
2007-08-04 06:27 pm UTC (link)
Everyone with any kind of decent email setup can filter mail somehow. Evolution, Thunderbird et al have built-in filters. It would be nice to have a decent bugzilla setup, but that's a bit beyond the scope of these proposals.

(Reply to this) (Parent)(Thread)

Re: Code review and the commit list
(Anonymous)
2007-08-07 08:36 pm UTC (link)
That still does not account for:
* Package moves
* Server load - sending out every mail to every user just to have him/her receive 0.1% of all changes clearly makes little sense.

A lot of the discussion on the list even make filtering hard to impossible (RSS diffs, common inboxes, digests), which would be sad and break with your "one guy reviews the other guy" idea.

(rbu)

(Reply to this) (Parent)


(Anonymous)
2007-08-04 03:29 pm UTC (link)
I think you have some great ideas!
I really like the idea of review, and "gentoo-commit" would be a nice thing to learn and watch packages trough (so flameyes does not have to be mad at people tuching pam.d retrospectiv for example.:-P)
Diffrent timelagged servers I think would be more problems than it is worth. Mostly when it comse to security updatees. Should those be pushed past this time-lag?
And I think many will just use the early rsync-mirrors (as many use ~arch) today. without any really need, but just to be able to call them self cutting edge.

About src_test():
Great idea! But first we have to fix failing tests first before we can move on. And remove all those RESTRICT="test" from ebuild!
As GCC seems to get a little bit more unstable (look at GMPs homepage for example) scr_test needs to fail ONLY when there really is a problem, and then really fail (and not like glibc/gcc only tell you that test failed and it sucks).
Other QA-things would be to really support MAKEOPTS="-jsomething" in as many packages as possible, so people with SMP/SMT/distcc really can enjoy their systems.
Rework things like ebuild to utilitise Gentoo to its fullest.

(Reply to this) (Thread)


[info]spyderous
2007-08-04 06:31 pm UTC (link)
If as many people used the early mirrors as use ~arch today, that still leaves a whole lot of people on the time-lagged ones. Do you think anywhere near the majority of our users are on ~arch? I don't.

(Reply to this) (Parent)(Thread)


(Anonymous)
2007-08-06 07:55 am UTC (link)
Maybe not. I do not really know if that is still the case, but for some problems I have encountered I have got the respons "go ~arch". I have not seen that response for a while, but that is one of the reasons why at least one of my machines is in ~arch.

But maybe have a set of "stabler" rsync-mirrors and a set of "unstabler" is not such a bad idea. However I do not think the idea about time-lagged ones with a lag of up to one week is a good one (as pointed out security-wise).

I like most of your ideas, but I think some of them are problems already upon us that needs to be addressed first (like failing testcases and such) and then we can move on to others (like create our own testcases).

(Reply to this) (Parent)

Testing
(Anonymous)
2007-08-06 06:14 am UTC (link)
Hi Donnie,

mlangc (x86 AT), pylon and I are working on the Gentoo Arch Testing Tool, which takes a stabilisation/keywording bug number as argument, puts all needed packages in p.keywords, emerges it and can create scripts that will do all the commit work (handy for big stabilisations). Planned is a USE flag iteration, so compilation of several USE flag combinations is tested (configurable, how many).
Next, mandatory src_test would be nice, but for Emacs team I see problems to automatically implement it. We provide test plans for a lot of packages we maintain in our wiki. Maybe this should be centralized.
http://overlays.gentoo.org/proj/emacs/wiki/test%20plans

Christian, opfer(at)gentoo.org

(Reply to this) (Thread)

Re: Testing
[info]spyderous
2007-08-06 08:17 pm UTC (link)
That tool sounds quite useful and should save some time, but not quite clear on the relationship to QA?

Yes I think that testing materials should accompany the ebuilds themselves, just like documentation.

(Reply to this) (Parent)

Nice post
[info]markusle
2007-08-09 03:03 pm UTC (link)
Thanks for the nice and insightful post! One point I would like to add, which bugs me all the time, is the inability to test on all ARCHES for which a particular ebuild is keyworded due to a lack of hardware access. Particularly for scientific packages this makes it often very difficult to really deliver quality ebuilds that run fine on all ARCHES and also complicates bugfixing.

(Reply to this)


(Anonymous)
2007-08-10 11:18 pm UTC (link)
Why not try a branching strategy. Branch new unchecked code for dev. Once it has been reviewed merge it to the trunk of the repository. that way it is available for review. In order to track reviews i would suggest looking at request tracker. a ticket could be opened for new commits/dev and then used to track progress etc.

using a centralised ticketing system means reviewers can see exactly what is waiting for a review.

(Reply to this)

Friends come? thanks
(Anonymous)
2007-08-27 06:06 am UTC (link)
enter text? test, sorry

dfdf767df



(Reply to this)

Friends come? thanks
(Anonymous)
2007-08-30 07:35 am UTC (link)
enter text? test, sorry

dfdf767df



(Reply to this)


(Anonymous)
2007-08-31 12:32 pm UTC (link)
Hi Donnie,

About your post:

* I like the idea of a rss-feed with full-diffs. It will certainly make the process of core-review easier. I don't think that in an open-source community of volunteers forcing people to do code-review will have any real effect, apart from pissing a substantial part of your volunteers off. Give them the tools to do so and encouraging them to review code of others and ask questions about it.

* It is unclear what you want to test in the unit-testing of individual packages?
The package, the ebuild or both ? Providing test-cases or test-checklists for packages sounds like a task worthy of it's own project / community. Given the current understaffed gentoo-developer pool I doubt this is realistic within gentoo.

I like the idea of automated tinderboxing for a restricted set of packages.
I can see that would really help especially combined with releases.

It would be fairly easy to create a set of automated tinderbox servers which prebuild most of the release stages and maybe a seperate set of stage4 catalyst builds for kde/gnome etc.

Suggestion number 3 sounds like a good business-case for a small company which could pay a subset of developers to work on extra QA-checks under GPL.
These QA-checks could be donated to the community and the bussiness could sustain itself with the operation of the time-lagged servers

Grtz Ramon van Alteren

(Reply to this)

(Reply from suspended user)
Tool
(Anonymous)
2008-03-01 08:53 pm UTC (link)
Sounds like a good tool but how is QA involved?

(Reply to this) (Thread)

Re: Tool
[info]spyderous
2008-03-04 07:00 am UTC (link)
Could you explain this in more detail? I'm not sure which tool you're talking about, and why you don't think QA is involved.

(Reply to this) (Parent)


Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…