Archive for May, 2008

User’s feedback

Sunday, May 25th, 2008

I got some feedback from users about managing software.

Raúl Moratalla. who has been blogging about using 11.0 ZYpp/YaST in 10.3, suggests two changes:

  • Updated packages filter.

Adding that feature is really easy. However, 11.0 is almost out.

  • New packages filter

This shouldn’t be impossible, but I don’t have an idea on how to do it. I think we don’t have the information on when a package appears in a repository, and even worse, we don’t keep information about the previous refresh when finishing one.

John Tomas asks and suggests an easier way to remove applications. I think the use case is valid, but I don’t like the exact proposal, specifically, I don’t like user interfaces with trees. And adding more check boxes for repo removal clutters the interface more. Moreover, I think this usecase is just part of a more generic one, how to install applications, without worrying about “packages”. This problem is already being solved by, for example, Ubuntu Appinstaller.

This problem could be solved in various way.

  • Packages could be “tagged” as such.

However that requires changing the current world, therefore it won’t work.

  • Do a .desktop file -> package mapping.

Ubuntu appinstaller does this by scanning the complete distribution archive, and generating special metadata from it. This metadata is distributed as a package called appinstaller-data. This method is sane, but the approach of scanning the distribution archive won’t work with 3rd party repositories.

Stephan Binner has a prototype for the same concept on openSUSE, called app-installer.

Any other ideas on how to solve this problem?

The greatest unknown openSUSE 11.0 package management feature

Wednesday, May 21st, 2008

During the development of openSUSE 11.0, we have been reporting in real time cool improvements like the fast installation, how YaST became sexy, how YaST/ZYpp/zypper became fast, how YaST/ZYpp/zypper performs better than others and even that our solver is also really smart.

However, there is something else…

Interoperability

Background

One of the features of our stack is the availability of patches and patterns. The first provide updates in a sense of “fix for a problem” (which can mean various, or none updated packages), while patterns are intelligent groups that can recommend, require and suggest packages in order to make certain functionality available, without being too strict in the specific packages to install.

Unlike in other systems where groups and updates are handled as special entities, ZYpp patterns and patches are just objects with dependencies like packages, and the solver threats them in the same way.

Because rpm installed packages database does not know about patterns and patches, in openSUSE 10.x (and SLE10) those objects are installed in a separate database, only viewable to libzypp. This is hidden to the users, but does not allow for easy management using 3rd party tools.

In addition to that, the patch metadata format is our own extension to the metadata handled by yum, the tool used by Fedora and Centos. That means, even if other distribution provide similar concepts, they will mostly ignore our extended metadata.

This is sad, if we share 90% of the metadata format, why not go further?. Sometimes it is no worth to wait that others do steps in becoming more interoperable with you, so what about doing those steps ourselves?

At this time, Fedora was implementing updates metadata by using a yum plugin and a updateinfo.xml description. Metadata for deltarpm availability is handled via the yum-presto plugin.

Sharing tools and data, a step for Interoperability

Metadata format

In openSUSE 11.0, ZYpp reads patches from updateinfo.xml too! (check 11.0 update repo!). Not only that, our delta rpm availability metadata will be in the same format yum-presto (with some modifications agreed with its author).

How will this benefit you?

  • You will be able to use yum stack with updates out of the box with updates and deltarpms, and they will just work.
  • You will be able to generate custom patches for openSUSE using existing tools like Bodhi, from Fedora, or generate custom deltarpm information using the yum-presto included tools.
  • We are working hard to get the ZYpp/YaST stack to build on Fedora (and other distros), in a near future, you will be able to enjoy ZYpp performance and features on Fedora with their own repositories!
  • We decoupled the delta-rpm information from patches, so we may start adding delta-rpm to normal factory packages and it will start to work out of the box!
  • Much more!

Handling of patches and patterns

As we mentioned, in 10.x codebase we used to install patterns and patches in a special database. This is no longer the case.

Patterns and patches are no longer installed, which means your system is rpm only! Patches are shown now as (un)satisfied (and (un)relevant). Which means you have all the requirements to consider them present.

All the information of patches and patterns (and products) is extra information that openSUSE applications use to add more value to you. So if you for example remove a repository offering patches, then you just lose the information about which patches do you have, the real information is the rpm packages you have installed. When you re-add the update repository, and you can immediately see which patches published affect you, which ones are irrelevant, and which ones are relevant but you don’t need because your packages are up to date.

Patterns and patches become “advice” and “value”, not extra non-compatible information.

How will this benefit you?

  • Simpler system.
  • No conflicts because using 3rd party tools, “rpm by hand” or our native tools.

openSUSE, PackageKit enabled

PackageKit is the new actor in the package management world. It is a thin layer that provides applications access to the package management system as a DBUS service. You may heard about it because Fedora 9 is coming PackageKit enabled. How it benefits you?

  • Role based (non-root) package management, via PolicyKit.
  • Sharing of upstream tools across distributions.
  • Gives the desktop the chance to integrate with software manipulation.

So, openSUSE 11.0 is fully PackageKit enabled. You will be able to use all PackageKit compatible applications on openSUSE and they will use the ZYpp stack underneaths. Not only that it is enabled, but our hackers Scott Reeves and Stefan Haas did an amazing job on the backend, I would dare to say it is one of the most robust backend implementations, and it fully benefits from the ZYpp speed and features.

Future

All this improvements are available now. May be you are already enjoying them in Factory. However this opens the door for new possibilities, just a few examples come to mind:

  • The openSUSE Build Service, the great software building platform from our openSUSE team, builds packages for all major distributions since long time. The build service could allow to enter and generate patch information for fixed bugs and the update/patch information will be compatible across yum/Fedora and openSUSE. Same with deltarpm information.
  • We could extend ZYpp parser to understand Fedora groups stored in comps.xml and threat them as patterns.
  • Do you have more ideas?

Community involvement

We welcome any help on creating more interoperability possibilities, especially about building the ZYpp stack and YaST on Fedora, Mandriva and others. There are already some packages building in the build service, but we still have a long way to go.

Solving the famous “smart” case 3

Saturday, May 17th, 2008

In my last post, I showed how sat-solver (solver’s ZYpp uses) would solve correctly the “case 2″ included in smart’s README, an exercise smart uses to claim its “smartness” (because apt and yum can’t handle them).

How does it with case 3?

That’s another interesting case which was tested with APT-RPM and YUM.

In this case, there’s a package A version 1.0 installed in the system, and there are two versions available for upgrading: 1.5 and 2.0. Version 1.5 may be installed without problems, but version 2.0 has a dependency on B, which is not available anywhere.

In this case, the best possibility is upgrading to 1.5, since upgrading to 2.0 is not an option.

Here both yum and apt fail:

Just like APT, YUM selected version 2.0 and didn’t consider the availability of an intermediate version.

But smart does the right thing:

Smart correctly selects the intermediate version 1.5, which is the only viable possibility given the current options.

So, we setup a repository packages.xml with A version 1.5 and 2.0 ( 2.0 requiring an non-existing B) and a system.xml representing the installed packages, with only A 1.0 installed. We put all together in a test.xml file, and select a “upgrade” action.

We run the already introduced deptestomatic:

# deptestomatic test.xml
>!> Solution #1:
>!> upgrade A-1.0-1.noarch => A-1.5-1.noarch[test]
>!> !unflag A-2.0-1.noarch[test]
>!> installs=0, upgrades=1, uninstalls=0

And we confirm, that ZYpp (sat-solver) would also select A 1.5 as expected.

Solving the famous “smart” case 2

Saturday, May 17th, 2008

In my recent post about ZYpp, yum and smart speed and memory usage, I got this comment:

It would be also interesting to know how the new libzypp manages the “Case Studies” from http://svn.labix.org/smart/trunk/README

I don’t know, but could be that yum fixed “Case 2″ and libzypp “fails”? So the extra time and memory over libzypp taken by yum to make the dep resolution would be justified.

At first glance I thought the case was very vague, but actually it was really simple and concrete, so it was a good candidate to be tested with our test suite tools. Let’s look at it:

This is another real case, and is being reproduced in a controlled environment for tests with YUM, APT-RPM, and Smart.

The issue is, a package named A requires package BCD explicitly, and RPM detects implicit dependencies between A and libB, libC, and libD. Package BCD provides libB, libC, and libD, but additionally there is a package B providing libB, a package C providing libC, and a package D providing libD.

In other words, there’s a package A which requires four different symbols, and one of these symbols is provided by a single package BCD, which happens to provide all symbols needed by A. There are also packages B, C, and D, that provide some of the symbols required by A, but can’t satisfy all dependencies without BCD.

So, what is so special about this case?

The expected behavior for an operation asking to install A is obviously selecting BCD to satisfy A’s dependencies, on the other hand, YUM and APT fail to deliver that as a guaranteed operation, as is shown below.

Ok, there are a couple of things that must be said about this example.

  • The dependencies of the package A are broken. Unless the package A depends on the libraries libA, libB and libC and some runtime “thing” provided by it, then BCD should be split in three packages. But as none of B, C, D provides any runtime “thing”, but they just “replace” it, this is not true. So either fix BCD or fix A to depend on the libraries only, the explicit BCD dependency is bogus.

  • Debian’s apt-get won’t succeed here, because AFAIK Debian only has dependencies across packages, not on libraries. Actually I still wonder how smart could succeed here using a debian backend. I ignore if apt-rpm does support library dependencies.

  • The result yum produces makes no sense too.

  • So yes, this is the expected result, but the example is a really dumb package.

How to simulate dependencies using the ZYpp sat solver?. The magic is call deptestomatic. There are two versions of this tool, one provided by satsolver-tools, and another one provided by libzypp-testsuite. The one in sat-solver tools should be sufficient. The one in libzypp-testsuite has extra features like graphical display of dependencies.

So, we use the helix format for testcases, which was the format of the original RedCarpet testuite. A testcase is a xml file refering to one or more xml files describing repositories, and one repository represents the installed packages (in this case we can simulate with no installed packages and just one repository).

We start by creating a packages.xml.

And now, we define the test.xml describing the testcase.

If there are conflicts, the xml language has features to select solutions, but that is advanced stuff documented here. For our case, the test is really, simple, just install A.

Now run the test:

# deptestomatic test.xml

And you get the result:

>!> Installing A from channel test
>!> Solution #1:
>!> install A-2.0-1.noarch[test]
>!> install BCD-2.0-1.noarch[test]
>!> installs=2, upgrades=0, uninstalls=0

Which is the “smart” result. It only installs A and BCD, and not B, C and D (which is what yum does).

Learning to develop testcases and run them is an excellent way to help the development team to fix bugs without having to request extra information, so you can play yourself with the simulation before actually reporting it.

You can also generate testcases for day to day operations in a very simple way (no need to write xml):

  • in the YaST (Qt) package selector, go to Extras, and choose generate solver testcase (after selecting your transactions, like installing or removing packages).
  • Use zypper command –debug-solver (like zypper install –debug-solver foo )

Then you can run them with deptestomatic by referencing the test xml file. Note that you can reuse package lists (repositories) across tests. libzypp-testsuite and the sat-solver source includes a big list of testcases representing simple operations, distributions, upgrades, and distribution upgrades.

yum and ZYpp speed / memory usage

Friday, May 16th, 2008

Michael Zucchi complains about yum memory usage, and points python as guilty.

Yum isn’t so yummy after-all. Re-enabled python so i could run yum. Wow 120mb of vm to install a couple of packages. Not bad considering the box only has 128mb. This is crap.

Hmm, should I try xubuntu - or will it be just as crappy and bloated and blighted by python poo?

Since our efforts to make the ZYpp really fast, by incorporating and integrating Michael Schroeder’s sat solver together with Michael Matz’s great work on the solv files and data storage, I never took the time to make a “quick comparison” on speed or memory usage. So lets have a quick look.

These are my repositories:

Software configuration management (openSUSE_10.3)
10.3 - Main Repository (NON-OSS) 
10.3 - Packman
openSUSE-10.3-Updates
Virtualization:VirtualBox
home:dgollub
KDE:KDE3
Mozilla based projects (openSUSE_10.3)
ZYPP SVN Builds (openSUSE_10.3) 
ZYPP SVN Builds (openSUSE_10.3)
home:prusnak
10.3 - VideoLan
openSUSE.org tools (openSUSE_10.3)
SUSE Feature Tracking Tool (openSUSE_10.3)
psmt's Home Project (openSUSE_10.3)
openSUSE:10.3
Duncan Mac-Vicar SUSE rpms (openSUSE_10.3)
Latest YaST svn snapshots (openSUSE_10.3)
building/openSUSE_10.3         

All these repos together are about 41.000 packages.

What I did was to symlink ZYpp repositories to the yum repo path so they use the same repos.

# rm -rf /etc/yum.repos.d/
# ln -s /etc/zypp/repos.d /etc/yum.repos.d
*NOTE:* I tested with yum 3.2.4. I know 3.2.14 is available, but that is what I had installed when doing the test. After doing this tests I upgraded to 3.2.14 but it did not accept my .repo file because the character “:” in repo names. However the changelog of yum since 3.2.4 shows: If using latest yum would invalidate this numbers (not as in 1 second, but as in an order of magnitude), let me know and I will repeat them when I make them work with my repo files.

Update 14.05.2008 : I did add yum 3.2.14. However it performed even worse, except for memory usage.

Update 15.05.2008 : added smart 0.52 numbers

libzypp is the one you see in factory since some days: 4.21.1.

yum and ZYpp behave differently, as yum downloads and parses filelists.xml and other.xml which we ignore. There fore I skipped the download metadata part and just timed the cache building process.

# yum clean dbcache
...
19 sqlite files removed

# time yum makecache
...
Metadata Cache Created

real    9m41.036s
user    2m34.766s
sys     0m11.545s

Almost 10 minutes. As this time includes parsing the two big files we ignore. I did it again, pressing Ctrl-C After yum finished with the primary data, which is what ZYpp uses:

# time yum makecache
...
Exiting on user cancel

real    4m6.730s
user    0m34.058s
sys     0m3.080s

Now, zypper’s turn:

# time zypper ref -B
...
All repositories have been refreshed.

real    0m18.472s
user    0m16.029s
sys     0m2.024s

So yum takes 13 times the ZYpp needs technically (primary 1:1 comparison), but 30 times the time the end user sees.

Now, installing a package. Times were measured till the “continue? yes/no” prompt, or till the first interactive question.

# time yum install fate
...
Is this ok [y/N]: n
Exiting on user Command
Complete!

real    0m19.143s
user    0m14.057s
sys     0m1.920s

zypper’s turn:

# time zypper in fate
...
Continue? [YES/no]: n

real    0m9.796s
user    0m8.509s
sys     0m0.624s

This time, ZYpp is only twice as fast as yum. Only ;-)

What happens when you want to upgrade your packages?

# time yum upgrade
...
real    0m45.152s
user    0m36.894s
sys     0m7.476s

(Note: yum did not even found a solution here).

# time zypper update
...
Continue? [YES/no]: n

real    0m8.988s
user    0m7.820s
sys     0m0.596s

yum needs 4 times the time ZYpp needs to calculate the upgrade.

Update 14.05.2008 : I was comparing update to upgrade, I fixed those numbers in the chart. However, I don’t have the update value for the old yum.

Summary:

Now, how much memory does each one need? For this, I just tested the install command with one package using valgrind massif, a heap profiler.

yum memory usage:

yum memory usage

ZYpp memory usage:

zypp memory usage

Update 14.05.2008 : memory usage chart for yum 3.2.14

yum 3.2.14 memory usage

Update 15.05.2008 : memory usage chart for smart 0.52

smart 0.52 memory usage

Here you can appreciate ZYpp goes a little bit over 20M, while yum goes over 180M, so yum uses about 9 times more memory. Update 14.05.2008 : yum 3.2.14 uses around 160 in the worst point of time.

I would be interested in tracking cpu usage too, but that will come later. What do you think about it?