NodeManager feature request

Marc requested the ability to turn off bwlimits in NM.  Also, since some nodes will have gigE connections to the outside world, moving from byte limits measured in KB to MB would probably be a good idea.

 NM in head is currently broken.  IIRC, dhozac was the last to play with it and I'm not certain what he did and didn't do.  I've been careful to merge changes I've made to the branch into head so, hopefully, the code isnt THAT different.  Regardless, I foresee at least a week of testing a tweaking before i can sign off on it.

Trac is up. Please create a password.

SVN and Trac, running on poppins, uses LDAP for authentication. I've created user accounts for most of PLC, if not all. There are no API calls to grab passwords from the PL API and making a backdoor to sync PL passwords with LDAP passwords doesn't really appeal to me. So, in order for PLC to log in to trac, please do the following:

 $ ssh root@poppins
# passwd <your username>
Changing password for user tmack.
New UNIX password:
Retype new UNIX password:
LDAP password information changed for <your username>
passwd: all authentication tokens updated successfully

You should now be able to log in to http://svn.planet-lab.org/ and create/edit content.

i hate drupal

i hate drupal

SVN cheat sheet

To Check Out:

(over ssh)

$  svn co svn+ssh://svn.planet-lab.org/svn/NodeManager/trunk NodeManager

(over https)

$ svn co https://svn.planet-lab.org/svn/NodeManager/trunk NodeManager


NodeManager bug

There were some hundred nodes with hung NMs. /var/log/nm<.gz.*> would look like the following:

Sat Sep 15 10:01:50 2007: bwmon:  Found 271 running HTBs
Sat Sep 15 10:01:50 2007: bwmon: Found 1 new slices
Sat Sep 15 10:01:50 2007: bwmon: Found 0 slices that have htbs but not in dat.
Sat Sep 15 10:01:50 2007: bwmon Slice utah_elab_31230 doesn't have xid. Must be delegated. Skipping.
Sat Sep 15 10:01:50 2007: bwmon: Found 1 dead slices
Sat Sep 15 10:01:50 2007: bwmon: removing dead slice 1186
Sat Sep 15 10:01:51 2007: bwmon: now 270 running HTBs
Sat Sep 15 10:02:08 2007: bwmon: Saving 270 slices in /var/lib/misc/bwmon.dat
Sat Sep 15 10:20:19 2007: Traceback (most recent call last):
File "/usr/share/NodeManager/nm.py", line 82, in run
File "/usr/share/NodeManager/nm.py", line 36, in GetSlivers
data = plc.GetSlivers()
File "/data/build/tmp/NodeManager-1.5-4.planetlab-root//usr/share/NodeManager/plcapi.py", line 86, in w
return function(*params)
File "/usr/lib/python2.4/xmlrpclib.py", line 1096, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python2.4/xmlrpclib.py", line 1383, in __request
File "/data/build/tmp/NodeManager-1.5-4.planetlab-root//usr/share/NodeManager/safexmlrpc.py", line 21,
in request
raise xmlrpclib.ProtocolError(host + handler, -1, str(e), '')
ProtocolError: <ProtocolError for boot.planet-lab.org:443//PLCAPI//: -1 >


SVN structure

After perusing OneLab's SVN and talking to dhozac on IRC, I like the idea of seperating out modules in our repository from their vanilla upstream versions and what exists in the trunk. So, preliminary structure for modules in SVN will be as follows.



4.1 rc2 Complete. Downtime announced.

I've given OneLab notice that we're taking the CVS down in order to upgrade cvs.planet-lab.org to something a little more modern. I'm taking the machine down after 5am UTC to begin the upgrade.

Marc weeded through CVS and removed the need to build PL using
myplc-devel, which provides a chroot fedora-core 4 environment to build
against. My major task for the next week is moving from CVS to SVN,
attic'ing old projects that are still in our repo. I had a conversation
with Thierry a few months ago concerning how we tag all of planetlab for
every release and how we should move to a scheme were we only tag a
single build package, and specify versions of subpackages needed for
building a particular tag. This will move us away from rebuilding all
of planetlab with each release and allow us to back track and bug fix
without overriding tags when HEAD has moved forward (as was the case
when there was a bug in 4.0 when Daniel was here and the wrong code was
pushed causing slivers to be deleted). This requires a large amount of
plumping to be rerouted but work now will save us headache in the
future. I'm going to tentatively say that this should take a week but
this may change.

Consolidated old blog

July 12, 2007 PLC server maintenance, et. al.

I've been working on making all PLC services redundant and have been pretty successful except for the database. PostgreSQL does not support the notion of a "database cluster" out of the box. Third party patches claim to have solved this problem, but we have yet to investigate them.

I'm moving www to the backup database machine (cassidy). When we ordered new machines, I went ahead and asked for 1 extra to serve as a hot spare (for the database because of the above problem). The rationale is that if a machine misbehaves, I can grab the latest database dump and mix and match services and machines as needed.


PlanetLab software consists of the core system that runs at PLC and on individual nodes (available via SVN, with ongoing efforts outlined in a development roadmap), user-contributed software tools that can be used to manage PlanetLab slices, and user-contributed services that support a user community.

Syndicate content