How to get Condor running on Mac OS X

This post was written by Randy Julian on June 14, 2011
Posted Under: Developer Corner,Technologies

Indigo delivers high performance data analytics for diagnostic laboratory operations.  Most of the heavy lifting is done using the Condor High Throughput Computing system from the Condor Team at the University of Wisconsin.  Condor is a very powerful piece of software which simplifies parallel computing for “embarrassingly parallel” problems.  Automated instrumental data analysis can usually be pipe lined into such an architecture with excellent performance results.

We use condor on both Amazon EC2 with Ubuntu and on customer systems (mostly VMWare).  There are several good starting images for Condor for Amazon EC2 and getting condor running on most Linux distributions is simple.  We do most of our development on Mac so I wanted a simple condor master that could accept flocking nodes (if I needed more compute power) for developing and testing.

Here’s how I set my own system up:

Use the /etc/launchd.conf file to set the path to the Condor executables and set the required CONDOR_CONFIG environment variable.  This file might not exist on your machine, so create it if don’t have one already.

setenv PATH /export/condor/bin:/export/condor/sbin:$PATH
setenv CONDOR_CONFIG /export/condor/etc/condor_config

When you reboot your machine, the global path will be properly set.

I run condor as a normal user (me) with the condor_master command, so I want all the condor binaries on my execution the path.  Also, I want condor to use my specific condor_config and condor_config.local, so I set the environment variable CONDOR_CONFIG to the main configuration file and it points to condor_config.local.

I expanded the installation tarball (condor-7.5.6-x86_macos_10.4-stripped.tar.gz from the Condor Download Center) into a directory,  I used /export/condor.  You can put this wherever you like, but this directory will be entered into the configuration script so just make sure the directory is consistent throughout.  Later, I will show that I put all the configuration and logging directories in the same place.  You don’t have to make all these changes, but I wanted everything in a nice, neat place.

Next, I put the condor_config and condor_config.local files in the /export/condor/etc directory.

You can use the vanilla condor_config that come with the distribution, but you have to change the following sections: first around line 56 – you can leave the CONDOR_HOST = $(FULL_HOSTNAME):

##--------------------------------------------------------------------
##  Pathnames:
##--------------------------------------------------------------------
##  Where have you installed the bin, sbin and lib condor directories?
RELEASE_DIR       = /export/condor  

##  Where is the local condor directory for each host?
##  This is where the local config file(s), logs and
##  spool/execute directories are located
LOCAL_DIR         = /export/condor/etc  

##  Where is the machine-specific local config file for each host?
LOCAL_CONFIG_FILE = /export/condor/etc/condor_config.local

Just put in the location for your installation here.

I also changed my execution settings to something like the Condor “TESTINGMODE”. That just means that I don’t want jobs suspended, killed or stopped if I use my computer:

# When should we only consider SUSPEND instead of PREEMPT?
WANT_SUSPEND      = False

# When should we preempt gracefully instead of hard-killing?
WANT_VACATE       = False

##  When is this machine willing to start a job?
START             = True

##  When to suspend a job?
SUSPEND           = False

##  When to resume a suspended job?
CONTINUE          = True

##  When to nicely stop a job?
##  (as opposed to killing it instantaneously)
PREEMPT           = False

##  When to instantaneously kill a preempting job
##  (e.g. if a job is in the pre-empting stage for too long
KILL              = False

PERIODIC_CHECKPOINT     = False
PREEMPTION_REQUIREMENTS = False
PREEMPTION_RANK         = 0
CLAIM_WORKLIFE          = 1200

I also found that it was easier to just create the following directories:

/export/condor/etc/spool
/export/condor/etc/run
/export/condor/etc/execute
/export/condor/etc/log
/export/condor/etc/lock

and change the following in “Part 2”

LOCK        = $(LOCAL_DIR)/lock

and “Part 4” of the configuration file:

######################################################################
##  Daemon-wide settings:
######################################################################

##  Pathnames

RUN         = $(LOCAL_DIR)/run
LOG         = $(LOCAL_DIR)/log
SPOOL       = $(LOCAL_DIR)/spool
EXECUTE     = $(LOCAL_DIR)/execute

Also if you don’t want the system to run a benchmark (I turned this off), you can comment out the RunBenchmark line lower down in Part 4:

##  When a machine unclaimed, when should it run benchmarks?
##  LastBenchmark is initialized to 0, so this expression says as soon
##  as we're unclaimed, run the benchmarks.  Thereafter, if we're
##  unclaimed and it's been at least 4 hours since we ran the last
##  benchmarks, run them again.  The startd keeps a weighted average
##  of the benchmark results to provide more accurate values.
##  Note, if you don't want any benchmarks run at all, either comment
##  RunBenchmarks out, or set it to "False".
#BenchmarkTimer = (time() - LastBenchmark)
#RunBenchmarks : (LastBenchmark == 0 ) || ($(BenchmarkTimer) >= (4 * $(HOUR)))
#RunBenchmarks : False

Next, I don’t want to create a special “condor” user on my laptop, I just want to run condor as “me”.  To do this, I need my user and group id:  From the command prompt run the id command:

uid=501(“me”) gid=20(staff) groups=20(staff),…

I then use the following condor_config.local:

CONDOR_IDS = 501.20
START_MASTER = True
START_DAEMONS = True
START = TRUE
FLOCK_FROM = *
HOSTALLOW_READ = *
HOSTALLOW_WRITE = *
CONDOR_HOST = $(FULL_HOSTNAME)
DAEMON_LIST = MASTER, SCHEDD, STARTD, NEGOTIATOR, COLLECTOR
TRUST_UID_DOMAIN = TRUE

I use the condor_config.local to override anything I didn’t fix (or forgot to fix) in the condor_config file.  The CONDOR_IDS allows me to run condor_master as me and allows other nodes to flock to my machine as a master.
If everything went right, you should be able to run the condor_master command followed by the condor_status command.  For my MacBook Pro I get:

Name           OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime
slot1@rkj      OSX        X86_64 Unclaimed Idle     0.310  2048  0+00:28:31
slot2@rkj      OSX        X86_64 Unclaimed Idle     0.000  2048  0+00:28:32
slot3@rkj      OSX        X86_64 Unclaimed Idle     0.000  2048  0+00:28:33
slot4@rkj      OSX        X86_64 Unclaimed Idle     0.000  2048  0+00:28:34

Total Owner Claimed Unclaimed Matched Preempting Backfill

X86_64/OSX     4     0       0         4       0          0        0

     Total     4     0       0         4       0          0        0

Your display will vary by memory, cores, and how busy your machine is at the moment.

Your system should now be ready for condor_submit

Share

Add a Comment

required, use real name
required, will not be published
optional, your blog address