Thursday, January 21, 2010

Linear Ordering Problems

Linear Ordering Problems (LOP) are encountered in a wide range of applications, such as ranking sports teams, triangulation of matrices (i.e., finding a reordering of the rows and columns that maximizes the total above or below the diagonal), or ordering lists of individual preferences. Recently I came across an interesting application for the scheduling of printing tasks. The data consisted of a array of entries $x_{ij}$  which give the time saved if task i is completed before task j.  The times savings were attributed to setup work done for the first task that would benefit subsequent tasks.  The problem was to find an ordering of tasks that maximize the overall time savings.

Like many problems in Operations Research, the LOP has a rich history of study and algorithm development.  LOP are  known to be NP hard, but state of the art algorithms provide for many practical uses. A somewhat older MILP formulation is easily implemented in GMPL/GLPK as shown in the Scheduling: Linear Ordering of Tasks example.

Wednesday, January 6, 2010

Setting up a Mac for Computational Work

It is not difficult to set up your Mac for work in the scientific/technical arena.  Here are some thoughts and advice based on my own experience.

1. Backup/Collaboration

First thing is to establish a backup/collaboration strategy. Obviously TimeMachine is the simplest alternative.  However, I've been using on-line services for this, in particular SugarSync and Dropbox.  I use SugarSync to maintain an on-line mirror of the Documents folder, and Dropbox to facilitate sharing of documents with collaborators. A nice advantage of these services is that they also provide web and mobile access to your files.

You likely maintain files on other servers. MacFusion (which requires MacFuse) allows you to access files on other servers as if they were local volumes.

2. Technical Documents

It's hard to beat Latex for the preparation of scientific and technical documents.  Moreover, mathematical software packages sometimes require the installation of latex to format output.

MacTeX is an excellent free distribution of the core tools needed to prepare a wide range of documents including papers, theses, and presentations (using Beamer).  Everything you need is available in one large download and simple installation. The tools include BibDesk for managing bibliographies, TexShop and TexWorks for editing and preparing documents, and the Excalibur spell checker.  Use the TeXLiveUtility to maintain the package.

Many scientific Mac users recommend Papers to manage bibliographies.  Having tried alternatives like Papers, I personally prefer BibDesk.  Though not quite as polished, by working directly with bibtex databases BiBDesk interfaces beautifully with LyX.

LyX is an almost WYSIWYG front end to Latex that generally masks the underlying markup language. If you want to use Latex without having to work directly with Latex commands, then LyX is for you.

The MacOS, of course, comes with an indispensable pdf and graphics previewer in Preview.  Skim is a freeware pdf viewer and notetaker that provides an additional capability to annotate pdf documents and enhanced presentation features.

3. General Software Utilities

Apple provides tools for software development in a separate package called Xcode that includes text editors, compilers, and other components. Many of the software tools you download from the internet require the tools contained in Xcode.   The most recent release of Xcode can be obtained at no cost by joining the Apple Developer's Connection.

There are many 'programmer's text editors' available for the Mac, including TextEdit that comes with MacOS.  TextMate is low-cost (with free trial) with bundles that adapt the editor for use in wide variety of applications.

MacPorts is a tool for finding and installing open-source software packages. MacPorts simplifies the installation of those packages which have been adapted to the Mac. MacPorts is a simple command line tool, Porticus provides a GUI interface to MacPorts.

Unfortunately, you'll occasionally come across software packages that simply are not available for the Mac.  One way to deal with this is to install Windows or Linux under one of the available virtual environments for the Mac such as Apple's Bootcamp, or Parallels, VMware Fusion, or VirtualBox. If you go that route you'll need to acquire and maintain a copy of a second operating system.  In many cases, however, the WINE package will provide enough support to run Windows software within the Mac environment.

4. Mathematical Software

Finally this brings us to the tools for actually doing computational science on a Mac.  Matlab and Mathematica are examples of commercial packages for doing mathematics on a Mac. There is an enormous variety of tools.

Here are some of the tools I find useful:
  • General mathematical computation
    • Matlab. Licensed for download to University affiliated students and faculty.
    • Octave.  A GPL licensed package with syntax similar to Matlab. Provided you have MacPorts installed, you can install Octave with the command line: sudo port install octave
    • Sage.  This is a big package based on Python that provides an enormous range of computational and visualization tools.
  • Statistical Computation.
    • Matlab statistics and econometrics toolboxes
    • R
    • Quantlib
  • Optimization
    • Gusek/GLPK (runs under WINE)
    • lpsolveIDE (runs under WINE)
    • Sage provides access to GLPK, the COIN-OR CBC package, and CVXOPT.
    • CVX, a Matlab class for convex optimization
    • YALMIP, a Matlab class for optimization with a variety of solvers
  • Python Development
    • Enthought Python distribution (using TextMate)
    • Sage

Wednesday, December 30, 2009

Modeling OR Problems in SAGE

Sage is an open source project providing a comprehensive environment for mathematical computing. It's a big package with big ambitions "Creating a viable free open source alternative to Magma, Maple, Mathematica and Matlab."  See SIAM Review (Vol 41, No. 4, 2009) for a  recent review.

Sage incorporates Python as a scripting language from which one can generate objects MixedIntegerLinearProgram objects.  Both GLPK and CBC from the COIN-OR project can be installed as optional Sage packages and used as solvers.  Coupled with the native features of Python, this approaches provides an interesting alternative to developing models using AMPL/GMPL.
# Dictionary of task durations indexed by (Job, Machine) tuples
dur = {
    ('Paper 1','Blue')   : 45,
    ('Paper 1','Yellow') : 10,
    ('Paper 2','Blue')   : 20,
    ('Paper 2','Green')  : 10,
    ('Paper 2','Yellow') : 34,
    ('Paper 3','Blue')   : 12,
    ('Paper 3','Green')  : 17,
    ('Paper 3','Yellow') : 28 }

# Task sequencing indicated by pairs of (Job, Machine) tuples
order = [
    (('Paper 1','Blue'),  ('Paper 1','Yellow')),
    (('Paper 2','Green'), ('Paper 2','Blue')),
    (('Paper 2','Blue'),  ('Paper 2','Yellow')),
    (('Paper 3','Yellow'),('Paper 3','Blue')),
    (('Paper 3','Blue'),  ('Paper 3','Green')) ]

# TASKS are a list of (job,machine) tuples found from the keys for dur{}
TASKS = dur.keys()

# JOBS and MACHS are unduplicated lists of jobs and machines 
JOBS = sorted(list(set(zip(*TASKS)[0])))
MACHS = sorted(list(set(zip(*TASKS)[1])))

# Create MILP
JobShop = MixedIntegerLinearProgram(maximization=False)

# The objective is to minimize Makespan
MakeSpan = JobShop.new_variable()
JobShop.set_objective(MakeSpan[0])

# MakeSpan is an upper bound on all tasks
x = JobShop.new_variable()
[JobShop.add_constraint(x[t] + dur[t] - MakeSpan[0], max = 0) for t in TASKS]

# Tasks must be complete in the specified order 
[JobShop.add_constraint(x[s] + dur[s] - x[t], max=0) for s,t in order]

# Disjunctive constraints to avoid machine conflicts. This uses a
# 'Big M' technique and a binary variable where y[s][t] = 0 implies
# task s must finish before task t can start and y[s][t] = 1 implies
# task t finishes before task s starts.

CONFLICTS = [(s,t) for s in TASKS for t in TASKS if s<t and s[1]==t[1]]

y = JobShop.new_variable(dim=2,vtype=JobShop.__BINARY)

BigM = sum(dur[t] for t in TASKS)

[JobShop.add_constraint(BigM*y[s][t] + x[t] - x[s] - dur[s], min=0)
    for s,t in CONFLICTS]
[JobShop.add_constraint(BigM*(1-y[s][t]) + x[s] - x[t] - dur[t], min=0)
    for s,t in CONFLICTS]
JobShop.solve('Coin')

Friday, December 18, 2009

Gambling with Stochastic Dynamic Programming

Stochastic dynamic programming is a technique for multi-period decision making under uncertainty. A classic illustration is the case of a risk neutral gambler entering a game with a stake x and a goal of leaving the game with a stake N > x.  The gambler places a wager from his or her current stake, then either wins an equal amount with probability p or loses the wager with probability q. The game continues until the gambler either reaches the goal or goes bust. The gambler cares only about the expected value and not about the range of possible outcomes and therefore is said to be risk neutral.  The problem is to find a betting strategy maximizing the expected value of the game.

The usual setup for solving this problem involves Bellman's 'principle of optimality'. The optimality criterion establishes a recursive expression for the expected value of the game that is a function of the stake. The 'Bellman equation' can be solved for an optimal strategy by a well-known technique called 'policy iteration'.  Alternatively, the same equation can be solved by linear programming as demonstrated in the GMPL file RNGambling.mod.

A related example is the case of the risk averse gambler.  The risk averse gambler enters the game for a predetermined number of rounds with a goal of maximizing the expected 'utility' of the final outcome.  Utility is a concave function of wealth that more heavily penalizes losses than gains. The GMPL model RAGambling.mod demonstrates that this problem can also be solved with linear programming. This example is a closely related to the well known 'Kelly Criterion' (or here for a popular account).

Thursday, December 17, 2009

Example of Business Planning through Stochastic Programming

The final project/exam for the course concerned a business planning problem that could be formulated as a two stage stochastic program.  I've posted here a variation of that problem as an additional example of GMPL modeling: BusinessPlanning.mod.

Monday, December 7, 2009

Final Exam, Assignments, and Assignment 7

Given the collision of so many end-of-semester activities, let's forego any new assignments for this week. Assignment 7 will be due Wednesday followed by a take-home final that will be due noon on Thursday, December 17th.  We'll take more about the final during Wednesday's class session.

Regarding Assignment 7, please feel free to use the GMPL model for portfolio optimization using the MAD criterion.  You'll need to adapt it to the problem data given in the assignment, and in particular deal with the portfolio constraint in part (b) of the problem.

Friday, December 4, 2009

Portfolio Optimization with a Mean Absolute Deviation Criterion

A question came up after class on whether GLPK could be used for portfolio optimization.  At first glance the answer is no because the classic Markowitz formulation for portfolio optimization with a  mean-variance criterion leads to a quadratic program.  GLPK is a linear programming solver that doesn't generally handle quadratic programs. There are workarounds of limited utility, which is demonstrated in a post below.  But in general, GLPK is not suited to quadratic programming.

However, changing the optimization criterion leads to some interesting and useful tools for portfolio optimization that can be expressed as a linear programs. An example is a criterion based on the mean of the absolute deviation (MAD) of return as a risk measure, an idea attributed to Konno and Yamazaki (1991).  This approach also has some other interesting features, including the direct use of return data either from simulation or historical records. Solutions for the MAD criterion exhibit second order stochastic dominance - a a theoretically important feature for risk measures that is not shared by the Markowitz formulation.

To demonstrate the concept, I've prepared a sample GMPL file PortfolioMAD.mod that computes an optimal portfolio using the MAD criterion.  The calculation shows how to compute random samples from multivariable Normal distribution in GMPL.  This requires an implementation of a Cholesky factorization in GMPL which is given in the code.

Thursday, December 3, 2009

Notes on Assignment 6: Australian Motors Case Study

I've been responding to questions regarding the Australian Motors (AM) case study and wanted to pass on a couple of observations.

First, the purpose of this exercise is to implement the business model outlined in Exhibit 8.  AM leases trucks on 2, 3, 4, and 5 month contracts. It makes money by renting those trucks to customers for short term needs.  AM is like a rental car company that obtains it cars through leasing.  Your task is to implement a business model that AM can use to schedule leases for maximum profit.

Second, the model described in Exhibit 8 assigns no cost to AM for leasing surplus vehicles for its fleet that don't, in turn, get rented out to customers. This unrealistic assumption leads to a trivial solution.  You'll get a more realistic solution if you modify the objective function to include an additional cost of $500 per month for each surplus vehicle in the inventory.

Take some time to look over your solution.  In particular, the marginal values on the inequality constraints provides much insight into this business case.  For example, which months and demand segments would your prioritize for a marketing effort intended to increased profits?  Do you see months and demand segments where increasing demand by one unit would increase profits by more than the monthly contribution of that unit?  Why does that occur?

Notes on Assignment 7: Portfolio Optimization in GMPL/GLPK

I was reminded that not all of you are experienced with Matlab, and was asked whether it would be possible to do the portfolio optimization problem in Assignment 7 using GMPL/GLPK.  The answer is a qualified yes.

The issue is that the Mean-Variance (MV) formulation for portfolio optimization leads to a quadratic programming problem.  GLPK is for linear programs, thus the quadratic objective cannot be directly introduced to GLPK.  For the first part of this assignment, however, you can work around the issue by using the first order necessary conditions that define a solution for the MV portfolio.

Look at page 30 of the notes for Lecture 7 entitled "General Case". There you will find linear equations which constrain solutions to the MV problem.  Two of these are simple equality constraints that can be directly translated into GMPL using portfolio weights as the decision variables.  The remaining n linear equations involves the weights and two additional decision variables for the Lagrange multipliers.  Adding those two decision variables to the model allows you to translate these equations into a system of GMPL constraints.  You won't need an objective function because the linear constraints are sufficient to determine a unique solution for the MV portfolio.

For your convenience, I've put a GMPL data file called PortfolioMVO.dat on the downloads list with the necessary data for Assignment 7.

The trouble you'll face is that the second part of Assignment introduces an additional inequality constraint. This change requires you to go back and establish KKT necessary conditions that include this new constraint.  In the end you will have have a somewhat more complex GMPL model with additional decision variables and some discrete logic.  If you're interested in doing this, let me know and I'd be glad to work you through the mechanics.

Friday, November 20, 2009

Lecture 4: Project Management Notes and files





I've uploaded notes for lecture 4 along with a polished up version of the GMPL model for doing the Critical Path Method (CPM) calculations.  Hopefully the notation is more consistent throughout the notes, and translates directly to the GMPL file.
We're becoming somewhat more sophisticated users of GMPL, so you'll note that the model and data files are now separate.  For the assignment you should be able to create a new data file describing the problem and obtain needed results with few (if any) changes to the model.
Project management is an industry of its own, and you'll find a wide variety of software tools to support project management.  Software ranges from simple charting all the way to tools for planning, scheduling, and controlling mission-critical corporate-scale endeavors.  If you're interested in this topic, you might try to try your hand at GanntProject, which is a cross-platform, GPL licensed tool for small scale desktop project management.  One option is to run it right from the web, so you can try it without having to fiddle with software installations.

Wednesday, November 18, 2009

Lecture 3 Notes, Assignment, and Examples

I've posted the notes for Lecture 3 and the GMPL file for the Network/Transportation example we developed in class.  I polished up the class example with comments, attributions, cleaned up variable names and so forth so that it is a good example of GMPL programming practice. Be sure to understand and master the basic techniques demonstrated in this example.

Assignment 2 is due Friday, and Assignment 3 -- the "Custom Vans, Inc." case study will be due next Tuesday.  Assignments 4 and 5 can wait until after the Thanksgiving holiday. We're moving pretty rapidly, so we're trying to maintain an appropriate balance between workload and staying current with lecture topics.

Finally, I've added some additional GMPL examples for Lecture 2.  These are mainly demonstrations of techniques in GMPL including some rather exotic computations one can do with sets.  You may find these useful if you find yourself trying to do complicated manipulations with sets.

For what it's worth, I'm finding the Greenhouse to be a good room for teaching.  Unless there are objections, let's try to meet there for upcoming class sessions.

Tuesday, November 17, 2009

Using GLPK/GMPL with GUSEK ...

In class I demonstrated a minimalistic, bare-bones installation of GLPK. The tools I used were a text editor (such as notepad on Windows machine, or textedit on a Mac) and the GLPK command glpsol.  These simple (and free) tools can handle a wide variety of optimization problems, but are definitely 'old school'. The good news is that integrated user interfaces are available which you may find more convenient for general development work in GMPL.

For Windows you have several choices, including the recently developed GUSEK.  My first reaction is that GUSEK appears to be a very useful tool for the development of GMPL models as used in this course. It's easy to download and install.  Go to http://sourceforge.net/projects/gusek/ and push the download button. Unzip the download file to any folder.  In that folder open the gusek.exe file and you should be good to go. Type your GMPL model in the first tab, save the file, then run the file using the 'Tools>Go' menu item.  If you select the 'Tools>Generate Output File on Go' option, then the optimization results are shown in a separate tab which can also be saved and printed.

For the Mac  I was able to get GUSEK working under the WINE system that allows one to use Windows software in a Mac OS X or Linux environment.  Here's a two step process that should you get you up and running (i.e., it worked for me):
  1. Download and install WINE.  The easiest way is to get a precompiled version from the web site http://winebottler.kronenberg.org/ (The download button is the little text item with the downward pointing arrow labeled 'download').  The file should download and unzip automatically then open a window for installation.  Drag the 'Wine' icon to the application folder (you may need to provide your Mac password). 
  2. Next download GUSEK from  http://sourceforge.net/projects/gusek/.  It should download and unzip automatically.  Put the gusek folder somewhere convenient (on your desktop, for example).  Go to that folder and double-click gusek.exe.  It's a little slow the first time as WINE configures itself to run the file. There may also be several windows that open up as WINE goes through the startup - this is part of it's configuration process for the first time you use GUSEK. A little wine glass icon will appear in the upper left status bar of your Mac - that's your interface to the WINE software.
  3. At this point you're in business.  You can either open an existing GMPL model or create one from scratch.  Use the Tools menu to run and display the results of the optimization.
Hopefully these instructions are enough to get both PC and Mac users going with GMPL and GUSEK. Let me know if there are problems.  Feel free to add comments to this post if you have any advice or experiences to share regarding the installation of GUSEK.