another day another vice another roll of the dice: 2008

Tuesday, December 09, 2008

Google Translate for Trac timeline comments

JavaScript that translates changeset comments in Trac timeline using Google Language API. How to use it with Trac should be well-described at this Trac-Hacks page. Here I just want to state that this code can be reused for other .html pages.

This code is used at http://farmanager.rainforce.org/timeline

Just do not forget to add jQuery and the following line to your header.


<script type="text/javascript" src="http://www.google.com/jsapi"></script>

Here goes the main code:


// Translate changeset comments in Trac timeline
// Using Google Language API version 1

// Code in public domain
// techtonik // rainforce.org


// wrap the code to use $ as jQuery function name
(function($){


// setup language parameters

var srclang = "ru";
var tgtlang = "en";

var gtrans_btn_text = "Translate russian comments";

var srclang_a_z = Array();
srclang_a_z["en"] = /[a-z]+/i;
srclang_a_z["ru"] = /[Р°-С_]+/i;


// load Google API

google.load("language", "1");


// make this anonymous function executed by jQuery when document is loaded
jQuery(document).ready(function() {

    // add "powered by Google" logo into footer
    $('<p id="googleattr" class="left"></p>').insertAfter("#footer p.left:last")
    google.language.getBranding('googleattr');

    // add trac_timeline_translate() button to timeline menu
    var gtrans_btn = $("<input type='button'/>")
          .attr({
              id: "gtrans_btn",
              value:gtrans_btn_text
          })
          .click(function() {
              trac_timeline_translate();
              gtrans_btn.attr("disabled", "disables");
              return false;
          });

    $("#prefs").append("<hr/>");
    $("#prefs")
      .append(
         $("<div align='center'></div>").append(gtrans_btn)
      )
})


function trac_timeline_translate() {
    // translate changeset messages
    g_trans_nodes( $("dd.changeset") );
}


function g_trans_nodes(nodes) {
  // translate nodes one by one
  // todo: group many small nodes into one packet < 5000 bytes

  var transTexts = Array();
  var transNodes = Array();

  for (i=0; i<nodes.size();i++) {
      var text = nodes[i].innerHTML;

      if (text.match(srclang_a_z[srclang])) {
          transTexts.push(text);
          transNodes.push(nodes[i]);
      }
  }

  for (i=0; i<transTexts.length;i++) {
      function makeTransNodeClosure(target_node) {
          // closure to save node parameter to be accessible
          // in callback function
          var node = target_node; // local variable
          var trans_callback = function (result) {
              node.innerHTML = result.translation;
          }
          return trans_callback; // return function
      }
      
      google.language.translate(transTexts[i], srclang, tgtlang, makeTransNodeClosure(transNodes[i]));
    }
}


})(jQuery);

Friday, September 26, 2008

Python in Industial Automation

Before celebrating 10th Google birthday I would like to make a gesture of goodwill to fellow comrades from industrial automation field, who over years battled with a very special kind of the most conservative software in world.

Industrial Automation Software

It is the Software that directly puts people lives at stake. A little bug may kill people, inflict tremendous physical damage to buldings, vehicles and equipment. No wonder that Industrial Automation Software is among the most slowly developed technologies in the world. People fear to change things that work if the cost of error is so high. But using outdated software that nobody is capable to maintain carries even greater risks. Just as connected to Internet Windows without updates is brought down by viruses in minutes, the Enterprise is brought down by unmaintained legacy systems that just await the time to explode. Industrial software is ought to be stable, error free and resistant to all kind of threats, but it is not true in most cases. Most industrial software products lack public exposure, and so far I haven't seen any serious industrial product that didn't crash. Industrial software is the classic example of vendor lock-in. It happens just everythere. If you have some hardware sensors in the field and would like to collect signals from them - most likely you would have to buy industrial controller or PLC, which will be on 24/7/365/10 to gather your data. PLC's were mostly used solely for control automation without the need to share data with external systems, but in IT age you would need that data sooner or later. To get it from he PLC you would need to buy expensive OPC server that is capable to poll PLC using their own protocol and provide you with data in OPC format. OPC stood for OLE for Process Control and as you can guess from its name was vendor-locked to Microsoft from the start. It was based on OLE, COM and DCOM - its proprietary technology. To read one byte from your PLC you would have to buy Windows, buy OPC server and pay specialist to write an OPC client to access one byte of your sensor. Even after that some companies dared to include "Lower TCO" slogan in their markting campaigns. There was no alternative those days though. MS SQL is the only promoted "realtime database" even if there was a word "near" before the quotes, but it has support. Now there are plenty of open protocols and standards for data communication, many databases and time-tested solutions from other IT areas, but things are moving slowly in Industrial Automation world where "open protocol" still means that you are free to implement a specification, but need to pay for texts and it is ok. DCOM is slowly replaced by .NET, new consortiums appear just to put their label "approved" on community accepted standards. Industrial Automation is a business with enterprises that are primary sources of capital directly extracted from raw materials, so there is nothing exceptional in constant buzz that is generated by efforts of players to bite off a piece from this cake. But even more don't want to lose their share. The players are huge and so are the problems with Industrial Applications. It takes a lot of efforts to prove something different when it comes against mature and experienced players with a big sum of arguments in their bank accounts.

Industrial Automation applications are meant to be error free and stable, but there are still many errors, flakiness and unstability there. High cost of specialized instruments to debug and troubleshoot problems, low community response, closed protocols, formats and sale-oriented support that is not allowed to give any developers feedback - all these things are on the back side of the modern software development trends. Sooner or later the situation will change and industrial software will undergo the same change that affected internet and made it enjoy the fruits of open source developers collaboration. In the meanwhile let me show an example, how to "Lower your TCO" for the task of accessing data from Siemens S7 controllers.

Introducing Python for Automation World

You can control everything in Python. Starting from virtual game characters and application objects to real robots and trains. Just like you can control them in C. That means that Python is capable to directly use C libraries. Even if they are not really suitable to do such things it gives you the freedom of choice. Pulling slow snake's tail may be more efficient than guarding against fast mosCuitoes aiming at sucking you memory away. Python gives you a finer degree of control focusing your efforts on program logic while in C about 30% of code is dealing with memory allocation and type conversion. Python has more clear syntax than C and that is more important - the same syntax is used to control speedy C libraries.

Controller Data - Corner Stone of Automation

Data is everything. In Automation the primary source of data are sensors - any kind of device that is capable to measure something in common terms. Sensors are usually made to be simple to keep them free from errors, so they are often provide their data using electrical signals. These signals are usually transmitted in rough industrial environment where electrical disturbances are so common that direct connection of such sensors with computers (through COM port, for example) is a reason for frequent PC component replacements. That why specialized devices exist to poll sensor data. These are usually PLCs (programmable logic controllets) and distributed periphery devices that have a higher degree of protection against physical threats than common PC stations. Data inside PLC is the starting point to source of information for are used by industrial panels to visualize and control technical process parameters - link of PLC + visualization panel is commonly referenced as SCADA solution. Access to data hidden inside PLC from a panel is made using proprietary protocols even if the medium is Ethernet. Communication with upper levels require aforementioned Industrial Software provided for a price. But some protocols have alternative open source C libraries or drivers for PLC direct data access and I am going to show how to use such library with Python to run and stop PLC.

Parts of Assembly: libnodave, SWIG and Python

libnodave - communication library for exchanging data with Siemens PLCs released under LGPL license (you are welcomed to support this project if you find it useful). Python is a general-purpose scripting language. SWIG - tool to bind C and Python. Another required thing is C compiler. Because automation designers are mostly windows folks, we will use GCC from MinGW project Minimalistic user interface is done with wxWidgets library, which is also accessible from Python as wxPython library (created with SWIG, by the way) The total cost of all tools is zero. The performance of libnodave solution is still have to be measured. Siemens site contains performance data that can be useful for comparison.

libnodave allows to communicate with PLC with an API described on it's site. To start/stop controller you need to compile and execute the following C program:


#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#include "nodavesimple.h"
#include "openSocket.h"

#ifdef LINUX
#include <unistd.h>
#include <sys/time.h>
#include <fcntl.h>
#define UNIX_STYLE
#endif

#ifdef BCCWIN
#include <time.h>
   void usage(void);
   void wait(void);
#define WIN_STYLE    
#endif

#include <errno.h>

int main(int argc, char **argv) {
   int useProtocol, useSlot;

   daveInterface * di;
   daveConnection * dc;
   _daveOSserialType fds;
  
   useProtocol=daveProtoISOTCP;
   useSlot=2;
  
   fds.rfd=openSocket(102, "192.168.1.244");
   fds.wfd=fds.rfd;
  
   if (fds.rfd>0) {
   di =daveNewInterface(fds,"IF1",0, useProtocol, daveSpeed187k);
   daveSetTimeout(di,5000000);
   dc =daveNewConnection(di,2,0,useSlot);
  
   if (0==daveConnectPLC(dc)) {
       daveStart(dc);
   return 0;
   } else {
       return -2;
   }
   } else {
       return -1;
   }   
}

What I am trying to achieve is the following equivalent in Python:


import nodave
plc = nodave.NoDaveTCPConnect('192.168.1.244')
plc.start()

To make this happen the libnodave library should be made available to Python as a module. Module is imported to make its functions available for Python scripters. To make module for libnodave we need to compile library, generate source for module and compile module itself.

[To be continued..]

Wednesday, August 27, 2008

Bazaar Experience

Not so long ago I decided to play with launchpad - site for open source collaboration. It is built around decentralized Bazaar version control system that reached version 1.6 in its development up to this moment. I read that centralized VCS are bad and DVCS are good, but never had enough free time to check this in practice and finally..

The first thing I liked in Bazaar is that it is written in Python - that means that bugs can be found and fixed rather easily. This was also the thing that I disliked, because standalone bzr distribution includes Python and takes 14Mb and package for installed Python distribution doesn't add bzr to PATH

Accessibility. I often work through tunnels, proxies, firewalls and other stuff that kind people tend to put here and there in each Local Area Network. Not only I have to deal with them myself, but sometimes also need to teach others. In SVN proxy settings are stored in documented configuration file called 'servers'. In Bazaar there is no such file and all configuration is done with environment variables and Python settings that are not documented. To use Bazaar through a proxy - set environment variables:

set http_proxy=127.0.0.1:1080
set https_proxy=127.0.0.1:1080

These proxy settings are rather common for linux/unix machines, but cause particular pain with windows machines in a domain. If SVN is able to transparently authenticate against domain proxy without asking sensitive passwords then Bazaar lacks such feature and needs another authorization proxy server (NTLMAPS) middleware.

Usability. In general distributed version control systems should be better than centralized ones. A lot of articles and tutorials describe the theory very well, but in practice there are many things is are not that brilliant. For example, I needed to fix stubborn DevPak plugin for Code:Blocks that didn't want to install one specific version of a curl library. In SVN I would do a partial checkout, make modifications and upload patch to the forum. Not very convenient. In Bazaar I would have to make a branch of the whole Code::Blocks repository, start patching plugin and then merge the whole branch back. But branching the whole repository is an overkill for such little plugin and you can't make a branch out of subtree in Bazaar. More then that - Bazaar doesn't even allow to export subtree of repository. Download either the whole thing or nothing. Not too effective, especially when the traffic is 0.02 cents for Mb.

So, I manually created directory structure for the plugin similar to existing project and started a new Bazaar repository from scratch. After a couple of revisions I released fixed plugin for stable Code::Blocks and would like to switch to latest development code and merge changes made to plugin in trunk into my version. Again, Bazaar disappointed me. I knew that SVN won't allow merging with different repository, but at least I could grab files from the source tree and merge them manually. Bazaar didn't allow me even that. With almost 100 commands it just couldn't export some files.

It may be that Bazaar is a good system for some small projects, but in my case it appeared unwieldy and too expensive even being open and free.

Wednesday, July 02, 2008

Applying Unified Diffs with Python

Windows has a lot of annoyances for developers. One of these is that it lacks some precious tools - namely "diff" and "patch". They can be downloaded from the Internet, but when the latest patch binary provided by Win32 ports of version 2.5.9 refused to apply a patch built with "svn diff" and closed with an error, I decided to write my own version in python. If it will be included in standard python distributive as a logical complement to Scripts\diff.py utility then at least for people with python there will be no problem with applying patches in windows. One limitation though - the script parses only the most popular format of patches - unified diff.

To start out I've outlined a structure of unified diff using information from Guido van Rossum blog and wikipedia.

(SVG source)

Parsing logic is implemented using brute-force regex parsing approach to avoid dependencies on parsing libraries (like pyparsing etc.). I took this approach to compare the code with the different techniques of Text Processing in Python by David Merz and learn how can I improve it.

Linefeeds are handled in automagic mode. Proper line ending is detected during scanning of source file. If source file has mixed line endings - lines from patch file are not transformed and written "as is". If lines in source files end with the same sequence - lines from patch file are stripped of their own line ends and applied.

The project doesn't have all UNIX patch options, but should be useful even without them. You may find it with sources (MIT license) at http://code.google.com/p/python-patch/

Tuesday, June 10, 2008

Yet another CVS to SVN transition

Some time ago I've compiled a short comparison of CVS vs SVN. It should not be hard to get the idea from it that SVN is better - easier to understand, easier to setup and to use. While there are many new decentralized systems around like Git, Darcs and Mercurial, SVN still has the best windows working clients that are even able to transparently authenticate against domain proxies. That's why SVN is the most accessible way of getting sources.

Many projects realize that and make switch to SVN to expose their public repositories to a wider audience even if some conservative developers are unwilling to change their habits and tools. Still the convenience that brought by SVN worth the efforts of rewriting tools and moving legacy codebase over to new rails. CVS deserves to be honored a special award in the history of software development, but perspective open source projects realize that evolution never stops and time comes to replace legacy CVS with tools that value developer's time. Here is a list of such projects that made the decision to help with yours:

May 5, 2005 - KDE
June 7, 2006 - pygame
September, 2006 - ArgoUML
December 14, 2006 - SWIG
January 1, 2007 - GNOME
January 24, 2008 - phpBB
June 2008 - FreeBSD

To migrate from CVS there is cvs2svn tool written in Python hosted on tigris.org that can move CVS repository as big as Mozilla or FreeBSD to SVN or Git.

Monday, May 26, 2008

CGI Python scripts on Apache for Windows

Just a quick tip how to make a CGI script running on Windows under Apache. The main problem with CGI scripts is that their first line usually contains a path to executable program - Python interpreter - and if on *nix platform this line is more or less the same:

#!/usr/bin/env python

in windows it usually different from one installation to another. If in *nix environment you do not have to change this line most of the time, in windows most of the time you'll need. Luckily, Apache developers invented an option to lookup the path for script interpreter from the registry by following association of file extension. Look at this .htaccess for example:


Options +ExecCGI
AddHandler cgi-script .py

<FilesMatch "\.py$">
  # Use the interpreter found in registry by file association
  ScriptInterpreterSource Registry

</FilesMatch>

Here ScriptInterpreterSource Registry is the magic phrase to turn on the lookup. It is well described in Apache manual. FilesMatch is another directive to limit the scope of lookup to .py files only.

Finally, a reminder of how a minimal CGI script in Python looks like.


#!/usr/bin/env python

print "Content-Type: text/html"   # HTML is following
print                             # blank line, end of headers

print "xxx"

Tuesday, May 06, 2008

Writing Far Manager plugin in C

Far Manager is a file manager for Windows, however the most functionality in Far is contained in plugins. Here is a short introduction how to create one.

You'll need Far Manager with Development Pack, GCC compiler and Developer's Encyclopedia at hand. Encyclopedia is also available in Development Pack in .chm format.

Plugin is a .dll file compiled from .c source. For the minimal example you'll need to create .c file with at least one function GetPluginInfo() that tells Far about plugin capabilities.

This Sequence Diagram illustrates communication of Far with a plugin that does absolutely nothing. The source of the plugin is below:


#include "plugin.hpp"

void WINAPI GetPluginInfo(struct PluginInfo *Info) {
    Info->StructSize = sizeof(struct PluginInfo);
}

Encyclopedia states that StructSize should be filled with the size of PluginInfo structure to maintain backwards compatibility in case of future API changes. Because plugin really does nothing, it is impossible to tell if it works or not. To prove it really works let's add logging to file.


#include <stdio.h>
#include "plugin.hpp"

void WINAPI GetPluginInfo(struct PluginInfo *Info) {
    FILE *file;
    file = fopen("minilog.txt", "a+");
    fprintf(file, "%s\n", "Fired.");
    fclose(file);

    Info->StructSize = sizeof(struct PluginInfo);
}

Copy "plugin.hpp" near to mini.c and execute GCC to compile the plugin:

gcc -shared -o mini.dll mini.c -Wl,--kill-at

Place mini.dll into plugin path or execute Far with /p parameter pointing to directory with mini.dll Press Ctrl-R and look for "minilog.txt" file in current dir containing burning proof that plugin works. This is enough to get started and follow Developer's Encyclopedia on your own, there are still some technical details that may answer some questions about GCC options and .dll writing not covered in Encyclopedia.

DLL, exports and GCC options

GCC parameters shown above instruct it to compile mini.c into shared library mini.dll To be recognized as a plugin the .dll should make GetPluginInfo() function visible to Far (i.e. exported from the library) the same way as any other function that Far calls in plugins. By default all functions in .dll are exported and this is visible in .dll as export table that can be checked for example with BIEW. The list of exported functions can (and usually should) be narrowed to increase performance and clean API either with .DEF file or with __declspec(dllexport) addition to function prototype. Before going with example there is one more thing left to explain - -Wl,--kill-at parameter.

By default function names are exported with "at" suffix like GetPluginInfo@4 where 4 denotes the number of bytes the argument takes. When looking for plugins Far looks for clean names without @4. The last option to gcc -Wl,--kill-at is required to strip "at" suffix from exported function name. Read this link for more details about this calling convention.

To illustrate how __declspec(dllexport) works we move logging code into separate function and compile it with:

gcc -shared -o mini.dll mini.c


#include <stdio.h>
#include "plugin.hpp"

void logfile(const char* msg) {
    FILE *file;
    file = fopen("minilog.txt", "a+");
    fprintf(file, "%s\n", msg);
    fclose(file);
}

void WINAPI GetPluginInfo(struct PluginInfo *Info) {
    Info->StructSize = sizeof(struct PluginInfo);
    logfile("GetPluginInfo called.");
}

If you now launch BIEW on mini.dll and press Alt-F3 to get to export table, you'll see two exported functions - logfile and GetPluginInfo. Latter with @ suffix. @ is added by WINAPI calling convention.

To remove suffix recompile plugin with:
gcc -shared -o mini.dll mini.c -Wl,--kill-at

To leave only GetPluginInfo() function in export table - add __declspec(dllexport) to its definition. If __declspec(dllexport) is present in at least one functions definition, all other functions that doesn't have this tag will be excluded from export.


#include <stdio.h>
#include "plugin.hpp"

void logfile(const char* msg) {
    FILE *file;
    file = fopen("minilog.txt", "a+");
    fprintf(file, "%s\n", msg);
    fclose(file);
}

void WINAPI __declspec(dllexport) GetPluginInfo(struct PluginInfo *Info) {
    Info->StructSize = sizeof(struct PluginInfo);
    logfile("GetPluginInfo called.");
}

Inspecting .dll to see if the "at" suffix is gone (thanks to GCC options) and only one function is present in export table (the one marked with __declspec).

Further Steps

There are four basic export functions in Far Plugin API that come handy at start. Plugins export these function to let Far call them to supply information about itself and gather data about plugin. Functions (if present) are invoked in particular order which is illustrated in the following table.

Name	Required	Order	Comment
GetMinFarVersion()	no	1	called first
SetStartupInfo()	no	2	always called if present - good for extra initialization code
GetPluginInfo()	yes	3	cached
OpenPlugin()	no	4	not required, but without it plugin is pretty useless

With this yet another lame Sequence Diagram there should be enough useful information to get started. Among improvement that could be done to mini.c code is to replace stdio.h library calls with native Windows API and Far API calls, adding help, menus and language files. However this falls out of scope of this post, so the best way to move further is to checkout Encyclopedia. Good luck!

Tuesday, April 29, 2008

CVS vs SVN

Open Source for many developers means getting the hands dirty in many projects simultaneously sending patches here and there as they polish their own way through the code to make sure the bug encountered today will not pester anyone in the future. Everybody seems to be interested in reporting the bugs and sharing patches, but in reality it doesn't happen too often, mostly because of lack of time necessary to contribute. Following modern enterprise tendency to define various measures and estimates let me call this parameter as Time-To-Contribute value. TTC is directly influenced by such factors as activity of developers and availability of the code as well as usability and knowledge of development tools.

It would be interesting to get deep into the details of code contributions, but to keep the long story short and justify the title let's compare SVN and CVS. These tools provide access to source code and poor decision may affect developers experience and desire to contribute in the future. This post based on RFC submitted to PHPDOC community a year ago, which may be useful for other conservative CVS parties out there like MinGW

Facts only.

++++ Accessibility:
CVS usually blocked by proxies (needs dedicated port)
SVN works over HTTP and HTTPS via WebDAV

CVS doesn't work behind a proxy
SVN works with proxies

CVS checkout is complicated, it is hard to remember all the prerequisites
cvs -d :pserver:cvsread@cvs.php.net:/repository login
cvs -z3 -d :pserver:cvsread@cvs.php.net:/repository checkout -P phpdoc
SVN project checkout command is easy to remember
svn co http://svn.php.net/repository/phpdoc

CVS is complicated to learn
SVN has a perfect book

CVS is abandoned by developers
SVN is supported

++++ Security:
CVS password is transmitted over network in cleartext with simple rot13-like translation
SVN works over HTTPS, supports Apache authentication schemes

++++ Usability:
Command set is mostly the same

CVS fetches previous revision online to build diff of changes
SVN builds diff offline

CVS takes twice less disk space
SVN stores full copies of checked out files for comparison

CVS screws linefeeds
SVN doesn't screw linefeeds

CVS is file based - history is separated for each file
SVN is atomic - modification of group of files is a whole

CVS maintains independent revision numbers for each file
SVN revisions are global for repository

CVS leaves deleted directories in repository tree
SVN keeps directory tree tidy

CVS has convenient concept of branches/tags
SVN branches/tags are just directory copies in repository

CVS branches/tags concept is complicated
SVN branches/tags are easy to understand

I think that for the most of us the choice is rather obvious if there are no practice to use $Id$ to track how many changes a file underwent since last time, or if people are not too addicted to CVS branches. Nevertheless there are still many projects that use CVS and the true reasons why people do this are: old habits, absence of time to learn something new and dependencies on CVS in hard-coded legacy scripts. Of course, laziness is also a reason and it's funny that this laziness sometimes pushes to seek more convenient tools.

Wednesday, April 09, 2008

503 Service Temporarily Unavailable

Just a typical 503 page source to know what to look for in web site monitoring scripts. This one is usually displayed by Apache when triggered by Tomcat.

--cut-[503.html]-

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>503 Service Temporarily Unavailable</TITLE>
</HEAD><BODY>
<H1>Service Temporarily Unavailable</H1>
The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.
</BODY></HTML>

-----------------

Friday, March 14, 2008

Binding PHP to Windows GUI

Great tool to create Windows GUI applications with PHP - WinBinder was once my favourite toy. Supplied with very easy to follow examples and a great-looking form designer, seemed like it is capable to not only become a standard for making Windows interfaces for PHP, but also the tool of choice for Rapid Application Development. But the progress goes on and Windows is no more preferred desktop system among developers. Many choose Linux, some folks stick with BSD and of course a lot of people fall in love with MacOS. Windows is no more as popular as before and everybody understand that making your application with PHP + WinBinder limits its widespread usage. It is a pity to know that such a great toy may have no future as every effort needs support and maintaining PHP to C bindings for Windows API takes a lot of time and requires a lot of specific knowledge. Although the code for WinBinder is very well structured, the less is developers base the less chances the project has its bugs to be fixed in time. Most WB users are PHP folks and there are not many with required C experience among them. Thanks Rubem Pechansky for inventing this fabulous toy and let's hope that one day cross-platform GUI building toolkits like wxWindows let us create similar nice, fast and nifty appications.

Tuesday, March 04, 2008

Bloodless partition backup

Sometimes it may happen that a system blackbox arrives at your desk together with a person begging to do something to make it work again. It may have been hardware failure, power outage, virus or just a prank that made this panicked person pull the plug out of the outlet and shamefully accept that "I am afraid to turn it on after that!!" What should be the first reaction? Backup what is left. There are tons of backup software available, but most of it just copy files from one place to another on a regular basis. What you need in this situation is to copy a whole disk to network drive without booting the system preferably on the same hardware to avoid mounting HDDs. This is a task for disk cloning tools.

And while it seems logical to think about Acronis True Image or Norton Ghost on the second thought they are not really great when you need quick burn-and-use ISO image that they are unlikely to provide in their downloads section. So the best way is to check what open source has to offer. There are several tools:

Clonezilla

FOG - Free OpenSource Ghost

g4u - Ghost for Unix

G4L - Ghost for Linux

LRS - Linbox Rescue Server

Mondo Rescue

PartImage

Trinity Rescue Kit

The list of 8 product seems enough to choose one. Let me remind that we need ready-to-run Live CD capable to save partition or disk image to a remote network location. Throw out Mondo - it doesn't provide Live CD. TRK 3.2 also goes to the shelf - this very nice Boot CD doesn't include tools to backup/restore partition to a network drive. A pity. Linbox Rescue Server - free GPLed version can't backup NTFS partitions - won't work. FOG - project named "Free Ghost" doesn't have Live CD! and even with such promising name is designed solely for distributing the same disk image to multiple workstations over the network. Removed. So what is left:

Clonezilla

g4u - Ghost for Unix

G4L - Ghost for Linux

PartImage

After browsing these sites for a bit the first intention was to get rid of tools with ugly web-pages and short feature list. And it is where g4u beats them all. How can two floppy images in total 3Mb long compete with the raw power of 90Mb image of Clonezilla? Well, let's see. It was very easy to download and run g4u 2.3 from VirtualBox. Boot message gives almost all necessary information to go on except it doesn't say that you need user "install" on your FTP server. FTP is the only supported method for upload/download, but it's ok. Because of simplicity this tool had all the chances to be the chosen one of this review. Even in Windows you can easily setup FTP server given Far Manager, Python and this plugin. g4u is a tool that does the job of backing up partition, whole drive and compressing it via GZIP, but!:

1. There is no command history - and you have to retype whole command every time you've made a mistake
2. There is no anonymous FTP login option - extra hassle to setup "install" FTP user
3. Empty password is not allowed
4. No progress bar with total MB/percentage completed, not speaking about estimated time
5. Rather obscure GZIP option, which basically means compression level (default is 9 - the biggest)

To end up with g4u - an ideal approach for this awesome and minimal tool with limited usability would be to grow its ISO to 10Mb disk image by including Python and provide some wizards in .py

Next is Clonezilla - wizard interface. No FTP. Only SSH, SAMBA or NFS server that needs to be mounted for an image file. For folks without Linux experience the most complicated stuff is probably to understand "mount clonezilla image directory" dialog which basically means to attach "share" where image files can be stored (sorry for the lame comment). Saving partition image worked like a charm - not only it displayed progress window, but also detected space in use! and saved only used part. Magic! I just wonder if I'll be able to restore image manually in case Clonezilla fails. In directory with image file there are text files with disk and partition parameters, but image file itself is in custom format. In case of NTFS system - in ntfsclone 2.0 To conclude - Clonezilla version 1.0.9-10 CD is nice and usable, but a single mistake during wizard requires to start the process anew. Let's skim over alternatives.

Ghost for Linux - doesn't have its own web-page - only standard SourceForge project template, but provides ISO image for downloads. The project itself is similar to g4u - the same FTP upload method with compression, but much better interface. Version 0.24 is GPLed and Linux-based. In comparison g4u is BSD-licensed and NetBSD-based. Ok, about the contents. Multiple boot options varied mostly by linux kernel. Weird. I do not want to try every one to find out which of them will do the job. Good reference is shown right after startup documenting two options, but lazy one would prefer only the first - to enter "g4l" at command line to start wizard interface. This preferably could be just one line reminder before the system proposes shell. The nice thing about g4l wizards is that they allow to go back to correct options. Nevertheless, the real test failed on VirtualBox, because there was no default 'img' directory on FTP, and even though g4l created it, it hung afterwards. After manual restart and mkdir 'img' on FTP the process successfully completed. It can be FTP or VirtualBox bug - no idea, but with this minimal correction the test passed. Resume: setup is minimal and intuitive, though partition selection doesn't give any info about partition type. Ideal for FTP backup, but for other network access types I would still prefer Clonezilla. g4l like Clonezilla includes ntfsclone utility for NTFS backup, but its version is slightly outdated and should be selected explicitly. If you remember - there is no partition information displayed.

PartImage doesn't have its own Live CD, but seems to be the tool of choice used on PING "Backup and Restore Disk Partitions" Live CD and on SystemRescueCd. PING 2.01.10 is even more simple than Clonezilla, includes wizard interface, for network storage supports only SMB shares and requires that a predefined set of directories is already created in the destination share. This predefined set is only described on official site and that's strange, because Clonezilla manages to create all required directories automatically. So PING is not an option for burn-and-boot solution. Another annoyance is that Alt-Tab pressed within VirtualBox ends PING session prematurely. Too bad. Let's see the last one. At last!

SystemRescueCd is a whole bunch of useful tools for plumbing works. However, even with the nice help displayed at startup knowledge of *nix and the tools is still required. If I haven't used Clonezilla or g4l I had no chance to know that I needed ntfsclone. In addition I must admit I still do not remember what cmd I have to type to mount SMB share. SystemRescueCd may be the best Rescue Live CD, but it is not as convenient and intuitive as more specialized tools. As a typical lame windows user I've failed to use it properly. I.e. for backup, but there is still awesome utility named 'gparted' for disk partitioning that I would recommend to everyone. In fact there is a very awesome project that combines SystemRescueCD and Clonezilla Live CD in one disk. Grab it here - you won't regret. =)

The verdict.
To backup your disk or partition via SSH, SAMBA or NFS use Clonezilla. For convenient backup to FTP use g4l.

That's all. Enjoy!

Thursday, January 24, 2008

Make Java program work through the proxy

It is not surprising that not many users know how to make a Java program work through a domain proxy if there is no place to enter proxy settings. Just because they are not developers they do not know that it is enough to launch this program with the following command line:


java -Dhttp.proxyHost=192.168.1.1 -Dhttp.proxyPort=3128 -jar SoftWare.jar %*

This works for anonymous or domain proxies only, because there are no password settings. To work through password-authenticated proxy, you will have to setup additional local or personal proxy that uses user/pass settings to authenticate and pass traffic to upstream one. Unfortunately, I can't name any software for user/pass authentication because I've never had to work with this problem, but I know that at least Privoxy is capable to forward requests.

Wednesday, January 09, 2008

Compiling Python extension with GCC

In my previous post I've described how to make C code accessible from Python. I used Visual C++ compiler cl.exe to build an extension (or module) for Python. This follow-up shows how to compile the same extension for windows using GCC. I bet you already know what GCC is and that it is available from MinGW install as a result of install procedure described long ago.

Grab the source from the previous post - it won't change. Everything what is going to happen are just changes in .exe and its command line options. Saving source as farpython.c and starting GCC to compile it:


gcc farpython.c

As usual, this won't produce anything useful except errors.


farpython.c:14:20: Python.h: No such file or directory
farpython.c:18: error: syntax error before '*' token
farpython.c:19: error: syntax error before '*' token
...

Additional include search path is specified using -I option in GCC.


gcc -IE:\ENV\Python25\include farpython.c

A different picture, but the output is still grim.


D:\Temp/ccyOaaaa.o(.text+0x1c):farpython.c: undefined reference to `_imp__PyArg_ParseTuple'
D:\Temp/ccyOaaaa.o(.text+0x4c):farpython.c: undefined reference to `_imp__Py_BuildValue'
D:\Temp/ccyOaaaa.o(.text+0x88):farpython.c: undefined reference to `_imp__Py_InitModule4'
D:\Temp/ccyOaaaa.o(.text+0xbe):farpython.c: undefined reference to `_imp__Py_InitModule4'
E:/ENV/MSYS/mingw/bin/../lib/gcc/mingw32/3.4.2/../../../libmingw32.a(main.o)(.text+0x106):main.c: undefined reference to `
WinMain@16'
collect2: ld returned 1 exit status

Luckily these errors are not concerned with the code. They are from linker (ld) complaining it could not find library with binaries for functions defined in Python.h. The last one about undefined reference to WinMain is different though, but let's skip it until we deal with missing libraries. Python libraries are located at E:\ENV\Python25\libs and option to GCC is -L.


gcc -IE:\ENV\Python25\include -LE:\ENV\Python25\libs farpython.c

The output is still the same. The problem here is that linker doesn't know which specific library we need to link with to get binary bits for Python.h functions. cl.exe from Visual C++ was able to detect the correct library somehow, but for GCC we have to specify its name explicitly with -l option. Note that this option goes after a name of all compiled .c files. It is because params up to and including .c files are for compiler component and everything that goes after can be treated as linker's.


gcc -IE:\ENV\Python25\include -LE:\ENV\Python25\libs farpython.c -lpython25

Check the output.


E:/ENV/MSYS/mingw/bin/../lib/gcc/mingw32/3.4.2/../../../libmingw32.a(main.o)(.text+0x106):main.c: undefined reference to `
WinMain@16'
collect2: ld returned 1 exit status

WinMain is an entrypoint or starting point of any program on windows platform, but Python extension is not a program that starts execution itself. .pyd is a .dll, or shared library with functions to be called by other programs. To tell that to GCC we add -shared switch to command line.


gcc -IE:\ENV\Python25\include -LE:\ENV\Python25\libs farpython.c -lpython25 -shared

Now everything seems fine, but instead of far.pyd or farpython.pyd we've got a.exe Default output filename is easily corrected with yet another option -o


gcc -IE:\ENV\Python25\include -LE:\ENV\Python25\libs farpython.c -lpython25 -shared -o far.pyd

Test.


E:\>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import far
>>> far.example("echo")
ECHO is on.
0