another day another vice another roll of the dice: 2015

Thursday, September 03, 2015

SCons build targets

SCons is awesome. Just saying. If you want to know (or troubleshoot) how SCons selects targets to be built, add this snippet at the end of your SConstruct:

def dump_targets(targets):
  for t in targets:
    if type(t) == str:
      name = t
    else:
      name = t.name
    print("  <" + str(t.__class__.__name__) + "> " + name)

print("[*] Default targets:")
dump_targets(DEFAULT_TARGETS)

print("[*] Command line targets:")
dump_targets(COMMAND_LINE_TARGETS)
print("[*] All build targets:")
dump_targets(BUILD_TARGETS)

For my copy of Wesnoth, 'scons .' produces this output:

[*] Default targets:
  <Alias> wesnoth
  <Alias> wesnothd
[*] Command line targets:
  <str> .
[*] All build targets:
  <str> .

And if you want to know how to specify targets or what do they mean, read the second page of SCons man documentation. Just for convenience I quote it here.

scons is normally executed in a top-level directory containing a SConstruct file, optionally specifying as command-line arguments the target file or files to be built.

By default, the command

scons

will build all target files in or below the current directory. Explicit default targets (to be built when no targets are specified on the command line) may be defined the SConscript file(s) using the Default() function, described below.

Even when Default() targets are specified in the SConscript file(s), all target files in or below the current directory may be built by explicitly specifying the current directory (.) as a command-line target:

scons .

Building all target files, including any files outside of the current directory, may be specified by supplying a command-line target of the root directory (on POSIX systems):

scons /

or the path name(s) of the volume(s) in which all the targets should be built (on Windows systems):

scons C:\ D:\

To build only specific targets, supply them as command-line arguments:

scons foo bar

in which case only the specified targets will be built (along with any derived files on which they depend).

Specifying "cleanup" targets in SConscript files is not usually necessary. The -c flag removes all files necessary to build the specified target:

scons -c .

to remove all target files, or:

scons -c build export

to remove target files under build and export. Additional files or directories to remove can be specified using the Clean() function. Conversely, targets that would normally be removed by the -c invocation can be prevented from being removed by using the NoClean() function.

A subset of a hierarchical tree may be built by remaining at the top-level directory (where the SConstruct file lives) and specifying the subdirectory as the target to be built:

scons src/subdir

or by changing directory and invoking scons with the -u option, which traverses up the directory hierarchy until it finds the SConstruct file, and then builds targets relatively to the current subdirectory:

cd src/subdir
scons -u .

Sunday, August 16, 2015

Technical Debts for Python Community

Debts and credits is a new age slavery. Technical debts are better, because they are just a disease. It can fatal if not treated, but for a strong community it is not a problem. Still a good working plan and a vague roadmap is needed to coordinate many eyes to go in the same direction and push things forward where it is hard to reach a consensus.

Dangers of Technical Debts

Technical debt is hard to understand and identify, but once your know about it, it will be easy to spot those things early. The most common symptom of technical debt is stolen time, which if not treated becomes a paralysis. You may know how your system works, and all the components, and where the files are, but you also have the previous experience about how much time it takes to modify and properly test the system, and ensure that everything is correct, and you just know that you just don't have the time, because your family and friends are missing you. We are all volunteers, etc.

Recipe: Simplify Your Systems, Reduce the Complexity, Automate Things and Think in 15 Minutes Slots.

Thinking in 15 Minutes Slots

This may not be so important for enterprise projects where people exchange their time for money, but my opinion is that it is extremely important for open source projects where time is distributed over many people.

Have you ever wondered what will a highly talented person be able to do for Python if we are lucky enough to get 15 minutes of his or her attention?

I am afraid that the sole "contribution" would be making a checkout. Even correcting a mistake in documentation requires that. Maybe the most he or she could accomplish would end in setup of Git or Mercurial. And if the checkout is a long and boring, the person may lose interest and switch to something different even before 15 minutes are expired.

I will continue with how technical debts are evolving into Competence Debt, but first.. let me ~~take a selfie~~ tell a story. Just to complete the 15min section, I once accepted a challenge to make a design fix for Launchpad. I spent 15 minutes 4 times - an hour - and couldn't even get the checkout. Here are the sessions:

https://launchpad.net/~inkscape.dev/+archive/ubuntu/stable - fix dialog here

open https://dev.launchpad.net/

open https://dev.launchpad.net/Getting

FAIL: edit wiki - move bzr instructions to the top

send a letter to https://lists.launchpad.net/launchpad-dev/ that wiki pages are not editable

Yes, I got distracted during the first session, but I want to make the world better by fixing things on the way. Of course, I'd like those problem not to appear in the first place. Let's see how a next session ended.

open https://dev.launchpad.net/Getting

# create LXD container for experiments

$ lxc init ubuntu lp

$ lxc start lp

$ lxc exec lp -- bash

# apt-get install bzr

# bzr branch lp:launchpad

open https://dev.launchpad.net/Running

The second slot was all about reading instructions and setting up "virtualenv for Linux" to install all the prerequisites without polluting my main system (and drop them without consequences). I already knew about LXD, so my competence here was already high to save some time on learning that. BTW, LXD rocks. Just try it.

# cd launchpad

# apt-get install postgresql make

# ./utilities/launchpad-database-setup $USER

# make schema

FAIL: many errors

reading https://dev.launchpad.net/Running

# utilities/update-sourcecode

These 15 minutes left me in confusing state without any working instance to get some positive feedback on what I am doing. At this moment I already have a strong desire to just drop everything. And yet after some time I get back to spend another 15 minutes slot trying to tackle the problem.

# drop old LXD container

lxc delete lp

# create new LXD container

lxc init ubuntu lp

lxc start lp

lxc exec lp -- bash

# install basic dependencies

apt-get install bzr make postgresql wget

# talking in https://lists.launchpad.net/launchpad-dev/

wget https://git.launchpad.net/launchpad/plain/utilities/rocketfuel-setup

bash rocketfuel-setup

# ^ need to enter name, and click Y on Apache install prompt

FAIL:

Making local branch of Launchpad trunk, this may take a while...

You have not informed bzr of your Launchpad ID, and you must do this to

write to Launchpad or access private data. See "bzr help launchpad-login".

bzr: ERROR: Connection error: Couldn't resolve host 'bazaar.launchpad.net' [Errno -2] Name or service not known

ERROR: Unable to create local copy of Rocketfuel trunk

# attempt to repeat the script

bash rocketfuel-setup

FAIL:

bzr: ERROR: No WorkingTree exists for "file:///root/launchpad/lp-branches/devel/.bzr/checkout/".

ERROR: Your trunk branch in /root/launchpad/lp-branches/devel is corrupted.

Please delete /root/launchpad/lp-branches/devel and run rocketfuel-setup again.

Now I really drop it. So after that user experience I doubt I will ever get hacking on LaunchPad again. So, if your technical debt provides poor experience for onboarding users, you're likely to lose them for a lifetime.

The Debt of Competence

Competence Debt is a chronic phase of the Technical Debt illness. When time for operation become more than your daily limit, there comes paralysis, and after that the worst thing that can happen is when you no longer know how your systems work and lack skills to restore the picture.

From that moment your project is entering the death spiral and it is only a matter of time when it will be dead. I've seen several examples where programmers was treated like a replaceable material, but the truth is that program lives as long as its code is alive in the heads of its maintainers. There is no such thing as "a software product" anymore - software is more about support and development, than about selling products on a local market.

For open source projects competence debt usually results in various rewrites and long term stale issues. Many attempts to fix them, many hours wasted just to hit the wall with new heads over and over. The power of open source is a little time and little effort that is distributed over many people to create momentum. That was the original idea behind the rainforce.org domain when it all started many years ago. And it is also the result of OpenStreetMap success - small and clear activities that don't require much time and competence to accomplish. This scales well and provides a good gameplay.

Recipe: Invest in Visualization and Learn Visual Tools (SVG, D3.js, Inkscape) to Explore Ways to Transfer Your Competence to Other People

Text is not a natural way for people to consume and produce information. We learn how to read and write, and it takes more than a month to get used to it. But learning to play games like World of Tanks just takes a few minutes. The new generation that I am a part of is used to watch YouTube lectures on 2x speed, read only first 150 letters of the messages and scan long texts without actually thoroughly reading them. That's why I highlight the key points in this post. We were developing tools for audio/visual communication naturally over all these many years - 3D graphics, demoscene, virtual reality, and now deep learning networks, but we still find it hard to produce visual material for communicate other ideas. Because we've been taught to write text, not to produce beautiful art that just works. Learn to draw. It makes people happy learning something new.

OpenStreetMap - The Earth as The Outline

Here is a success story. No, it won't teach how to remove technical debts, but rather give an idea how to restructure it, so that a thousand eyes could make an impact.

The OpenStreetMap has a reference model - it is our Earth. We just copy what we see into vector form to draw a map and everybody could validate with their own eyes. With open source project it is all the same, except that you need to create that reference model and that should be actionable - split into many pieces that people can validate in parallel separately. Think about specification where every is independent enough to serve both as an entrypoint and a clause to put a checkmark next to it if a condition is true. Think about a canvas, where everybody can draw the common vision and then see who has drawn the components that they can reuse in their own sketches. The reference model is what you need to know where to push so that your small effort could contribute to a greater goal. The Roadmap also tell you where your skills will be more useful, and ensures that your efforts will not be wasted

The Role of Foundations

People think of foundation as of fund. That's not effective. I need about $600/m to cover shelter and food expenses. Travelling, buying clothes and stuff, covering medical expenses raises the plank to $1000/m, girlfriend may add another $500/m, building a house another $500/m, and I don't even want to think about children. There is no chance I will be able to afford this. So at a bare minimum a foundation should provide $600/m per person to deal with EPIC issues that nobody could deal with in their spare time. Forgot the taxes. Add another $600/m on top of that, and that's just one person, and you need at least two of them. So, $2400/m just to make router for https://bugs.python.org/ so that we can add more URLs endpoints via extensions for interactive frontends and REST API to it. Nobody will ever pay for that. We tried to hack the problem with Gratipay, but got a flashback from a protection mechanism of U.S. financial system. It is clearly a dead end to fight the World owned by corporations (the World as it was already 100 years ago).

Instead, the role of the foundation is to explain to corporation the above mathematics of time and effort, to enable people in these companies give more time to contribute their professional expertise to deal with complexity and reduce competence debt for the community. Applying the recipes to reduce the technical debt to inspire people. Employing art for documenting the systems and structures, so that people could digest the information easily. Organising in-house sprints to deal with important matters we alone, with less time, but many hands can not tackle on our own.

The role of foundations is not to empower individuals, but collect the data about obstacles, foresee and communicate about them on the path ahead, and organize clean up efforts where they are needed, so that anybody who got those precious 15 minutes knew what to do, and could spend those 15min most effectively to bring their contribution to the common stream that benefits everyone.

Sunday, June 21, 2015

Lifehack - import directly from command line

If you're obsessed with UX as I am, and long repetitive commands seem to break your flow, you may appreciate this little trick for invoking Python from command line.

I invoke python from console/terminal to check some behavior that I tend to forget after a day in JavaScript or other programming languages. It is often way faster than reading the docs. It also helps to quickly try new PyPI libraries after installing with pip (saves on mouse clicking in IDE etc.).

Today I found a time saver that makes me happy. This example is typed for Windows, but there should be no problem for any Linux hacker to port it to their favorites. So instead of doing this:

> python [ENTER]
>>> import os [ENTER]
>>> os.name [ENTER]
'nt'

I can just type:

> import os [ENTER]
>>> os.name [ENTER]
'nt'

It should be now easy to understand how it works, so I will complicate it a bit. The trick is this little import.bat script, placed in system PATH:

@echo off
py -2 -c "import sys; print(sys.version)"
py -2 -i -c "import %*"

py here is a py.exe launcher that can call either Python 2 or Python 3 depending on argument. It comes installed with Windows Python versions. I like to know which Python I work with, so I've added version into to the command, and to avoid polluting the global space of new interpreter, the version is printed by separate command.

Sunday, May 03, 2015

ANN: patch 1.14.2 - utility to apply unified diffs

Google Code shutdown kicked me to finally release the Python patch utility on PyPI.

Python 2 only for now (issue #10)
single patch.py file that can be dropped directly into your repository
command line tool
importable diff parser library
doesn't create and remove files (issue #3)

It can be run in from command line standard way

    python patch.py <patch.diff>

or directly from .zip file:

    python patch-1.14.zip <patch.diff>

Get it from PyPI: https://pypi.python.org/pypi/patch

Monday, March 09, 2015

Python CLA, Community and Node.js

I was trying to write on CLA multiple times, since the day that I sent "My CLA" letter in hope for some explanation to python lists. Now search for "My CLA" leads to new model of Mercedes Benz released in 2014 and I wonder, why do I waste my time arguing on the things that should not matter to me? Why do they matter at all? If I could afford a Mercedes Benz and had time to enjoy it, everything else probably didn't matter.

https://www.joyent.com/blog/broadening-node-js-contributions

The post by Bryan Cantrill is all good, but this quote in particular got my attention:

While node.js is a Joyent-led project, I also believe that communities must make their own decisions—and a CLA is a sufficiently nuanced issue that reasonable people can disagree on its ultimate merits.

I don't remember that steps were taken to explain the necessity of CLA or ask people what they think about it. There was no community process that I've seen and my requests to get some clarification did not went good. It took months of non-constructive critics and diplomatic skills of Ezio to at least add electronic signature to the paper form. There was no incident, nothing in public to cover that. Just somebody decided to do this and then there was a lot of peer pressure to force you to comply, because "lawyer know better". I can hardly name it a community process. OpenStack fights for it, to keep process open and inclusive, tries to analyze problems, to seek solution in open way, at least it is visible.

http://lists.openstack.org/pipermail/openstack-dev/2015-February/056551.html

Python was an open source language, and a good and unique one that deserves its own license. Guido is always open and sincere behind the community of language supporters, but are languages supporters evolve the same, do they possess the same understanding into the complicated nature of human processes to take the baton? Did the core of community become a closed elitist circle of people with good relationships? Are they able to handle the insane amount of ideas and communication that coordination around core development and surrounded infrastructure is needed? Is new generation involved in solving these challenges or all they do now is dreaming about startups? Is it a community problem or economy problem already with all these CLA and other issues that are impossible to hide?

https://mail.python.org/pipermail/distutils-sig/2015-March/025807.html

Monday, February 16, 2015

ANN: hexdump 3.2 - packaged and pretty executable

https://pypi.python.org/pypi/hexdump

The new version doesn't add much to the API, but completes a significant research about executable .zip packages with proof of concept distribution that is both an installable Python package that you can upload to PyPI and an executable .zip archive.

The benefit is that you may drop .zip package downloaded from PyPI into your source code repository and invoke it from your helper scripts as python hexdump-3.2.zip Keeping dependencies together with application source code in repository may seem like a bad idea, but we find it useful at Gratipay to setup project from checkout in offline mode with no internet connection.

The solution required three modifications to distutils setup.py (see sources):

force sdist command to produce .zip file
strip extra dir created at PyPI package root
add __main__.py to make .zip executable

If you think that it is cool and should be promoted further, feel free to test Gratipay to send me 0.01+, and I'll get the signal to spend more time on reversing native distutils/packaging gotchas.

Thursday, January 08, 2015

Shipping Python tools in executable .zip packages

UP201501: Found out how to force setup.py create .zip instead of .tar.gz

This is about how to make Python source packages executable. As a side effect, this also explains, how to run invoke's tasks.py with a copy of invoke .zip shipped with your source code, but without installing it.

You know, Python can execute .zip files. As simply as:

$ wget https://pypi.python.org/packages/source/h/hexdump/hexdump-3.1.zip
$ python hexdump-3.1.zip
/usr/bin/python: can't find '__main__.py' in 'hexdump-3.1.zip'

Well, you need to place '__main__.py' into the root. It will then import what it needs and do something. The only problem with .zip source packages is that their structure (if created with standard Python tools) includes one additional directory at the top:

`-- hexdump-3.1
    |-- PKG-INFO
    |-- README.txt
    |-- hexdump.py
    |-- hexfile.bin
    `-- setup.py

So, even if you place `__main__.py` into the root, it won't be able to import file from `hexdump-3.1` subdirectory.. unless you prepare for it. Here is the solution:

import os
import sys

# add package .zip to python lookup path
__dir__ = os.path.dirname(__file__)
path = os.path.join(__dir__, 'hexdump-3.1')
sys.path.insert(0, path)

import hexdump
msg = hexdump.dehex("48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21")
print(msg)

Now if you run it, it will print the obvious.

$ python hexdump-3.1.zip
Hello, World!

In Python 3, the result will be wrapped into b'' string. But that's it!

The next hexdump version will likely to ship as executable .zip package

Where you may need it?

It may be useful for scripts automation if you're "vendorizing" dependencies (ship them with your code). It started with need to fix a broken link on Gratipay site. The Gratipay codebase is in public domain clear of any NDA, CLA and all other BS, and that makes it great for people to learn, reuse and enhance. I wish Python internal projects possessed these properties, but PSF politics is another topic.

Gratipay inside site uses make. It doesn't need powerful build tool, such as scons, so I tried to replace it with some simple task automation utility to be more cross-platform. I chose invoke, because of my previous experience with fabric for remote control, and because I saw previous attempts in another Gratipay repository. The necessary condition for new tool is that user experience should not degrade for make users, so main invoke's tasks.py script needed to be self-executable.

I completed the proof of concept by making invoke .zip package executable with the method above, placed it into vendor/ directory, and tuned tasks.py to reexecute itself with invoke .zip This also solved a potential problem with currently unstable invoke API.

tasks.py was modified as following:

import sys
sys.path.insert(0, 'vendor/invoke-0.9.0.zip')

from invoke import task, run

# ... tasks go here

if __name__ == '__main__':
  import sys
  from invoke import cli
  cli.main()

Bonus points

There is one more thing, which is better explained with hexdump example. hexdump package ships `hexfile.bin` data file used for testing. hexdump.py looks for it in its directory (__file__ dirname), and this lookup fails when hexdump.py is in archive. I'd say Python could provide some kind of an API to deal with "virtual import filesystem" (with watchers to track who and when modifies sys.path), but I digress. If you modify the `__main__.py` from the first chapter above to run tests - just call hexdump.runtest() - the execution will finally fail with the following error:

Traceback (most recent call last):
  File "/usr/lib/python2.6/runpy.py", line 122, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.6/runpy.py", line 34, in _run_code
    exec code in run_globals
  File "hexdump-3.1.zip/__main__.py", line 13, in
  File "hexdump-3.1.zip/hexdump-3.1/hexdump.py", line 311, in runtest
IOError: [Errno 20] Not a directory: '/root/hexdump-3.1.zip/hexdump-3.1/hexfile.bin'

(yes, I am hexdumping under root). The problem above was explained by Radomir, and I feel like the post is already tool long to write about it myself.

More bonus points

In theory it is possible to hack setup.py to produce executable source .zip package. By default, setup.py sdist command on Linux creates tar.gz files. So, first it needs to be forced to create .zip file. This can be done by overriding default --formats option:

# Override sdist to always produce .zip archive
from distutils.command.sdist import sdist as _sdist
class sdistzip(_sdist):
    def initialize_options(self):
        _sdist.initialize_options(self)
        self.formats = 'zip'

setup(
    ...
    cmdclass={'sdist': sdistzip},
)

I posted relevant documentation links to this StackOverflow question. The rest - how to inject __main__.py into the root of packed .zip - is yet to be covered, and I am not ready to research it just yet. Some hints may be provided by these answers.

Usability conclusion

I am actually quite happy with the ability to execute python .zip packages (which gave a lot of motivation to write this post), because previously I had to care about how to tell people that a tool is a single script, which can be downloaded from repository or from some dedicated download site (I mean Google Code, which does not support this anymore). I also don't trust my own server for downloads, because one day I found some "benzonasos" running there and I don't even have a slightest idea how it got there.

But now it becomes possible to just download and run the stuff from PyPI to test how it works without messing with creating and activating virtualenvs, and installation of package inside. Of course, if your .zip package has dependencies, they still need to be present on your system, but that's still a time saver. Oh, and that means that my tools can now consist of several modules inside.