Friday, March 28, 2014

Python C API/ABI compatibility report

UPDATE: There is now an official thread on python-dev.

Upstream Tracker is an open source (GPL) tool that allows to track API/ABI changes between releases of C/C++ libraries. I asked Andrey Ponomarenko, who is the main maintainer of the project, to add Python to the list and here is the result:

http://upstream-tracker.org/versions/python.html

Hope you like it.

Monday, February 17, 2014

ANN: xtrace 0.5 - indented function trace in Xdebug format

Xdebug is a PHP tool that allows to trace how PHP code is executed. Today I release a tool called xtrace into public (and more specifically into public domain), which allows to get the same (or at least very identical) output, but for Python.

Few years ago inability to trace function calls in Python came as a showstopper. I decided to write a familiar tool for Python. More than that - I wanted to integrate it into Spyder. But a year ago xtrace itself faced with showstopper. The showstopper was the behavior of execfile function, that I could not get right at that time, because of the docs, of my expectations and poor knowledge of English. Maybe there is a flaw in my cognitive abilities, but I tried to get at this problem several times and failed. Until recently some hackers from ZenSecurity team brought a concept known an pyjail to my attention. The challenge to prove that pyjail concept is impossible allowed me to concentrate on gory details of execfile works and knowing that documentation is totally confusing for my, I found the time to set my own experiments. You can read them at the link I've given above as well as some analysis why documentation that actually includes all the details can be bad and confusing.

The xtrace was basically broken for three years, starting from the version 0.2 - the day I put execfile() call from root to the xtrace module to the main() function. This placement changed the execfile() behavior, and while trying to debug that I also run into confusing dynamic behavior of dictionary returned by locals(). Opened can of worms made those parasites to completely consume my brain, causing much anger and frustration to be spilled around execfile() and locals() concepts over into Python lists. It is kind of relief now that I can name all the problems, analyse them and look back as enlightened. Being jobless I had a plenty of time to investigate, but I really don't want anyone to enter that state of confusing and helplessness that I had a year ago.

Hopefully, my experience with xtrace will clear the confusion for those who will try to use exec type abilities of Python for developing their own tools. Maybe it will result in a better Python API in the future, with better documentation and position-independent behavior.

I am interested to know the feedback that you can leave in xtrace tracker, such as if the output really matches PHP behavior, if it is accepted by PHP tools and how it behaves in different scopes of Python. It is interesting to convert it to Spyder plugin and see the usage in other tools, but I realize that I may not have time for that. The next focus for me is to add an easy API to xtrace to enable people to write they own tracers more easily. Focus on UX and everything else will come.

Sunday, February 02, 2014

ANN: hexdump 2.0 - view/edit your binary with hex tool

https://pypi.python.org/pypi/hexdump

Finally some prod that can be named feature-complete for release. It is cross-platform, meaning it should run the same on Windows (tested), Linux, and OS X. It is Python 2 and Python 3 compatible. And it released into public domain, so that you won't have any problems in reusing it for your commercial and non-commercial hacking.

For those who are unaware of what hexdump is, hexdump is a representation of any binary data in human readable form. This form is good for hacking, inspecting and debugging binary data and protocols, but it is also good for editing such data. I am not pasting the output if the tool to encourage you to play with it yourself.

It can be used as command line tool and as a library. The most simple way is to use it as a tool:
# install
$ python -m pip install hexdump

# dump
$ python -m hexdump binary.bin > dump.txt
...

# restore
$ python -m hexdump --restore dump.txt
...

P.S. I don't mind including `hexdump` as provisional package in Python standard library if anyone will be able to convince PSF to accept public domain, CC0 or MIT licensed code.

Friday, January 10, 2014

Draw a pixel with PySDL2

This is a minimal code to output pixel on the screen using PySDL2.

UPD: (March 2014) Up for PySDL2 0.9.0 (RenderContext renamed to Renderer)
#!/usr/bin/env python
"""
The code is placed into public domain
by anatoly techtonik <techtonik@gmail.com>
"""
import sdl2
import sdl2.ext as lib

lib.init()

window = lib.Window('', size=(300, 100))
window.show()

renderer = lib.Renderer(window)
renderer.draw_point([10,10], lib.Color(255,255,255))
renderer.present()

running = True
while running:
  for e in lib.get_events():
    if e.type == sdl2.SDL_QUIT:
      running = False
      break
    if e.type == sdl2.SDL_KEYDOWN:
      if e.key.keysym.sym == sdl2.SDLK_ESCAPE:
        running = False
        break

Tuesday, January 07, 2014

Open Source / Free Standards vs ISO/IEC

Intro

While trying to use Galaxy Note 10.1 as a tablet and remote control device for my Windows and Linux stations, I discovered awesome MIT licensed GfxTablet project (draw on your PC via your Android device):
GfxTablet shall make it possible to use your Android device (especially tablets) like a graphics tablet.
It consists of two components:
  • the GfxTablet Android app
  • the input driver for your PC
The GfxTablet app sends motion and touch events via UDP to a specified host on port 40118.
It was so awesome in its simplicity and protocol that I couldn't resist to build a Python client for it. It didn't take long (well, a day maybe), before I noticed two errors in its protocol. First is that byte order for fields with 2 bytes length was not described and appeared to be big endian (while I assumed the opposite). Second is that one of the fields described as 2 bytes ushort was actually 1 byte size octet. After reading the source, I found the mistakes and edited the protocol description from the web to fix them, which resulted in this (already merged, yay!) pull request.

Standard on Byte Size

I've got an interesting comment on my pull request:
"octet" is more accurate than "byte" because there are systems and programming languages which have a byte size that is not 8 bits
My natural reaction was "No way! That's can't be true.", but Wikipedia said I am wrong. Luckily, it also said that:
The de facto standard of eight bits is a convenient power of two permitting the values 0 through 255 for one byte. The international standard IEC 80000-13 codified this common meaning.
This was the first time I thought that ISO did something right, so I decided to take a look myself at this standard. I found two copies - IEC and ISO. ISO site has a better SEO department, so they've got a better Google position for their shop. Yes, the ISO and IEC are shops - the price to get official size of byte is ISO CHF 154,00 or IEC CHF 150. They really like to have accounts in Swiss banks for some reasons. I'd advice to sell those standard in Bitcoins instead - it is more profitable in a long term.

ISO/IEC as Commercial De-Facto Authorities

Why do we need some organizations like ISO/IEC that place their name and put limitations around access to de-facto standards that more like any other information want to be free? I'd say that our awesome decentralized and independent approach to develop what do you feel and support what do you want is just not widely exposed to those conservative oldschool bureaucrats, who still live in their own world of central authorities that should dictate people what to do.

Don't get me wrong. There are conflicting points when you DO need to set a standard, and an enforcing organization like ISO/IEC is required (enforcing, because market force business to comply no matter how "recommended" the standard is). The costs of dealing with conflicting parties and convincing them is high, and that's why they set price on papers (the calculation if the price is fair is thankfully out of scope for this post). But the thing that bothers me more is that they set price on assessing the facts that are de-facto standards and common knowledge.

The problem here that we separately don't have a tool to say something as whole. The problem here is that if anybody will try to speak for the whole net, the net would resist and that's natural. Because people tend to say too much in one phrase and they are too smart. The reason is to keep the facts short, clear voices from responsibility and make it all countable and strictly out of politics.

What can be fixed here?

Usually people sign petitions. I propose to extend this just for fun. Make a technical statement that countries should agree on (no politics, please), give people an opportunity to support these openly, say if they don't want to support openly and give ability to support in closed manner (?respect privacy), the same way for disapprovals. Disapprovals may carry a reason. Once this data is in place, let people upvote and see what will happen.

The statement - "the byte is 8 bits".

Then build a list of countries that nationally accepted this statement. Then name this initiative somehow - it is important to keep this strictly technical to shot the zombies.

Once a statement reaches some degree of exposure and human votes per country, country can decide to accept it by placing official signed statement online. Over the time, the statements can be combined into free will packs and signed too. This will allow to sync.

Arguments. I am not sure they are needed for de-facto standards, because you reach consensus not by persuasion, but by collecting overall feeling. However, if there are problems with de-facto ways, and people feel there is something wrong, there should have an ability to "opt-in for a change" to upvote/downvote such arguments too. People should not be ashamed to set a value of "my butthurt" meter when voting or proposing counter-arguments, because we are irrational by our design, and technical problems with standards need more feedback than any other area of development on the butthurt effect.

So, simple statements, public voting, open process, feddback, realtime status and summary on nation adoptions.

Who should do this?

I'd be interested in working on this if I had some place to live in of my own. I'd start from contacting guys from Stack Overflow to reuse open source parts of their experience. Quite boring, right? Well, I am not saying that I want and plan to work on this alone. I am just saying that I am not in a position to take a role of coordinator. I just want to says that if you like the idea, maybe even in some crippled variant, found the resources to go, and want to try, feel free to ping me.

One of the tools that was really close and impressed was (now defunct) http://hammerprinciple.com/ which helped me to discover bad things about my favorite version control and programming language without too much butthurt injury. Hopefully, it will strike back again.

Thursday, November 28, 2013

Roundup Tracker: Create Issues by Email

There is one thing about bugs.python.org and other Roundup issue tracker instances that is not widely known. It is the fact that you can create new issues and update old ones directly from your email client, without visiting web interface at all.

As much as I hate Debian's email-only tracker, I must admit that having email control feature in addition to web interface can save some time, especially if you constantly forget passwords for different trackers like me.

So, to create new issue, just send email to the address that tracker uses to send mail to you. Well-known addresses of Python trackers:


Note that your email needs to be present in tracker database for it to accept your request, so you might need to create your account first.

You can also update existing issues by adding suffixes like [status=closed;resolution=invalid] to the subject field of your replies. I just closed issue19825 to test this method. You can try it too next time you feel uncomfortable about escaping from your mailbox.

This stuff is actually documented in official Roundup docs, but who reads the docs, anyway.

Thursday, November 21, 2013

Mercurial UX: Undo/Redo Wanted

This is an adapted mail for Mercurial mailing list, which is good to have as a blog post for reference.

For those of you who was born in Github era, Mercurial is an alternative version control system with transparent, pythonic internals. Because all my projects are escaping to Github I took a chance to reiterate over my knowledge of HG and see what I missed over the years of using it. This is just one idea.

This year, Mercurial introduced new ChangesetEvolution concept, which allows to safely mess with repository history. I decided to take a look and started with 'hg fold' command. Quite soon I got into a usual state of missed RTFM evening (you know the evening when you have a plenty of time to read a book with a cup of coffee). I couldn't understand what happened, but I clearly knew it is not something I want, so I wanted to get my repo back into the initial state. There are a lot of commands like 'rollback', 'revert', 'update -C', 'backout', 'strip' to revert the state after some command, but the real problem is to choose the right one. So I thought that it is something that is missing.


In Mercurial (and in other version control systems as well) - there is no concept of "operational transaction". In databases no matter what you do, if transaction is not committed, the state is reverted. These are called atomic transactions. Before Subversion there was CVS with non-atomic commits - if there was an error with some file (merge or something) - you got half of files committed and half not. Awful, right? After SVN all commits are atomic - if something is wrong, nothing is committed. Atomicity is important for user operations too. If something goes wrong - I want to get back from where I started. In Mercurial it works by making a backup copy of your repo. I guess for Git it's the same.

So, no obvious command to revert the last operation, no atomicity on operation level. This makes me feel unsafe and unsure about what can I do in my clone if I am too lazy to make a copy. And I thought that the next step in Mercurial evolution would be going from "user command" to "user operation" concept.


"user command" is a command like `hg inc` that users type in command line. It can affect the state of repository or not.

"user operation" - is a command or commands that change the state of repository. The "user operation" has a property of being "revertible" or "not". Granularity of changes to repository (how many commands is one operation) is decided using the high level user level goal to undo and redo these operations. For example 'hg fold' is an command that can be undone. It is a separate "user operation" and an entry in "undo history".

"user command" that modifies state may have "reverse command" that brings the changes back to the initial state. But maintaining this on command level is too fragile and hard to remember "commit/rollback". "user operation" may not have a "reverse command" - it may just be reverted without dedicated reverse command (like when you replace clone with your backup copy). And for that you need "undo history".

"undo history" is a stack of "user operations". These can be revertible or not - it depends on the logic. And it is not a commit log - it is operations log. The direct analogy is GIMP undo history dialog.



Now that the concept of the feature wanted is clear, some blueprints for the starter.

From the usability POV, a mercurial operations history dialog is a list, where each entry contains:
- operation name
- if it can be undone
  - if not, state the reason
    the reason is necessary to understand either:
       1. current condition of repository
           - what should be adjusted to enable undo
           - why adjustment can not be automated
       2. what should be written in hg itself to make it possible
           - pointer to dev docs and status page

Summary:
 * user command ('hg inc', 'hg ci', ...)
 * user operation (hg command that changes state)
 * undo history (stack of latest user operations)
   * undo history items are frozen if reverting is impossible
   * undo history is local
 * state explanation between operations

Links:
https://www.google.by/search?q=undo+pattern  - command and memento patterns can help
https://bitbucket.org/hstuart/hg-multiundo  - some work on the topic was done by Henrik Stuart

The final test:
 hg undo
 hg redo
 hg undo --list


If you have what to say, but are not subscribed to continue thread in official mercurial@selenic.com mailing list, then I guess it's safe to leave comments here.