Saturday, May 23, 2015

Writing a programming book? Don't compose an utility library!

I came across two books recently, in which the authors decided to write an utility library. The first book was Python in Practice, by Mark Summerfield (my opinion about the book can be found here), and the second, which I'm still reading, is Doing Bayesian Data Analysis, Second Edition, by John Kruschke. A separate review will be added when I will finish reading it.
The books are different in their nature: One is about python programming, while the other is about statistical methods, and uses the R programming language for hands-on examples and exercises; the first book is average quality overall (IMHO) and the second is absolutely amazing! However, I believe that I may be able criticize the utility libraries that came with the books in the same manner: Don't do this!

And why?

Installation process breaks conventions 

When I need an external tool in a python project I know I have pypi to rely on for finding packages. I have pip to easily install the package and prefer to work with virtualenv whenever possible. This set of tools help me in maintaining a sane codebase, and reduce the effort of managing the dependencies by my own.
There is no chance that I will copy an external module into my project and source control it unless I'll have to, so why to use this module in an educational project in the first place?
I really don't know what is the convention in installing R external packages, but I believe that Kruschke suggestion of sourcing his supplied scripts is not the proper way to do this (enlighten me if I'm wrong).

Package maintenance / code quality

Before I'm installing an external package I tend to search about the package quality. First thing is checking how many stars the package have on github and how many times it was downloaded from pypi.
And there is a reason behind it: I can rely on packages that are used often to have better code quality; through gihub I can browse the package issues / latest commits and make sure that it is still maintained.
I'm sure that books authors invest a large amount of time in writing their utility libraries. But code free of bugs doesn't exists, and I prefer to know that the codebase is maintained before I use it (again, without distinction between educational and "real" projects).

Not specific enough

If your utility library is a mix of different solutions for different problems, it might not worth keeping in our toolbox. The above is probably more relevant to Python in practice than to Doing Bayesian Data Analysis, but I think it's still worth mentioning.

Documentation

When I choose a tool to work with I want it's documentation to be top notch! Take django for example. The project's documentation is not less than perfect, including a great tutorial for beginners. I really don't want to look for the book when I'm interesting in put in use some less obvious function from an utility library.


What I'm expecting from authors instead

  • If you think that your utility functions worth it pack it and publish it as any other package.
  • I really don't mind reading one or two additional pages of code in your book, if there's something interesting in it. Again, if the code deserved to be mentioned in your book, it may be also deserved to be talked about explicitly.
  • If this functionality exists elsewhere you should reference it, and advise the user to use it. I've never wrote code in R, but was ready to learn how to work with its ecosystem. I expected Kruschke to teach me that, instead of showing me how to source his supplied scripts.
 

Late disclaimer

Don't get me wrong, supplying code as part of your book is great! But there are different ways to do it: David Beazley's Python Cookbook is full of code snippets, fully commented and explained; In Test-Driven Development with Python Harry Percival guides the reader in developing an webapp with reference code available at github.
Don't get me wrong 2: The above doesn't mean that the books are bad.

Edit:

Don't miss Kruschke's comment below! He lights the above topics from different angle and supplies great arguments for his decisions.

Friday, May 22, 2015

Scipy_lecture_notes.epub

I've bought an ereader today. There are the usual 3 weeks shipping though, but I'm already trying to manage a library for when I will get the device.

Some of the books I'm interesting in or those I already own are freely accessible. One example is the "Python Scientific Lecture Notes". On the other hand I couldn't find a decent epub of the above, only pdf.

I've built the book in the epub format, from source, and it would be great if someone else will enjoy from it too!

Scipy_lecture_notes.epub

Monday, November 17, 2014

Python readings

I usually learn anything new by reading books. In fact, I got almost all of my python knowledge (which is not a lot, I'm just an apprentice programmer) by reading python books.
A year ago I've started to learn web development from Udi Oron in Hackita (my impressions here), and shortly after started to work with him as a python teaching assistant in his courses. Few months ago I've got a permanent position in one of those companies we've taught in and the stigma of someone that can answer everybody's python questions still sticks to me in the company. Between those questions are how to get started with python and where to find information regarding specific topics.
Hence, here is my thoughts about the books that helped, and still helping me learning python.

Disclaimers

- I'm new to python and the world of programming.
- Books won't do the work for everybody.
- When I first started to learn python I never thought I will end up making my leaving out of it, so I've learned python 3 (which is preferable language IMHO). However, most of the industry still uses python 2. All of the books below are for python 3. It doesn't mean that they won't help you learn python 2 also, but you will have to find the differences by yourself.

Beginners books


The Quick Python Book, Second Edition (Naomi R. Ceder)


After trying different books for python (Think Python, Dive into Python 3 and Head First Python) I've found this one to be the preferable as a learning book for someone that already saw some code, but is definitely not an experienced programmer.
Part 1 is a short introduction that may also be used as a quick reference. Part 2 is very organized tutorial for the language. It contains most of the essentials and will give you the feeling that you can continue learning by your own (or with more specialized books / tutorials). Part 3 is much less cohesive then part 2. It seems that the chapter about regular expressions could get into part 2 but the rest of the section is too much esoteric and there are some mistakes through all of it (for example, it refers you to the appendix for more information that is not there).
I didn't read part 4 completely. I've only read the information about working with databases in chapter 24 and it is very well written.
Summary: For part 2 I will give 5 start without hesitations. But part 3, although less significant, doesn't deserve it. After all the book is very recommended.

More advance / intermediate books


Python in Practice (Mark Summerfield)


I bought this book primary for its chapters about design patterns as well as the concurrency and the networking chapters (1 to 3, 4 and 6 accordingly). The book doesn't meant to be read from start to finish, but as a reference and guide to each topic separately. I think that from the above chapters I've already read most of the content, as well as the chapter about GUI with tkinter. I have nothing to say though about the two remaining chapters (extending python and 3d graphics).
The best chapter of this book is the one about high-level concurrency. In this chapter Summerfield explain with details the difference between CPU-bound and I/O-bound concurrency and have a strong suggestions regarding the tools to use for concurrency with python 3. Namely, the suggestion is to use the threading, multiprocessing and concurrent.futures modules and never use locks or other lower level synchronization primitives explicitly, use queues and futures instead. The examples are good, although I found the code unnecessarily complex sometimes.
On the other hand, I found the chapters about design patterns to be much less fruitful. The author attitude is too object oriented for me where things could be done much easier using a decorator or two instead. The code examples too, are complex and non pythonic.
I'm sure that there are much better and approaches to high-level networking then those described in this book. The author implement remote procedure call server and client. Simple examples can be done in a simpler manner then the suggested code and advance use cases may prefer higher level 3rd party libraries and frameworks that removes much of the boilerplate (e.g. Django + DRF for REST server + requests based client).
Summary:The high-level concurrency chapter is really great and deserve 5 stars, but the rest of the book is ranging between 2 and 3.

The Python Cookbook, 3rd edition (David Beasley)


After disappointing from "Python in Practice" I've came across this book as one with similar scope, namely, a design patterns book, organized into chapters by topic that you can read in any order. In addition, this one is also great reference book by the fact that most of the suggested patterns are described in a short, self-contained manner.
This is a really great book! Beasley's attitude is so pythonic. AKA: readable, simple, DRY, less OO and more functional whenever possible, smart usage of the standard library / 3rd party high level libraries.
Most of the chapters of the book are really fluent and easy to read. I've found the meta-programming and the object oriented chapters a bit more complex, but still great after the 2nd or the 3rd read as the ideas demonstrated there are a bit too advanced for my background.
Summary: Assuming you already know python (don't read it if you don't) I think that this book is a must have. 5 stars are barely enough.

Scientific computing with python


As already noted, I've never thought that I will find myslef programming python in a full time job. Essentially, I've decided to learn python as a data analysis tool for my MA research. These are the main sources I've used to get the necessary knowledge.

Python for Data Analysis (Wes McKinney)


It's not a bad book but if you are looking for a good book for scientific computing with python you will probably be disappointed.
The book covers mostly the pandas library. It doesn't give much information about numpy and matplotlib, and say completely nothing about scipy, which are all more essential for scientific computing than pandas as far as I understand that topic.
On the other hand, pandas is your tool to go if you need to work with spreadsheet oriented data (the library highlights page summarize its strengths pretty good).
This book was one of the first python books I've read, together with the quick python book above. It explains pandas in a very introductory way (pretty slow), which make recommending this book even harder: If you are a beginner, this book is written in the right level, but on the wrong content; If you are a more advanced programmer looking to learn a bit of pandas you may find the tutorials here comprehensive enough.
Summary: Pandas is a great tool, use it! But I don't think that this book is a good your way to learn data analysis with python, whether you are a beginner or not.

Python Scientific Lecture Notes


I have to admit, I've read only the first section of the "lecture notes", but if you are looking for an introduction to scientific computing with python this "book" is definitely worth reading. It covers the basics of numpy, matplotlib and scipy very concisely, with lots of short but working code examples.

Web development with python


Two Scoops of Django: Best Practices for Django 1.6 (Daniel Greenfeld - AKA pydanny, and his wife Audrey Roy)


Can't say I've finish reading this book. It more like a reference you open anytime you need for some extra help on each topic, with emphasis on best practices.
Be aware that this book is not for beginners! But if you want to progress with python + django you're going to appreciate the suggestions found there. For django starters, go through the really good tutorial and write another django app before reading any of the suggestions in this book. It won't help you if you don't.
There are two editions for this book, for django versions 1.5 and 1.6. According to the authors there will be no more version of this book, so don't attempt to wait to one. Take the latest as it has much more content.
Behind the general recommendation and the versions stuff I will add that I don't like the "theme" of the book. The code examples themselves are great but there are lots of illustrations that doesn't really helping in explaining the concepts nor in remembering them.
Summary: If you take django development seriously just get yourself a copy, you won't regret it!

TDD with python (Harry J. W. Percival)


I've started to read this book only recently, so I'm still in the middle of it (somewhere around chapter 17). So my very warm recommendations are for those I've read.
Percival does a great job in explaining and demonstrating the TDD discipline, introducing web development with django on the way. Although I am already familiar with django I found the introductory attitude of the author more then appropriate, and it let me concentrate more on the TDD side rather on understanding the framework. On the other hand, there are lots of developers that prefer a more strait forward attitude, with less text and more working code snippets, so bear in mind that this is not the case with this one. Here, lots of code examples are written iteratively throughout the test cycles and upon several pages. I like it!
Behind introducing TDD, its the first time I manage to deploy an app to a real server (I've deployed some apps to heroku before, but it is different). I will surely recommend those chapters as stand alone tutorial for deployment (chapters 8 & 9 + appendix C).
The only downside I can think of is if you are not interested in web development at all. It will be too much work to translate the concepts in this book into completely different subject.
Summary: Great introduction to the discipline of TDD for web development. Very recommended. And you can even read it online for free here.

Ending words


I would really like to hear your thoughts about the recommendations, whether you agree with me and even more if not :-).
You are also welcome to contact me on any question about these books / other python resources and I will do my best to answer.

Monday, October 27, 2014

Participants movement tracking animations from my MA experiment #2

The following animated renditions are a byproduct of the video tracking an analysis of my MA thesis second experiment.


The figure above shows a schematic diagram of the experiment design. The videos are of session 1 to 3 of each of the groups (the last session wasn't analyzed). They have been for great help in gaining insights about the social interactions between the participants themselves and between the participants and the system components.







The analysis repository can be found at github.
Additional information about the research can be found here.

Friday, September 19, 2014

Create teams easily with Xteams!

I've been playing volleyball recently with a group of amateur players. In the last two months the size of our group has increased so much that it became very hard to create teams. And if you think that size is the only issue I can assure you that there are many more:

- How can one create teams when Dana doesn't want to play with Haim, who must play with Jacob but not with Yossi... You've got the idea.

- No one will ever want to help in creating teams as he may end up insulting a not-so-good player by choosing him last.

- Maybe you have too many players around for one game, but just enough for a tournament of 4 teams.

In order to solve these inconveniences I've created Xteams! a web-app with one goal in mind:

Create teams automatically based on discrete scores of the players

Using Xteams, group managers can give scores to players in the management panel. Players of the group can't access this panel but can see the list of players, mark which of them arrived to the game and create teams easily.

At the time of writing, the algorithm behind the teams' allocation was pretty simple. It takes all of the available players, and the number of teams to create, and tries to find teams with equal or close to equal strength (sum of the players scores) by generating several random allocations and choosing the best of them.

For devs

The app is still under development (aren't they all?), and many more modifications, improvements and features are considered. Any help in the development process is more than welcome (github repo).

Thanks

To the players of Nahlaot Veshut volleyball team, who consistently help with new ideas for features and additional improvements.

Wednesday, April 9, 2014

My research proposal - the most comprehensive text about the project

I've recently submitted my MA research proposal, titled: "Audio-Only Augmented Reality System for Social Interaction".
Usually, research proposals aims to present the subject and describe the intents of the current research. This one is a bit more comprehensive, presenting a fully operated system I've developed, preliminary results of the system evaluation, and the exact design for a future experiment.
Feedback is always welcome.

LaTeX source can be found @ github.

Sunday, April 6, 2014

Some experiments with SimpleCV - object detection by color

Computer vision is way far from my daily interest. But last weekend I participated in a semi-hackathon, developing code that aims to detect and track cards by their color.



Credits deserve to this guy. I've used his code as a reference and a starting point.
It is my first experience with SimpleCV. After all the code works pretty well, I think, despite the awful documentation of SimpleCV and some hard time working with the library.
As always, the code is written in python and is available at github. Any thoughts about the mini-project, the code and computer vision alternative libraries are always welcome.