Archive

Archive for the ‘Python’ Category

Google Style Guide

November 10th, 2009 1 comment

So, I’m reading through Peter Seibel‘s “Coders at Work“, which both Joel Spolsky and Jeff Atwood blogged about. At some point I want to write more about the actual interviews; which are fantastic. But today, I’m just sharing a link of something that I came across in the book. In Seibel’s interview with Brad Fitzpatrick, Brad mentions something about style guides that Google published. Well I found the link to them and passed them around my office and it started some pretty interesting conversations. Anyways, thought it was worth highlighting this link since it’s definitely worth a read through.

http://google-styleguide.googlecode.com/svn/trunk/pyguide.html

The reason I even thought about looking up the link is because there was a discrepancy in the way people were naming Python modules in the project I’m working on, which was making me a bit uncomfortable. I was wondering what the Google folks had to say about it. With Guido, of course, their opinion has some clout in the Python world…

I’m only on the fourth interview and I’ve already been exposed to lots of great material. One thing that I finally did — after years of wanting to take the plunge — is I ordered The Art of Computer Programming books (“volumes”) by Knuth. I *almost* buy these books every single year since 1999, but it just seemed time…if you read the “Coders at Work” book, you’ll see why… One great thing is that they put out a new version of the books in paperback, and they’re MUCH cheaper than previous versions. So that made the decision to finally buy them very easy.

Categories: Programming, Python Tags:

crontab shell

August 10th, 2009 1 comment

I wrote a Python script that I want to run every five minutes through a crontab. The script ran fine and linked with my local libraries until I ran it through the crontab and it couldn’t find my local libraries. After a bit of thought, I realized that the crontab was not running through the same shell environment as I expected it to. Apparently the shell is set in /etc/crontab file and there it was set to bash. That in turn was calling the wrong version of Python, and that was why my local Python scripts weren’t being found.

There are four different ways around this:
1) Modify /etc/crontab to hit the right shell; in my case (first line modified):
/etc/crontab

SHELL=/bin/tcsh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

crontab -e

*/5 * * * * python /my/script.py

2) Add the shell value to the top of the custom crontab (crontab -e); in my case:

SHELL=/bin/tcsh

*/5 * * * * python /my/script.py

3) Run the actual command through the tcsh as so:

*/5 * * * * tcsh -c "python /my/script.py"

4) Directly request the right version of Python:

*/5 * * * * /tools/bin/python /my/script.py
Categories: Linux, Python Tags:

os module gotya

August 4th, 2009 No comments

Apparently documentation is important sometimes… I’m writing this Python script that’s run in both Linux and Windows. My development happens on a Linux box, so until I move onto testing my scripts on Windows, I don’t realize the discrepancies. I wanted to get the username of the user, and there is this command in the os module called getlogin() which worked perfectly on my Linux box. In Windows it raises an exception. I checked the documentation and it says that getlogin() is only supported in UNIX, go figure… The way it recommends to get the username is from the os environment variable dictionary. The documentation tells you to use the “LOGNAME” key. I’ve found that “LOGNAME” does not work in Windows, but “USERNAME” works in both…

import os
os.getenv("USERNAME")

Anyways, tread lightly and always look at the documentation, even when something seems to be named so appropriately in such a mainstream language/module. One of the things that I really love about Python is that it usually does what you expect it to do. I guess nothing’s perfect though, and it pays to always be on your toes.

Categories: Python Tags:

Python documentation

April 29th, 2009 2 comments

Was looking into auto-generated documentation for my python code and came across the following projects: pydoc, epydoc, and sphinx. Pydoc was pretty limited and didn’t output the most useful data. Epydoc is quite easy to use and does a great job at showing all your classes, functions, and variables. And then there was Sphinx. Sphinx is really the end-all and be-all documentation tool. It does a lot and lets you do a lot, BUT you have to put a little time into it. So if the documentation is just for your own purposes, it might not be worth it. If the documentation’s for an open source project, it’s probably worth the fuss, because it’s really not that bad and makes some really good looking docs. My one further point on this, is that the other tools are very limited in scope compared to Sphinx. You use Sphinx as a tool to write documentation outside the documentation in your code. The other tools pretty much strip the doc-strings from your code and make them pretty. Definitely worth looking into Sphinx for any serious open source project.

As an aside, I always wondered what Django used to make their documentation, and now it seems it was Sphinx all along. http://docs.djangoproject.com/en/dev/internals/documentation/

Epydoc: http://epydoc.sourceforge.net/
epydoc --html python_file_1 python_file_2 python_file_3 version -o packrat_docs
screenshot

Sphinx: http://sphinx.pocoo.org/
By far the nicest of the three tools is Sphinx.
The Python documentation was done using Sphinx: http://docs.python.org/dev/
Sphinx was a little confusing to setup. First off there was a dependancy to jinja2 that really wasn’t spelled out (beyond the errors that came up during install).

So after installing ‘jinja2′ it was as simple as running
sudo easy_install -U Sphinx
screenshot1

pydoc: http://docs.python.org/library/pydoc.html
Command looks something like this:
pydoc -w python_file-1
Seems to want the data to fit a certain setup.
screenshot-1

Categories: Python Tags:

Ignore Python bytecode files in svn

April 27th, 2009 No comments

When loading up subversion’s status command it is quite annoying to get a list of files with the extension “pyc” in the list of files to look through. What you can do is setup rules within subversion to ignore these files. Subversion calls these settings “properties” and you can add them to different levels of your files (which actually can be confusing). What happened in my situation is that one of the folders I was working on did not have this value set, so that folder’s bytecode files were showing up but not any of the others.

The command is pretty simple:
svn propset svn:ignore "*.pyc" folder_name/

To view those properties that are set to a folder simply do:
svn proplist folder_name/

To delete the “svn:ignore” property do:
svn propdel svn:ignore folder_name/

Categories: Python, Subversion Tags:

qrc file?

March 23rd, 2009 1 comment

Just started working on this new project and ran into a filetype — and an issue — I’ve never ran into before. The file’s extension is “qrc” and it’s doctype is “RCC”, but it looks very much like a straight up XML file. I did a little lookup and found out it’s Qt’s “resource system.” Qt is a set of libraries written in C++ that includes GUI, networking, thread, and many other utilities. The resource system lets you embed different types of files into your programs more easily. These “resources” are then “part of” your program… Instead of having to figure out a way to store an image somewhere and then make the connection to your program, the image becomes actually part of your software.

I had to take the “resources.qrc” file that was in the codebase and convert it to python (since I’m using PyQt). The command to convert “resources.qrc” to “resources.py” is as simple as:

pyrcc4 -o resources.py resources.qrc

PyQT 4 Ref

Categories: Python Tags:

Revisiting long running processes in Django

March 2nd, 2009 7 comments

So in my last post on this topic I discussed different ways to handle the problem of running code that wold otherwise cripple your web server, by pushing them off to another process.

It was mentioned that there may be other ways to attack the problem, either through Python threads or Django’s signals. This post is a review of those two suggestions.


Django’s signals

I’ll attack the simpler option first, since the response to whether this solution is viable takes all of one word: no. Signals are not asynchronous by nature and therefore they do not work as a solution to the problem. It is true that you can setup signals to work asynchronously, but you’d have to employ one of the methods discussed to handle that. So if you were to use signals to handle some asynchronous work then all they would really be doing is changing the layout of that code. You’d still need something else to actually handle the asynchronous work.

I’ll offer up a simple example of how signal code looks just for completeness.

There are three pieces to signals.
1.The signal itself
2.The sending of the signal
3.The listeners of the signal

1. Setting up the signal is as simple as including a signals.py file in your application.

import django.dispatch
test_sig = django.dispatch.Signal()

2. All you need to do to send out the request to all the listeners is call “send” off the signal. In the following code snippet I’m calling the send from one of the view functions.

from django.shortcuts import render_to_response
from django.template import RequestContext
from my_project.my_app import signals
def home(request):
    signals.test_sig.send(sender=None)
    return render_to_response('home.html',
                              context_instance=RequestContext(request),)

3. To setup a listener you call the connect function on the signal and to have link the listener to an instance method or function. Beware: the connect call has to come after the function definition. This code can really sit anywhere within your project.

def signal_func(*args, **kwargs):
    print "Do something here"
test_sig.connect(receiver=signal_func)

Python’s threads

(Update: See Malcom’s comments below, which kind of nullify the positive things I wrote about threads and this problem.)

Threads are actually a very fitting way to handle this scenario. Of course the thought of using threads did cross my mind when first attacking this problem, but it’s the kind of thing I’ve ignored as an option because I know you can get yourself in trouble when you start fiddling with an animal like this, but at the same time they are very powerful and useful if used properly.

It’s really a very simple way to resolve this problem, just push off the work to another thread within the same Python application. And in Python making use of threads is actually quite simple.

Here’s a simple intro into Python threads. There are two levels of thread libraries. The lower level object is called ‘thread’ and the higher level threading module is call ‘threading’. In most cases you will probably be working with the ‘threading’ module.

Another thing to know about is the GIL (Global interpreter lock). Since Python is an interpreted language and threads seem to all be working at the same time, there has to be a way to insure that they are not both using the same objects at the same time. The GIL locks the interpreter so that each thread can safely use Python’s internal data structures. The lock keeps moving between threads pretty frequently (I think it’s something like every 100 bytes).

Important functions to know about from threading:
__init__: initializes thread
start: starts the thread
run: code that actually runs when thread is activated
join: when called waits for thread to finish before continuing

A very simple example:

from django.shortcuts import render_to_response
from django.template import RequestContext
import threading

class TestThread(threading.Thread):
    def run(self):
        print "%s starts" % (self.getName(),)
        import time
        time.sleep(5)
        print "%s ends" % (self.getName(),)

def threadView(request):
    testThread = TestThread()
    testThread.start()
#    testThread.join() # if you remove the first pound sign on this line the becomes synchronous.
    print "Prints right when requested just to show that the other thread is off on it's own"
    
    return render_to_response('test.html',
                              context_instance=RequestContext(request),)
Categories: Django, Python Tags:

Running long processes in Django

February 2nd, 2009 8 comments

My original issue was that I had this piece of code in my Django project that was taking way too long to run, sometimes even leading to a load time out. The particular situation happens very infrequently (at most once a week), so it was just a matter of getting the process to run successfully. Along the way, though, I learned a lot about spawning processes, distributed computing, and the different options we have in the Python community. The different approaches are just ways to get the processing to be done outside the Django process, of course.


cron and queue

cron
I will first start with the recommended process of taking care of this issue. You can setup a cron job to run some code that checks every minute to see if there’s anything to process and then run the appropriate code. Cron jobs are pretty simple to setup and pretty effective. All you need to do is edit the crontab with the time intervals you want the service to be run and your code takes care of the rest.

django-queue-service
The cron part of this solution takes care of when the processing happens, but what handles why it happens? So for that aspect of it you’ll need some way to know when there is processing to be done. There are of course multiple ways to handle this. Update a table in your database, update a file, or a folder… One way is to use django-queue-service. This method requires you to run the queue service as another django instance and then make requests to it. The sample code from the projects page looks as such:

import httplib2
client = httplib2.Http()
body_string = "message=%s" % (message_to_be_inserted_in_queue,)
uri = 'http://dqshost:8000/q/queue_name_here/put'
(response,content) =client.request(uri=uri,
method='POST',
body=body_string)
if response['status'] != 200:
print "Queue didn't accept the input"

While this method does make the most use of Django of all the methods I’ll discuss, I really have problems with it. It’s heavy handed, unnecessary, and I can’t even tell if there are security concerns. Let’s say that someone set this method up improperly and exposed this django instance to the outside world…

queue: Python module
There was some Python module I came across, which I can’t seem to find. When I find it I’ll post a link, but the way it worked is that it was a file based queue. It would add files to folders. And there were five different folders based on the status of the item of the queue. Ready, active, complete… This was a better way to handle the queue than the django-queue in my opinion, but still seemed a bit uncomfortable.


Read more…

Categories: Django, Python Tags:

Format Python Decimal

September 2nd, 2008 No comments

This shows you how to take a Python Deciaml type number and format it so that it shows up with two places after the decimal place (like dollar values).

>>> num = Decimal("5")
>>> num.quantize(decimal.Decimal('.01'))

Categories: Python Tags:

Append to PYTHONPATH in Ubuntu

August 13th, 2008 2 comments

Not sure how other distros have this setup, but I know this works with Ubuntu…

First off, to view your PYTHONPATH. Load the python shell, by just running “python” from your favorite prompt. Then the following.


>>> import sys
>>> for line in sys.path: print line

Modifying it is as simple as adding a path file (such as “myproject.pth”) to this folder:

/usr/local/lib/pythonx.y/site-packages

Then within the file “myproject.pth” put the path to the folder of interest.

Important asside:
Although this is useful to know, the reason I ended up figuring this out was to add the path of my project so that my own project could access a certain folder within the project. What I really missed was just adding an __init__.py within the folder I added to my project, which is why I couldn’t treat the folder as a module… Arg!

Categories: Linux, Python, Ubuntu Tags: