  • O'Reilly's Python in a Nutshell
  • The Python language reference


Python 3
Including Language Reference.
Python 2.7
Other versions of Python are available [1]
Variants and distributions
  • ipython
  • Jupyter — The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
  • Anaconda
Coding style
  • References:
Nice example of generating / testing regex in Python (with nice / small test framework)
  • seaborn is a powerful python toolkit to visualize statistical data.
# As simple as
py-spy --pid 12345                         # Display activity of given pid in real-time!


Very clear, terse, covering many topics (strings, regex)


In a command shell, use pydoc to get help:

pydoc repr               # Get help on 'repr' command

Same can be achieved in python interpreter:

help()                 # Interactive help
help('repr')           # Same as typing 'repr' in interactive help
help(repr)             # Help on repr builtin


Virtual Environments

A Virtual Environment is a tool to keep the dependencies required by different projects in separate places, by creating virtual Python environments for them.


Update Python

 ❗  It is not recommended to update the system Python

Some links:

Install pip and setuptools

To install setuptools, the easiest is to use pip, which comes pre-installed in later versions of Python:

pip install -U setuptools

To bootstrap the setuptools on an naked installation:

cd /path/to/your/python
wget -O - | ./python
wget -O - | sudo ./python       # System-wide
wget -O - | ./python - --user   # User-local path

See Install pip setuptools and wheels for more information.

Install module online

Python comes with a wide range of libraries, called modules. There are several ways to install these modules.

Using the distribution
  • For instance, in Debian:
apt-cache search --names-only python-       # View available modules
sudo apt-get install python-pyscard         # Install the pyscard module
Using pip

pip is the new way to install modules. It uses the wheel format.

sudo pip install Pygments

This is equivalent to:

sudo python -m pip install Pygments

This last form can be used to explicit which python runtime must be used:

sudo /path/to/your/python -m pip install Pygments

Use --user to install for user only:

pip install --user Pygments

Use --target SITE to specify manually the target SITE:

pip install --target SITE Pygments

See tip below on how to obtain the default site.

Using easy_install

easy_install is the old way to install modules. It uses the egg format.

sudo easy_install Pygments
Using the source

Download and uppack the package

tar -xvzf pyscard-1.6.12.tar.gz
cd pyscard-1.6.12

To install globally (in /usr/local/lib/python2.7/dist-packages or similar):

sudo ./ install

To install locally (in ~/.local/lib/python2.7/site-packages, use --user:

sudo ./ install --user

One can also use pip to install from source:

sudo pip install .       # Global install
pip install --user .     # Local install

Install modules offline

To install a Python module on a machine that has no connection to Internet [3]:

  • On a machine with internet connection
# For instance, to install package neovim
mkdir tmp && cd tmp
pip download neovim
  • On the offline machine, which has access to tmp/:
# For instance, to install package neovim
cd tmp
pip install --no-index --find-links ./ neovim

Interactive mode

Python can be run interactively, which is a very powerful way to develop new applications.


To import an existing module, use import as usual:

import mymod             # Import module in current session
from mymod import *      # Idem, but remove mymod. prefix to symbols

iPython / Jupyter

To import an existing module, use import as above or command run:

run mymod

Python variants


Use iPy (ipython) to get an interactive shell with auto-completion, instant help...

%magic                    # Get help on %magic commands (%run,...)
?run                      # Get help on %run magic
%run            # Run given script
%run -i         # ... with inspect mode on
%run -i -e      # ... ... and ignore sys.exit() call
!cmd                      # Run shell command 'cmd', for instance ...
!ls                       # ... List file in current directory


PyPy is a fast, compliant alternative implementation of the Python language, which usually runs python programs faster thanks to its Just-in-Time compiler.

On Lucid 64-bit, the easiest is to download the dedicated tarball:
tar -cvjf pypy-2.2.1-linux64.tar.bz2
Install virtualenv, then install pypy as virtual environment my-pypy-env
sudo apt-get install python-virtualenv
virtualenv -p pypy-2.2.1-linux64/bin/pypy my-pypy-env
Modules must be installed separatedly for this virtual environment. For instance
./my-pypy-env/bin/pip install libnum
Run python programs using python or pypy

Python 3 Reference

Source: Python reference, w3schools python tutorial and O'Reilly Python in a Nutshell


False      await      else       import     pass
None       break      except     in         raise
True       class      finally    is         return
and        continue   for        lambda     try
as         def        from       nonlocal   while
assert     del        global     not        with
async      elif       if         or         yield

In addition, the following have special meaning:

  • _*. Also _ is last evaluation result in interactive mode.
  • __*__ system-defined names.
  • __* class-private names (rewritten as mangled form by the compiler).


See Literals in Python reference and Python in a Nutshell.

42         # Integer literal
3.14       # Floating-point literal
3.14e-10   # Floating-point literal
1.0j       # Imaginary literal

[42, 3.14, 'hello']    # List
[]                     # Empty list
100, 200, 300          # Tuple
()                     # Empty tuple
{'x':42, 'y':3.14}     # Dictionary
{}                     # Empty dictionary
{1, 2, 4, 8, 'string'} # Set
# There is no literal to denote an empty set; use set() instead
string literals (str objects)
night"""               # Triple-quoted string literal
r"\b\x"                # raw -- ignore escape sequences
R"\b\x"                # raw -- ignore escape sequences
f"name is {name!r}"    # formatted string literals
bytes literals (bytes objects)
rb"abc\x81\x82"        # raw -- ignore escape sequences
RB"abc\x81\x82"        # raw -- ignore escape sequences
formatted string literals (3.6)
f'His name is {name!r}'     # !r conversion, applies repr()
f'His name is {repr(name)}' # equivalent
                            # !s does str(), !a does ascii()
f'length is {len(name)}'    # expression
width=8; prec=3;
f'{3.14159:{width}.{prec}}' # integer formatting
number = 1024
f'{number:#0x}}'            # integer formatting
today = datetime(year=2017, month=1, day=27)
f'{today:%B %d, %Y}'        # date format specifier


+       -       *       **      /       //      %      @
<<      >>      &       |       ^       ~       :=
<       >       <=      >=      ==      !=

Operators and their evaluation order, from highest to lowest:

, [...] {...} `...`                   # Tuple, list & dict. creation; string conv.
s[i] s[i:j] s.attr f(...)             # indexing & slicing; attributes, function calls
+x, -x, ~x                            # Unary operators
x**y                                  # Power
x*y x/y x%y                           # mult, division, modulo
x+y x-y                               # addition, substraction
x<<y   x>>y                           # Bit shifting
x&y                                   # Bitwise "and"; also intersection of sets
x^y                                   # Bitwise exclusive or
x|y                                   # Bitwise "or"; also union of sets
x<y  x<=y  x>y  x>=y  x==y x!=y  x<>y # Comparison,
x is y   x is not y                   # identity,
x in s   x not in s                   # membership
not x                                 # boolean negation
x and y                               # boolean and
x or y                                # boolean or
lambda args: expr                     # anonymous function
Arithmetic operators
1//2                                  # Floor division (PEP-238)
ternary operator
x_sign = 'positive' if (x>=0) else 'negative'
  • Use is or notfor testing None
if (p.poll() is None):         # Use 'is' for testing None
    print "None"
if not p.poll():               # ... or 'not'
    print "None"


(       )       [       ]       {       }
,       :       .       ;       @       =       ->
+=      -=      *=      /=      //=     %=      @=
&=      |=      ^=      >>=     <<=     **=

Characers with special meanings as part of other tokens:

' " # \

Data types


True            # constant for true
False           # constant for false
bool(x)         # To convert to bool built-in type

Avoid unnecessary call to bool(x).

if x:                     # GOOD
if bool(x):               # BAD
if x is True:             # BAD
if x == True:             # BAD
if bool(x) == True:       # BAD

A valid use:

def count_trues(seq): return sum(bool(x) for x in seq)   # Ensure each item is counted either as 0 or 1


Strings in Python are immutable objects. There are many differences between Python2 and Python3.

Python 2 Python 3

There are two type of strings:

  • str (like 'foo') that are bytestring, ie. array of bytes.
# <type 'str'>
  • unicode (like u'foo') that are textual string (Unicode).
# <type 'unicode'>

There are two type of strings:

  • str (like 'foo') that are textual string (Unicode).
# <class 'str'>
  • bytes (like b'foo') that are bytestring, ie. array of bytes.
# <class 'bytes'>

So Python3's 'foo' is Python2's u'foo', and Python2's 'bar' is Python3's b'bar'.

Converts bytes to str (unicode).
Converts str (unicode) to bytes.
bytes only. Converts bytes to str (unicode).
str only. Converts str (unicode) to bytes.
b'hello' == 'hello'.encode()          # str to bytes
'hello' == b'hello'.decode()          # bytes to str
"def" in "abcdefgh"                   # substring
s.upper()                             # Change 'uppercase' to 'UPPERCASE'
', '.join(set_3)                      # Join a sequence
map(ord, hex_data)                    # [0xDE, 0xAD, 0xBE, 0xEF]


See Bitstring module.


Nice tutorial:

print a[1]                     # 3

a=[0] * 1000                   # Array with 1000 elements
len(a)                         # Number of elements

b=a                            # This only copy the REFERENCE
b[0]+=1                        # ... this also changes a[0]
b=a[:]                         # This makes a NEW COPY
b=a.copy()                     # PYTHON >3.3

a[:]=a[::-1]                   # Reassign element in the list (here in reverse order)
a=a[::-1]                      # Idem, but create a new object

a.append(12);                  # Create object before appending
a[len(a):] = [13];             # Same as appending

list("abc")                    # ['a', 'b', 'c']

line = '1234567890'
n = 2
[line[i:i+n] for i in range(0, len(line), n)]   # ['12', '34', '56', '78', '90']

def shiftRow(word, n):
    return word[n:]+word[0:n]
state[i::4] = shiftRow(state[i::4],i)      # Apply shiftRow on 4 bytes distant of 4 each

alist = map(lambda b: sbox[b],alist)

state[:] = [ a ^ b for a,b in zip(state,roundKey) ]    # Ex-oring 2 lists of integers

# Multi-dimensional list
matrix = [[0 for _ in range(5)] for _ in range(5)]     # Initialize bi-dimensional array
matrix = [[0]*5 for _ in range(5)]                     # faster way
# matrix = 5*[5*[0]]                                   # WRONG - 5 times copy of same

# Compare - simply use ==
[1,2,3] == [1,2,3]                                     # True
[1,2,3] == [1,2,3,4]                                   # False
[1,2,3] == ['a','b']                                   # False

# ... to remove order and duplicates, use set()
set([1,2,3]) == set([2,1,3,3])                         # True

# Sort

# Sum
sum(x <= 10 for x in a)
sum(1 for x in a if x <= 10)                          # List comprehension

def count(iterable):
    return sum(1 for _ in iterable)
sub10Count = count(x for x in a if x <= 10)           # Cheap (doesn't create useless list) and readable

# Adding (
                            [sum(x) for x in zip(list1, list2)]     # 177ms
from itertools import izip; [sum(x) for x in izip(list1, list2)]    # 139ms
                            [a + b for a, b in zip(list1, list2)]   # 112ms, most pythonic
from itertools import izip; [a + b for a, b in izip(list1, list2)]  #  71ms, pythonic
from operator import add;   map(add, list1, list2)                  #  44ms

import numpy as np
vector1 = np.array([1, 2, 3])
vector2 = np.array([4, 5, 6])
sum_vector = vector1 + vector2                                      # 25x faster

# Find *first* matching item
["foo", "bar", "baz"].index("bar")                                  # 1  !!! Throws ValueError if item not found

# Find all items
[i for i, e in enumerate([1, 2, 1]) if e == 1]                      # [0, 2]
g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
next(g)                                                             # 0
next(g)                                                             # 2
# Sort based on object attribute
ut.sort(key=lambda x: x.count, reverse=True)   # To sort the list in place...
newlist = sorted(ut, key=lambda x: x.count, reverse=True)  # To return a new list, use the sorted() built-in function...
(From stackoverflow [4])
for c in list(sha256.digest()):


D = { 'x':42, 'y':3.14, 'z':7 }
D['x']                              # 42
del D[k]                            # Removes from dictionary D the item whose key is k
#Spare matrix
Matrix = {}
Matrix[1,2] = 15                    # This works because 1,2 -- a tuple -- is used as a key

for key in d:                       # Loop over keys in dictionary d
for key, value in d.iteritems():    # Loop over keys and values in dictionary d

Control flow statements


if x < 0: print('x is negative')
elif x % 2: print('x is positive and odd')
else: print('x is even and non-negative')

# Better style (PEP 8):
if x < 0:
    print('x is negative')
elif x % 2:
    print('x is positive and odd')
    print('x is even and non-negative')


count = 0
while x > 0:
    x //= 2              # truncating division
    count += 1
    print('The approximate log2 is', count)


for letter in 'ciao':
    print('give me a', letter, '...')

# target can be a tuple
for key, value in d.items():
    if key and value:        # print only true keys and values
        print(key, value)

# ... or something else (LHS expression)
prototype = [1, 'placemarker', 3]
for prototype[1] in 'xyz': print(prototype)
# prints [1, 'x', 3], then [1, 'y', 3], then [1, 'z', 3]

# Using range():
for i in range(10):
for i in range(5,10):

#Using list comprehension:
result1 = [x+1 for x in some_sequence]
#... same as:
result2 = []
for x in some_sequence:
# Comprehension list may have 'if', or nested for
result3 = [x+1 for x in some_sequence if x>23]
result5 = [x for sublist in listoflists for x in sublist]

# Dict comprehension
d = {n:n//2 for n in range(5)}
print(d) # prints: {0:0, 1:0, 2:1, 3:1, 4:2] or other order


while True:               # this loop can never terminate naturally
    x = get_next()
    y = preprocess(x)
    if not keep_looping(x, y): break
    process(x, y)


for x in some_container:
    if not seems_ok(x): continue

for-else and while-else

for x in some_container:
    if is_ok(x): break # item x is satisfactory, terminate loop
    print('Beware: no satisfactory item was found in container')
    x = None


if condition1(x):
elif x>23 or condition2(x) and x<5:
    pass                  # nothing to be done in this case
elif condition3(x):

try-except-finally-else / raise

    print("An exception occured")
except NameError:      # Can give many except
  print("Variable x is not defined")
  print("Something else went wrong")
  print("Something  went wrong")
else:   # exec'ed if no error and NO BREAK
  print("try block finished")
  print("Something  went wrong")
finally:   # exec'ed no matter what
  print("the 'try except is finished'")
raise Exception("Sorry, that was wrong")


The with statement is the Python embodiment of the well-known C++ idiom “resource acquisition is initialization" (RAII)

with expression [as varname]:


a = 'global'
def afunction():
    global a                         # Use 'global' to change scope of a variable
    a = 'still using global'
    b = 'local'


def toh(cls,s):
    """ Convert a (binary) string into an hexadecimal string.
    >>> mc.toh('ABCD')
    >>> mc.toh('mycrypto')
    return s.encode('hex')

Docstrings can also be defined at module level.

Use module doctest to test examples in docstrings:

# Check docstring examples on exec (not on import)
if __name__ == "__main__":
    import doctest


An empty class:

class Empty(object):

A class with constructor and data members:

class Basic(object):
    __param = None                           # __* denotes a class-private member

    def __init__(self, param):
        self.__param = param
        print "Basic is born with param %s" % param

A class that inherits:

class Child(Parent):
    __param = None

    def __init__(self, param):
        Parent.__init__(self)                # Must call EXPLICITLY parent constructor
        self.__param = param

Class members can be defined as properties:

class Rectangle(object):
    def __init__(self, width, height):
        self.width = width
        self.height = height
    def area(self):
        '''area of the rectangle'''
        return self.width * self.height
    def area(self, value):
        scale = math.sqrt(value/self.area)
        self.width *= scale
        self.height *= scale

Classes may have static methods and class methods [5]:

class Rectangle(object):
    max_area = 10       # A class variable shared by all instances

    def __init__(self, width, height):
        self.width = width
        self.height = height

    def give_height(area,width):
        return area / width

    def get_max_height(cls,max_area):
        return cls.max_area

Using modules

Assume we have a module named

import module;               # Import everything in module.* namespace
from module import *;        # Import everything in current namespace

import sys
import module;               # Import module from a custom path

Use built-in __import__ to import a module whose name is in a string (

mymodule = __import__('mymodule')  # Import module from string - see

Python 2 reference


for i in range(10):
    print i

# Add a comma to remove carriage return
for i in range(10):
    print i,                     # 0 1 2 3 4 5 6 7 8 9

To enable Python 3 print function:

from __future__ import print_function        # Enable v3 print in Python 2.x

Basic I/O in Python

Source: O'Reilly Python in a Nutshell.

String formatting with format

Available since Python 3.

# v3 - String formatting
# '{[selector][conversion]:[format_specifier]}'.format(value)
'First: {} second: {}'.format(1, 'two')
'Second: {1} first: {0}'.format(1, 'two')                        # Give positional for all 
'a: {a}, 1st: {}, 2nd: {}, a again: {a}'.format(1, 'two', a=3)   # Give name for some
'a: {a} first:{0} second: {1} first: {0}'.format(1, 'two', a=3)  # Can mix name and positional

# Using sequences and composites:
'p0[1]: {[1]} p1[0]: {[0]}'.format(('zero', 'one'), ('two', 'three'))
'p1[0]: {1[0]} p0[1]: {0[1]}'.format(('zero', 'one'), ('two', 'three'))
'{} {} {a[2]}'.format(1, 2, a=(5, 4, 3))
'First r: {.real} Second i: {a.imag}'.format(1+2j, a=3+4j)

# Field width

# Precision specification
'as f: {:.4f}'.format(x)
'as g: {:.4g}'.format(x)
'as s: {:.6s}'.format(s)

See Python in Nutshell, chapter 8 for more information.

String formatting with %

Available in Python 2 and 3.

# format % values
'result = %d' % x               # %d - decimal
'answers: %d %f' % x, y         # %f - float
'%x' % hexval                   # Print hex
'File not found %r' % filename  # !!! USE %r to log possibly erroneous strings !!!

Input parsing

# Using built-ins

# Using ast.literal_eval()
import ast
print(ast.literal_eval('23'))     # 23
print(ast.literal_eval('[2,3]'))  # [2, 3]
print(ast.literal_eval('2+3'))    # raises ValueError
print(ast.literal_eval('2+'))     # raises SyntaxError

Text output

print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

for i in range(10):
    print(i,"",end="")             # 0 1 2 3 4 5 6 7 8 9 
for i in range(10):
    print(f"{i} ",end="")          # 0 1 2 3 4 5 6 7 8 9 

import sys;
sys.stdout                         # Standard output
sys.stderr                         # Standard error

# Output to a file
sys.stdout.write(...)              # Using write with stdout

Text input

import sys;
sys.stdin                          # Standard input

# Input (from stdin only)
input(prompt='')                   # v3: same as v2 raw_input; v2: same as eval(raw_input(prompt))
raw_input(prompt='')               # v2 only

# read a file
f = open("demofile3.txt", "rU")    # "r" optional, "U" for universal line ending
File input
# Context manager - txt is closed automatically
with open("test.txt", "U") as txt: # "U" for universal line ending
    for line in txt:
        print(line.rstrip('\n'))   # Or rstrip() to right strip all blanks (no need for "U" then)

# Even more compact
for line in open('test.txt', 'U'): # file will be closed when object out of scope
    print(line.rstrip())           # Or rstrip() to right strip all blanks (no need for "U" then)
import fileinput

# Iterate over all files in sys.argv or stdin
for line in fileinput.input():
    print(line.rstrip())            # Right-strip all blansk (CR,LF,SPC)

# Can override list of files -- here explicit use as context manager
with fileinput.input(files=('spam.txt', 'eggs.txt'),mode="U") as f:
    for line in f:

Standard Library

sys module


sys.argv, len(sys.argv)          # Argument list, number of arguments ([0] -> exec name)
if ("-h" in sys.argv) or ("--help" in sys.argv):
for a in range(len(sys.argv)):
    if sys.argv[a] == "-e":
        # handler



io module

To open a file:

# - mode can be 'r', 'w', 'a', 'r+', 'w+', 'a+', ...
#   Default is text 't', add 'b' for binary, 'U' for universal line ending
open(file, mode='r', buffering=-1, encoding=None, errors='strict', newline=None, closefd=True,

with as f:            # PYTHONIC way, open is a manager
    # ...

for line in          # PYTHONIC way to read line by line, file close automatically
    # ...

f =                   # BAD. No guarantee that f gets closed

File operations:

str =              # bytestring in bynary mode, text string otherwise.
str = f.readline(size=-1)
lst = f.readlines(size=-1)
f.writelines(lst)                  # Same as: for line in lst: f.write(line)


for line in f:
    # ...                          # !!! 'break' and 'next(t)' interferes with file's position
                                   # f.readline() is ok.

os and filesystem operations

import os
os.remove(path)                 # Remove a file
os.unlink(path)                 # ... idem
os.rmdir(path)                  # Remove an (empty) directory

import shutil
shutil.rmtree(path, ignore_errors=False, onerror=None)   # Remove a directory and all its content
import os.path
os.path.isfile(fname)            # True if fname exists and is a file

if not os.path.exists(directory):
    os.makedirs(directory)       # Create directory if does not exists

try:                             # Avoid race condition if directory created by another process
    os.makedirs(path)            # But we could fix solution above as well
except OSError:                  # This one always trigger an exception in nominal case
    if not os.path.isdir(path):  

Scanning a directory

import glob
tests = glob.glob('tests/tests_*.py')
for t in tests:
    print("tests %s" % t)

random module

import random
IV = []
for i in range(16):
    IV.append(random.randint(0, 255))

datetime module

import datetime
print    # similar, but possibly more accurate
print        # date only

bitstring module

from bitstring import *
s  = Bits('0x8081828384858687')
s  = Bits(hex='8081828384858687')
s  = Bits(bytes=b'\x80\x81\x82\x83\x84\x85\x86\x87')
sa = BitArray('0x8081828384858687')    # same as Bits, but mutable

s << 8                           # Logical shift
s[8:] + '0x00'                   # ... same as above
s <<= 8                          # ... (with mutation)
sa.rol(8)                        # Cyclic shift (with mutation)
s[8:] + s[:7]                    # ... same as above

See bitstring [6] (manual)

regex module

Source: w3schools

import re"^The.*Spain$", "The rain in Spain")    # re.Match object

re.findall(".ai", "The rain in Spain")            # ['rai', 'pai']
re.split("\s", "The rain in Spain")               # ['The', 'rain', 'in', 'Spain']
re.sub("\s", "_", "The rain in Spain")            # The_rain_in_Spain
re.sub("\s", "_", "The rain in Spain", 2)         # The_rain_in Spain

# re.Match object".ai", "The rain in Spain").group()     # rai".ai", "The rain in Spain").span()      # (4, 7)".ai", "The rain in Spain").string      # The rain in Spain


from Crypto.Cipher import AES
def toh(s):
    return s.encode('hex')
def tos(h):
    return h.replace(' ','').decode('hex')
def aes(k,p):
    return toh(a.encrypt(tos(p)))
def aesinv(k,c):
    return toh(a.decrypt(tos(c)))
def sxor(h1,h2):
    return toh(''.join(chr(ord(a) ^ ord(b)) for a,b in zip(tos(h1),tos(h2))))

Example of use:

run mycrypto                    # Assuming script in current dir and named ''
key='00112233 44556677 8899aabb ccddeeff'
p0='00000100 80000000 00000000 00000000'
p1='aaaaaaaa bbbbbbbb cccccccc dddddddd'
Modular inverse [7]
# Using gmpy - FASTEST
import gmpy
gmpy.invert(1234567, p)                      # 1000000 loops, best of 3: 737 ns per loop (p 1024-bit)
gmpy.divm(1, 1234567, p)                     # 1000000 loops, best of 3: 933 ns per loop (p 1024-bit)

# Using egcd function - NO DEPS, BUT SLOWER
def egcd(a, b):
    if a == 0:
        return (b, 0, 1)
        g, y, x = egcd(b % a, a)
        return (g, x - (b // a) * y, y)

def modinv(a, m):
    g, x, y = egcd(a, m)
    if g != 1:
        raise Exception('modular inverse does not exist')
        return x % m
timeit modinv(1234567,p)                     # 100000 loops, best of 3: 13.6 us per loop (p 1024-bit)

timeit pow(1234567,p-2,p)                    # 100 loops, best of 3: 4.22 ms per loop
modular exponentiation
from gmpy import mpz
def power_mod(a, b, n):
    return long(pow(mpz(a),b,n))


The doctest module searches for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown.

See example below.

# file

def toh(s):
    """ Convert a (binary) string into an hexadecimal string.
    >>> toh('DOH!')
    return s.encode('hex')

if __name__ == "__main__":
    import doctest

Run the tests with:



Simple HTTP Server

It's very easy to setup an ad-hoc HTTP server with Python. Just open a shell in a folder with some contents to share, and type:

python -m SimpleHTTPServer

More available at (see BaseHTTPServer and CGIHTTPServer).

Detect interactive mode

References: [8], [9]

Started with First method Second method Third method Fourth method
import __main__ as main print hasattr(main, '__file__') def in_ipython(): try: __IPYTHON__ except NameError: return False return True import sys print hasattr(sys, 'ps1'): import sys print bool(sys.flags.interactive)
python True - - -
python -i True - - True
python then import mymod - - True -
ipython True True - -
ipython -i True True - -
ipython then run True True - -
ipython then run -i True True - -
ipython then import mymod - True - -
ipython -i then import mymod - True - -

Find duplicates in list

From stackoverflow [10]

import collections

def fastest():                         # 134 us - Fastest
    seen = set()
    seen_add = seen.add                                            # To avoid lookup 'add' ever time an item is inserted
    seen_twice = set( x for x in l if x in seen or seen_add(x) )   # adds all elements it doesn't know yet to seen and all other to seen_twice
    return list( seen_twice )                                      # turn the set into a list (as requested)

def compact():                         # 415 us
    return [x for x, y in collections.Counter(l).items() if y > 1]

def slowest():                         # 19.2 ms
    return list(set([x for x in l if l.count(x) > 1]))

Start post-mortem debugger on exception

From stackoverflow [11]

>>> import pdb


Detect whether a variable is defined

Note it is bad practice to define a variable conditionally [12]. An interesting use case is to run code and define variable conditionally based on interactive status.

# Using try ... except
try: myvar
except NameError: print "variable 'myvar' IS defined"

# Using vars() / globals()
'myvar' in vars() or 'myvar' in globals()
# ...pedantic...
'myvar' in vars(__builtins__)

Analyse memory usage

  • See [13] — seems better suited to find memory leaks, not to analyse usage for memory hungry applications
sudo pip install -U memory_profiler
sudo pip install psutil
  • Add @profile decorator
def primes(n): 
  • Run the profiler
 python -m memory_profiler

The Pythonic way

Type import this in a Python interpreter, you get this:

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Detect Python 2 or Python 3 dependency

For instance, does gdb uses python 2 or 3?

ldd $(which gdb)|grep python
# => /usr/lib/x86_64-linux-gnu/ (0x00007f442a960000)

Find character in a string

The fastest and simplest is to use in operator, like

if '.' in name:
    # ...

To detect more characters, we must use a regex [15]:

>>> import re
>>> def special_match(strg, search=re.compile(r'[^a-z0-9.]').search):
...     return not bool(search(strg))
>>> special_match("az09.")
>>> special_match("az09.\n")


  • search is faster than using match.
  • If using match, there is no need to use ^...$ to force a full match.
  • Regex should use raw string r'...'.
  • If using the regex multiple times, compile it once and reuse later!

Detect Python version, location...

From pwndbg [16]:

# Find the Python version
PYVER=$(python -c 'import platform; print(".".join(platform.python_version_tuple()[:2]))')
PYTHON=$(python -c 'import sys; print(sys.executable)')

# Find the Python site-packages that we need to use
SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])')
# or to get user site
SITE_PACKAGES=$(python -c 'import site; print(site.getusersitepackages())')

Using script above, one can install a module using pip for the given python/site installation.

# Install Python dependencies using pip
sudo ${PYTHON} -m pip install --target ${SITE_PACKAGES} -Ur requirements.txt

Display random distribution with seaborn

seaborn is a powerful python toolkit to visualize statistical data.

Assume a data file like

head -n 5 samples
# 19.2
# 6.6
# 7.9
# 5.5
# 3.6
# ...

To visualize into seaborn:

# First setup seaborn -
%matplotlib gtk

import numpy as np
import pandas as pd
from scipy import stats, integrate
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(sum(map(ord, "distributions")))

# Then load our file -
file_in = open('../samples','r')
for z in'\n'):
    if z: y.append(float(z))

# Then tell seaborn to show the distribution. If 

# Normally the graph should pop up automatically. If not:

Convert bytes to str and vice-versa

Python v2 and v3 have different types of strings.

  • In v2, the type str is a sequence of bytes, while unicode are for Unicode text strings.
  • In v3, the type str are for Unicode text strings, and bytes is a sequence of bytes, also known as bytestring or byte string.
# Python v3
isinstance(s,str)          # True if s is a unicode text string
isinstance('abc',str)      # True
isinstance(b,bytes)        # True if b is a bytestring
isinstance(b'abc',bytes)   # True
s.encode()                 # Convert a text string (str) to bytes
b.decode()                 # Convert a bytestring (bytes) to str

XOR strings together

In Python 2.x [17]:

def sxor(s1,s2):
    return ''.join(chr(ord(a) ^ ord(b)) for a,b in zip(s1,s2))

In Python 3.x:

def bytes_xor(a, b) :
    return bytes(x ^ y for x, y in zip(a, b))

Various conversion

Binary 00110101
# Or use bin to convert an integer into binary literal string ('0b' prefix)
>>> bin(173)
# Binary literals are regular integers
>>> 0b101111
# Use int(..., 2) to convert a binary string into integer
>>> print int('01010101111',2)
>>> print int('11111111',2)

Reverse a string

>>> 'hello world'[::-1]
'dlrow olleh'

Reload a module in interactive python

There is reload command:

  • Python3 >= 3.4: importlib.reload(some_module)
  • Python3 < 3.4: imp.reload(some_module)
  • Python2: reload(some_module) built-in

For instance

import importlib
import some_module

# hack hack...

importlib.reload(some_module)           # Reload module


  • reload does not reload dependencies.
  • It does not work when module is loaded like from some_module import *.

Usually it's simpler to do:

python3 -c 'from some_module import *'
# >>> hack hack...
# >>> <CTRL-D>
python3 -c 'from some_module import *'
# >>> ....

Benchmark an algorithm

From the shell, using the timeit module:

python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
# 10000 loops, best of 3: 143 usec per loop
python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
# 1000 loops, best of 3: 969 usec per loop
python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
# 1000 loops, best of 3: 1.1 msec per loop

Or directly in Python, using timeit.Timer:

>>> timeit.Timer(
        '[item for sublist in l for item in sublist]',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10000'

Flatten a list of lists (of lists...)

from SO:

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]

# Fastest - using iconcat
functools.reduce(operator.iconcat, a, [])

# Fastest - using itertools
list(itertools.chain.from_iterable(list2d))  # Since Python 2.6, no unpacking needed

# Using list comprehension - very fast
flat_list = [item for sublist in l for item in sublist]

# Using sum and monoid - fastest for small list, very compact
sum(l, [])

# Using lambda, slowest
reduce(lambda x,y: x+y,l)

See also this blogspot, for a non-recursive solution that can process even deeply nested lists.

Detect last element in a for loop

From SO:

def lookahead(iterable):
    """Pass through all values from the given iterable, augmented by the
    information if there are more values to come after the current one
    (False), or if it is the last value (True).
    # Get an iterator and pull the first value.
    it = iter(iterable)
    last = next(it)
    # Run the iterator to exhaustion (starting from the second value).
    for val in it:
        # Report the *previous* value (more to come).
        yield last, False
        last = val
    # Report the last value.
    yield last, True

for i, has_more in lookahead(range(3)):
    print(i, has_more)

Swap two variables

The pythonic way [18]:

a,b = b,a

Print to stderr

# For Python 2:
# from __future__ import print_function
# import sys

def eprint(*args, **kwargs):
    print(*args, file=sys.stderr, **kwargs)

Note that stderr is not buffered, so no need to flush [19].

Do's and don't's

foo = 'abcdef'
l = list(foo)                     # DO
foo = 'abcdef'
l = [c for c in foo]              # don't
foo = list(...)
g = map(blah,foo]                 # DO
foo = list(...)
g = [blah(i) for i in foo]        # don't
A = [[0]*5 for _ in range(5)]     # DO
A = [[0]*5]*5                     # don't


Frequent mistakes. Beware the snake can bite you!

Confuse a method and a property in a test

if A.isdummy():            # This will fail isdummy is a property
if A.isdummy:              # Always True if isdummy is a method

Note that property should only be used to extend the behaviour of a class variable. Properties are designed to make it safe to publish variables in class interface, and get rid of useless mutator/accessor (see Python in a Nutshell, Why properties are important). Don't use property as replacement of a method when designing a new class.

Stick to a convention. Like always define methods like isxyyz() or hasabc() as methods. Note that defining them as property would raise an exception if used as a function, and hence might be safer.

Mix 0 with None in a sequence

Testing whether an element is defined is more difficult.
a = [0,None,None,None]
bool(a[0])           # --> False
bool(a[1])           # --> False !!! How can we tell them apart?
a[1] is None         # --> True      This works

Mixing property and normal getter

SOLUTION: prefix all getter method with get, like getvalue()
b = a.prop           # Using a property, OR
b = a.getprop()      # Using a getter

Forget that, in a python function, arguments are always passed by value

def f(x, y):
    x = 23
a = 77
b = [99]
f(a, b)
print a, b                 # prints: 77 [99, 42]

To reassing a list in a function, use a[:] construct, like:

def f(a):
   a[:]=a[::-1]             # This will NOT create a new list, but reassign elements in the original list

Use bytes, not string of characters

Characters can be unicode and take more than one byte.


Mixing string and bytestring (v3)

buf = b'abc\n'
if buf.find(b'\n'):        # MUST use BYTESTRING here
    # ....
str = 'abc\n'
if str.find('\n'):         #  MUST use STRING here
    # ....

Forget self. when using class members

class MyClass(object):
    buf = b''

    def UpdateBuf(self,new_buf):
        buf = new_buf                 # WRONG!
        self.buf = new_buf            # CORRECT!


Read a file line by line

Sources: [20]

Shortest version with autoclose and universal line ending (mode "u"):

for line in open("path/to/file.txt","U"): # U: universal line ending
    print(line.strip())                   # or strip('\r')

Slightly longer version with with:

with open("path/to/file.txt") as f:       # assume read-text mode "rt"
    for line in f:
        print(line.strip())               # or strip('\r')

Counting line number:

with open('path/to/file.txt') as f:
    for cnt, line in enumerate(f):
        print(f"Line {cnt}: {line.strip()}")

The long old way:

    f = open("path/to/file.txt")
    line = f.readline()
    cnt = 1
    while line:
        print(f"Line {cnt}: {line.strip()}")
        line = f.readline()
        cnt += 1

Simple TCP server

import socket, socketserver
import sys 
import itertools


class Handler(socketserver.BaseRequestHandler):
    messages = b""
    def handle(self):
        token = uint8(0)
        client = self.request

            while True:
                buf = client.recv(1)
                # buf = client.recv(len)
                # client.send(buf)
        except socket.error as msg:


port = SERV_PORT
if len(sys.argv) > 1:
    port = int(sys.argv[1])
server = socketserver.TCPServer((SERV_ADDR, port), Handler)

Docstrings and Doctest

Specifications: pep-0257

  • To write good module docstrings, "think about somebody doing help(yourmodule) at the interactive interpreter's prompt — what do they want to know?" [21].
  • See pep-0257 for more recommendations
Using doctest

You can include tests, in the form of examples, in your Python modules' docstrings [22].

For instance, here file It contains:

  • A function with a docstring, and example of use with some test values.
  • A footer code that will call doctest.testmod() function if the module is loaded as main file.
import binascii

def sxor(s1,s2):
    """Xor two strings together.
    >>> sxor('abcd','1234')
    return binascii.hexlify(bytes(a ^ b for a,b in zip(s1,s2))).decode()

# Footer to trigger doctest automatically.
# Alternatively, trigger it with:
#     python -m doctest
if __name__ == "__main__":
    import doctest

Now, we can run the tests with:


No output means there was no errors. Use -v to get more output:

python3 -v
# Trying:
#     sxor('abcd','1234')
# Expecting:
#     'b9f9'
# ok
# 1 items had no tests:
#     __main__
# 1 items passed all tests:
#    1 tests in __main__.sxor
# 1 tests in 2 items.
# 1 passed and 0 failed.
# Test passed.

Instead of using the footer code, one may call doctest from the command line (since Python 2.6):

python3 -m doctest


Big numbers
  • gmpy based on GMP
  • libnum a lighter bignum library, but compatible with pypy.


Set source file encoding

Add any of these lines [23]:

# -*- coding: utf-8 -*-
# vim: set fileencoding=utf-8 :
Write the BOM

See [24]

import codecs

file ="lol", "w", "utf-8")
file.write(u'\ufeff')                          # or use unicode name: u'\N{ZERO WIDTH NO-BREAK SPACE}'

# Using
with"test_output", "w", "utf-8-sig") as temp:
    temp.write("hi mom\n")
Handling unicode

Some recommends to always process unicode internally, and decode on input and encode on output [25]:

line = line.decode('utf-8')
# ...treat line as unicode...
print line.encode('utf-8')

But this is error prone. So another solution proposed is to redefine sys.stdout:

import sys
import codecs
sys.stdout = codecs.getwriter('utf8')(sys.stdout)

An hackish way (not recommended):

# -*- coding: utf-8 -*-
import sys
print u"åäö"

Python 2 to Python 3

Use python v3 print in v2
from __future__ import print_function

This way print() will not print () in v2.

Coding style

From PEP 8, Coding Style.

  • Use pycodestyle to check code conformance:
pip install pycodestyle
  • Use autopep8 to format existing code:
pip install autopep8
autopep8 --in-place
Naming conventions
lower_case_variable = None

def lower_case_func():
    # ...

class ClassNameAreCapsWord:
    # ...
Some good/bad practices
# BAD - superfluous 'pass'
class InvalidAttribute(AttributeError):
    """Used to indicate attributes that could never be valid"""
class InvalidAttribute(AttributeError):
    """Used to indicate attributes that could never be valid"""

f = open('file.txt')
a =
print a
with open('file.txt') as f:
    for line in f:
        print line

my_very_big_string = """For a long time I used to go to bed early. Sometimes, \
    when I had put out my candle, my eyes would close so quickly that I had not even \
    time to say “I’m going to sleep.”"""

from some.deep.module.inside.a.module import a_nice_function, another_nice_function, \
my_very_big_string = (
    "For a long time I used to go to bed early. Sometimes, "
    "when I had put out my candle, my eyes would close so quickly "
    "that I had not even time to say “I’m going to sleep.”"

from some.deep.module.inside.a.module import (
    a_nice_function, another_nice_function, yet_another_nice_function)


Troubleshooting a missing library

  • Use python -v -c "import mylibrary" to troubleshoot a module.
  • Look at the log for the loaded libraries.
  • Some libraries are statically linked in python and might be missing. Use ldd to see the linked libraries, and report missing ones.
ldd /path/to/your/
# =>  (0xf77c3000)
# => not found
# => not found
# => not found
# => /lib/i386-linux-gnu/ (0xf776a000)
# => /lib/i386-linux-gnu/ (0xf75b3000)
# /lib/ (0x5659b000)