Python: Difference between revisions

Revision as of 08:43, 27 September 2018

References

Books

O'Reilly's Python in a Nutshell
The Python language reference

Links

==> The Python Standard Library <==
The Python Tutorial
Python 2.7.6 docs
Python Quick Reference 2.7 — Extremelly complete

Other versions of Python are available [1]

Variants and distributions

ipython
Jupyter — The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
Anaconda

Python 3

Built-in Types

PEP

Tools

autopep8 — A tool that automatically formats Python code to conform to the PEP 8 style guide

sudo pip install --upgrade autopep8

Miscellaneous

Nice example of generating / testing regex in Python (with nice / small test framework) [2]

Libraries

seaborn is a powerful python toolkit to visualize statistical data.

Profiler

py-spy a sampling profiler for Python

Shell

In a command shell, use pydoc to get help:

pydoc repr               # Get help on 'repr' command

Same can be achieved in python interpreter:

help()                 # Interactive help
help('repr')           # Same as typing 'repr' in interactive help
help(repr)             # Help on repr builtin

Install

Virtual Environments

A Virtual Environment is a tool to keep the dependencies required by different projects in separate places, by creating virtual Python environments for them.

References

Guide to Python — Virtual Environments
Is it possible to install another version of Python to Virtualenv? (stackoverflow.com)

Install pip and setuptools

To install setuptools, the easiest is to use pip, which comes pre-installed in later versions of Python:

pip install -U setuptools

To bootstrap the setuptools on an naked installation:

cd /path/to/your/python
wget https://bootstrap.pypa.io/ez_setup.py -O - | ./python
wget https://bootstrap.pypa.io/ez_setup.py -O - | sudo ./python       # System-wide
wget https://bootstrap.pypa.io/ez_setup.py -O - | ./python - --user   # User-local path

See Install pip setuptools and wheels for more information.

Install module online

Python comes with a wide range of libraries, called modules. There are several ways to install these modules.

Using the distribution

For instance, in Debian:

apt-cache search --names-only python-       # View available modules
sudo apt-get install python-pyscard         # Install the pyscard module

Using pip

pip is the new way to install modules. It uses the wheel format.

sudo pip install Pygments

This is equivalent to:

sudo python -m pip install Pygments

This last form can be used to explicit which python runtime must be used:

sudo /path/to/your/python -m pip install Pygments

Use --user to install for user only:

pip install --user Pygments

Use --target SITE to specify manually the target SITE:

pip install --target SITE Pygments

See tip below on how to obtain the default site.

Using easy_install

easy_install is the old way to install modules. It uses the egg format.

sudo easy_install Pygments

Using the source

Download and uppack the package

wget http://sourceforge.net/projects/pyscard/files/pyscard/pyscard%201.6.12/pyscard-1.6.12.tar.gz#md5=908d2530972ea91eb4bb66987e0e1e98
tar -xvzf pyscard-1.6.12.tar.gz
cd pyscard-1.6.12

To install globally (in /usr/local/lib/python2.7/dist-packages or similar):

sudo ./setup.py install

To install locally (in ~/.local/lib/python2.7/site-packages, use --user:

sudo ./setup.py install --user

One can also use pip to install from source:

sudo pip install .       # Global install
pip install --user .     # Local install

Install modules offline

To install a Python module on a machine that has no connection to Internet [3]:

On a machine with internet connection

# For instance, to install package neovim
mkdir tmp && cd tmp
pip download neovim

On the offline machine, which has access to tmp/:

# For instance, to install package neovim
cd tmp
pip install --no-index --find-links ./ neovim

Import modules

Assume we have a module named module.py:

import module;               # Import everything in module.* namespace
from module import *;        # Import everything in current namespace

Interactive mode

Python can be run interactively, which is a very powerful way to develop new applications.

Python

To import an existing module, use import as usual:

import mymod             # Import module in current session
from mymod import *      # Idem, but remove mymod. prefix to symbols

iPython / Jupyter

To import an existing module, use import as above or command run:

run mymod

Python variants

iPy

Use iPy (ipython) to get an interactive shell with auto-completion, instant help...

%magic                    # Get help on %magic commands (%run,...)
?run                      # Get help on %run magic
%run script.py            # Run given script
%run -i script.py         # ... with inspect mode on
%run -i -e script.py      # ... ... and ignore sys.exit() call
!cmd                      # Run shell command 'cmd', for instance ...
!ls                       # ... List file in current directory

Pypy

PyPy is a fast, compliant alternative implementation of the Python language, which usually runs python programs faster thanks to its Just-in-Time compiler.

Install: On Lucid 64-bit, the easiest is to download the dedicated tarball:

wget https://bitbucket.org/pypy/pypy/downloads/pypy-2.2.1-linux64.tar.bz2
tar -cvjf pypy-2.2.1-linux64.tar.bz2

Install virtualenv, then install pypy as virtual environment my-pypy-env

sudo apt-get install python-virtualenv
virtualenv -p pypy-2.2.1-linux64/bin/pypy my-pypy-env

Modules must be installed separatedly for this virtual environment. For instance

./my-pypy-env/bin/pip install libnum

Run: Run python programs using python or pypy

./my-pypy-env/bin/pypy

Reference

range

Keywords

and     continue  except   global  lambda  raise   yield
as      def       exec     if      not     return
assert  del       finally  import  or      try
break   elif      for      in      pass    while
class   else      from     is      print   with

Reserved class of identifiers

From the Python reference:

_* — _ is is used in the interactive interpreter to store the result of the last evaluation.
__*__ — System-defined names (for instance __init__ used for constructors).
__* — Class-private names. Names in this category, when used within the context of a class definition, are re-written to use a mangled form to help avoid name clashes between “private” attributes of base and derived classes. See section Identifiers (Names)

Operators

+  -  *  /  %   **  //  <<  >>  &
|  ^  ~  <  <=  >   >=  <>  !=  ==

In v3, @ is also an operator.

Operators and their evaluation order, from highest to lowest:

, [...] {...} `...`                   # Tuple, list & dict. creation; string conv.
s[i] s[i:j] s.attr f(...)             # indexing & slicing; attributes, function calls
+x, -x, ~x                            # Unary operators
x**y                                  # Power
x*y x/y x%y                           # mult, division, modulo
x+y x-y                               # addition, substraction
x<<y   x>>y                           # Bit shifting
x&y                                   # Bitwise "and"; also intersection of sets
x^y                                   # Bitwise exclusive or
x|y                                   # Bitwise "or"; also union of sets
x<y  x<=y  x>y  x>=y  x==y x!=y  x<>y # Comparison,
x is y   x is not y                   # identity,
x in s   x not in s                   # membership
not x                                 # boolean negation
x and y                               # boolean and
x or y                                # boolean or
lambda args: expr                     # anonymous function

Delimiters

(   )   [   ]    {    }
,   :   .   `    =    ;   @
+=  -=  *=  /=   //=  %=
&=  |=  ^=  >>=  <<=  =

Characers with special meanings as part of other tokens:

' " # \

Literals

42         # Integer literal
3.14       # Floating-point literal
1.0j       # Imaginary literal
'hello'    # String literal
"world"    # Another string literal
"""Good
night"""   # Triple-quoted string literal

[42, 3.14, 'hello']    # List
[]                     # Empty list
100, 200, 300          # Tuple
()                     # Empty tuple
{'x':42, 'y':3.14}     # Dictionary
{}                     # Empty dictionary
{1, 2, 4, 8, 'string'} # Set
# There is no literal to denote an empty set; use set() instead

More

See Python reference and Python in a Nutshell

Data types

Boolean

True            # constant for true
False           # constant for false
bool(x)         # To convert to bool built-in type

Avoid unnecessary call to bool(x).

if x:                     # GOOD
if bool(x):               # BAD
if x is True:             # BAD
if x == True:             # BAD
if bool(x)source==True    # BAD

A valid use:

def count_trues(seq): return sum(bool(x) for x in seq)   # Ensure each item is counted either as 0 or 1

Control flow statements

If

if x < 0: print('x is negative')
elif x % 2: print('x is positive and odd')
else: print('x is even and non-negative')

# Better style (PEP 8):
if x < 0:
    print('x is negative')
elif x % 2:
    print('x is positive and odd')
else:
    print('x is even and non-negative')

While

count = 0
while x > 0:
    x //= 2              # truncating division
    count += 1
    print('The approximate log2 is', count)

For

for letter in 'ciao':
    print('give me a', letter, '...')

# target can be a tuple
for key, value in d.items():
    if key and value:        # print only true keys and values
        print(key, value)

# ... or something else (LHS expression)
prototype = [1, 'placemarker', 3]
for prototype[1] in 'xyz': print(prototype)
# prints [1, 'x', 3], then [1, 'y', 3], then [1, 'z', 3]

# Using range:
for i in range(n):
    statement(s)

#Using list comprehension:
result1 = [x+1 for x in some_sequence]
#... same as:
result2 = []
for x in some_sequence:
    result2.append(x+1)
# Comprehension list may have 'if', or nested for
result3 = [x+1 for x in some_sequence if x>23]
result5 = [x for sublist in listoflists for x in sublist]

# Dict comprehension
d = {n:n//2 for n in range(5)}
print(d) # prints: {0:0, 1:0, 2:1, 3:1, 4:2] or other order

break

while True:               # this loop can never terminate naturally
    x = get_next()
    y = preprocess(x)
    if not keep_looping(x, y): break
    process(x, y)

continue

for x in some_container:
    if not seems_ok(x): continue

for-else and while-else

for x in some_container:
    if is_ok(x): break # item x is satisfactory, terminate loop
else:
    print('Beware: no satisfactory item was found in container')
    x = None

Pass

if condition1(x):
    process1(x)
elif x>23 or condition2(x) and x<5:
    pass # nothing to be done in this case
elif condition3(x):
    process3(x)
else:
    process_default(x)

Try-raise

try:
    # statement(s)
except [expression [as target]]:
    # statement(s)
[else:
    # statement(s)]

While

The with statement is the Python embodiment of the well-known C++ idiom “resource acquisition is initialization" (RAII)

with expression [as varname]:
    statement(s)

Scope

a = 'global'
def afunction():
    global a                         # Use 'global' to change scope of a variable
    a = 'using global'
    b = 'local'

File and Text Operations

Source: O'Reilly Python in a Nutshell.

io module

To open a file:

# - mode can be 'r', 'w', 'a', 'r+', 'w+', 'a+'; Default is text 't', add 'b' for binary.
open(file, mode='r', buffering=-1, encoding=None, errors='strict', newline=None, closefd=True, opener=os.open)

with io.open(...) as f:            # PYTHONIC way, open is a manager
    # ...

f = io.open(...)                   # BAD. No guarantee that f gets closed

File operations:

f.close()
f.flush()
str = f.read(size=-1)              # bytestring in bynary mode, text string otherwise.
str = f.readline(size=-1)
lst = f.readlines(size=-1)
f.write(s)
f.writelines(lst)                  # Same as: for line in lst: f.write(line)

Iterations:

for line in f:
    # ...                          # !!! 'break' and 'next(t)' interferes with file's position
                                   # f.readline() is ok.

Text input and output

import sys;

sys.stdout                         # Standard output
sys.stderr                         # Standard error

# Output (from any file)
from __future__ import print_function        # Enable v3 print in Python 2.x
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

# Input (from stdin only)
input(prompt='')                   # v3: same as v2 raw_input; v2: same as eval(raw_input(prompt))
raw_input(prompt='')               # v2 only

# Using write with stdout
sys.stdout.write(...)

# Output to a file
print(file=f,'...')
f.write('...')

Output formatting with format (v3)

# v3 - String formatting
# '{[selector][conversion]:[format_specifier]}'.format(value)
'First: {} second: {}'.format(1, 'two')
'Second: {1} first: {0}'.format(1, 'two')                        # Give positional for all 
'a: {a}, 1st: {}, 2nd: {}, a again: {a}'.format(1, 'two', a=3)   # Give name for some
'a: {a} first:{0} second: {1} first: {0}'.format(1, 'two', a=3)  # Can mix name and positional

# Using sequences and composites:
'p0[1]: {[1]} p1[0]: {[0]}'.format(('zero', 'one'), ('two', 'three'))
'p1[0]: {1[0]} p0[1]: {0[1]}'.format(('zero', 'one'), ('two', 'three'))
'{} {} {a[2]}'.format(1, 2, a=(5, 4, 3))
'First r: {.real} Second i: {a.imag}'.format(1+2j, a=3+4j)

# Field width
'{:^12s}'.format(s)
'{:.>12s}'.format(s)
print('{:,}'.format(12345678))

# Precision specification
'as f: {:.4f}'.format(x)
'as g: {:.4g}'.format(x)
'as s: {:.6s}'.format(s)

See Python in Nutshell, chapter 8 for more information.

Formatted String Literals (3.6)

print(f'{name!r} is {len(name)} characters long')
for width in 8, 11:
    for precision in 2, 3, 4, 5:
        print(f'{3.14159:{width}.{precision}}')

Legacy string formatting with %

# format % values
'result = %d' % x
'answers: %d %f' % x, y
'File not found %r' % filename             # !!! USE %r to log possibly erroneous strings !!!

Input parsing

# Using built-ins
print(int('2'))
print(float('3.14'))

# Using ast.literal_eval()
import ast
print(ast.literal_eval('23'))
# prints 23
print(ast.literal_eval('[2,3]')) # prints [2, 3]
print(ast.literal_eval('2+3'))
# raises ValueError
print(ast.literal_eval('2+'))
# raises SyntaxError

Basic

Statements

try: statement(s) except [expression [,target]]: statement(s) [else: statement(s)]	try: statement(s) finally: statement(s)	try: statement(s) except [expression [,target]]: statement(s) finally: statement(s)
expression is a class or tuple of classes. target is variable that will store exception object. `else` clause is executed if `try` block terminates, i.e. not on exception or if a `break` occurs. `try-except-finally` is Python 2.5.

Basic

for i in range(10):
    print i                      # carriage return

for i in range(10):
    print i,                     # no carriage return

for key in d:                       # Loop over keys in dictionary d
for key, value in d.iteritems():    # Loop over keys and values in dictionary d

a = 'global'
def afunction():
    global a
    a = 'still using global'
    b = 'local'

import os.path
os.path.isfile(fname)            # True if fname exists and is a file

if not os.path.exists(directory):
    os.makedirs(directory)       # Create directory if does not exists

try:                             # Avoid race condition if directory created by another process
    os.makedirs(path)            # But we could fix solution above as well
except OSError:                  # This one always trigger an exception in nominal case
    if not os.path.isdir(path):  
        raise

s.upper()                             # string s to uppercase
', '.join(set_3)                      # Join a sequence
hex_data = "deadbeef".decode("hex")   # "\xde\xad\xbe\xef"
map(ord, hex_data)                    # [0xDE, 0xAD, 0xBE, 0xEF]

sys.argv, len(sys.argv)          # Argument list, number of arguments ([0] -> exec name)
if ("-h" in sys.argv) or ("--help" in sys.argv):
    printUsage()
for a in range(len(sys.argv)):
    if sys.argv[a] == "-e":
        # handler

# Sort based on object attribute
ut.sort(key=lambda x: x.count, reverse=True)   # To sort the list in place...
newlist = sorted(ut, key=lambda x: x.count, reverse=True)  # To return a new list, use the sorted() built-in function...

(From stackoverflow [4])

for c in list(sha256.digest()):
    key.append(ord(c))

Operators

if (p.poll() is None):         # Use 'is' for testing None
    print "None"

List

a=[0,3,6]
print a[1]                     # 3

a=[0] * 1000                   # Array with 1000 elements
len(a)                         # Number of elements

a[:]=a[::-1]                   # Reassign element in the list (here in reverse order)
a=a[::-1]                      # Idem, but create a new object

a=[];
a.append(12);                  # Create object before appending
a[len(a):] = [13];             # Same as appending

def shiftRow(word, n):
    return word[n:]+word[0:n]
state[i::4] = shiftRow(state[i::4],i)      # Apply shiftRow on 4 bytes distant of 4 each

alist = map(lambda b: sbox[b],alist)

state[:] = [ a ^ b for a,b in zip(state,roundKey) ]    # Ex-oring 2 lists of integers

# Multi-dimensional list
matrix = [[0 for x in range(5)] for x in range(5)]     # Initialize bi-dimensional array
matrix = [[0]*5 for i in range(5)]                     # faster way
# matrix = 5*[5*[0]]                                   # DO NOT DO THIS - 5 times copy of same

# Sort
a.sort()

Dictionary

D = { 'x':42, 'y':3.14, 'z':7 }
D['x']                                                 # 42
del D[k]                                               # Removes from dictionary D the item whose key is k
#Spare matrix
Matrix = {}
Matrix[1,2] = 15                                       # This works because 1,2 -- a tuple -- is used as a key

Random

IV = []
for i in range(16):
    IV.append(randint(0, 255))

Miscellaneous conversion

print list("abc")               # ['a', 'b', 'c']

Format operator `%` or `format` function

print '%x' % variable            # Print hex
print("{}-{}-{}".format(n1, n2, n3))

math

print 1//2                       # floor division (PEP-238)

System

sys.exit()

Classes

An empty class:

class Empty(object):
    pass

A class with constructor and data members:

class Basic(object):
    __param = None                           # __* denotes a class-private member

    def __init__(self, param):
        self.__param = param
        print "Basic is born with param %s" % param

A class that inherits:

class Child(Parent):
    __param = None

    def __init__(self, param):
        Parent.__init__(self)                # Must call EXPLICITLY parent constructor
        self.__param = param

Class members can be defined as properties:

class Rectangle(object):
    def __init__(self, width, height):
        self.width = width
        self.height = height
    @property
    def area(self):
        '''area of the rectangle'''
        return self.width * self.height
    @area.setter
    def area(self, value):
        scale = math.sqrt(value/self.area)
        self.width *= scale
        self.height *= scale

Modules

import datetime
print datetime.datetime.today()  
print datetime.datetime.now()    # similar, but possibly more accurate
print datetime.date.now()        # date only

Advanced

mymodule = __import__('mymodule')          # Import module from string - see http://effbot.org/zone/import-string.htm

Modular inverse [5]

# Using gmpy - FASTEST
import gmpy
gmpy.invert(1234567, p)                      # 1000000 loops, best of 3: 737 ns per loop (p 1024-bit)
gmpy.divm(1, 1234567, p)                     # 1000000 loops, best of 3: 933 ns per loop (p 1024-bit)

# Using egcd function - NO DEPS, BUT SLOWER
def egcd(a, b):
    if a == 0:
        return (b, 0, 1)
    else:
        g, y, x = egcd(b % a, a)
        return (g, x - (b // a) * y, y)

def modinv(a, m):
    g, x, y = egcd(a, m)
    if g != 1:
        raise Exception('modular inverse does not exist')
    else:
        return x % m
timeit modinv(1234567,p)                     # 100000 loops, best of 3: 13.6 us per loop (p 1024-bit)

# Using pow() - SIMPLEST BUT SLOWEST
timeit pow(1234567,p-2,p)                    # 100 loops, best of 3: 4.22 ms per loop

modular exponentiation

from gmpy import mpz
def power_mod(a, b, n):
    return long(pow(mpz(a),b,n))

Python list

See [6]

Bitstring [7] (manual)

from bitstring import *
s  = Bits('0x8081828384858687')
s  = Bits(hex='8081828384858687')
s  = Bits(bytes=b'\x80\x81\x82\x83\x84\x85\x86\x87')
sa = BitArray('0x8081828384858687')    # same as Bits, but mutable

s << 8                           # Logical shift
s[8:] + '0x00'                   # ... same as above
s <<= 8                          # ... (with mutation)
sa.rol(8)                        # Cyclic shift (with mutation)
s[8:] + s[:7]                    # ... same as above

Cryptography

Package pycrypto

from Crypto.Cipher import AES
def toh(s):
    return s.encode('hex')
def tos(h):
    return h.replace(' ','').decode('hex')
def aes(k,p):
    a=AES.new(tos(k))
    return toh(a.encrypt(tos(p)))
def aesinv(k,c):
    a=AES.new(tos(k))
    return toh(a.decrypt(tos(c)))
def sxor(h1,h2):
    return toh(''.join(chr(ord(a) ^ ord(b)) for a,b in zip(tos(h1),tos(h2))))

Example of use:

ipython

run mycrypto                    # Assuming script in current dir and named 'mycrypto.py'
key='00112233 44556677 8899aabb ccddeeff'
p0='00000100 80000000 00000000 00000000'
c0=aes(key,p0)
p1='aaaaaaaa bbbbbbbb cccccccc dddddddd'
c1=aes(key,sxor(c0,p1))

os and filesystem operations

# Using os module
os.remove(path)                 # Remove a file
os.unlink(path)                 # ... idem
os.rmdir(path)                  # Remove a directory

# Using shutil module
rmtree(path, ignore_errors=False, onerror=None)
                                # Remove a directory and all its content

Doctest

The doctest module searches for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown.

See example below.

# file dc.py

def toh(s):
    """ Convert a (binary) string into an hexadecimal string.
    >>> toh('DOH!')
    '444f4821'
    """
    return s.encode('hex')

if __name__ == "__main__":
    import doctest
    doctest.testmod()

Run the tests with:

python dc.py

Tips

Simple HTTP Server

It's very easy to setup an ad-hoc HTTP server with Python. Just open a shell in a folder with some contents to share, and type:

python -m SimpleHTTPServer

More available at http://docs.python.org/2/library/internet.html (see BaseHTTPServer and CGIHTTPServer).

Detect interactive mode

References: [8], [9]

Started with	First method	Second method	Third method	Fourth method
	`import __main__ as main print hasattr(main, '__file__')`	`def in_ipython(): try: __IPYTHON__ except NameError: return False return True`	`import sys print hasattr(sys, 'ps1'):`	`import sys print bool(sys.flags.interactive)`
`python mymod.py`	True	-	-	-
`python -i mymod.py`	True	-	-	True
`python` then `import mymod`	-	-	True	-
`ipython mymod.py`	True	True	-	-
`ipython -i mymod.py`	True	True	-	-
`ipython` then `run mymod.py`	True	True	-	-
`ipython` then `run -i mymod.py`	True	True	-	-
`ipython` then `import mymod`	-	True	-	-
`ipython -i` then `import mymod`	-	True	-	-

Find duplicates in list

From stackoverflow [10]

import collections

def fastest():                         # 134 us - Fastest
    seen = set()
    seen_add = seen.add                                            # To avoid lookup 'add' ever time an item is inserted
    seen_twice = set( x for x in l if x in seen or seen_add(x) )   # adds all elements it doesn't know yet to seen and all other to seen_twice
    return list( seen_twice )                                      # turn the set into a list (as requested)

def compact():                         # 415 us
    return [x for x, y in collections.Counter(l).items() if y > 1]

def slowest():                         # 19.2 ms
    return list(set([x for x in l if l.count(x) > 1]))

Start post-mortem debugger on exception

From stackoverflow [11]

>>> import pdb
>>> pdb.pm()

Miscellaneous

Detect whether a variable is defined

Note it is bad practice to define a variable conditionally [12]. An interesting use case is to run code and define variable conditionally based on interactive status.

# Using try ... except
try: myvar
except NameError: print "variable 'myvar' IS defined"

# Using vars() / globals()
'myvar' in vars() or 'myvar' in globals()
# ...pedantic...
'myvar' in vars(__builtins__)

Analyse memory usage

Dowser

See [13] — seems better suited to find memory leaks, not to analyse usage for memory hungry applications

memory_profiler

See [14]
Install

sudo pip install -U memory_profiler
sudo pip install psutil

Add @profile decorator

@profile
def primes(n): 
    ...

Run the profiler

 python -m memory_profiler primes.py

The Pythonic way

Type import this in a Python interpreter, you get this:

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Detect Python 2 or Python 3 dependency

For instance, does gdb uses python 2 or 3?

ldd $(which gdb)|grep python
# libpython3.5m.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0 (0x00007f442a960000)

Find character in a string

The fastest and simplest is to use in operator, like

if '.' in name:
    # ...

To detect more characters, we must use a regex [15]:

>>> import re
>>> def special_match(strg, search=re.compile(r'[^a-z0-9.]').search):
...     return not bool(search(strg))
>>> special_match("az09.")
True
>>> special_match("az09.\n")
False

Note:

search is faster than using match.
If using match, there is no need to use ^...$ to force a full match.
Regex should use raw string r'...'.
If using the regex multiple times, compile it once and reuse later!

Detect Python version, location...

From pwndbg [16]:

# Find the Python version
PYVER=$(python -c 'import platform; print(".".join(platform.python_version_tuple()[:2]))')
PYTHON=$(python -c 'import sys; print(sys.executable)')
PYTHON="${PYTHON}${PYVER}"

# Find the Python site-packages that we need to use
SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])')
# or to get user site
SITE_PACKAGES=$(python -c 'import site; print(site.getusersitepackages())')

Using script above, one can install a module using pip for the given python/site installation.

# Install Python dependencies using pip
sudo ${PYTHON} -m pip install --target ${SITE_PACKAGES} -Ur requirements.txt

Display random distribution with seaborn

seaborn is a powerful python toolkit to visualize statistical data.

Assume a data file like

head -n 5 samples
# 19.2
# 6.6
# 7.9
# 5.5
# 3.6
# ...

To visualize into seaborn:

# First setup seaborn - https://seaborn.pydata.org/tutorial/distributions.html
%matplotlib gtk

import numpy as np
import pandas as pd
from scipy import stats, integrate
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(color_codes=True)
np.random.seed(sum(map(ord, "distributions")))

# Then load our file - https://stackoverflow.com/questions/36343646/reading-a-text-file-and-converting-string-to-float
file_in = open('../samples','r')
for z in file_in.read().split('\n'):
    if z: y.append(float(z))
file_in.close()

# Then tell seaborn to show the distribution. If 
sns.distplot(y)

# Normally the graph should pop up automatically. If not:
# plt.show()
# sns.plt.show();

Do's and don't's

foo = 'abcdef' l = list(foo) # DO	foo = 'abcdef' l = [c for c in foo] # don't
foo = list(...) g = map(blah,foo] # DO	foo = list(...) g = [blah(i) for i in foo] # don't

Traps

Frequent mistakes. Beware the snake can bite you!

Confuse a method and a property in a test

SOLUTION: Stick to a convention. Like always define methods like isxyyz() or hasabc() as methods. Note that defining them as property would raise an exception if used as a function, and hence might be safer.

if A.isdummy():            # This will fail isdummy is a property
if A.isdummy:              # Always True if isdummy is a method

Mix `0` with `None` in a sequence

Testing whether an element is defined is more difficult.

a = [0,None,None,None]
bool(a[0])           # --> False
bool(a[1])           # --> False !!! How can we tell them apart?
a[1] == None         # --> True      This works, but is unusual and likely bad practice

Mixing property and normal getter

SOLUTION: prefix all getter method with get, like getvalue()

b = a.prop           # Using a property, OR
b = a.getprop()      # Using a getter

Forget that, in a python function, arguments are always passed by value

def f(x, y):
    x = 23
    y.append(42)
a = 77
b = [99]
f(a, b)
print a, b                 # prints: 77 [99, 42]

To reassing a list in a function, use a[:] construct, like:

def f(a):
   a[:]=a[::-1]             # This will NOT create a new list, but reassign elements in the original list

Use bytes, not string of characters

Characters can be unicode and take more than one byte.

b'abc'
bytes('abc')

Mixing string and bytestring (v3)

buf = b'abc\n'
if buf.find(b'\n'):        # MUST use BYTESTRING here
    # ....
str = 'abc\n'
if str.find('\n'):         #  MUST use STRING here
    # ....

Docstrings

Specifications: pep-0257

To write good module docstrings, "think about somebody doing help(yourmodule) at the interactive interpreter's prompt — what do they want to know?" [17].
See pep-0257 for more recommendations

Using doctest

You can include tests, in the form of examples, in your Python modules' docstrings. Properly written, these tests can be executed and verified by the doctest module. [18]

Libraries

Big numbers

gmpy based on GMP
libnum a lighter bignum library, but compatible with pypy.

Unicode

Set source file encoding

Add any of these lines [19]:

# -*- coding: utf-8 -*-
# vim: set fileencoding=utf-8 :

Write the BOM

See [20]

import codecs

file = codecs.open("lol", "w", "utf-8")
file.write(u'\ufeff')                          # or use unicode name: u'\N{ZERO WIDTH NO-BREAK SPACE}'
file.close()

# Using https://docs.python.org/2/library/codecs.html#module-encodings.utf_8_sig
with codecs.open("test_output", "w", "utf-8-sig") as temp:
    temp.write("hi mom\n")

Handling unicode

Some recommends to always process unicode internally, and decode on input and encode on output [21]:

line = line.decode('utf-8')
# ...treat line as unicode...
print line.encode('utf-8')

But this is error prone. So another solution proposed is to redefine sys.stdout:

import sys
import codecs
sys.stdout = codecs.getwriter('utf8')(sys.stdout)

An hackish way (not recommended):

# -*- coding: utf-8 -*-
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
print u"åäö"

Python 2 to Python 3

Troubleshooting

Troubleshooting a missing library

Use python -v -c "import mylibrary" to troubleshoot a module.
Look at the log for the loaded libraries.
Some libraries are statically linked in python and might be missing. Use ldd to see the linked libraries, and report missing ones.

ldd /path/to/your/_hashlib.so
# linux-gate.so.1 =>  (0xf77c3000)
# libssl.so.6 => not found
# libcrypto.so.6 => not found
# libpython2.7.so.1.0 => not found
# libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xf776a000)
# libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf75b3000)
# /lib/ld-linux.so.2 (0x5659b000)

@@ Line 1,174: / Line 1,174: @@
 Frequent mistakes. Beware the snake can bite you!
-; Confuse a method and a property in a test
+=== Confuse a method and a property in a test ===
 :SOLUTION: Stick to a convention. Like always define methods like <code>isxyyz()</code> or <code>hasabc()</code> as methods. Note that defining them as property would raise an exception if used as a function, and hence might be safer.
 <source lang=python>
@@ Line 1,181: / Line 1,181: @@
 </source>
-; Mix <code>0</code> with <code>None</code> in a sequence
+=== Mix <code>0</code> with <code>None</code> in a sequence ===
 : Testing whether an element is defined is more difficult.
@@ Line 1,191: / Line 1,191: @@
 </source>
-; Mixing property and normal getter
+=== Mixing property and normal getter ===
 :SOLUTION: prefix all getter method with <tt>get</tt>, like <code>getvalue()</code>
@@ Line 1,199: / Line 1,199: @@
 </source>
-; Forget that, in a python function, arguments are always passed ''by value''
+=== Forget that, in a python function, arguments are always passed ''by value'' ===
 <source lang=python>
 def f(x, y):
@@ Line 1,216: / Line 1,216: @@
 </source>
-;Use bytes, not string of characters
+=== Use bytes, not string of characters ===
 Characters can be unicode and take more than one byte.
 <source lang=python>
@@ Line 1,222: / Line 1,222: @@
 bytes('abc')
 </source>
-=== Docstrings ===
+=== Mixing string and bytestring (v3) ===
+<source lang="python">
+buf = b'abc\n'
+if buf.find(b'\n'):        # MUST use BYTESTRING here
+    # ....
+str = 'abc\n'
+if str.find('\n'):         #  MUST use STRING here
+    # ....
+</source>
+== Docstrings ==
 Specifications: [http://legacy.python.org/dev/peps/pep-0257/ pep-0257]

Python: Difference between revisions

Revision as of 08:43, 27 September 2018

References

Books

Links

Shell

Install

Virtual Environments

Install pip and setuptools

Install module online

Install modules offline

Import modules

Interactive mode

Python

iPython / Jupyter

Python variants

iPy

Pypy

Reference

Keywords

Reserved class of identifiers

Operators

Delimiters

Literals

More

Data types

Boolean

Control flow statements

Scope

File and Text Operations

io module

Text input and output

Basic

Basic

Operators

List

Dictionary

Random

Miscellaneous conversion

Format operator % or format function

math

System

Classes

Modules

Advanced

Cryptography

os and filesystem operations

Doctest

Tips

Simple HTTP Server

Detect interactive mode

Find duplicates in list

Start post-mortem debugger on exception

Miscellaneous

Analyse memory usage

The Pythonic way

Detect Python 2 or Python 3 dependency

Find character in a string

Detect Python version, location...

Display random distribution with seaborn

Do's and don't's

Traps

Confuse a method and a property in a test

Mix 0 with None in a sequence

Mixing property and normal getter

Forget that, in a python function, arguments are always passed by value

Use bytes, not string of characters

Mixing string and bytestring (v3)

Docstrings

Libraries

Unicode

Python 2 to Python 3

Troubleshooting

Troubleshooting a missing library

Navigation menu

Search

Format operator `%` or `format` function

Mix `0` with `None` in a sequence