The Third Branch Of Physics Essays On Scientific Computing With Python - Homework for you

Homework for you

The Third Branch Of Physics Essays On Scientific Computing With Python

Rating: 5.0/5.0 (21 Votes)

Category: Essay

Description

Python distributions and environments for scientific computing - Stack Overflow

I apologize upfront if this question is too broad. I come from the MATLAB world and have relatively little experience with Python.

After having spent some time reading about several Python-based environments and distributions for scientific computing, I feel that I still don't fully understand the landscape of solutions or the precise relationship between some notable packages, including:

  • Do any of the above packages provide similar functionality? Do they complement each other?
  • Does the installation of any of them include or require the installation of any of the others? If so, which ones include or require which?

Less importantly, are there any other packages similar to the ones above that provide similar functionality?

Thanks in advance

Scientific computing with Python is taking a plain vanilla language and bolting on a bunch of modules, each of which implement some aspect of the functionality of MATLAB. As such the experience with Python scientific programming is a little incohesive c.f. MATLAB. However Python as a language is much cleaner. So it goes.

The basic necessary modules for scientific computing in Python are Numpy. Matplotlib. SciPy and if you are doing 3d plotting, then Mayavi/VTK. These modules all depend on Numpy.

Numpy Implements a new array type that behave similar to MATLAB arrays (i.e. fast vector calculations). It also defines a load of functions to do these calculations which are usually named the same as similar functions in MATLAB.

Matplotlib Allows for 2d plotting with very similar commands to MATLAB. Matplotlib also defines pylab. which is a module that - with a single import - brings most of the Numpy and Matplotlib functions into the global namespace. This is useful for rapid/interactive scripting where you don't want to be typing lots of namespace prefixes.

SciPy is a collection of Python modules arranged under the SciPy umbrella that are useful to scientists. Fitting routines are supplied in SciPy modules. Numpy is part of Scipy.

Spyder is a desktop IDE (based on QT) that loosely tries to emulate MATLAB IDE. It is part of the Python-XY distribution.

IPython provides an enhanced interactive Python shell which is useful for trying out code and running your scripts and interacting with the results. It can now be served to a web interface as well as the traditional console. It is also embedded in the Spyder IDE.

Distributions

Getting all these modules running on your computer can be time consuming and so there are a few distributions that package them (plus many other modules) up for you.

Python-XY. WinPython. Enthought and more recently Anaconda are all full package distributions that include all the core modules, although Enthought does not come with Spyder.

Sage is another programming environment which is served over the web or via a command line and also comes as a full package including lots of other modules. Traditionally it came as a VMWare image based on an install of Linux. Although you are writing Python in the Sage environment, it's a little different to ordinary Python programming, it kind of defines its own language and methodology based on Python.

If you are using Windows I would install WinPython. It installs everything that you need including Scipy and Spyder (which is the best replacement for MATLAB for Python IMHO) and because it is designed to be standalone it will not interfere with other installs of Python you may have on your system. If you are on OSX, Enthought is probably the best way to go - Spyder can be installed separately using e.g. MacPorts. For Linux you can install the components (Numpy, SciPy, Spyder, Matplotlib) separately.

I personally don't like the Sage way of working with Python 'hidden under the hood' but you may prefer that.

This link may be usefull: https://www.cfa.harvard.edu/

It's the page of an astrophysicist at Harvard. It gives the point of view of someone switching from ITT-VIS IDL to python, on OS-X (but most tips also work on other operating systems).

EDIT: It seems the page was taken down. An alternative good introduction to python for a scientist/engineer is in this document (big PDF warning): http://stsdas.stsci.edu/perry/pydatatut.pdf Hope this one will not be taken down!

Other articles

Schoerghofer N

Size 2.7Mb
Date Jan 3, 2007


1 Analytic and Numeric Solutions; Chaos
Many equations that describe the behavior of physical systems cannot be solved analytically. In fact, it is said that “most” can not. Numerical methods enable us to obtain solutions that would otherwise elude us. The results may be valuable not only because they deliver quantitative answers; they can also provide new insight. A pocket calculator or a short computer program suffices for a simple demonstration. If we repeatedly take the sine function starting with an arbitrary value, xn+1 = sin(xn ), the number will decrease and slowly approach zero. For example, x = 1.000, 0.841, 0.746, 0.678, 0.628. (The values are rounded to three digits.) The sequence decreases because sin(x)/x < 1 for any x = 0. Hence, with each iteration the value becomes smaller and smaller and approaches a constant. But if we try instead xn+1 = sin(2.5xn ) the iteration is no longer driven toward a constant. For example, x = 1.000, 0.598, 0.997, 0.604, 0.998, 0.602, 0.998, 0.603, 0.998, 0.603, 0.998, 0.603. The iteration settles into a periodic behavior. There is no reason for the iteration to approach anything at all. For example, xn+1 = sin(3xn ) produces x = 1.000, 0.141, 0.411, 0.943, 0.307, 0.796, 0.685, 0.885, 0.469, 0.986, 0.181, 0.518. One thousand iterations later x = 0.538, 0.999, 0.144, 0.418, 0.951, 0.286. This sequence does not approach a constant value, it does not grow indefinitely, and it is not periodic, even when continued over many more iterations. A behavior of this kind is called “chaotic.” Can it be true that the iteration does not settle to a constant or into a periodic pattern, or is this an artifact of numerical inaccuracies? Consider the simple iteration yn+1 = 1 − |2yn − 1| known as “tent map.” 1.


Table 2-I: Newton’s method applied to sin(3x) − x = 0 with two different starting values.

Chapter 2 n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 yn y1 = 1./3. 1 0.333333 0.111111 0.0370371 0.0123458 0.00411542 0.00137213 0.000458031 0.000153983 5.39401E-05 2.32047E-05 1.81843E-05 2.69602E-05 5.07843E-05 0.000100523.

3 Roundoff and Numb er Representation
In a computer every real number is represented by a sequence of bits, most commonly 32 bits (4 bytes). One bit is for the sign, and the distribution of bits for mantissa and exponent can be platform dependent. Almost universally however a 32-bit number will have 8 bits for the exponent and 23 bits for the mantissa, leaving one bit for the sign (as illustrated in figure 1). In the decimal system this corresponds to a maximum/minimum exponent of ±38 and approximately 7 decimal digits (at least 6 and at most 9). For a 64-bit number (8 bytes) there are 11 bits for the exponent (±308) and 52 bits for the mantissa, which gives around 16 decimal digits of precision (at least 15 and at most 17). |0|01011110|00111000100010110000010| sign exponent mantissa +1.23456E-6 sign mant. exp.

Chapter 3 single double bytes 4 8 bits for mantissa 23 52 bits for exponent 8 11 significant decimals 6–9 15–17 maximum finite 3.4E38 1.8E308 minimum normal 1.2E-38 2.2E-308 minimum subnormal 1.4E-45 4.9E-324.


It is helpful to reserve a few bit patterns for “exceptions.” There is a bit pattern for numbers exceeding the maximum representable number, a bit pattern for Inf (infinity), -Inf, and NaN (not a number). For example, 1./0. will produce Inf. An overflow is also an Inf. There is a positive and a negative zero. If a zero is produced as an underflow of a tiny negative.


Problem: The convergence test indicates that ||u2N − uN || → 0 as the resolution N goes to infinity (roundoff ignored). Does this mean limN →∞ ||uN − u|| → 0, where u is the exact, correct answer.


Co de Sources. • The Guide to Available Mathematical Software http://math.nist.gov maintains a directory of subroutines from numerous public and proprietary repositories. • NETLIB at www.netlib.org offers free sources by various authors and of varying quality. • A more specialized, refereed set of routines is available to the public from the Col lected Algorithms of the ACM at www.acm.org/calgo. • Numerical Recipes, www.nr.com, explains and provides a broad and selective collection of reliable subroutines. (Sporadic weaknesses in the first edition are corrected in the second.) Each program is available in C, C++, Fortran 77, and Fortran 90. 34.


Recommended References: Patterson & Hennessy, Computer Organization and Design: The Hardware/Software Interface.


Stars indicate nonzero elements and blank elements are zero. Eliminating the first column takes about N 2 floating-point operations, the second column (N − 1)2. the third column (N − 2)2. and so on. This yields a total of about N 3 /3 floating-point operations. (One way to see that is to approximate the sum by an integral, and the integral of N 2 is N 3 /3.) Once triangular form is reached, the value of one variable is known and can be substituted in all other equations, and so on. These substitutions require only O(N 2 ) operations. A count of N 3 /3 is less than the.


Recommended References: For generation and testing of random numbers see Knuth, The Art of Computer Programming, Vol. 2. Methods for generating probability distributions are found in Devroye, Non-Uniform Random Variate Generation, which is also available on the web at http://cg.scs.carleton.ca/˜luc/rnbookindex.html.


Figure 3 shows the magnetization as a function of temperature obtained with such a program. Part (a) is for the one-dimensional Ising model and the spins are initialized in random orientations. The scatter of points at low temperatures arises from insufficient equilibration and averaging times. In one dimension the magnetization vanishes for any.


Entertainment: One good example of an online applet that demonstrates the spin fluctuations in the two-dimensional Ising model.

Chapter 12 AGTGGACTTTGACAGA AGTGGACTTAGATTTA TGGATCTTGACAGATT AGTTGACTTACGTGCA ATCGATCTATTCACCG.


There are two ma jor distinct types of PDEs. One type describes the evolution over time, or any other variable, starting from an initial configuration. Physical examples are the propagation of sound waves (wave equation) and the spread of heat in a medium (diffusion equation or heat equation). These are “initial value problems.” The other group are static solutions constrained by boundary conditions. Examples are the electric field of charges at rest (Poisson equation) and the charge distribution of electrons in an atom (time-independent Schrodinger equation). These ¨ are “boundary value problems.” The same distinction can already be made for ordinary differential equations. For example, −f (x) = f (x) with f (0) = 1 and f (0) = −1 is an initial value problem, while the same equation with f (0) = 1 and f (1) = −1 is a boundary value problem. 74.

Schoerghofer N

Size 2.7Mb
Date Jan 3, 2007


iv basic linear algebra, or introductory physics. The last two and a half chapters involve multivariable calculus and can be omitted by anyone who does not have this background. Prior knowledge of numerical analysis and a programming language are optional. The book can be roughly divided into two parts. The first half deals with small computations and the second mainly with large computations. The reader is exposed to a wide range of approaches, conceptional ideas, and practical issues. Although the book is focused on physicists, all but a few chapters are accessible to and relevant for a much broader audience in the physical sciences. Sections with a ∗ symbol are specifially intended for physicists and chemists. For better readability, references within the text are entirely omitted. Figure and table numbers are prefixed with the chapter number, unless the reference occurs in the text of the same chapter. Bits of entertainment, problems, dialogs, and quotes are used for variety of exposition. Problems at the end of several of the chapters do not require paper and pencil, but should stimulate thinking. Numerical results are commonly viewed with suspicion, and often rightly so, but it all depends how well they are done. The following anecdote is appropriate. Five physicists carried out a challenging analytic calculation and obtained five different results. They discussed their work with each other to resolve the discrepancies. Three realized mistakes in their analysis, but the others still ended up with two different answers. Soon after, the calculation was done numerically and the result did not agree with any of the five analytic calculations. The numeric result turned out to be the only correct answer. ¨ Norbert Schorghofer Honolulu, Hawaii August, 2006.

Chapter 2 n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 yn y1 = 1./3. 1 0.333333 0.111111 0.0370371 0.0123458 0.00411542 0.00137213 0.000458031 0.000153983 5.39401E-05 2.32047E-05 1.81843E-05 2.69602E-05 5.07843E-05 0.000100523.

3 Roundoff and Numb er Representation
In a computer every real number is represented by a sequence of bits, most commonly 32 bits (4 bytes). One bit is for the sign, and the distribution of bits for mantissa and exponent can be platform dependent. Almost universally however a 32-bit number will have 8 bits for the exponent and 23 bits for the mantissa, leaving one bit for the sign (as illustrated in figure 1). In the decimal system this corresponds to a maximum/minimum exponent of ±38 and approximately 7 decimal digits (at least 6 and at most 9). For a 64-bit number (8 bytes) there are 11 bits for the exponent (±308) and 52 bits for the mantissa, which gives around 16 decimal digits of precision (at least 15 and at most 17). |0|01011110|00111000100010110000010| sign exponent mantissa +1.23456E-6 sign mant. exp.

Chapter 3 single double bytes 4 8 bits for mantissa 23 52 bits for exponent 8 11 significant decimals 6–9 15–17 maximum finite 3.4E38 1.8E308 minimum normal 1.2E-38 2.2E-308 minimum subnormal 1.4E-45 4.9E-324.


Recommended References: The “father” of the IEEE 754 standard is William Kahan, who has a description of the standard and other roundoff-related notes online at www.cs.berkeley.edu/˜wkahan. A technical summary is provided by David Goldberg What every computer scientist should know about floating point arithmetic. It can be found all over the internet, for example at http://docs-pdf.sun.com/800-7895/ 800-7895.pdf.


Stars indicate nonzero elements and blank elements are zero. Eliminating the first column takes about N 2 floating-point operations, the second column (N − 1)2. the third column (N − 2)2. and so on. This yields a total of about N 3 /3 floating-point operations. (One way to see that is to approximate the sum by an integral, and the integral of N 2 is N 3 /3.) Once triangular form is reached, the value of one variable is known and can be substituted in all other equations, and so on. These substitutions require only O(N 2 ) operations. A count of N 3 /3 is less than the.


Random number generators are not truly random, but use deterministic rules to generate “pseudorandom” numbers, for example xi+1 = (23xi )mod(108 +1), meaning the remainder of 23xi /100000001. The starting value x0 is called the “seed.” Pseudorandom number generators can never ideally satisfy all desired statistical properties. For example, since there are only finitely many computer representable numbers they will ultimately always be periodic, though the period can be extremely long. Random number generators are said to be responsible for many wrong computational results. Particular choices of the seed can lead to short periods. Likewise, the coefficients in formulas like the one above need to be chosen carefully. Many implementations of pseudorandom number generators were simply badly chosen or faulty. The situation has however improved and current random number generators suffice for almost any practical purpose. Source code routines seem to be universally better than built-in random number generators provided by libraries. Pseudorandom number generators produce a uniform distribution of numbers in an interval, typically either integers or real numbers in the interval from 0 to 1 (without perhaps one or both of the endpoints). How do we obtain a different distribution? A new probability distribution, p(x), can be related to a given one, q (y ), by a transformation y = y (x). The probability to be between x and x + dx is p(x)dx. By construction, this equals the probability to be between y and y + dy. Hence, |p(x)dx| = |q (y )dy |, where the absolute values are needed because y could 56.


Figure 3 shows the magnetization as a function of temperature obtained with such a program. Part (a) is for the one-dimensional Ising model and the spins are initialized in random orientations. The scatter of points at low temperatures arises from insufficient equilibration and averaging times. In one dimension the magnetization vanishes for any.


Entertainment: One good example of an online applet that demonstrates the spin fluctuations in the two-dimensional Ising model.


Trees, which we have encountered in the heapsort algorithm, are a “data structure.” Arrays are another, simple data structure. A further possibility is to store pointers to data, that is, every data entry includes a reference to where the next entry is stored. Such a storage arrangement is called “list.” Inserting an element in a sequence of data is faster when the data are stored as a list rather than as an array. On the other hand, accessing the last element is faster in an array than in a list. Lists cause cache misses (described in chapter 9), because sequential elements are.

PHY 546: Python for Scientific Computing

PHY 546: Python for Scientific Computing a weekly graduate seminar on techniques for scientific programming Instructor: Michael Zingale

Some basic programming background, be it C/C++, Fortran, matlab, mathematica. (enough to understand the logic of programming, control statements, basic data structures, etc.) would be useful.

This is intended to be a 1-credit class. The primary method of evaluation is class participation.

To make the most of this class, you should have python installed on a laptop that you can bring to the seminar. On Linux machines, you can get python and the needed libraries through your package manager. For Mac and Windows, you might want to consider the free distributions provided by Enthought Canopy or Anaconda. These both install everything you need.

All of the course slides (in LibreOffice flat XML format), scripts, and IPython notebooks are availble on the course github page: https://github.com/sbu-python-class/python-science

Course Information: Online Resources: Other Readings (dealing with Open science and managing code projects):
  • 10 Simple Rules for the Care and Feeding of Scientic Data by Goodman et al.
  • How to Scale a Code in the Human Dimension by Matt Turk
  • Practices in source code sharing in astrophysics by L. Shamir et al.
  • Best Practices for Scientific Computing by G. Wilson et al.
  • Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research by V. Stodden and Sheila Miguez
  • Reliability in the Face of Complexity; The Challenge of High-End Scientific Computing by G. Ferland
  • The Nature of Scientific Proof in the Age of Simulations by K. Heng
Python Resources by Disicipline: The following list provides links to discipline-specific python software:
  • Astronomy resources:
    • Astropy: A community Python package for astronomy . an article describing a community Astronomy package for python
    • AstroPython
  • Atmospheric Sciences resources:
    • PyAOS. a list of python resources for Atmospheric Sciences
  • Biology resources:
    • Python—All a Scientist Needs . an article describing how python is used in bioinformatics
    • Biopython. a set of tools for computational biology
  • Cognitive Science resources:
    • pycogsci :a blog providing information about how python is used in Cognitive Science
  • Ocean and marine sciences resources:
    • OceanPython.org. a blog for the ocean and marine sciences communities
  • Physics resources:
    • QuTiP. the Quantum Toolbok in Python
  • Social sciences resources:
    • NetworkX. a library for exploring the structure and complexity of social networks
  • Solar physics resources:
    • SunPy. a library providing routines to analyze solar data
  • Psychology resources:
    • PyschoPy. psychology software to "allow the presentation of stimuli and collection of data for a wide range of neuroscience, psychology and psychophysics experiments."
Course Topics:


Note: this information will be updated continuously throughout the semester, so it is best to look at the relevant topics just before the class meeting.

Introduction to python The NumPy library Python Practices Matplotlib and others SciPy and numerical methods (lectures 8–9)
  • Readings:
    • The official SciPy Tutorial
    • The SciPy cookbook
    • SciKits are additional toolkits for SciPy which provide extra functionality
    • SciPy Central user-submitted SciPy snippets
    • NumPy for Matlab Users
    • Deterministic Nonperiodic Flow by E. N. Lorenz—this is the system we integrated when discussing ODEs
    • A simple example of an ill-conditioned matrix by G. J. Tee
  • Lecture slides: scipy.pdf
  • Lecture IPython notebooks: scipy-basics.ipynb
  • Other examples:
    • Gaussian elimination with pivoting: gauss.py (main module), gauss-test.py (test routine) matmul.py (auxillary routine)
SymPy Pandas and the data frame GUIs Extending python with C/Fortran & System Operations (lecture 13)
  • Readings:
    • Interfacing with C by Valentin Haenel as part of his SciPy lecture notes. This is a very nice comparison of different methods
    • Speeding up Python (NumPy, Cython, and Weave) by T. Oliphant
    • C-API:
      • Extending Python with C or C++ . this is the "hard" way to do things.
    • ctypes:
      • Ctypes Cookbook . ctypes makes it easy to call existing C code.
    • f2py:
      • f2py Users Guide
      • F2PY: a tool for connecting Fortran and Python programs
    • Cython:
      • Cython, C-Extensions for Python the official project page
      • Cython: The Best of Both Worlds by S. Behnet et al. (alternate links: [here] )
  • Lecture slides: extensions.pdf
  • Example codes:
    • C-API: test-C-API.py numpy-ex.c setup.py
    • ctypes: test-ctypes.py cfunc_multid.c Makefile
    • f2py: test_f2py.py numpy_in_f.f90 Makefile
    • Cython: test_cy.py square.pyx setup.py
    • Timing comparison for Laplace smoothing (this extends the comparison from the blog entry by T. Oliphant listed above): Makefile laplace_CAPI.c laplace_C.c laplace_cython.pyx laplace_fortran.f90 laplace.py setup.py
    • Calling an external command and capturing both stdout and stderr: githash.py
Building python applications / Packaging Other topics (if time) MayaVi

A Primer on Scientific Programming with Python, 4th» PDF

A Primer on Scientific Programming with Python, 4th Edition

The book serves as a first introduction to computer programming of scientific applications, using the high-level Python language. The exposition is example and problem-oriented, where the applications are taken from mathematics, numerical calculus, statistics, physics, biology and finance. The book teaches “Matlab-style” and procedural programming as well as object-oriented programming. High school mathematics is a required background and it is advantageous to study classical and numerical one-variable calculus in parallel with reading this book. Besides learning how to program computers, the reader will also learn how to solve mathematical problems, arising in various branches of science and engineering, with the aid of numerical methods and programming. By blending programming, mathematics and scientific applications, the book lays a solid foundation for practicing computational science.

Post navigation

Scientific Computing with Python

Note from Rector

“We are pleased to bring the AIMS model to Tanzania. We bring together top global scholars in math and science to teach and research with Africa’s brightest students”. Our graduates then use these skills to tackle African development issues ranging from disease prevention to environmental degradation, education and poverty.

AIMS graduates have a broad-based training and are talented problem solvers and innovators, which is just what this continent needs.”

Contact us Scientific Computing with Python

Course taught by: Dr. Emile Chimusa (emile(at)aims.ac.tz) from University of Cape Town (South Africa).

SPECIFIC OUTCOMES ADDRESSED

  1. Generally speaking: develop numerical/scientific computing and problem-solving skills and approaches through writing computers scripts.
  1. Understand the three types of control structures (sequence, repetition and selection), as building blocks for all scripts.
  1. Manipulate basic objects and data structures.
  1. Understand the concepts of variable assignment, different data types, the memory allocation model, functions and function calls, with the mechanics of argument passing.
  1. Appreciate the importance of writing programs with I/O capabilities.
  1. Introduction to object-oriented programming.
  1. Effectively write computer programs. The question of the target programming language to be chosen here seems to be resolved into a growing consensus around Python.
BOOKS & OTHER SOURCES USED

Introduction to algorithms. 3rd edition, by Thomas Cormen, Charles Leiserson and colleagues (MIT Press, 200 9).

Python: Built-in Data Types.

Sage: introduction to computational Mathematics.

String Manipulation and if Statements

Derived Data Types (Lists, tuples, sets, and dictionaries) and more Control statements.

Writing Functions, File Input/output and Exception Handling.

Modules and more about graphics.

Scipy-Numpy Arrays and Introduction to Python Classes and Objects.

Throughout the course, use an interactive Python shell to demonstrate concepts, plus a simple text editor later on, once the students start writing functions.

This section “practical component” follows the same structure as the previous section “Theory lectures”: practicals just aim at having the students manipulate the concepts seen in the lectures, right after they were introduced to them.


BACKGROUND KNOWLEDGE REQUIRED
Basic general-purpose scientific knowledge, linear algebra and basic arithmetic/calculus skills, and some familiarity with computers.
Notes

ASSESSMENT ACTIVITIES AND THEIR WEIGHTS
  1. Homeworks will be assigned every week. Homework problems will consist of a mix of general problems, programming assignments, problems related to the class project.
  1. Grading
      1. Final Projects: 50%
      2. Weekly Home works: 50%
  1. Collaboration Policy:
    Students may discuss the homework problems with other students or use other resources such as textbooks or the Internet. However, Students must not obtain answers directly from anyone else. All home works will be submitted individually.
  1. Final Project: working in Groups
  • Identify how different variables work together to create the dynamics of the
    system.
  • Reduce the dimensionality of the data.
  • Decrease redundancy in the data.
  • Filter some of the noise in the data.
  • Compress the data.

Projects
1. Stability Analysis of a Predator-Prey Model using python:
T he dynamic relationship between predators and their prey has long been and will continue to be one of the dominant themes in both ecology and mathematical ecology due to its universal existence and importance. The dynamic relationship between predators and their prey has long been and will continue to be one of the dominant themes in both ecology and mathematical ecology due to its universal existence and importance. The aim of this project is to compare the computational approaches of Stability Analysis of a Predator-Prey Model using python and sage.
2. Application of Programming Dynamic Models in Social Networks using Networkx in Python:
Finding patterns of social interaction within a population has wide-ranging applications including: disease modelling, cultural and information transmission, and behavioural ecology. Social interactions are often modelled with networks. A key characteristic of social interactions is their continual change. However, most past analyses of social networks are essentially static in that all information about the time that social interactions take place is discarded. The aim of this short project is to use Networkx package in python and some existence algorithm to illustrate the mathematical and computational framework that enables analysis of dynamic social networks and that explicitly makes use of information about when social interactions occur.
3. Computing fractal geometry using python:
Fractals is a new branch of mathematics and art. Perhaps this is the reason why most people recognize fractals only as pretty pictures useful as backgrounds on the computer screen or original postcard patterns. But what are they really? Most physical systems of nature and many human artefacts are not regular geometric shapes of the standard geometry derived from Euclid. Fractal geometry offers almost unlimited waysof describing, measuring and predicting these natural phenomena. But is it possible to define the whole world using mathematical equations? Fractal geometry has permeated many area of science, such as astrophysics, biological sciences, and has become one of the most important techniques in computer graphics. This project aims at discussion the mathematical and computational of most famous fractals. The discussion may brings the computational aspects using turtle or other python packages, how those fractals were created and explains the most important fractal properties, which make fractals useful for different domain of science.
4. Data Analysis: Application to tuberculosis data of South African Coloured populations of South Africa:
Tuberculosis (TB) remains a source of morbidity and mortality worldwide, particularly in developing countries. One-third of the world’s individuals are infected with TB, but only 10% go on to develop active TB during their lifetime. In addition, twin studies in humans and animal models also demonstrate a strong genetic influence on TB susceptibility. This suggest that genetic factors may play an important role in TB susceptibility in determining both the host response and the outcome of infection. The second highest incidence of TB in the world is in the Western, Eastern and Northern Cape in South Africa, particularly in the mixed South African Coloured population. This project aims at looking at ancestry-specific TB risk using the genetic data of the mixed South African Coloured population. It also aims at evaluating the genetic ancestry of samples of TB cases and controls from this population. Importantly, it will examine whether the genetic contribution can increase tuberculosis prevalence.
5. Computing Forward-Backward Algorithm using python:
The forward-backward algorithm has very important applications to both hidden Markov models (HMMs) and conditional random fields (CRFs). It is a dynamic programming algorithm, and is closely related to the Viterbi algorithm for decoding with HMMs or CRFs. This project aims at describing the algorithm at a level of abstraction that applies to both HMMs. It will also describe its specific application and its computational aspect using python.
6. Computational Signal Processing and Visualization using python:
An ever-increasing number of scientific studies are generating larger, more complex, and multi-modal datasets. This results in data analysis tasks becoming more demanding. To help tackle these new challenges, more disciplines now need to incorporate advanced visualization techniques into their standard data processing and analysis methods. While many systems have been developed to allow scientists to explore, analyse, and visualize their data, many of these solutions are domain specific, limiting their scope as general processing tools. This project aims at discussing a development environment suitable to both computational and visualization tasks. It will describe basic mathematics and computational signal processing and visualization using python and applications neuroscience.
7. Application and Computing Principal Component Analysis (PCA):
Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. It is one of several statistical tools available for reducing the dimensionality of a data set. The major goal of principal components analysis is to reveal hidden structure in a data set. This project will discuss the mathematical and computational aspect of PCA including
8. Computational Generalization of Mixed Models:
Dependent data arise in many studies. Frequently adopted sampling designs, such as cluster, multilevel, spatial, and repeated measures, may induce this dependence, which the analysis of the data needs to take into due account. This project involves the exploration of the generalization of mixed models, its applications and computational aspect using python. In addition the project will discuss the computational of parameters in mixed models using Monte Carlo Expectation Maximization algorithm.
9. Modelling and Visualizing Human Protein-Protein Interactions Network using Python:
Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. This project will involve the discussion of graph theory graph theory universe to model and visualize Human Protein-Protein Interactions and will discuss ways in which they can be used to reveal hidden properties and features of a network using networkx in Python and R.

NetworkX 1

Geeks3D

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. An update (1.0 RC1) has been released few days ago and you can grab the package HERE .

I took advantage of this update to test NetworkX with GeeXLab. I created a simple graph (path_graph ) and I played with Fruchterman-Reingold algorithm to position graph’s nodes.

Fruchterman-Reingold is an algorithm that attempts to produce aesthetically pleasing, two-dimensional pictures of graphs by doing simplified simulations of physical systems. A more detailed definition can be found HERE .

The Fruchterman-Reingold Algorithm is a force-directed layout algorithm. The idea of a force directed layout algorithm is to consider a force between any two nodes. In this algorithm, the nodes are represented by steel rings and the edges are springs between them. The attractive force is analogous to the spring force and the repulsive force is analogous to the electrical force. The basic idea is to minimize the energy of the system by moving the nodes and changing the forces between them. For more details refer to the Force Directed algorithm.

I’m absolutly not an expert in graph, I’m rather even a newbie, but it’s cool to see how we can play with a scientific library in GeeXLab. Thanks a lot Python! Just for the fun, it would be nice to see such a NetworkX graph rendered with cool lighting and other eye catching post processing effects. I’ll do it shortly, as soon as the alpha version of GeeXLab will be out…

NetworkX is based on NumPy (Numerical Python). NumPy is the fundamental package needed for scientific computing with Python. You can get the last version (1.3.0) HERE .

A Worked Example on Scientific Computing with Python - A Worked Example on Scientific Computing with Python 1

A Worked Example on Scientific Computing with Python¶

This worked example fetches a data file from a web site, applies that file as input data for a differential equation modeling a vibrating system, solves the equation, and visualizes various properties of the solution and the input data. The following programming topics are illustrated: downloading files from a web site, working with numpy arrays, flexible storage of objects in lists, easy storage of objects in files (persistence), signal processing and FFT, and curve plotting of data.

Physical Problem¶

The task is to make a simulation program that can predict how a (simple) mechanical system oscillates in response to environmental forces. Introducing as some displacement of the system at time , application of Newton’s second law of motion to the system often results in the following type of equation for :

Vehicle on a bumpy road

Another example regards the vertical shaking of a building due to earthquake-induced movement of the ground. If the vertical displacement of the ground is recorded as a function , this results in a vertical force . The soil foundation acts as a spring and damper on the building, modeled through the damping parameter and normally a linear spring term .

In both cases we drop the effect of gravity, which is just a constant compression of the spring.

Numerical Model and Implementation¶

The implementation of the computational algorithm can make use of an array u to represent as u[n]. The force is assumed to be available as an array element F[n]. The following Python function computes u given an array t with time points , the initial displacement I. mass m. damping parameter b. restoring force f(u). environmental forces F as an array (corresponding to t ).

Dissection of the Code. Functions in Python start with def. followed by the function name and the list of input objects separated by comma. The function body is indented, and the first non-indented line signifies the end of the function body block. The string, enclosed in triple double-quotes, right after the function definition, is a doc string used for documenting the function. Various tools can extract function definitions and doc strings to automatically produced manuals.

Array functionality is offered by the numpy packaged, here imported under the nickname np. This package contains MATLAB-like functionality. It is quite common to prefix a MATLAB-like function such as zeros by np ( np.zeros ), but one can also perform

and then write just zeros without any prefix. The advantage is that the code closely resembles similar MATLAB code.

The total number of elements in an array t is obtained by t.size. One could also use len(t). but for multi-dimensional arrays len just gives the number of elements corresponding to the first index (number of rows in a matrix).

Arrays are indexed by square brackets, and indices always start at 0. For/do loops in Python are more general than those in Fortran, C, C++, and Java, as one can loop over the any set of objects with the syntax for element in some_set. In numerical code, it is common to loop over array indices, i.e. a set of integers. Such a set is produced by range(start, stop, increment). which returns a list of integers start, start+increment, start+2*increment. and so on, up to but not includingstop. Writing just range(stop) means range(0, stop, 1) .

Every variable in Python is an object. In particular, the f function above is a function object, transferred to the function as any other object, and called as any other function.

The Force¶

Considering the application where the present mathematical model describes the vibrations of a vehicle driving along a bumpy road, we need to establish the force array F from the shape of the road . Various shapes are available as a file with web address http://folk.uio.no/hpl/scripting/bumpy.dat.gz. The Python functionality for downloading this gzip compressed file as a local file bumpy.dat.gz and reading it into a numpy array goes as follows:

In general, a[s:t:i,2] gives a view (not a copy) to the part of the array a where the first index goes from s to t. but not including the ``t`` value. in increments of i. and the second index is fixed at 2. Just writing : for an index means all possible index values.

Here, u is a local variable. which lives just inside in the function, while k is a global variable. which must be initialized outside the function prior to calling f with any u argument.

Parameters can be set as

This choice corresponds to a velocity of 36 km/h and a mass of 60 kg, i.e. bicycle conditions.

For each shape we want to compute the corresponding vertical displacement using the mathematical model (1). This can be accomplished by looping over the columns of h_data and calling forced_vibrations for each column, i.e. each realization of the force . The major arrays from the computations are collected in a list data. x. t. and for each road shape, a 3-list [h, a, u] .

The code above is naturally implemented as a Python function:

Since the roads have a quite noise shape, the force looks very noise and the response to this excitation is quite noisy, see Figure First realization of a bumpy road, with corresponding acceleration of the wheel and resulting vibrations for an example. It may be useful to compute the root mean square value of the various realizations of and add this array to the data list of input and output data in the problem: