The result is:
This page shows my suggestion to process data and generate figure parallelly by running some external python script.
The advantage of this method is; you don't have to wait the end of processing so you can execute some cells/lines in Jupyter-Notebook, you don't have to use difficult ipyparallel or multiprocessing. In this code, firstly the original python code which process data serially is shown, then I show the python code which process data parallelly. I don't show the result figure because it is no mean in this page.
In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pickle
from os.path import join
import subprocess
%matplotlib agg
First of all, generate the data to show.
In this example, the data suitable to contourf is saved in pickle format.
In [2]:
NUMFIG = 1000
datadir = './data2graph_parallelly'
xs = np.linspace(-2.0,2.0,10)
ys = np.linspace(-2.0,2.0,10)
XX,YY = np.meshgrid(xs,ys)
ZZs = [base**(XX*YY) for base in np.linspace(1,5,NUMFIG)]
with open(join(datadir,'xyz.pkl'), mode='wb') as f: pickle.dump([XX,YY,ZZs],f)
Convert data to png file serially.
In [3]:
for idx in range(NUMFIG):
# Never show result in this notebook
%matplotlib agg
plt.figure(facecolor='w',figsize=(5,4))
cont = plt.contourf(XX, YY, ZZs[idx], 50)
cbar = plt.colorbar(cont)
plt.savefig(join(datadir,'xyz%03d.png'%idx))
Prepare for parallel processing.
Write code and save as external .py file.
In [4]:
scriptname = 'data2graph.py'
with open(scriptname, mode='w') as f:
f.writelines("""
import numpy as np
import pickle
import matplotlib.pyplot as plt
import argparse
from os.path import join
def main():
with open(join(args.datadir,args.fname), mode='rb') as f:
XX,YY,ZZs = pickle.load(f)
for idx in [int(i) for i in args.idxs.split(',')]:
plt.figure(facecolor='w',figsize=(5,4))
cont = plt.contourf(XX, YY, ZZs[idx], 50)
cbar = plt.colorbar(cont)
plt.savefig(join(args.datadir,'xyz%03d.png'%idx))
if __name__ == '__main__':
p = argparse.ArgumentParser(description='load image and convert to image')
p.add_argument('datadir', type=str,
help='directory name to store output graph(s)')
p.add_argument('fname', type=str,
help='file name which contains data to process')
p.add_argument('idxs', type=str,
help='comma separated index number(s) of data to process')
args = p.parse_args()
main()
""")
In [5]:
ret = subprocess.check_output('python %s -h'%(scriptname))
print(ret.decode())
Run 8 python scripts (because my PC has 8 cores).
In [6]:
NCORE = 8
for st in range(NCORE):
idxs = ','.join(map(str, range(st,NUMFIG+1,NCORE)))
cmd = 'python %s %s xyz.pkl %s'%(scriptname,datadir,idxs)
subprocess.Popen(cmd)