Using GenX from the command line (with mpi)¶
Introduction¶
Using GenX from the command line lets you, in the simplest case, start up the gui. You can also run fits without starting up the gui at all. This opens possibilities to make a batch script of multiple GenX runs and, in addition, you can run GenX on machines without a desktop environment. GenX also supports mpi for fitting in parallel opens up the possibility to use it on clusters. The mpi implementation was contributed by Canrong Qiu. Note that, currently, the command line is only fully implemented in the source/pip versions.
Dependencies¶
If you only intend to run GenX from the command line you do not need an installation of wxPython or matplotlib.
See section Clusters for installation instructions.
Command line arguments¶
The arguments to GenX can be viewed by executing the program with the --help
option.
Note
All commands should work from the source folder without installation by using python scripts/genx
as execuable.
$ genx --help
usage: genx [-h] [-r | --mpi | -g | --pars | --mod]
[--pr PR] [--cs CS] [--mgen MGEN] [--pops POPS] [--asi ASI] [--km KM] [--kr KR]
[-s] [-e] [--var VAR] [--bumps]
[-d DATA_SET] [--load DATAFILE] [--export SAVE_DATAFILE]
[-l LOGFILE] [--debug] [--dpi-scale DPI_OVERWRITE] [--no-curses] [--disable-nb] [--nb1]
[infile] [outfile]
GenX 3.6.0, fits data to a model.
positional arguments:
infile The .gx or .hgx file to load or .ort file to use as basis for model
outfile The .gx or hgx file to save into
optional arguments:
-h, --help show this help message and exit
-r, --run run GenX fit (no gui)
--mpi run GenX fit with mpi (no gui)
-g, --gen generate data.y with poisson noise added (no gui)
--pars extract the parameters from the infile (no gui)
--mod modify the GenX file (no gui)
optimization arguments:
--pr PR Number of processes used in parallel fitting.
--cs CS Chunk size used for parallel processing.
--mgen MGEN Maximum number of generations that is used in a fit
--pops POPS Population size - number of individuals.
--asi ASI Auto save interval (generations).
--km KM Mutation constant (float 0 < km < 1)
--kr KR Cross over constant (float 0 < kr < 1)
-s, --esave Force save evals to gx file.
-e, --error Calculate error bars before saving to file.
--var VAR Minimum relative parameter variation to stop the fit (%)
--bumps Use Bumps DREAM optimizer instead of GenX Differential Evolution
data arguments:
-d DATA_SET Active data set to act upon. Index starting at 0.
--load DATAFILE Load file into active data set. Index starting at 0.
--export SAVE_DATAFILE
Save active data set to file. Index starting at 0.
startup options:
-l LOGFILE, --logfile LOGFILE
Output debug information to logfile.
--debug Show additional debug information on console/logfile
--dpi-scale DPI_OVERWRITE
Overwrite the detection of screen dpi scaling factor (=72/dpi)
--no-curses Disable Curses interactive console interface for command line fitting on UNIX systems.
--disable-nb Disable the use of numba JIT compiler
--nb1 Compile numba JIT functions without parallel computing support (use one core only).
This does disable caching to prevent parallel versions from being loaded.
For support, manuals and bug reporting see http://genx.sf.net
To run a fit using the multiprocessing module (forking different processes) which is the same code as in the gui the following command can be executed.
$ python ./scripts/genx --run --mgen=10 --pr 8 ./genx/examples/X-ray_Reflectivity.hgx test.hgx
INFO: *** GenX 3.6.0 Logging started ***
INFO: Loading model C:\Users\Artur\genx\genx\genx\examples\X-ray_Reflectivity.hgx...
INFO: Simulating model...
INFO: Setting up the optimizer...
INFO: DiffEv Optimizer:
Fitting:
use_start_guess=True use_boundaries=True
use_autosave=False autosave_interval=10
save_all_evals=False max_log_elements=100000
Differential Evolution:
km 0.6
kr 0.6
create_trial best_1_bin
use_pop_mult=False pop_mult=3 pop_size=50
use_max_generations=True max_generations=10 max_generation_mult=6
min_parameter_spread 0.0
Parallel processing:
use_parallel_processing True
parallel_processes 8
parallel_chunksize 1
INFO: Saving the initial model to C:\Users\Artur\genx\genx\test.hgx
INFO: Fitting starting...
INFO: DE initilized
INFO: Setting up a pool of workers ...
INFO: Starting the fit...
INFO: Starting a pool with 8 workers ...
INFO: Calculating start FOM ...
INFO: Going into optimization ...
INFO: FOM: 0.321 Generation: 1 Speed: 2777.7
INFO: FOM: 0.293 Generation: 2 Speed: 2500.0
INFO: FOM: 0.254 Generation: 3 Speed: 2500.2
INFO: FOM: 0.217 Generation: 4 Speed: 2499.9
INFO: FOM: 0.217 Generation: 5 Speed: 2777.7
INFO: FOM: 0.217 Generation: 6 Speed: 2941.2
INFO: FOM: 0.217 Generation: 7 Speed: 2941.2
INFO: FOM: 0.206 Generation: 8 Speed: 2941.3
INFO: FOM: 0.206 Generation: 9 Speed: 3124.8
INFO: FOM: 0.206 Generation: 10 Speed: 2941.3
INFO: Stopped at Generation: 10 after 500 fom evaluations...
INFO: Fitting finished!
INFO: Time to fit: 0.05453455845514933 min
INFO: Updating the parameters
INFO: Saving the fit to C:\Users\Artur\genx\genx\test.hgx
INFO: Fitting successfully completed
INFO: *** GenX 3.6.0 Logging ended ***
As can be seen this loads the file .genx/examples/X-ray_Reflectivity.hgx
sets the maximum number of generation to run
to 10 and then runs the fit. The result is saved to test.hgx
. Note that to be able to analyse the fits (calculate error bars
for example) the option --esave
should be used. If the fits take a long time to run it is advisable to save them
every now and then with the --asi
command that specifies how often the current result should be written to file.
It can also be good idea to directly calculate the errorbars before saving to file with the -e
command.
Another point to see is that there is a significant speed-up when only using the command line. This is probably due to
that the GUI does not have to be updated.
For UNIX systems the default command line output uses the curses library to better visualize the progress, the output during refinement will look something like this:
FOM: 0.051 Generation: 25 Speed: 2162.7
FOM: 0.046 Generation: 26 Speed: 2141.1
FOM: 0.046 Generation: 27 Speed: 2123.3
FOM: 0.046 Generation: 28 Speed: 2120.4
FOM: 0.046 Generation: 29 Speed: 1865.8
FOM: 0.046 Generation: 30 Speed: 2185.8
FOM: 0.046 Generation: 31 Speed: 2176.6
FOM: 0.046 Generation: 32 Speed: 2227.9
Relative value and spread of fit parameters: best/width
Parameter 00: [ ==# ] 0.53/0.03
Parameter 01: [ ===================#==================== ] 0.34/0.51
Parameter 02: [ ==========================================#=== ] 0.94/0.58
Parameter 03: [ =============================#=================== ] 0.64/0.62
Parameter 04: [ =======================#== ] 0.94/0.33
Parameter 05: [ =========================#===================== ] 0.33/0.59
Parameter 06: [ =============#========== ] 0.42/0.31
Parameter 07: [ =================#====== ] 0.79/0.31
Parameter 08: [ ============#================= ] 0.17/0.38
Note
The fit can be stopped before the breaking conditions using q
. To deactivate the interactive
view use the --no-curses
option.
Stopping with q only works on UNIX without curses if <enter>
is pressed afterwords. This can
also be used to stop a MPI refinement at any time.
Using MPI¶
If MPI and mpi4py is installed on the system the --mpi
switch will be activated. Note that the description for
--mpi
in the help will not appear until the mpi4py can be loaded correctly. In order to use mpi the command mpirun
or mpiexec
has to be used. The argument -np
defines how many processes to use. An example can be seen below.
$ mpirun -np 2 python -m genx.run --mpi --mgen=10 ./genx/examples/X-ray_Reflectivity.hgx test.hgx
INFO: *** GenX 3.6.0 Logging started ***
INFO: Loading model /mnt/c/Users/Artur/genx/genx/genx/examples/X-ray_Reflectivity.hgx...
INFO: Simulating model...
INFO: Setting up the optimizer...
INFO: DiffEv Optimizer:
Fitting:
use_start_guess=True use_boundaries=True
use_autosave=False autosave_interval=10
save_all_evals=False max_log_elements=100000
Differential Evolution:
km 0.6
kr 0.6
create_trial best_1_bin
use_pop_mult=False pop_mult=3 pop_size=50
use_max_generations=True max_generations=10 max_generation_mult=6
min_parameter_spread 0.0
Parallel processing:
use_parallel_processing False
parallel_processes 2
parallel_chunksize 1
INFO: Saving the initial model to /mnt/c/Users/Artur/genx/genx/test.hgx
INFO: Fitting starting...
INFO: DE initilized
INFO: Inits mpi with 2 processes ...
INFO: Starting the fit...
INFO: Calculating start FOM ...
INFO: Going into optimization ...
INFO: FOM: 0.301 Generation: 1 Speed: 1244.8
INFO: FOM: 0.234 Generation: 2 Speed: 1262.8
INFO: FOM: 0.234 Generation: 3 Speed: 1225.5
INFO: FOM: 0.234 Generation: 4 Speed: 1229.7
INFO: FOM: 0.234 Generation: 5 Speed: 1148.9
INFO: FOM: 0.234 Generation: 6 Speed: 1226.7
INFO: FOM: 0.234 Generation: 7 Speed: 1112.0
INFO: FOM: 0.234 Generation: 8 Speed: 1214.3
INFO: FOM: 0.234 Generation: 9 Speed: 1200.5
INFO: FOM: 0.234 Generation: 10 Speed: 1000.2
INFO: Stopped at Generation: 10 after 500 fom evaluations...
INFO: Fitting finished!
INFO: Time to fit: 0.011236679553985596 min
INFO: Updating the parameters
INFO: Saving the fit to /mnt/c/Users/Artur/genx/genx/test.hgx
INFO: Fitting successfully completed
INFO: *** GenX 3.6.0 Logging ended ***
As MPI defines its process externally and the code calculates the chunk size automatically the arguments -pr
and
--cr
will not be used in this case. This should be the only changes compared to using it from the command line as
usual.
If a logfile is written with the -l
option the MPI process number will be added to the file name with the
primary process starting with number 00
.
Using remote refinement server¶
To have the advantage of high performance computing and interactive refinement GenX has a server script that can be started on the cluster and a desktop client within the same network can use this as worker for refinement from a GUI client.
To start the server with the standard parameters run the genx_server command or execute with python directly:
$ genx_server
INFO: *** GenX 3.6.0 Logging started ***
INFO: Importing numba based modules to pre-compile JIT functions, this can take some time
INFO: Modules imported successfully
INFO: Starting RemoteController
INFO: Starting listening on localhost with port=3000
The fitting is then started from the GUI client selecting the “Remote DiffEv” optimizer. The configuration is done the same way as for the standard optimizer with additional options for the server configuration. From the client side the fit should look like a local run refinement and the server outputs a short information on the console (if –debug is not set).
INFO: Setting a new model
INFO: Start fit was triggered
INFO: Stop fit was triggered
It is also possible to use MPI on the server by starting it using mpiexec
or mpirun
:
mpiexec -np 32 python -m genx.server
The client optimizer settings will determine if multiprocessing or MPI will be used.
Connection settings¶
The genx_server script takes two optional arguments address
and port
. By default the sever listens only to
connections from localhost on port 3000.
You can choose to listen on any incoming network interfaces by supplying 0.0.0.0 as address
but this is
not very secure as anyone on the local network would be able to connect to this client.
The communication protocol does use a simple password authentication but communication is not encrypted so
it is adviced to keep the port open only locally and using ssh tunnel (-L option) to connect from you machine.
$ ssh -L 3000:localhost:3000 {server_with_genx}
$ mpiexec -np 32 genx_server