Getting started
This example illustrates a sample workflow using the pre-implemented dummy engine. To make use of it please configure the .strucscan configuration file and copy it in your home directory. You can use the template that comes by default with the repository and link the absolute paths to the structure and resource repository to the default ones from the repository.
[1]:
! cat ../.strucscan
PROJECT_PATH: "data" # corresponds to the top node of your data tree
STRUCTURES_PATH: "structures"
RESOURCE_PATH: "resources"
DEBUG: FALSE # Default: FALSE
STRUCT_FILE_FORMAT: cfg # Default: cfg
SLEEP_TIME: 45 # Default: 45
In general, you may want to adapt the .strucscan to your own structure and resource directory and set up a resource directory accordingly. For now, you do not need to worry about it and we will discuss it later. This example makes use of the Jupyter notebook interface. In the second example, we demonstrate how to use strucscan from command line.
The input dictionary
All information about the calculations you want to perform is handed over to strucscan in form of a python dictionary. This input dictionary allows several keys where some are mandatory and other are optional. If optional keys are left out, strucscan will fall back to default values. Each value needs to be of type string unless it is a boolean. Let’s have a look at the general mandatory keys that every engine requires: ## Mandatory keys
[2]:
from strucscan.resources.inputyaml import GENERAL
GENERAL().MANDATORY
[2]:
{'species': 'str',
'engine': 'str',
'machine': 'str',
'ncores': 'str',
'nnodes': 'str',
'queuename': 'str',
'potential': 'str',
'properties': 'str',
'prototypes': 'str',
'settings': 'str'}
As you see, this gives you an idea which value type to enter for each key. Let’s have a detailed look at the mandatory keys:
species: (str) chemical species. You can enter multiple species space separated, e.g. “Ni Al”.engine: (str) the material simulation code. This depends on the implemented interfaces. As a ‘dummy’ engine and an interface to VASP is already implemented, possible values are'dummy'or'VASP'.machine: (str) name of the machine. This refers to the configurations you have made in the resource directory.ncores: (int) number of cores per node.nnodes: (int) number of nodes.queuename: (str) name of the queue. When running on queuing systems, this will refer to a template file that you deposit inresources/machineconfig/<machine>/<scheduler>/<queuename>.<scheduler_suffix>.potentials: (str) in case of VASP, this refers to the exchange functional ('PBE'or'LDA'). For VASP, please enter one potential per specie. For LAMMPS or any other engine, this refers to a file that you deposit inresources/engine/<engine>/potentials.settings: (str) this refers to a template engine settings file text that you deposit inresources/engine/<engine>/settings. Strucscan will check this template and adapts tags and values if necessary.properties: (str) material properties to perform. You can enter multiple properties space separated. We will discuss the available properties below.prototypes: (str) names of structure files you want to analyse. Strucscan will look inresources/structuresfor the structure files. You can enter the full name of the structure files space separately or enter them on multiple lines. You can also enter a directory containing structures by indicating it by'<', '>', e.g.<bulk/fcc/>. Please make sure that every structure has a unique name.
Optional keys
Last, we should discuss the optional keys.
[3]:
from strucscan.resources.inputyaml import GENERAL
GENERAL().OPTIONAL
[3]:
{'initial atvolume': 'default',
'verbose': False,
'monitor': True,
'submit': True,
'collect': True}
initial atvolume: (str) initial scaling of the structures. Enter one float per specie, e.g.10. 12.or typedordefaultfor using the default atomic volumes deposited instrucscan.resources.atomicvolumes.pyverbose: (bool) toggles command line output.monitor: (bool) if true, strucscan will check the status of each job.submit: (bool) if true, strucscan will submit the job.collect: (bool) if true, strucscan will collect the job.
Properties
Now let’s see which properties are available.
[4]:
from strucscan.resources.properties import *
Strucscan distinguishes between properties that require any condition and properties that run without any prerequisites. For example, calculating the energy from a structure or optimizing the structure in some way requires no condition. These are the tasks static (calculate energy only), atomic (optimize inner degrees of structure), and total (fully optimize structure).
[5]:
OPTIMIZATIONS
[5]:
['static', 'atomic', 'total']
An example task requiring a condition could be a energy-volume calculation. Usually, you pre-process the structure before you create the strained images. The task ev, therefore, belongs to the advanced tasks. Advanced tasks and their conditions can be defined in properties.yaml in strucscan.resources which is read by the properties module and stored in properties_conifg_dict:
[6]:
from pprint import pprint
pprint(properties_conifg_dict)
{'default_option': 'atomic', 'dos': 'total', 'eos': 'static, atomic, total'}
The dictionary is built in a way that each key is the name of an advanced task and contains values representing the condition. The key default_option can be configured in the properties.yaml and is needed whenever no information about the condition is made. If not, strucscan will set it to the default value which is atomic. You may update the dictionary by:
[7]:
properties_conifg_dict.update({"default_option": "total"})
pprint(properties_conifg_dict)
{'default_option': 'total', 'dos': 'total', 'eos': 'static, atomic, total'}
Input example: VASP
Now that we have an idea of what the input dictionary looks like, let’s make a little more clear by looking at an example for VASP.
[8]:
from strucscan.resources.inputyaml import VASP
VASP().EXAMPLE
[8]:
{'species': 'Ni Al_pv',
'engine': 'VASP 5.4',
'machine': 'example_vasp',
'ncores': '1',
'nnodes': '1',
'queuename': 'none',
'potential': 'PBE',
'properties': 'atomic',
'prototypes': 'L1_2',
'settings': '500_SP.incar',
'magnetic configuration': 'SP',
'initial magnetic moments': '2.0 0.',
'kdens': '0.15',
'kmesh': 'Monkhorst-pack',
'initial atvolume': 'default',
'verbose': False,
'monitor': True,
'submit': True,
'collect': False}
You see that VASP allows keys to define the magnetic moments and configuration as well as information about the k-point distribution. If we look at the mandatory keys that VASP requires, we will see that the keys on the magnetism are mandatory as VASP is sensitive to it. The ones on the k-points distribution are optional.
[9]:
list(VASP().MANDATORY.keys())
[9]:
['species',
'engine',
'machine',
'ncores',
'nnodes',
'queuename',
'potential',
'properties',
'prototypes',
'settings',
'magnetic configuration',
'initial magnetic moments']
[10]:
list(VASP().OPTIONAL.keys())
[10]:
['initial atvolume',
'verbose',
'monitor',
'submit',
'collect',
'kdens',
'kmesh',
'k points file']
Input example: dummy
As VASP requires an available installation, let’s move on to the pre-implemented ‘dummy’ engine to get started with strucscan. Instead of calculating any energies, it will only copy the initial structure file, structure.cfg, to the final file, final.cfg and waits for 15 s. You can configure this command in the machine configuration file:
[11]:
import yaml
with open("../resources/machineconfig/dummy/config.yaml", "r") as stream:
config = yaml.safe_load(stream)
pprint(config)
{'DUMMY': {'parallel': 'cp structure.cfg final.cfg | echo "This is a dummy log '
'file." > log.out | sleep 1\n',
'serial': 'cp structure.cfg final.cfg | echo "This is a dummy log '
'file." > log.out | sleep 1\n'},
'scheduler': 'noqueue',
'smallest queue': None}
input dictionary for our dummy:[12]:
from strucscan.resources.inputyaml import DUMMY
DUMMY().EXAMPLE
[12]:
{'species': 'Al',
'engine': 'dummy',
'machine': 'dummy',
'initial atvolume': 'default',
'ncores': '1',
'nnodes': '1',
'queuename': 'none',
'properties': 'static atomic total eos',
'prototypes': 'fcc.cfg',
'potential': 'none',
'settings': 'none'}
Settings and potential are set to 'none'. The ‘dummy’ does not require any potential or settings file and will not look for it in the resource directory.
[13]:
list(DUMMY().MANDATORY)
[13]:
['species',
'engine',
'machine',
'ncores',
'nnodes',
'queuename',
'potential',
'properties',
'prototypes',
'settings']
[14]:
list(DUMMY().OPTIONAL)
[14]:
['initial atvolume', 'verbose', 'monitor', 'submit', 'collect']
Let’s use this input example and set the verbose, so we have a little more insight into what strucscan is doing.
[15]:
input_dict = DUMMY().EXAMPLE
input_dict.update({'verbose': True})
input_dict
[15]:
{'species': 'Al',
'engine': 'dummy',
'machine': 'dummy',
'initial atvolume': 'default',
'ncores': '1',
'nnodes': '1',
'queuename': 'none',
'properties': 'static atomic total eos',
'prototypes': 'fcc.cfg',
'potential': 'none',
'settings': 'none',
'verbose': True}
The JobManager
Once the input dictionary is set up properly, you can hand it over to the JobManager which is the main class of strucscan. Initialising the JobManager with your input will start the process. Using the ‘dummy’ example, the process will include the following steps:
checking your input: strucscan checks your input on mandatory and optional keys. If you left out any optional key, strucscan will fall back to the default value.
initializing the job list: strucscan creates a list of all jobs that are assembled from a loop over your entered prototypes and a loop over your entered properties.
update the job list: after initialization, strucscan will update each job in the list by checking its status. If a job doesn’t exist yet, it will create and - if true - submit it. If a job exists, strucscan will check for possible errors in the job and perform a troubleshooting if possible. If the job is already running, strucscan will leave the job files unchanged.
monitoring: if set to true, strucscan will repeat the update process until each job in the list is finished, or has been restarted to a maximum number of three times. If enabled, strucscan will collect the data from the data tree after each cycle.
exiting: if enabled, strucscan will collect the data from the data tree one more time.
[16]:
from strucscan.core.jobmanager import JobManager
JobManager(input_dict)
Data tree path: /home/users/pietki8q/git/strucscan-master/data
Structure repository: /home/users/pietki8q/git/strucscan-master/structures
Resource repository: /home/users/pietki8q/git/strucscan-master/resources
Optional key 'monitor' not provided. Default value will be used: True
Optional key 'submit' not provided. Default value will be used: True
Optional key 'collect' not provided. Default value will be used: True
key: : your input what strucscan reads
----------------------------------------------------------------------------------------------------
species : Al Al
engine : dummy dummy
machine : dummy dummy
initial atvolume : default default
ncores : 1 1
nnodes : 1 1
queuename : none none
properties : static atomic total eos static atomic total eos_total
prototypes : fcc.cfg fcc.cfg
potential : none none
settings : none none
verbose : True True
monitor : (not set) True
submit : (not set) True
collect : (not set) True
>> Initializing:
Initialized Al static
Initialized Al static
Initialized Al atomic
Initialized Al total
Initialized Al eos_total
4 jobs in JobList:
------------------------------------------------------------------------------------------------------------------
#: jobpath prototype path
------------------------------------------------------------------------------------------------------------------
0: DUMMY/Al/static__fcc__Al unaries/bulk/fcc.cfg
1: DUMMY/Al/eos_total__fcc__Al DUMMY/Al/total__fcc__Al/final.cfg
2: DUMMY/Al/total__fcc__Al DUMMY/Al/atomic__fcc__Al/final.cfg
3: DUMMY/Al/atomic__fcc__Al DUMMY/Al/static__fcc__Al/final.cfg
#: jobpath id status start end
------------------------------------------------------------------------------------------------------------------
0 DUMMY/Al/static__fcc__Al None does not exist
1 DUMMY/Al/eos_total__fcc__Al None does not exist
2 DUMMY/Al/total__fcc__Al None does not exist
3 DUMMY/Al/atomic__fcc__Al None does not exist
>> Entering loop:
Submitted: static__fcc__Al
Submitted: atomic__fcc__Al
#: jobpath id status start end
------------------------------------------------------------------------------------------------------------------
0 DUMMY/Al/static__fcc__Al None finished 06/22/2022 09:57
1 DUMMY/Al/eos_total__fcc__Al None does not exist
2 DUMMY/Al/total__fcc__Al None does not exist
3 DUMMY/Al/atomic__fcc__Al None finished 06/22/2022 09:57
Submitted: total__fcc__Al
#: jobpath id status start end
------------------------------------------------------------------------------------------------------------------
0 DUMMY/Al/static__fcc__Al None finished 06/22/2022 09:57
1 DUMMY/Al/eos_total__fcc__Al None does not exist
2 DUMMY/Al/total__fcc__Al None finished 06/22/2022 09:57
3 DUMMY/Al/atomic__fcc__Al None finished 06/22/2022 09:57
Submitted: eos_total__fcc__Al
#: jobpath id status start end
------------------------------------------------------------------------------------------------------------------
0 DUMMY/Al/static__fcc__Al None finished 06/22/2022 09:57
1 DUMMY/Al/eos_total__fcc__Al None finished 06/22/2022 09:57
2 DUMMY/Al/total__fcc__Al None finished 06/22/2022 09:57
3 DUMMY/Al/atomic__fcc__Al None finished 06/22/2022 09:57
Finished.
[16]:
<strucscan.core.jobmanager.JobManager at 0x7f106ae135f8>
[ ]: