Getting started

This example illustrates a sample workflow using the pre-implemented dummy engine. To make use of it please configure the .strucscan configuration file and copy it in your home directory. You can use the template that comes by default with the repository and link the absolute paths to the structure and resource repository to the default ones from the repository.

[1]:

! cat ../.strucscan

PROJECT_PATH: "data"    # corresponds to the top node of your data tree
STRUCTURES_PATH: "structures"
RESOURCE_PATH: "resources"

DEBUG: FALSE                # Default: FALSE
STRUCT_FILE_FORMAT: cfg     # Default: cfg
SLEEP_TIME: 45              # Default: 45

In general, you may want to adapt the .strucscan to your own structure and resource directory and set up a resource directory accordingly. For now, you do not need to worry about it and we will discuss it later. This example makes use of the Jupyter notebook interface. In the second example, we demonstrate how to use strucscan from command line.

The `input` dictionary

All information about the calculations you want to perform is handed over to strucscan in form of a python dictionary. This input dictionary allows several keys where some are mandatory and other are optional. If optional keys are left out, strucscan will fall back to default values. Each value needs to be of type string unless it is a boolean. Let’s have a look at the general mandatory keys that every engine requires: ## Mandatory keys

[2]:

from strucscan.resources.inputyaml import GENERAL

GENERAL().MANDATORY

[2]:

{'species': 'str',
 'engine': 'str',
 'machine': 'str',
 'ncores': 'str',
 'nnodes': 'str',
 'queuename': 'str',
 'potential': 'str',
 'properties': 'str',
 'prototypes': 'str',
 'settings': 'str'}

As you see, this gives you an idea which value type to enter for each key. Let’s have a detailed look at the mandatory keys:

species: (str) chemical species. You can enter multiple species space separated, e.g. “Ni Al”.
engine: (str) the material simulation code. This depends on the implemented interfaces. As a ‘dummy’ engine and an interface to VASP is already implemented, possible values are 'dummy' or 'VASP'.
machine: (str) name of the machine. This refers to the configurations you have made in the resource directory.
ncores: (int) number of cores per node.
nnodes: (int) number of nodes.
queuename: (str) name of the queue. When running on queuing systems, this will refer to a template file that you deposit in resources/machineconfig/<machine>/<scheduler>/<queuename>.<scheduler_suffix>.
potentials: (str) in case of VASP, this refers to the exchange functional ('PBE' or 'LDA'). For VASP, please enter one potential per specie. For LAMMPS or any other engine, this refers to a file that you deposit in resources/engine/<engine>/potentials.
settings: (str) this refers to a template engine settings file text that you deposit in resources/engine/<engine>/settings. Strucscan will check this template and adapts tags and values if necessary.
properties: (str) material properties to perform. You can enter multiple properties space separated. We will discuss the available properties below.
prototypes: (str) names of structure files you want to analyse. Strucscan will look in resources/structures for the structure files. You can enter the full name of the structure files space separately or enter them on multiple lines. You can also enter a directory containing structures by indicating it by '<', '>', e.g. <bulk/fcc/>. Please make sure that every structure has a unique name.

Optional keys

Last, we should discuss the optional keys.

[3]:

from strucscan.resources.inputyaml import GENERAL
GENERAL().OPTIONAL

[3]:

{'initial atvolume': 'default',
 'verbose': False,
 'monitor': True,
 'submit': True,
 'collect': True}

initial atvolume: (str) initial scaling of the structures. Enter one float per specie, e.g. 10. 12. or type d or default for using the default atomic volumes deposited in strucscan.resources.atomicvolumes.py
verbose: (bool) toggles command line output.
monitor: (bool) if true, strucscan will check the status of each job.
submit: (bool) if true, strucscan will submit the job.
collect: (bool) if true, strucscan will collect the job.

Properties

Now let’s see which properties are available.

[4]:

from strucscan.resources.properties import *

Strucscan distinguishes between properties that require any condition and properties that run without any prerequisites. For example, calculating the energy from a structure or optimizing the structure in some way requires no condition. These are the tasks static (calculate energy only), atomic (optimize inner degrees of structure), and total (fully optimize structure).

[5]:

OPTIMIZATIONS

[5]:

['static', 'atomic', 'total']

An example task requiring a condition could be a energy-volume calculation. Usually, you pre-process the structure before you create the strained images. The task ev, therefore, belongs to the advanced tasks. Advanced tasks and their conditions can be defined in properties.yaml in strucscan.resources which is read by the properties module and stored in properties_conifg_dict:

[6]:

from pprint import pprint
pprint(properties_conifg_dict)

{'default_option': 'atomic', 'dos': 'total', 'eos': 'static, atomic, total'}

The dictionary is built in a way that each key is the name of an advanced task and contains values representing the condition. The key default_option can be configured in the properties.yaml and is needed whenever no information about the condition is made. If not, strucscan will set it to the default value which is atomic. You may update the dictionary by:

[7]:

properties_conifg_dict.update({"default_option": "total"})
pprint(properties_conifg_dict)

{'default_option': 'total', 'dos': 'total', 'eos': 'static, atomic, total'}

Input example: VASP

Now that we have an idea of what the input dictionary looks like, let’s make a little more clear by looking at an example for VASP.

[8]:

from strucscan.resources.inputyaml import VASP
VASP().EXAMPLE

[8]:

{'species': 'Ni Al_pv',
 'engine': 'VASP 5.4',
 'machine': 'example_vasp',
 'ncores': '1',
 'nnodes': '1',
 'queuename': 'none',
 'potential': 'PBE',
 'properties': 'atomic',
 'prototypes': 'L1_2',
 'settings': '500_SP.incar',
 'magnetic configuration': 'SP',
 'initial magnetic moments': '2.0 0.',
 'kdens': '0.15',
 'kmesh': 'Monkhorst-pack',
 'initial atvolume': 'default',
 'verbose': False,
 'monitor': True,
 'submit': True,
 'collect': False}

You see that VASP allows keys to define the magnetic moments and configuration as well as information about the k-point distribution. If we look at the mandatory keys that VASP requires, we will see that the keys on the magnetism are mandatory as VASP is sensitive to it. The ones on the k-points distribution are optional.

[9]:

list(VASP().MANDATORY.keys())

[9]:

['species',
 'engine',
 'machine',
 'ncores',
 'nnodes',
 'queuename',
 'potential',
 'properties',
 'prototypes',
 'settings',
 'magnetic configuration',
 'initial magnetic moments']

[10]:

list(VASP().OPTIONAL.keys())

[10]:

['initial atvolume',
 'verbose',
 'monitor',
 'submit',
 'collect',
 'kdens',
 'kmesh',
 'k points file']

Input example: dummy

As VASP requires an available installation, let’s move on to the pre-implemented ‘dummy’ engine to get started with strucscan. Instead of calculating any energies, it will only copy the initial structure file, structure.cfg, to the final file, final.cfg and waits for 15 s. You can configure this command in the machine configuration file:

[11]:

import yaml
with open("../resources/machineconfig/dummy/config.yaml", "r") as stream:
        config = yaml.safe_load(stream)
pprint(config)

{'DUMMY': {'parallel': 'cp structure.cfg final.cfg | echo "This is a dummy log '
                       'file." > log.out | sleep 1\n',
           'serial': 'cp structure.cfg final.cfg | echo "This is a dummy log '
                     'file." > log.out | sleep 1\n'},
 'scheduler': 'noqueue',
 'smallest queue': None}

The machine configuration is configured for machines without queuing systems. You can test it right away on your local machine. You can test strucscan with the ‘dummy’ engine without setting up any pre-requirements.

Let’s check the example input dictionary for our dummy:

[12]:

from strucscan.resources.inputyaml import DUMMY
DUMMY().EXAMPLE

[12]:

{'species': 'Al',
 'engine': 'dummy',
 'machine': 'dummy',
 'initial atvolume': 'default',
 'ncores': '1',
 'nnodes': '1',
 'queuename': 'none',
 'properties': 'static atomic total eos',
 'prototypes': 'fcc.cfg',
 'potential': 'none',
 'settings': 'none'}

Settings and potential are set to 'none'. The ‘dummy’ does not require any potential or settings file and will not look for it in the resource directory.

[13]:

list(DUMMY().MANDATORY)

[13]:

['species',
 'engine',
 'machine',
 'ncores',
 'nnodes',
 'queuename',
 'potential',
 'properties',
 'prototypes',
 'settings']

[14]:

list(DUMMY().OPTIONAL)

[14]:

['initial atvolume', 'verbose', 'monitor', 'submit', 'collect']

Let’s use this input example and set the verbose, so we have a little more insight into what strucscan is doing.

[15]:

input_dict = DUMMY().EXAMPLE
input_dict.update({'verbose': True})
input_dict

[15]:

{'species': 'Al',
 'engine': 'dummy',
 'machine': 'dummy',
 'initial atvolume': 'default',
 'ncores': '1',
 'nnodes': '1',
 'queuename': 'none',
 'properties': 'static atomic total eos',
 'prototypes': 'fcc.cfg',
 'potential': 'none',
 'settings': 'none',
 'verbose': True}

The JobManager

Once the input dictionary is set up properly, you can hand it over to the JobManager which is the main class of strucscan. Initialising the JobManager with your input will start the process. Using the ‘dummy’ example, the process will include the following steps:

checking your input: strucscan checks your input on mandatory and optional keys. If you left out any optional key, strucscan will fall back to the default value.
initializing the job list: strucscan creates a list of all jobs that are assembled from a loop over your entered prototypes and a loop over your entered properties.
update the job list: after initialization, strucscan will update each job in the list by checking its status. If a job doesn’t exist yet, it will create and - if true - submit it. If a job exists, strucscan will check for possible errors in the job and perform a troubleshooting if possible. If the job is already running, strucscan will leave the job files unchanged.
monitoring: if set to true, strucscan will repeat the update process until each job in the list is finished, or has been restarted to a maximum number of three times. If enabled, strucscan will collect the data from the data tree after each cycle.
exiting: if enabled, strucscan will collect the data from the data tree one more time.

[16]:

from strucscan.core.jobmanager import JobManager

JobManager(input_dict)

Data tree path:                /home/users/pietki8q/git/strucscan-master/data
Structure repository:          /home/users/pietki8q/git/strucscan-master/structures
Resource repository:           /home/users/pietki8q/git/strucscan-master/resources

Optional key 'monitor' not provided. Default value will be used: True
Optional key 'submit' not provided. Default value will be used: True
Optional key 'collect' not provided. Default value will be used: True


key:                           : your input                                         what strucscan reads
----------------------------------------------------------------------------------------------------
species                        : Al                                                 Al
engine                         : dummy                                              dummy
machine                        : dummy                                              dummy
initial atvolume               : default                                            default
ncores                         : 1                                                  1
nnodes                         : 1                                                  1
queuename                      : none                                               none
properties                     : static atomic total eos                            static atomic total eos_total
prototypes                     : fcc.cfg                                            fcc.cfg
potential                      : none                                               none
settings                       : none                                               none
verbose                        : True                                               True
monitor                        : (not set)                                          True
submit                         : (not set)                                          True
collect                        : (not set)                                          True

>> Initializing:
Initialized  Al static
Initialized  Al static
Initialized  Al atomic
Initialized  Al total
Initialized  Al eos_total

4 jobs in JobList:
------------------------------------------------------------------------------------------------------------------
  #: jobpath                                                       prototype path
------------------------------------------------------------------------------------------------------------------
  0: DUMMY/Al/static__fcc__Al                                      unaries/bulk/fcc.cfg
  1: DUMMY/Al/eos_total__fcc__Al                                   DUMMY/Al/total__fcc__Al/final.cfg
  2: DUMMY/Al/total__fcc__Al                                       DUMMY/Al/atomic__fcc__Al/final.cfg
  3: DUMMY/Al/atomic__fcc__Al                                      DUMMY/Al/static__fcc__Al/final.cfg

  #: jobpath                                                      id       status   start                end
------------------------------------------------------------------------------------------------------------------
  0 DUMMY/Al/static__fcc__Al                                     None     does not exist
  1 DUMMY/Al/eos_total__fcc__Al                                  None     does not exist
  2 DUMMY/Al/total__fcc__Al                                      None     does not exist
  3 DUMMY/Al/atomic__fcc__Al                                     None     does not exist


>> Entering loop:
Submitted: static__fcc__Al
Submitted: atomic__fcc__Al
  #: jobpath                                                      id       status   start                end
------------------------------------------------------------------------------------------------------------------
  0 DUMMY/Al/static__fcc__Al                                     None     finished                      06/22/2022 09:57

  1 DUMMY/Al/eos_total__fcc__Al                                  None     does not exist

  2 DUMMY/Al/total__fcc__Al                                      None     does not exist

  3 DUMMY/Al/atomic__fcc__Al                                     None     finished                      06/22/2022 09:57


Submitted: total__fcc__Al
  #: jobpath                                                      id       status   start                end
------------------------------------------------------------------------------------------------------------------
  0 DUMMY/Al/static__fcc__Al                                     None     finished                      06/22/2022 09:57

  1 DUMMY/Al/eos_total__fcc__Al                                  None     does not exist

  2 DUMMY/Al/total__fcc__Al                                      None     finished                      06/22/2022 09:57

  3 DUMMY/Al/atomic__fcc__Al                                     None     finished                      06/22/2022 09:57


Submitted: eos_total__fcc__Al
  #: jobpath                                                      id       status   start                end
------------------------------------------------------------------------------------------------------------------
  0 DUMMY/Al/static__fcc__Al                                     None     finished                      06/22/2022 09:57

  1 DUMMY/Al/eos_total__fcc__Al                                  None     finished                      06/22/2022 09:57

  2 DUMMY/Al/total__fcc__Al                                      None     finished                      06/22/2022 09:57

  3 DUMMY/Al/atomic__fcc__Al                                     None     finished                      06/22/2022 09:57



Finished.

[16]:

<strucscan.core.jobmanager.JobManager at 0x7f106ae135f8>

[ ]: