Normally, one wouldn't worry about binding to cores when running SMP. An exception would be if SMP LS-DYNA and MPP LS-DYNA are running simultaneously on a node. In that case, for SMP LS-DYNA compiled with Intel Fortran, you can set the KMP_AFFINITY environment variable. See the following: https://software.intel.com/en-us/node/522691 http://www.nas.nasa.gov/hecc/support/kb/using-intel-openmp-thread-affinity-for-pinning_285.html sp Ticket#2018012510000033 ______________________________________________________________________________ By default, an MPI process migrates between cores as the OS manages resources and attempts to get the best load balance on the system. But because LS-DYNA is a memory intensive application, such migration can significantly degrade performance since memory access can take longer if the process is moved to a core farther from the memory it is using. To avoid this performance degradation, it is important to bind each MPI process to a core. Each MPI has its own way of binding the processes to cores, and furthermore, threaded MPP (HYBRID) employs a different strategy from pure MPP. I. Pure MPP ============ To bind processes to cores, include the following MPI execution line directives according to the type of MPI used. HP-MPI, Platform MPI, and IBM Platform MPI: -cpu_bind or -cpu_bind=rank -cpu_bind=MAP_CPU:0,1,2,... <<<< not recommended unless user really needs to bind MPI processes to specific cores IBM Platform MPI 9.1.4 and later: -affcycle=numa Intel MPI: -genv I_MPI_PIN_DOMAIN=core Open MPI: --bind-to numa II. HYBRID MPP ============== First, set maximum OMP threads and thread distribution by the following environment variables. setenv OMP_NUM_THREADS 8 <<<< allows up to 8 SMP threads, i.e., |ncpu| can be <= 8 setenv KMP_AFFINITY compact <<<< threads are close together Then, to bind processes to cores, include the following MPI execution line directive(s) according to the type of MPI used. (The following examples specify 8 MPI ranks.) HP-MPI, Platform MPI, and IBM Platform MPI: -cpu_bind=MASK_CPU:F,F0,F00,F000 -np 8 IBM Platform MPI 9.1.4 and later: -affcycle=numa -affwidth=4core -np 8 Intel MPI: -genv I_MPI_PIN_DOMAIN=core -genv I_MPI_PIN_ORDER=compact -np 8 -ppn 8 Open MPI: --bind-to numa -cpus-per-proc 4 -np 8